APPROX_DISTINCT() GROUP BY ...
All functions > GROUP BY > APPROX_DISTINCT() GROUP BY ...
Returns the approximate number of distinct values in the group.
Syntax
APPROX_DISTINCT(expr) [ FILTER (WHERE condition) ] [ GROUP BY feature [, feature ...] ]
Notes
- Uses HyperLogLog algorithm for efficient approximate counting
- Much faster than COUNT(DISTINCT) for large datasets
- Provides probabilistic estimate with controllable error rate
- NULL values are excluded from the count
- Precision parameter controls accuracy vs memory tradeoff
- Typical accuracy: within 2-3% of exact count
- Returns 0 for empty groups
- Can be used with WHERE clause to filter before aggregation
- Can be used with GROUP BY clause for grouped aggregation
Examples
FeatureQL
SELECT
f1 := ZIP(ARRAY[1, 2, 3] AS value).TRANSFORM(SELECT APPROX_DISTINCT(value)).UNWRAP_ONE() -- Approximate distinct count on a small group
;Result
| f1 BIGINT |
|---|
| 3 |