APPROX_DISTINCT() OVER ...
All functions > WINDOW FUNCTION > APPROX_DISTINCT() OVER ...
Returns the approximate number of distinct values in the window frame.
Syntax
APPROX_DISTINCT(expr [, precision]) OVER ([PARTITION BY expr [, ...]] [ORDER BY sort_item [, ...]] [ROWS|RANGE|GROUPS frame])
Notes
- Returns approximate count of distinct values using HyperLogLog algorithm
- Much faster than exact COUNT(DISTINCT) for large datasets
- Typical error rate around 2.3%
- Always returns BIGINT type
See also
Examples
FeatureQL
SELECT
f1 := ZIP(ARRAY[1, 2, 3, 4] AS id, ARRAY['a', 'b', 'b', 'c'] AS v).TRANSFORM(SELECT APPROX_DISTINCT(v) OVER (ORDER BY id ASC)).UNWRAP() -- Per-row approximate distinct counts (BIGINTs), not the VARCHAR values in v
;Result
| f1 ARRAY |
|---|
| [1, 2, 2, 3] |