APPROX_DISTINCT() GROUP BY ...

All functions > GROUP BY > APPROX_DISTINCT() GROUP BY ...

Returns the approximate number of distinct values in the group.

Syntax

Diagram(
  Sequence(
    Terminal("APPROX_DISTINCT"),
    Terminal("("),
    NonTerminal("expr"),
    Terminal(")"),
    Choice(0, Skip(),
      Sequence(
        Terminal("FILTER"),
        Terminal("("),
        Terminal("WHERE"),
        NonTerminal("condition"),
        Terminal(")")
      )
    ),
    Choice(0, Skip(),
      Sequence(
        Terminal("GROUP BY"),
        OneOrMore(NonTerminal("feature"), Terminal(","))
      )
    )
  )
)
ParameterTypeRequiredDescription
exprTYesThe expression to count distinct values of
conditionBOOLEANNoThe condition to filter the values before aggregation
featureFEATURENoThe features to group by (many features are supported)
ParameterTypeRequiredDescription
exprTYesThe expression to count distinct values of
precisionDOUBLEYesPrecision parameter for accuracy control

Notes

  • Uses HyperLogLog algorithm for efficient approximate counting
  • Much faster than COUNT(DISTINCT) for large datasets
  • Provides probabilistic estimate with controllable error rate
  • NULL values are excluded from the count
  • Precision parameter controls accuracy vs memory tradeoff
  • Typical accuracy: within 2-3% of exact count
  • Returns 0 for empty groups
  • Can be used with WHERE clause to filter before aggregation
  • Can be used with GROUP BY clause for grouped aggregation
Last update at: 2026/03/03 16:47:38
Last updated: 2026-03-03 16:48:19