Growth accounting functions

Growth accounting functions track user engagement and retention metrics using an efficient bitwise representation of activity. This approach enables fast computation of cohort analytics, retention rates, and user lifecycle stages.

Core concept: Activity bits

Activity bits encode up to 60 days of user activity as a single BIGINT, where each bit represents whether a user was active on a specific day relative to a reference date.

  • Bit 0 (rightmost): Reference date (today)
  • Bit 1: Yesterday
  • Bit 2: 2 days ago
  • ...
  • Bit 59: 59 days ago

Example: Binary 10000000111 means active today, 1 day ago, 2 days ago, and 10 days ago.

Conversion functions

TO_ACTIVITY_BITS()

Converts an array of dates to bitwise representation.

TO_ACTIVITY_BITS(
    dates: ARRAY(DATE),
    reference_date: DATE
) -> BIGINT
sql

FROM_ACTIVITY_BITS()

Converts bitwise representation back to date array.

FROM_ACTIVITY_BITS(
    activity_bits: BIGINT,
    reference_date: DATE
) -> ARRAY(DATE)
sql

BINARY_REPRESENTATION()

Returns human readable binary string (for debugging).

BINARY_REPRESENTATION(activity_bits: BIGINT) -> VARCHAR
sql

Activity analysis functions

IS_ACTIVE()

Checks if user was active within a time window.

IS_ACTIVE(
    activity_bits: BIGINT,
    days_back: INT,              -- Check last N days
    offset_days: INT = 0          -- Starting from M days ago
) -> BOOLEAN
sql

COUNT_ACTIVE()

Counts active days within a time window.

COUNT_ACTIVE(
    activity_bits: BIGINT,
    days_back: INT,
    offset_days: INT = 0
) -> INT
sql

RECENCY()

Returns days since last activity (0 = active today).

RECENCY(activity_bits: BIGINT) -> INT
sql

User lifecycle classification

ACTIVITY_STATUS()

Classifies users into lifecycle stages by comparing activity across time periods.

ACTIVITY_STATUS(
    activity_bits: BIGINT,
    current_period_days: INT,
    previous_period_days: INT = current_period_days
) -> VARCHAR
sql

Returns one of:

  • new: First activity in current period
  • retained: Active in both current and previous period
  • resurrected: Active in current but not previous period (was active before)
  • churned: Not active in current but was in previous period
  • stale: No activity in either period

Complete example

Practical patterns

Weekly active users (WAU)

-- Users active in last 7 days
COUNT_IF(IS_ACTIVE(activity_bits, 7))
sql

Daily active users trending

-- Compare today vs yesterday
WITH
    ACTIVE_TODAY := IS_ACTIVE(activity_bits, 1),
    ACTIVE_YESTERDAY := IS_ACTIVE(activity_bits, 1, 1)
SELECT
    COUNT_IF(ACTIVE_TODAY) AS dau_today,
    COUNT_IF(ACTIVE_YESTERDAY) AS dau_yesterday
sql

Retention cohorts

-- 7 day retention for users active 14 days ago
COUNT_IF(
    IS_ACTIVE(activity_bits, 7, 7)  -- Active days 7-14
    AND IS_ACTIVE(activity_bits, 7)  -- Also active days 0-7
) / COUNT_IF(IS_ACTIVE(activity_bits, 7, 7))
sql

Engagement intensity

-- Classify by activity frequency
CASE
    WHEN COUNT_ACTIVE(activity_bits, 7) >= 6 THEN 'power_user'
    WHEN COUNT_ACTIVE(activity_bits, 7) >= 3 THEN 'regular'
    WHEN COUNT_ACTIVE(activity_bits, 7) >= 1 THEN 'casual'
    ELSE 'inactive'
END
sql

Churn risk scoring

-- Higher score = higher churn risk
CASE
    WHEN RECENCY(activity_bits) = 0 THEN 0  -- Active today
    WHEN RECENCY(activity_bits) <= 3 THEN 1  -- Recently active
    WHEN RECENCY(activity_bits) <= 7 THEN 2  -- Week old
    WHEN RECENCY(activity_bits) <= 14 THEN 3  -- Two weeks
    ELSE 4  -- High risk
END
sql

Updating activity bits

UPDATE_ACTIVITY_BIT()

Updates activity bits with new activity, automatically shifting historical data.

UPDATE_ACTIVITY_BIT(
    activity_bits: BIGINT,
    is_active_today: BOOLEAN
) -> BIGINT
sql

Usage pattern for daily updates:

-- Shift history and mark today's activity
NEW_BITS := UPDATE_ACTIVITY_BIT(
    OLD_BITS,
    USER_HAD_ACTIVITY_TODAY
)
sql

Implementation notes

Storage efficiency

  • 60 days of history in 8 bytes (vs 60+ bytes for date array)
  • Enables columnar compression for analytics databases
  • Reduces memory footprint for real time calculations

Performance characteristics

  • Bitwise operations are CPU native instructions
  • No array scanning or date comparisons needed
  • Parallelizable across user segments
  • Cache friendly data structure

Limitations

  • Maximum 60 day lookback (BIGINT constraint)
  • Day granularity only (no hourly metrics)
  • Reference date must be consistent across calculations

Best practices

  1. Consistent reference dates: Always use the same reference date (typically TODAY) within a query
  2. Period alignment: Ensure comparison periods don't overlap for accurate lifecycle classification
  3. Null handling: Treat NULL activity_bits as never active user
  4. Incremental updates: Use UPDATE_ACTIVITY_BIT for efficient daily refreshes
  5. Debugging: Use BINARY_REPRESENTATION to visually verify bit patterns
Last update at: 2025/10/13 10:23:46