Growth accounting functions
Growth accounting functions track user engagement and retention metrics using an efficient bitwise representation of activity. This approach enables fast computation of cohort analytics, retention rates, and user lifecycle stages.
Core concept: Activity bits
Activity bits encode up to 60 days of user activity as a single BIGINT, where each bit represents whether a user was active on a specific day relative to a reference date.
- Bit 0 (rightmost): Reference date (today)
- Bit 1: Yesterday
- Bit 2: 2 days ago
- ...
- Bit 59: 59 days ago
Example: Binary 10000000111
means active today, 1 day ago, 2 days ago, and 10 days ago.
Conversion functions
TO_ACTIVITY_BITS()
Converts an array of dates to bitwise representation.
TO_ACTIVITY_BITS(
dates: ARRAY(DATE),
reference_date: DATE
) -> BIGINT
FROM_ACTIVITY_BITS()
Converts bitwise representation back to date array.
FROM_ACTIVITY_BITS(
activity_bits: BIGINT,
reference_date: DATE
) -> ARRAY(DATE)
BINARY_REPRESENTATION()
Returns human readable binary string (for debugging).
BINARY_REPRESENTATION(activity_bits: BIGINT) -> VARCHAR
Activity analysis functions
IS_ACTIVE()
Checks if user was active within a time window.
IS_ACTIVE(
activity_bits: BIGINT,
days_back: INT, -- Check last N days
offset_days: INT = 0 -- Starting from M days ago
) -> BOOLEAN
COUNT_ACTIVE()
Counts active days within a time window.
COUNT_ACTIVE(
activity_bits: BIGINT,
days_back: INT,
offset_days: INT = 0
) -> INT
RECENCY()
Returns days since last activity (0 = active today).
RECENCY(activity_bits: BIGINT) -> INT
User lifecycle classification
ACTIVITY_STATUS()
Classifies users into lifecycle stages by comparing activity across time periods.
ACTIVITY_STATUS(
activity_bits: BIGINT,
current_period_days: INT,
previous_period_days: INT = current_period_days
) -> VARCHAR
Returns one of:
- new: First activity in current period
- retained: Active in both current and previous period
- resurrected: Active in current but not previous period (was active before)
- churned: Not active in current but was in previous period
- stale: No activity in either period
Complete example
Practical patterns
Weekly active users (WAU)
-- Users active in last 7 days
COUNT_IF(IS_ACTIVE(activity_bits, 7))
Daily active users trending
-- Compare today vs yesterday
WITH
ACTIVE_TODAY := IS_ACTIVE(activity_bits, 1),
ACTIVE_YESTERDAY := IS_ACTIVE(activity_bits, 1, 1)
SELECT
COUNT_IF(ACTIVE_TODAY) AS dau_today,
COUNT_IF(ACTIVE_YESTERDAY) AS dau_yesterday
Retention cohorts
-- 7 day retention for users active 14 days ago
COUNT_IF(
IS_ACTIVE(activity_bits, 7, 7) -- Active days 7-14
AND IS_ACTIVE(activity_bits, 7) -- Also active days 0-7
) / COUNT_IF(IS_ACTIVE(activity_bits, 7, 7))
Engagement intensity
-- Classify by activity frequency
CASE
WHEN COUNT_ACTIVE(activity_bits, 7) >= 6 THEN 'power_user'
WHEN COUNT_ACTIVE(activity_bits, 7) >= 3 THEN 'regular'
WHEN COUNT_ACTIVE(activity_bits, 7) >= 1 THEN 'casual'
ELSE 'inactive'
END
Churn risk scoring
-- Higher score = higher churn risk
CASE
WHEN RECENCY(activity_bits) = 0 THEN 0 -- Active today
WHEN RECENCY(activity_bits) <= 3 THEN 1 -- Recently active
WHEN RECENCY(activity_bits) <= 7 THEN 2 -- Week old
WHEN RECENCY(activity_bits) <= 14 THEN 3 -- Two weeks
ELSE 4 -- High risk
END
Updating activity bits
UPDATE_ACTIVITY_BIT()
Updates activity bits with new activity, automatically shifting historical data.
UPDATE_ACTIVITY_BIT(
activity_bits: BIGINT,
is_active_today: BOOLEAN
) -> BIGINT
Usage pattern for daily updates:
-- Shift history and mark today's activity
NEW_BITS := UPDATE_ACTIVITY_BIT(
OLD_BITS,
USER_HAD_ACTIVITY_TODAY
)
Implementation notes
Storage efficiency
- 60 days of history in 8 bytes (vs 60+ bytes for date array)
- Enables columnar compression for analytics databases
- Reduces memory footprint for real time calculations
Performance characteristics
- Bitwise operations are CPU native instructions
- No array scanning or date comparisons needed
- Parallelizable across user segments
- Cache friendly data structure
Limitations
- Maximum 60 day lookback (BIGINT constraint)
- Day granularity only (no hourly metrics)
- Reference date must be consistent across calculations
Best practices
- Consistent reference dates: Always use the same reference date (typically TODAY) within a query
- Period alignment: Ensure comparison periods don't overlap for accurate lifecycle classification
- Null handling: Treat NULL activity_bits as never active user
- Incremental updates: Use UPDATE_ACTIVITY_BIT for efficient daily refreshes
- Debugging: Use BINARY_REPRESENTATION to visually verify bit patterns