FeatureQL homepage tutorial
This tutorial will guide you through the basics of FeatureQL by building customer analytics features for an e-commerce dataset.
Introduction
FeatureQL is a declarative language for defining and computing features. It combines the familiarity of SQL with powerful abstractions for working with entities, relationships, and aggregations.
In this tutorial, you will learn how to:
- Define entities and map them to external tables
- Compute aggregations across relationships with
RELATED() - Work with arrays using
ZIP()andEXTEND() - Create parameterized features with
MACRO() - Run controlled experiments with deterministic hashing
Create data
First, let's create a simple e-commerce schema with customers, orders, and a pre-aggregated "One Big Table" (OBT) for customer order summaries.
We have three tables:
home.dim_customers: customer dimension with id, name, and creation datehome.fct_orders: order facts with price and timestampshome.agg_customers_obt: pre-aggregated table with each customer's last order and array of all order ids
Create features
Now let's define our feature namespace and map the external tables to FeatureQL entities.
This creates:
- Two entities:
customersandorders - Primary key inputs for each entity
- External column mappings that bind table columns to features
- Keysets that define how to enumerate all keys for each entity
Example 1: customer LTV
Let's compute customer lifetime value (total spend) using the RELATED() function, which performs a GROUP BY and JOIN in one step.
FeatureQL
The expression customer_id.RELATED(SUM(order_price) GROUP BY order_customer_id) aggregates order prices by customer and joins the result back to each customer.
Equivalent SQL
In SQL, this requires a CTE with GROUP BY, then a LEFT JOIN. FeatureQL expresses the same logic more concisely.
Example 2: customer LTV from array
Sometimes your data is stored in a denormalized format with arrays
of foreign keys (common in OBT or key-value stores). FeatureQL handles
this with ZIP() and EXTEND().
FeatureQL
ZIP(customer_orders AS order_id)unpacks the array into rowsEXTEND(...) WITH order_pricejoins each order_id to get its priceARRAY_SUM()aggregates the prices back to a scalar
Equivalent SQL
The SQL version requires UNNEST, a JOIN, and re-aggregation.
Example 3: send promocode
Let's build a business rule: show a promocode to customers who haven't ordered in 30+ days and have LTV > 1000.
First, we persist customer_ltv as a reusable feature:
FeatureQL
We use RELATED() to look up the last order's timestamp through
the last_order_id foreign key, then compute recency and apply
the business rule.
Equivalent SQL
Example 4: send promocode with experiment
Now let's add A/B testing. We want to test whether a shorter recency threshold (90 days vs 30 days) affects conversion.
First, persist the recency feature:
FeatureQL
Key concepts:
INPUT(BIGINT)declares a parameter typeMACRO(... USING INPUT threshold)creates a parameterized expressionHASH01()provides deterministic bucketing for experiment assignment- The same macro is instantiated with different thresholds for control/test
Equivalent SQL
The SQL version requires duplicating the business logic and manually implementing the hash-based bucketing.
Summary
In this tutorial you learned how to:
- Map external tables to FeatureQL entities with
EXTERNAL_COLUMNS() - Compute aggregations across relationships with
RELATED() - Work with arrays using
ZIP()andEXTEND() - Create reusable parameterized features with
MACRO() - Implement deterministic experiment assignment with
HASH01()
FeatureQL lets you express complex feature logic concisely while maintaining clear semantics and composability.