Build customer analytics with FeatureQL
FeatureQL replaces multi-CTE SQL with composable feature definitions. This tutorial builds customer analytics for an e-commerce dataset, progressing from simple aggregations to parameterized A/B tests.
Set up the data
The tutorial uses three tables: a customer dimension, an order fact table, and a pre-aggregated "One Big Table" (OBT) that stores each customer's last order and an array of all their order IDs.
/* SQL */
--
CREATE SCHEMA IF NOT EXISTS home;
DROP TABLE IF EXISTS home.dim_customers;
DROP TABLE IF EXISTS home.fct_orders;
DROP TABLE IF EXISTS home.agg_customers_obt;
--
CREATE TABLE home.dim_customers (
customer_id BIGINT,
name VARCHAR,
created_at DATE
);
INSERT INTO home.dim_customers VALUES
(100, 'Alice', '2022-03-15'),
(101, 'Bob', '2023-06-10'),
(102, 'Charlie', '2024-01-20'),
(103, 'Diana', '2024-11-01');
--
CREATE TABLE home.fct_orders (
order_id BIGINT,
order_customer_id BIGINT,
price DECIMAL(10, 2),
created_at TIMESTAMP
);
INSERT INTO home.fct_orders VALUES
(1001, 100, 450.00, '2025-06-01 08:00:00'),
(1002, 100, 380.00, '2025-07-15 14:45:00'),
(1003, 101, 600.00, '2025-08-20 17:30:00'),
(1004, 101, 550.00, '2025-10-05 16:20:00'),
(1005, 100, 520.00, '2025-11-12 11:30:00'),
(1006, 102, 1200.00, '2025-11-15 14:30:00'),
(1007, 102, 300.00, '2025-12-20 10:00:00'),
(1008, 101, 400.00, '2025-12-25 09:45:00'),
(1009, 103, 850.00, '2026-01-10 12:00:00'),
(1010, 103, 400.00, '2026-01-14 09:30:00');
--
CREATE TABLE home.agg_customers_obt (
customer_id BIGINT,
last_order_id BIGINT,
orders BIGINT[]
);
INSERT INTO home.agg_customers_obt
SELECT
order_customer_id AS customer_id,
MAX_BY(order_id, created_at) AS last_order_id,
ARRAY_AGG(order_id)
FROM home.fct_orders
GROUP BY order_customer_id;| Count BIGINT |
|---|
| 4 |
Define entities and map external data
Every FeatureQL model starts by declaring entities and mapping table columns to features. This single CREATE FEATURES statement sets up two entities (customers and orders), their primary key inputs, and the EXTERNAL_COLUMNS() mappings that connect each table column to a named feature.
CREATE OR REPLACE FEATURES IN fm.home AS
SELECT
-- Entities
customers := ENTITY(),
orders := ENTITY(),
-- Primary keys
customer_id := INPUT(BIGINT#customers),
order_id := INPUT(BIGINT#orders),
-- Customer dimensions
tables.dim_customers := EXTERNAL_COLUMNS(
customer_id BIGINT#customers BIND TO customer_id,
name VARCHAR,
created_at TIMESTAMP
FROM TABLE(home.dim_customers)
),
customer_name := tables.dim_customers[name],
customer_created_at := tables.dim_customers[created_at],
-- Order facts
tables.fct_orders := EXTERNAL_COLUMNS(
order_id BIGINT#orders BIND TO order_id,
order_customer_id BIGINT#customers,
price DECIMAL,
created_at TIMESTAMP
FROM TABLE(home.fct_orders)
),
order_price := tables.fct_orders[price],
order_customer_id := tables.fct_orders[order_customer_id],
order_created_at := tables.fct_orders[created_at],
-- Customer facts aggregations (OBT)
tables.agg_customers_obt := EXTERNAL_COLUMNS(
customer_id BIGINT#customers BIND TO customer_id,
last_order_id BIGINT#orders,
orders ARRAY(BIGINT#orders),
FROM TABLE(home.agg_customers_obt)
),
last_order_id := tables.agg_customers_obt[last_order_id],
customer_orders := tables.agg_customers_obt[orders],
-- Keysets: Define where to find the keys for each entity.
dim_customers_keyset := KEYSET(
'all', customers,
'SELECT customer_id AS "fm.home.customer_id"
FROM home.dim_customers'
),
fct_orders_keyset := KEYSET(
'all', orders,
'SELECT order_id AS "fm.home.order_id"
FROM home.fct_orders'
),| feature_name VARCHAR | status VARCHAR | message VARCHAR |
|---|---|---|
| fm.home.customers | CREATED | Feature created as not exists |
| fm.home.orders | CREATED | Feature created as not exists |
| fm.home.customer_id | CREATED | Feature created as not exists |
| fm.home.order_id | CREATED | Feature created as not exists |
| fm.home.tables.dim_customers | CREATED | Feature created as not exists |
| fm.home.customer_name | CREATED | Feature created as not exists |
| fm.home.customer_created_at | CREATED | Feature created as not exists |
| fm.home.tables.fct_orders | CREATED | Feature created as not exists |
| fm.home.order_price | CREATED | Feature created as not exists |
| fm.home.order_customer_id | CREATED | Feature created as not exists |
| fm.home.order_created_at | CREATED | Feature created as not exists |
| fm.home.tables.agg_customers_obt | CREATED | Feature created as not exists |
| fm.home.last_order_id | CREATED | Feature created as not exists |
| fm.home.customer_orders | CREATED | Feature created as not exists |
| fm.home.dim_customers_keyset | CREATED | Feature created as not exists |
| fm.home.fct_orders_keyset | CREATED | Feature created as not exists |
Notice the BIND TO clauses — they tell FeatureQL which column is the key for each entity. The KEYSET() definitions at the bottom declare how to enumerate all keys, which lets later queries use @BIND_KEYSET() instead of listing IDs manually.
Customer lifetime value with RELATED()
RELATED() aggregates data across entity boundaries in a single expression. Here, it sums order_price grouped by order_customer_id and joins the result back to each customer — replacing the CTE + LEFT JOIN pattern you'd write in SQL.
FeatureQL
SELECT
customer_id,
customer_ltv := customer_id.RELATED(
SUM(order_price)
GROUP BY order_customer_id
),
FROM fm.home
FOR
customer_id := @BIND_KEYSET(all, customers),
order_id := @BIND_KEYSET(all, orders),
;| fm.home.customer_id BIGINT | customer_ltv VARCHAR |
|---|---|
| 100 | 1350.0 |
| 101 | 1550.0 |
| 102 | 1500.0 |
| 103 | 1250.0 |
Equivalent SQL
/* SQL */
WITH orders_agg AS (
SELECT
order_customer_id,
SUM(price) AS customer_ltv
FROM home.fct_orders
GROUP BY order_customer_id
)
SELECT
c.customer_id,
o.customer_ltv
FROM home.dim_customers c
LEFT JOIN orders_agg o
ON c.customer_id = o.order_customer_id
;| customer_id BIGINT | customer_ltv VARCHAR |
|---|---|
| 100 | 1350.0 |
| 101 | 1550.0 |
| 102 | 1500.0 |
| 103 | 1250.0 |
The SQL version requires a CTE with GROUP BY, then a LEFT JOIN. FeatureQL expresses the same logic in one line: customer_id.RELATED(SUM(order_price) GROUP BY order_customer_id).
Customer lifetime value from arrays
When data is stored in a denormalized format — arrays of foreign keys in an OBT or key-value store — FeatureQL handles it with ZIP() and EXTEND(). ZIP() unpacks a scalar array into an array of rows, EXTEND() enriches each row with data from another entity, and ARRAY_SUM() aggregates back to a scalar.
FeatureQL
WITH
customer_orders_details := EXTEND(
ZIP(customer_orders AS order_id)
WITH order_price AS order_price
VIA order_id BIND TO order_id
)
SELECT
customer_id,
customer_ltv_arr := ARRAY_SUM(customer_orders_details[order_price]),
FROM fm.home
FOR
customer_id := @BIND_KEYSET(all, customers),
;| fm.home.customer_id BIGINT | customer_ltv_arr VARCHAR |
|---|---|
| 100 | 1350.0 |
| 101 | 1550.0 |
| 102 | 1500.0 |
| 103 | 1250.0 |
Equivalent SQL
/* SQL */
WITH orders_unnested AS (
SELECT
c.customer_id,
UNNEST(c.orders) AS order_id
FROM home.agg_customers_obt c
),
orders_with_price AS (
SELECT
u.customer_id,
o.price
FROM orders_unnested u
LEFT JOIN home.fct_orders o
ON u.order_id = o.order_id
)
SELECT
customer_id,
SUM(price) AS customer_ltv_arr
FROM orders_with_price
GROUP BY customer_id
;| customer_id BIGINT | customer_ltv_arr VARCHAR |
|---|---|
| 100 | 1350.0 |
| 101 | 1550.0 |
| 102 | 1500.0 |
| 103 | 1250.0 |
The SQL version requires UNNEST, a JOIN, and re-aggregation — three separate steps for what FeatureQL chains in a single expression.
Business rules: send a promo code
Features compose naturally. Here, we persist customer_ltv so it can be reused across queries, then build a business rule: show a promo code to customers who haven't ordered in 30+ days and have LTV above 1000.
CREATE OR REPLACE FEATURES IN fm.home AS
SELECT
customer_ltv := customer_id.RELATED(
SUM(order_price)
GROUP BY order_customer_id
)
;| feature_name VARCHAR | status VARCHAR | message VARCHAR |
|---|---|---|
| fm.home.customer_ltv | CREATED | Feature created as not exists |
The RELATED() call on last_order_id follows the foreign key to look up the order's timestamp, and DATE_DIFF() computes recency. The boolean show_promocode combines both conditions.
FeatureQL
SELECT
customer_id,
customer_ltv,
recency := DATE_DIFF(
last_order_id.RELATED(order_created_at),
TIMESTAMP '2026-02-01',
'day'
),
show_promocode := recency > 30 AND customer_ltv > 1000::DECIMAL,
FROM fm.home
FOR
customer_id := @BIND_KEYSET(all, customers),
order_id := @BIND_KEYSET(all, orders),
;| fm.home.customer_id BIGINT | fm.home.customer_ltv VARCHAR | recency BIGINT | show_promocode VARCHAR |
|---|---|---|---|
| 100 | 1350.0 | 81 | true |
| 101 | 1550.0 | 38 | true |
| 102 | 1500.0 | 43 | true |
| 103 | 1250.0 | 18 | false |
Equivalent SQL
/* SQL */
WITH customer_ltv AS (
SELECT
order_customer_id,
SUM(price) AS customer_ltv
FROM home.fct_orders
GROUP BY order_customer_id
)
SELECT
c.customer_id,
ltv.customer_ltv,
DATE_DIFF('day', o.created_at, DATE '2026-02-01') AS recency,
DATE_DIFF('day', o.created_at, DATE '2026-02-01') > 30
AND ltv.customer_ltv > 1000.00 AS show_promocode
FROM home.dim_customers c
LEFT JOIN home.agg_customers_obt agg
ON c.customer_id = agg.customer_id
LEFT JOIN home.fct_orders o
ON agg.last_order_id = o.order_id
LEFT JOIN customer_ltv ltv
ON c.customer_id = ltv.order_customer_id
;| customer_id BIGINT | customer_ltv VARCHAR | recency BIGINT | show_promocode VARCHAR |
|---|---|---|---|
| 100 | 1350.0 | 81 | true |
| 101 | 1550.0 | 38 | true |
| 102 | 1500.0 | 43 | true |
| 103 | 1250.0 | 18 | false |
A/B testing with MACRO() and HASH01()
MACRO() turns any expression into a parameterized template. Here, the promo-code rule becomes a macro with a configurable recency threshold. HASH01() provides deterministic bucketing — the same customer always lands in the same experiment arm, with no external randomization service needed.
First, persist the recency feature so both experiment arms can reference it:
CREATE OR REPLACE FEATURES IN fm.home AS
SELECT
recency := DATE_DIFF(
last_order_id.RELATED(order_created_at),
TIMESTAMP '2026-02-01',
'day'
)
;| feature_name VARCHAR | status VARCHAR | message VARCHAR |
|---|---|---|
| fm.home.recency | CREATED | Feature created as not exists |
The macro show_promocode_macro accepts a threshold input and produces a boolean. It's instantiated twice — once with 30 days (control) and once with 90 days (test). HASH01() assigns each customer to an arm based on a deterministic hash of their ID.
FeatureQL
WITH
threshold := INPUT(BIGINT),
show_promocode_macro := MACRO(
recency > threshold AND customer_ltv > 1000.
USING INPUT threshold
),
show_promocode_control := show_promocode_macro(30),
show_promocode_test := show_promocode_macro(90),
expose_to_test :=
HASH01(UNSAFE_CAST(customer_id AS VARCHAR) || 'SALT')
BETWEEN.5 0e0 AND 0.5e0,
SELECT
customer_id,
show_promocode_control,
expose_to_test,
show_promocode := CASE
WHEN expose_to_test THEN show_promocode_test
ELSE show_promocode_control
END,
FROM fm.home
FOR
customer_id := @BIND_KEYSET(all, customers),
order_id := @BIND_KEYSET(all, orders),
;| fm.home.customer_id BIGINT | show_promocode_control BOOLEAN | expose_to_test BOOLEAN | show_promocode BOOLEAN |
|---|---|---|---|
| 100 | true | false | true |
| 102 | true | true | false |
| 101 | true | false | true |
| 103 | false | true | false |
Equivalent SQL
/* SQL */
WITH customer_ltv AS (
SELECT
order_customer_id,
SUM(price) AS customer_ltv
FROM home.fct_orders
GROUP BY order_customer_id
),
customer_recency AS (
SELECT
c.customer_id,
DATE_DIFF('day', o.created_at, DATE '2026-02-01') AS recency
FROM home.dim_customers c
LEFT JOIN home.agg_customers_obt agg
ON c.customer_id = agg.customer_id
LEFT JOIN home.fct_orders o
ON agg.last_order_id = o.order_id
)
SELECT
c.customer_id,
r.recency > 30
AND ltv.customer_ltv > 1000.00 AS show_promocode_control,
CAST(
CAST(
('0x' || SUBSTRING(
MD5(ARRAY_TO_STRING(
ARRAY[c.customer_id::VARCHAR, 'SALT'], ''
)), 1, 15
)) AS BIGINT
) AS DOUBLE
) / POW(2, 60) BETWEEN 0 AND 0.50 AS expose_to_test,
CASE
WHEN expose_to_test
THEN (r.recency > 90 AND ltv.customer_ltv > 1000.00)
ELSE (r.recency > 30 AND ltv.customer_ltv > 1000.00)
END AS show_promocode
FROM home.dim_customers c
LEFT JOIN customer_recency r
ON c.customer_id = r.customer_id
LEFT JOIN customer_ltv ltv
ON c.customer_id = ltv.order_customer_id
;| customer_id BIGINT | show_promocode_control BOOLEAN | expose_to_test BOOLEAN | show_promocode VARCHAR |
|---|---|---|---|
| 100 | true | false | true |
| 101 | true | false | true |
| 102 | true | true | false |
| 103 | false | true | false |
The SQL version duplicates the business logic for each arm and manually implements hash-based bucketing. The FeatureQL version defines the logic once and parameterizes the difference.
What's next
This tutorial covered the core workflow: map external data with EXTERNAL_COLUMNS(), aggregate across entities with RELATED(), work with arrays using ZIP() and EXTEND(), create reusable logic with MACRO(), and run experiments with HASH01().
For a deeper look at the e-commerce data model with more entity types and relationship patterns, see the E-commerce tutorial . For language fundamentals, start with Hello World .