LLM reference — common recipes
Ready-to-use patterns for common FeatureQL tasks. Load this page when you need to solve a specific problem and want a working template.
Feature engineering vs analytics
FeatureQL serves two modes. The query structure is the same; the difference is intent and how you bind inputs.
Feature engineering: describe a known entity
You compute features for specific entity instances (e.g., "customer 42's total spend"). Keys come from a known set of entity IDs.
WITH
CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
-- ... entity definitions, data sources, derived features ...
TOTAL_SPEND := ENRICHED_ORDERS.TRANSFORM(`SELECT SUM(amount)`).UNWRAP_ONE(),
SELECT
CUSTOMER_ID,
TOTAL_SPEND
FOR
CUSTOMER_ID := BIND_VALUES(ARRAY[42, 43, 44]);
-- or: CUSTOMER_ID := BIND_SQL(SELECT customer_id FROM active_customers); Feature engineering queries can run online (realtime serving) or offline (batch).
Analytics: explore and aggregate across entities
You compute metrics across a population (e.g., "revenue by country"). Keys are the full dataset; you filter and aggregate in the query.
WITH
CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
-- ... same definitions ...
TOTAL_SPEND := ENRICHED_ORDERS.TRANSFORM(`SELECT SUM(amount)`).UNWRAP_ONE(),
SELECT
CUSTOMER_ID,
CUSTOMER_COUNTRY,
TOTAL_SPEND
FOR
CUSTOMER_ID := BIND_SQL(SELECT customer_id FROM dim_customers)
WHERE TOTAL_SPEND > 1000e0
ORDER BY TOTAL_SPEND DESC
LIMIT 100; Analytics queries are always offline (batch).
Aggregation recipes
Sum, count, average on an array
TOTAL := ORDERS.TRANSFORM(`SELECT SUM(amount)`).UNWRAP_ONE(),
COUNT := ORDERS.TRANSFORM(`SELECT COUNT(*)`).UNWRAP_ONE(),
AVG_AMOUNT := ORDERS.TRANSFORM(`SELECT AVG(amount)`).UNWRAP_ONE(), Aggregation with filter
JULY_REVENUE := ORDERS.TRANSFORM(`
SELECT SUM(amount) WHERE order_date >= DATE '2017-07-01' AND order_date < DATE '2017-08-01'
`).UNWRAP_ONE(), Group by within an array
REVENUE_BY_MONTH := ORDERS.TRANSFORM(`
WITH month := EXTRACT(MONTH FROM order_date)
SELECT month, SUM(amount) GROUP BY month AS total
ORDER BY month ASC
`),
-- Result type: ARRAY(ROW(month BIGINT, total DOUBLE)) Aggregation across entities with RELATED
-- Count orders per customer (no array needed)
ORDER_COUNT := CUSTOMER_ID.RELATED(COUNT(1) GROUP BY FCT_ORDERS[customer_id]),
-- Sum with filter
JULY_REVENUE := CUSTOMER_ID.RELATED(
SUM(ORDER_AMOUNT) FILTER (WHERE IS_JULY)
GROUP BY FCT_ORDERS[customer_id]
), Filtering recipes
Filter an array of rows
HIGH_VALUE := ORDERS.TRANSFORM(`SELECT * WHERE amount > 100e0`),
RECENT := ORDERS.TRANSFORM(`SELECT * WHERE order_date >= DATE '2024-01-01'`), Top N / Bottom N
TOP_3 := ORDERS.TRANSFORM(`SELECT * ORDER BY amount DESC LIMIT 3`),
BOTTOM_3 := ORDERS.TRANSFORM(`SELECT * ORDER BY amount ASC LIMIT 3`), Boolean existence check
HAS_HIGH_VALUE := ORDERS.TRANSFORM(`SELECT BOOL_OR(amount > 100e0)`).UNWRAP_ONE(),
ALL_SHIPPED := ORDERS.TRANSFORM(`SELECT BOOL_AND(status = 'shipped')`).UNWRAP_ONE(), Distinct values
UNIQUE_CATEGORIES := ORDERS.TRANSFORM(`SELECT DISTINCT category`),
-- Result: ARRAY(ROW(category VARCHAR))
-- As a simple array:
UNIQUE_CATEGORIES_FLAT := ORDERS.TRANSFORM(`SELECT DISTINCT category`).UNWRAP(),
-- Result: ARRAY(VARCHAR) Enrichment recipes
Enrich scalar FK array with entity data
-- customer_order_ids is ARRAY(BIGINT#ORDERS)
ENRICHED := EXTEND(
ZIP(CUSTOMER_ORDER_IDS AS order_id)
WITH ORDER_DATE, ORDER_AMOUNT
VIA order_id BOUND TO ORDER_ID
), Enrich existing array of rows
-- order_items is ARRAY(ROW(product_id BIGINT#PRODUCTS, quantity BIGINT))
ENRICHED_ITEMS := EXTEND(
ORDER_ITEMS
WITH PRODUCT_NAME, PRODUCT_PRICE
VIA product_id BOUND TO PRODUCT_ID
), Multi-hop enrichment (Customer → Orders → Items → Products)
-- Step 1: Enrich customer's orders with order-level data
CUSTOMER_ORDERS := EXTEND(
ZIP(CUSTOMER_ORDER_IDS AS order_id)
WITH ORDER_DATE, ORDER_ITEMS
VIA order_id BOUND TO ORDER_ID
),
-- Step 2: For each order, enrich items with product data
-- This requires a second EXTEND at the order level, not inside TRANSFORM
ORDER_ITEMS_ENRICHED := EXTEND(
ORDER_ITEMS
WITH PRODUCT_NAME, PRODUCT_PRICE
VIA product_id BOUND TO PRODUCT_ID
),
-- Step 3: Combine in the customer's order enrichment
CUSTOMER_ORDERS_FULL := EXTEND(
ZIP(CUSTOMER_ORDER_IDS AS order_id)
WITH ORDER_DATE, ORDER_ITEMS_ENRICHED AS items
VIA order_id BOUND TO ORDER_ID
), Array manipulation recipes
Extract field from array of rows as simple array
-- array_of_rows[field_name] extracts that field as an array
ORDER_AMOUNTS := ENRICHED_ORDERS[amount], -- ARRAY(DOUBLE)
ORDER_DATES := ENRICHED_ORDERS[order_date], -- ARRAY(DATE) Combine arrays into array of rows
COMBINED := ZIP(
ARRAY[1, 2, 3] AS id,
ARRAY['a', 'b', 'c'] AS letter,
ARRAY[10e0, 20e0, 30e0] AS value
),
-- Result: ARRAY(ROW(id BIGINT, letter VARCHAR, value DOUBLE)) Merge arrays by join key
MERGED := ARRAY_MERGE(orders_array, returns_array JOINED ON order_id), Concatenate arrays with deduplication
ALL_ITEMS := ARRAY_CONCAT(order1_items, order2_items DEDUPLICATED ON product_id), Array aggregation functions (no TRANSFORM needed)
TOTAL := ARRAY_SUM(amounts),
MAX_VAL := ARRAY_MAX(amounts),
COUNT := ARRAY_LENGTH(amounts),
SORTED := ARRAY_SORT(amounts),
UNIQUE := ARRAY_DISTINCT(categories),
HAS_ITEM := CONTAINS(product_ids, 42), Date and time recipes
-- Extract parts
YEAR := EXTRACT(YEAR FROM order_date),
MONTH := EXTRACT(MONTH FROM order_date),
-- Truncate
MONTH_START := DATE_TRUNC('MONTH', order_date),
-- Arithmetic
NEXT_WEEK := DATE_ADD(order_date, 'DAY', 7),
DAYS_SINCE := DATE_DIFF('DAY', order_date, @literal(CURRENT_DATE())),
-- Format
FORMATTED := DATE_FORMAT(order_date, '%Y-%m-%d'),
-- Parse
PARSED := DATE_PARSE('2024-01-15', '%Y-%m-%d'), String recipes
FULL_NAME := first_name || ' ' || last_name,
CLEAN := ' hello '.TRIM().UPPER(),
PARTS := SPLIT('a,b,c', ','), -- ARRAY['a', 'b', 'c']
FIRST := SPLIT_PART('a.b.c', '.', 1), -- 'a'
MATCHES := name LIKE 'Jo%',
PADDED := LPAD(id::VARCHAR, 5, '0'), -- '00042' Conditional recipes
-- Simple ternary
LABEL := IF(amount > 100e0, 'high', 'low'),
-- Multi-branch
TIER := CASE WHEN(
ARRAY[amount > 1000e0, amount > 100e0, amount > 10e0],
ARRAY['platinum', 'gold', 'silver'],
'bronze'
),
-- Value matching
STATUS_LABEL := CASE WHEN VALUE(
status_code,
ARRAY[1, 2, 3],
ARRAY['active', 'pending', 'closed'],
'unknown'
),
-- First non-null
DISPLAY_NAME := COALESCE(preferred_name, full_name, email), Window function recipes
Window functions work at the top-level SELECT (not inside TRANSFORM):
SELECT
CUSTOMER_ID,
TOTAL_SPEND,
RANK := RANK() OVER (ORDER BY TOTAL_SPEND DESC),
RUNNING_TOTAL := SUM(TOTAL_SPEND) OVER (ORDER BY CUSTOMER_ID ASC),
PERCENTILE := NTILE(4) OVER (ORDER BY TOTAL_SPEND DESC), GROUP BY at top level (analytics)
In FeatureQL, GROUP BY is attached to each aggregate feature, not a separate clause:
SELECT
CUSTOMER_COUNTRY,
TOTAL_REVENUE := SUM(REVENUE) GROUP BY CUSTOMER_COUNTRY,
CUSTOMER_COUNT := COUNT(1) GROUP BY CUSTOMER_COUNTRY,
AVG_REVENUE := AVG(REVENUE) GROUP BY CUSTOMER_COUNTRY,
FOR
CUSTOMER_ID := BIND_SQL(SELECT customer_id FROM dim_customers); Hybrid queries (FeatureQL inside SQL)
When you need SQL capabilities that FeatureQL doesn't have (e.g., complex window functions, UNION), wrap FeatureQL in SQL:
/* SQL */
SELECT * FROM (
/* FEATUREQL */
WITH
ID := INPUT(BIGINT),
SELECT
ID,
VALUE := ID * 2
FOR
ID := BIND_VALUES(ARRAY[1, 2, 3])
)
WHERE VALUE > 2; Function catalog quick reference
When you need a function, check SHOW FUNCTION SIGNATURES WHERE NAME = 'name' first. Here are the most-used categories:
| Category | Key functions |
|---|---|
| Core | ENTITY(), INPUT(), BIND_VALUE(), BIND_VALUES(), BIND_SQL(), BIND_COLUMNS() |
| Data sources | EXTERNAL_COLUMNS(), INLINE_COLUMNS(), EXTERNAL_SQL() |
| Relationships | RELATED(), EXTEND(), TRANSFORM() |
| Array of rows | ZIP(), UNWRAP(), UNWRAP_ONE(), ARRAY_MERGE(), ARRAY_CONCAT() |
| Array | ARRAY_LENGTH(), ARRAY_SUM(), CONTAINS(), ARRAY_SORT(), FLATTEN() |
| Aggregation | SUM(), COUNT(), AVG(), MIN(), MAX(), ARRAY_AGG(), COUNT_DISTINCT() |
| Conditional | IF(), CASE WHEN(), COALESCE() |
| String | CONCAT(), LOWER(), UPPER(), TRIM(), SPLIT(), REPLACE(), SUBSTR() |
| Date | EXTRACT(), DATE_ADD(), DATE_DIFF(), DATE_TRUNC(), DATE_FORMAT() |
| Math | ROUND(), ABS(), FLOOR(), CEIL(), SQRT(), POW() |
| JSON | JSON_PARSE(), JSON_EXTRACT(), JSON_EXTRACT_SCALAR() |
| Indexes | INDEX_UNIQUE(), INDEX_MULTI(), ELEMENT_AT_KEY() |
| Metaprogramming | @literal(), @echo(), EVAL_CONST(), CONST |
| Business | HASH01(), GRADUAL_ROLLOUT(), SCD_AT_TIME() |