FeatureQL homepage tutorial

This tutorial will guide you through the basics of FeatureQL by building customer analytics features for an e-commerce dataset.

Introduction

FeatureQL is a declarative language for defining and computing features. It combines the familiarity of SQL with powerful abstractions for working with entities, relationships, and aggregations.

In this tutorial, you will learn how to:

Define entities and map them to external tables
Compute aggregations across relationships with RELATED()
Work with arrays using ZIP() and EXTEND()
Create parameterized features with MACRO()
Run controlled experiments with deterministic hashing

Create data

First, let's create a simple e-commerce schema with customers, orders, and a pre-aggregated "One Big Table" (OBT) for customer order summaries.

We have three tables:

home.dim_customers: customer dimension with id, name, and creation date
home.fct_orders: order facts with price and timestamps
home.agg_customers_obt: pre-aggregated table with each customer's last order and array of all order ids

Create features

Now let's define our feature namespace and map the external tables to FeatureQL entities.

This creates:

Two entities: customers and orders
Primary key inputs for each entity
External column mappings that bind table columns to features
Keysets that define how to enumerate all keys for each entity

Example 1: customer LTV

Let's compute customer lifetime value (total spend) using the RELATED() function, which performs a GROUP BY and JOIN in one step.

FeatureQL

The expression customer_id.RELATED(SUM(order_price) GROUP BY order_customer_id) aggregates order prices by customer and joins the result back to each customer.

Equivalent SQL

In SQL, this requires a CTE with GROUP BY, then a LEFT JOIN. FeatureQL expresses the same logic more concisely.

Example 2: customer LTV from array

Sometimes your data is stored in a denormalized format with arrays of foreign keys (common in OBT or key-value stores). FeatureQL handles this with ZIP() and EXTEND().

FeatureQL

ZIP(customer_orders AS order_id) unpacks the array into rows
EXTEND(...) WITH order_price joins each order_id to get its price
ARRAY_SUM() aggregates the prices back to a scalar

Equivalent SQL

The SQL version requires UNNEST, a JOIN, and re-aggregation.

Example 3: send promocode

Let's build a business rule: show a promocode to customers who haven't ordered in 30+ days and have LTV > 1000.

First, we persist customer_ltv as a reusable feature:

FeatureQL

We use RELATED() to look up the last order's timestamp through the last_order_id foreign key, then compute recency and apply the business rule.

Equivalent SQL

Example 4: send promocode with experiment

Now let's add A/B testing. We want to test whether a shorter recency threshold (90 days vs 30 days) affects conversion.

First, persist the recency feature:

FeatureQL

Key concepts:

INPUT(BIGINT) declares a parameter type
MACRO(... USING INPUT threshold) creates a parameterized expression
HASH01() provides deterministic bucketing for experiment assignment
The same macro is instantiated with different thresholds for control/test

Equivalent SQL

The SQL version requires duplicating the business logic and manually implementing the hash-based bucketing.

Summary

In this tutorial you learned how to:

Map external tables to FeatureQL entities with EXTERNAL_COLUMNS()
Compute aggregations across relationships with RELATED()
Work with arrays using ZIP() and EXTEND()
Create reusable parameterized features with MACRO()
Implement deterministic experiment assignment with HASH01()

FeatureQL lets you express complex feature logic concisely while maintaining clear semantics and composability.

Suggest changes to this page

Last update at: 2026/02/02 15:16:47

On this page

Introduction Create data Create features Example 1: customer LTV FeatureQL Equivalent SQL Example 2: customer LTV from array FeatureQL Equivalent SQL Example 3: send promocode FeatureQL Equivalent SQL Example 4: send promocode with experiment FeatureQL Equivalent SQL Summary