FeatureQL homepage tutorial

This tutorial will guide you through the basics of FeatureQL by building customer analytics features for an e-commerce dataset.

Introduction

FeatureQL is a declarative language for defining and computing features. It combines the familiarity of SQL with powerful abstractions for working with entities, relationships, and aggregations.

In this tutorial, you will learn how to:

  1. Define entities and map them to external tables
  2. Compute aggregations across relationships with RELATED()
  3. Work with arrays using ZIP() and EXTEND()
  4. Create parameterized features with MACRO()
  5. Run controlled experiments with deterministic hashing

Create data

First, let's create a simple e-commerce schema with customers, orders, and a pre-aggregated "One Big Table" (OBT) for customer order summaries.

We have three tables:

  • home.dim_customers: customer dimension with id, name, and creation date
  • home.fct_orders: order facts with price and timestamps
  • home.agg_customers_obt: pre-aggregated table with each customer's last order and array of all order ids

Create features

Now let's define our feature namespace and map the external tables to FeatureQL entities.

This creates:

  • Two entities: customers and orders
  • Primary key inputs for each entity
  • External column mappings that bind table columns to features
  • Keysets that define how to enumerate all keys for each entity

Example 1: customer LTV

Let's compute customer lifetime value (total spend) using the RELATED() function, which performs a GROUP BY and JOIN in one step.

FeatureQL

The expression customer_id.RELATED(SUM(order_price) GROUP BY order_customer_id) aggregates order prices by customer and joins the result back to each customer.

Equivalent SQL

In SQL, this requires a CTE with GROUP BY, then a LEFT JOIN. FeatureQL expresses the same logic more concisely.

Example 2: customer LTV from array

Sometimes your data is stored in a denormalized format with arrays of foreign keys (common in OBT or key-value stores). FeatureQL handles this with ZIP() and EXTEND().

FeatureQL

  • ZIP(customer_orders AS order_id) unpacks the array into rows
  • EXTEND(...) WITH order_price joins each order_id to get its price
  • ARRAY_SUM() aggregates the prices back to a scalar

Equivalent SQL

The SQL version requires UNNEST, a JOIN, and re-aggregation.

Example 3: send promocode

Let's build a business rule: show a promocode to customers who haven't ordered in 30+ days and have LTV > 1000.

First, we persist customer_ltv as a reusable feature:

FeatureQL

We use RELATED() to look up the last order's timestamp through the last_order_id foreign key, then compute recency and apply the business rule.

Equivalent SQL

Example 4: send promocode with experiment

Now let's add A/B testing. We want to test whether a shorter recency threshold (90 days vs 30 days) affects conversion.

First, persist the recency feature:

FeatureQL

Key concepts:

  • INPUT(BIGINT) declares a parameter type
  • MACRO(... USING INPUT threshold) creates a parameterized expression
  • HASH01() provides deterministic bucketing for experiment assignment
  • The same macro is instantiated with different thresholds for control/test

Equivalent SQL

The SQL version requires duplicating the business logic and manually implementing the hash-based bucketing.

Summary

In this tutorial you learned how to:

  • Map external tables to FeatureQL entities with EXTERNAL_COLUMNS()
  • Compute aggregations across relationships with RELATED()
  • Work with arrays using ZIP() and EXTEND()
  • Create reusable parameterized features with MACRO()
  • Implement deterministic experiment assignment with HASH01()

FeatureQL lets you express complex feature logic concisely while maintaining clear semantics and composability.

Last update at: 2026/02/02 15:16:47
Last updated: 2026-02-02 15:17:27