FeatureQL for the Impatient

This guide helps SQL practitioners understand FeatureQL by comparing it with familiar SQL concepts.

Core concepts

Features are functions

In SQL, we create one table from other tables.

In FeatureQL, we define features as formulas based on other features.

Unlike tables, when we persist features with CREATE FEATURES AS, no data is evaluated. Only their definition is stored in the feature registry, similar to a view or function in SQL.

Think of features as database functions with automatic parameter detection. When we reference other features, they automatically become inputs to our formula.

Features are pure functions: their output depends only on their inputs. The only exception is the EXTERNAL functions, which map external data to features.

We can evaluate features by binding values to their inputs using the BIND functions.

Simple example step by step

Let's create a customer segment.

First, the minimal setup:

Now let's create some mockup external data:

@fql-playground(impatient_create_table_customer1)

We can now map features to existing table columns by using the EXTERNAL_COLUMNS function:

Then finally add our business logic:

Persisted and local definitions

Persisted features (stored in the registry):

  • Defined with the CREATE FEATURES AS statement
  • Available everywhere in the organization
  • Namespaced (e.g., fm.finance.revenue, fm.marketing.campaign_roi)
  • Have granular permissions

Local features (defined in the WITH clause):

  • Defined in the query scope
  • Override persisted features if same name
  • Useful for prototyping

In practice, core features like CUSTOMER_ID, CUSTOMER_LAST_LOGIN_DATE would have been persisted in the registry using this query:

End-users would then just use the features in their queries like this:

Presentation layer and hybrid queries for analytics

Feature definitions are typically used to create fine-grained datasets for analytics.

But the ultimate goal is to present data to the end-user, often filtered or at a coarse level of aggregation.

This is typically done in a presentation layer.

For maximum flexibility, you can also mix SQL and FeatureQL to integrate with existing SQL based clients: the SQL part of the query is generated by the visualisation tool itself.

@fql-playground(impatient_7_hybrid)

Structural patterns

All JOIN and GROUP BY operations in feature definitions are described using the RELATED() function.

To illustrate the main structural patterns, let's create more mockup data for our ecommerce dataset.

A second table containing customer information:

@fql-playground(impatient_create_table_customer2)

A table of orders with a foreign key to the customers table.

@fql-playground(impatient_create_table_orders)

Join PK on PK

Informations related to the customer can be queried without join:

@fql-playground(impatient_8_join_pk_on_pk)

Join FK on PK

Informations related to the order can be queried by joining on the customer ID.

@fql-playground(impatient_9_join_fk_on_pk)

Join PK on FK

If we want to get the revenue for each customer, we can group orders by customer ID and sum the revenue, then join the customer information on this table.

@fql-playground(impatient_10_join_pk_on_fk)

Join FK on FK

Let's imagine we want to compute the percentage that represents each order in the total revenue for each customer. We need to join the orders table on the customer ID and then join the customer information on this table.

@fql-playground(impatient_11_join_fk_on_fk)

Array of rows operations

FeatureQL promotes ECM (Entity Centric Modeling) that stores facts as arrays or arrays of rows within each dimension table (also known as One Big Table modeling).

FeatureQL has then strong support for ARRAY and ARRAY of ROWs that are the core data structures of ECM.

OperationSyntaxDescriptionScope
Access rows<array_of_rows>[1] or <array_of_rows>[1:3]Extract one or a range of rows as an array of rowsrows
Access fields<array_of_rows>[amount]Extract fields from an array of rows as a array of values (one field) or an array of rows (many fields)fields
UnwrapUNWRAP(<array_of_rows>)Unwraps an array of one field rows as a array of values (convenient when the name of the field is not named)fields
ConcatenateARRAY_CONCAT(<array_of_rows1>, <array_of_rows2> (DEDUPLICATED ON <field1>, <field2>))Concatenates the rows of the two arrays of rows, possibly deduplicated on one or many fields valuesrows
ZipZIP(<array_of_rows1>, <array_of_rows2>)Zips one or many arrays into an array of rowsfields
Merge (by positions)ARRAY_MERGE(<array_of_rows1>, <array_of_rows2>)Combines two or more arrays of rowsfields
Merge (by values)ARRAY_MERGE(<array_of_rows1>, <array_of_rows2> (JOINED ON <field1>, <field2>))Merge the rows of two arrays of rows by joining on a fieldfields and rows
ExtendEXTEND(<array_of_rows> WITH <feature>)Adds features as fields to an array of rows by joining on an existing fieldfields
TransformTRANSFORM(<array_of_rows>, <subquery>)Applies transformations to an array of rows (map, filter, aggregate)fields and rows

Example

Let's imagine we want to compute the top 3 categories of products bought by each customer in the last 84 days.

We imagine we have a OBT table that contains the list of orders for each customer and their date of purchase.

@fql-playground(impatient_create_customer_obt)

We can use TRANSFORM() to filter the 84 days window and then use EXTEND() to add the product categories to the array of rows. Then use TRANSFORM() again to group by order category and count the number of orders.

@fql-playground(impatient_12_transform_customer_obt)

Developer tools

To be efficient in a collaborative environment, FeatureQL provides a set of developer tools to help you write and test your features.

Metaprogramming

@echo and @literal compute values at compilation time.

This is useful for computed literal parameters without having to use external ways of generating text such as jinja templates.

QUERY := @literal(`SELECT * FROM dim_customers WHERE date = '` || CURRENT_DATE() || `'`),
sql

Testing and reusability

  • VARIANT(): Create feature variations (e.g., test different thresholds, swap sources, etc.)
  • MACRO(): Define reusable functions in FeatureQL
  • UDF(): Define reusable functions in the language of the backend database

Business functions

FeatureQL provides specialized functions for analytics and machine learning in enterprise environments.

Historical

  • SCD_AT_TIME(): Get the state of a slowly changing dimension (SCD type 2) at a certain point in time
  • ACCUMULATION_AT_TIME(): Get the state of an accumulation of facts at a certain point in time

Experimentation

  • HASH01(): Generate a deterministic value between 0 and 1 based on the input string.
  • GRADUAL_ROLLOUT(): Enable controlled, time based feature deployment with deterministic assignment. Unlike simple percentage rollouts, it smoothly transitions from 0% to 100% coverage over a defined time window.

Growth accounting

  • TO_ACTIVITY_BITS(): Converts an array of dates to a dense 60 days bitwise representation called activity bits.
  • FROM_ACTIVITY_BITS(): Converts the activity bits back to an array of dates.
  • IS_ACTIVE(): Checks if a user was active within a time window using the activity bits.
  • COUNT_ACTIVE(): Counts the number of active days within a time window using the activity bits.
  • RECENCY(): Returns days since last activity (0 = active today) using the activity bits.
  • ACTIVITY_STATUS(): Classifies users into lifecycle stages by comparing activity across time periods using the activity bits.

Simulation

  • SAMPLE_VALUE(): Samples from a discrete distribution with custom probabilities.
  • SAMPLE_UNIFORM(): Samples from a uniform distribution between min and max values.
  • SAMPLE_GAUSSIAN(): Samples from a normal distribution between two bounds.
  • PROBABILITY_TO_EVENT(): Returns the discrete event (true or false) with the given probability based on the given hashing key.

Discovery tools

Discovery tools are used to explore metadata about the features and the system.

  • SHOW FEATURES: Browse the catalog of features
  • SHOW CREATE FEATURES <name>: See the definition of a feature
  • SHOW FUNCTION <function>: Get the documentation of a function
  • SHOW DOCS: Browse the documentation
Last update at: 2025/12/05 16:03:14
Last updated: 2025-12-05 16:07:55