Bind keyset

FeatureQL's key selection approach varies between online and offline evaluation scenarios, requiring different strategies for defining the scope of feature computation.

Online vs. Offline evaluation patterns

Real-time serving:

Precise key specification: "Calculate revenue for user_id = 1"
Real-time feature serving with specific entity instances
Typically involves BIND_VALUE() or small BIND_VALUES() operations

Batch analytics:

Subset selection: "Calculate revenue for all orders from yesterday"
Batch processing across entity populations
Requires efficient key space definition and filtering

The @BIND_KEYSET() function is designed specifically for offline analytics scenarios, enabling sophisticated key selection based on business criteria while maintaining FeatureQL's entity-centric feature evaluation principles.

How it works

@BIND_KEYSET() is a metaprogramming function (prefixed with @) that resolves at compile time. It works in two steps:

Define a keyset using the KEYSET() function when creating persisted features. A keyset associates an entity with a SQL query that returns the set of keys for that entity.
Reference the keyset using @BIND_KEYSET() in the FOR clause of an evaluation query. At compile time, @BIND_KEYSET() expands into the appropriate BIND_SQL() call.

Syntax

Defining a keyset (in CREATE FEATURES):

<keyset_name> := KEYSET(
    '<label>',
    <entity>,
    '<SQL query>',
    ARRAY['<table>', ...]    -- optional
)

Using a keyset (in evaluation queries):

FOR
    <input> := @BIND_KEYSET(<label>, <entity> [, <param1>, ...])

Parameter	Description
`label`	The keyset label defined in `KEYSET()` (e.g., `ALL`, `BY_DATE`)
`entity`	The entity reference (can be a local alias or a fully qualified name like `FM.NAMESPACE.ENTITY`)
`param1, param2, ...`	Optional parameters substituted into `$1`, `$2`, ... placeholders in the keyset SQL

Basic usage: single keyset

The simplest case defines a keyset with a static SQL query and uses it to bind all entity keys.

First, create the entity and its keyset:

FeatureQL

CREATE OR REPLACE FEATURES IN FM.TUTORIALS.KEYSETS1 AS
SELECT
    ORDERS := ENTITY(),
    ORDER_KEYSET1 := KEYSET('ALL', ORDERS, `SELECT order_id FROM (VALUES (10), (11), (12)) AS t(order_id)`)
;

Result

feature_name VARCHAR	status VARCHAR	message VARCHAR
FM.TUTORIALS.KEYSETS1.ORDERS	CREATED	Feature created as not exists
FM.TUTORIALS.KEYSETS1.ORDER_KEYSET1	CREATED	Feature created as not exists

Then use @BIND_KEYSET() to evaluate features for all keys in the set:

FeatureQL

WITH
    ORDERS := FM.TUTORIALS.KEYSETS1.ORDERS,
    ORDER_ID := INPUT(BIGINT#ORDERS)
SELECT
    ORDER_ID,
FOR
    ORDER_ID := @bind_keyset(ALL, ORDERS)
;

Result

ORDER_ID BIGINT
10
11
12

The @BIND_KEYSET(ALL, ORDERS) call resolves to a BIND_SQL() that executes the SQL query defined in the ORDER_KEYSET1 keyset, returning order IDs 10, 11, and 12.

Multiple keysets with table disambiguation

When an entity has multiple keysets with the same label, @BIND_KEYSET() automatically selects the right one based on which tables the query references. This is useful when different data sources have different key populations.

First, set up two tables with different data:

FeatureQL

/* SQL */
CREATE SCHEMA IF NOT exists keysets;
--
CREATE OR REPLACE TABLE keysets.table1 AS SELECT num as key_id, 'table1_' || num::varchar as value FROM UNNEST(ARRAY[1,2,3]) AS t(num);
CREATE OR REPLACE TABLE keysets.table2 AS SELECT num as key_id, 'table2_' || num::varchar as value FROM UNNEST(ARRAY[1,2,3]) AS t(num);

Result

Count BIGINT
3

Create two keysets with the same 'SHARED' label, each associated with a different table:

FeatureQL

CREATE OR REPLACE FEATURES IN FM.TUTORIALS.KEYSETS2 AS
SELECT
    KEYS := ENTITY(),
    KEYSET_TABLE1 := KEYSET('SHARED', KEYS, `SELECT 1 as key_id`, ARRAY['keysets.table1']),
    KEYSET_TABLE2 := KEYSET('SHARED', KEYS, `SELECT 2 as key_id`, ARRAY['keysets.table2']),
;

Result

feature_name VARCHAR	status VARCHAR	message VARCHAR
FM.TUTORIALS.KEYSETS2.KEYS	CREATED	Feature created as not exists
FM.TUTORIALS.KEYSETS2.KEYSET_TABLE1	CREATED	Feature created as not exists
FM.TUTORIALS.KEYSETS2.KEYSET_TABLE2	CREATED	Feature created as not exists

When the query uses EXTERNAL_COLUMNS from table1, @BIND_KEYSET() automatically picks the keyset associated with table1:

FeatureQL

WITH
    KEY_ID := INPUT(BIGINT),
SELECT
    VALUE := EXTERNAL_COLUMNS(
        key_id BIGINT BIND TO KEY_ID,
        value VARCHAR
        FROM TABLE(keysets.table1)
    )[value],
FOR
    KEY_ID := @bind_keyset(SHARED, FM.TUTORIALS.KEYSETS2.KEYS)
;

Result

VALUE VARCHAR
table1_1

And when the query references table2, the keyset for table2 is selected instead:

FeatureQL

WITH
    KEY_ID := INPUT(BIGINT),
SELECT
    VALUE := EXTERNAL_COLUMNS(
        key_id BIGINT BIND TO KEY_ID,
        value VARCHAR
        FROM TABLE(keysets.table2)
    )[value],
FOR
    KEY_ID := @bind_keyset(SHARED, FM.TUTORIALS.KEYSETS2.KEYS)
;

Result

VALUE VARCHAR
table2_2

The fourth argument to KEYSET() — ARRAY['keysets.table1'] — tells the system which tables the keyset is associated with, enabling automatic disambiguation.

Parameterized keysets

Keyset SQL queries can include $1, $2, ... placeholders that are filled in at query time via additional arguments to @BIND_KEYSET(). This enables dynamic key selection based on runtime parameters.

Create tables and parameterized keysets (note the $1 placeholder in the SQL):

FeatureQL

/* SQL */
CREATE SCHEMA IF NOT exists keysets;
--
CREATE OR REPLACE TABLE keysets.tablec1 AS SELECT num as key_id, 'tablec1_' || num::varchar as value FROM UNNEST(GENERATE_SERIES(1, 30)) AS t(num);
CREATE OR REPLACE TABLE keysets.tablec2 AS SELECT num as key_id, 'tablec2_' || num::varchar as value FROM UNNEST(GENERATE_SERIES(1, 30)) AS t(num);

Result

Count BIGINT
30

Then create the keysets that reference those tables, using $1 as a placeholder for the parameter that will be supplied at query time:

FeatureQL

CREATE OR REPLACE FEATURES IN FM.TUTORIALS.KEYSETS2 AS
SELECT
    KEYSC := ENTITY(),
    KEYSET_TABLEC1 := KEYSET('SHARED', KEYSC, 'SELECT 1+$1 as key_id', ARRAY['keysets.tablec1']),
    KEYSET_TABLEC2 := KEYSET('SHARED', KEYSC, 'SELECT 2+$1 as key_id', ARRAY['keysets.tablec2']),
;

Result

feature_name VARCHAR	status VARCHAR	message VARCHAR
FM.TUTORIALS.KEYSETS2.KEYSC	CREATED	Feature created as not exists
FM.TUTORIALS.KEYSETS2.KEYSET_TABLEC1	CREATED	Feature created as not exists
FM.TUTORIALS.KEYSETS2.KEYSET_TABLEC2	CREATED	Feature created as not exists

Pass a computed parameter to @BIND_KEYSET() — the value is substituted into $1 in the keyset SQL:

FeatureQL

CONST
    BASE := 5
WITH
    KEY_ID := INPUT(BIGINT),
SELECT
    VALUE := EXTERNAL_COLUMNS(
        key_id BIGINT BIND TO KEY_ID,
        value VARCHAR
        FROM TABLE(keysets.tablec1)
    )[value],
    PARAM_1_OF_BINDKEYSET := VALUE(@literal(BASE*2))
FOR
    KEY_ID := @bind_keyset(SHARED, FM.TUTORIALS.KEYSETS2.KEYSC, CAST(BASE*2 AS VARCHAR))
;

Result

VALUE VARCHAR	PARAM_1_OF_BINDKEYSET BIGINT
tablec1_11	10

The same pattern works with different parameter values, selecting from the appropriate table:

FeatureQL

CONST
    BASE := 5
WITH
    KEY_ID := INPUT(BIGINT),
SELECT
    VALUE := EXTERNAL_COLUMNS(
        key_id BIGINT BIND TO KEY_ID,
        value VARCHAR
        FROM TABLE(keysets.tablec2)
    )[value],
    PARAM_1_OF_BINDKEYSET := VALUE(@literal(BASE*4))
FOR
    KEY_ID := @bind_keyset(SHARED, FM.TUTORIALS.KEYSETS2.KEYSC, CAST(BASE*4 AS VARCHAR))
;

Result

VALUE VARCHAR	PARAM_1_OF_BINDKEYSET BIGINT
tablec2_22	20

Parameters are passed as strings (use CAST(... AS VARCHAR) for non-string values) and substituted into the keyset SQL at compile time via CONST evaluation.

Date-filtered keysets

A common pattern in batch analytics is filtering entity populations by date — for example, computing features for all customers who had activity on a specific day. Parameterized keysets make this straightforward.

Create a table with customer activity across multiple dates:

FeatureQL

/* SQL */
CREATE SCHEMA IF NOT EXISTS keysets;
--
CREATE OR REPLACE TABLE keysets.customers_by_date AS
SELECT * FROM (VALUES
    (1, DATE '2024-01-15', 100.0),
    (2, DATE '2024-01-15', 200.0),
    (3, DATE '2024-01-15', 150.0),
    (1, DATE '2024-01-16', 110.0),
    (2, DATE '2024-01-16', 220.0),
    (3, DATE '2024-01-16', 165.0),
    (4, DATE '2024-01-16', 300.0),
    (5, DATE '2024-01-16', 250.0)
) AS t(customer_id, ds, revenue);

Result

Count BIGINT
8

Define a keyset with a date filter using the $1 placeholder:

FeatureQL

CREATE OR REPLACE FEATURES IN FM.TUTORIALS.KEYSETS3 AS
SELECT
    CUSTOMERS := ENTITY(),
    KEYSET_BY_DATE := KEYSET('BY_DATE', CUSTOMERS, `SELECT DISTINCT customer_id FROM keysets.customers_by_date WHERE ds = '$1'`, ARRAY['keysets.customers_by_date']),
;

Result

feature_name VARCHAR	status VARCHAR	message VARCHAR
FM.TUTORIALS.KEYSETS3.CUSTOMERS	CREATED	Feature created as not exists
FM.TUTORIALS.KEYSETS3.KEYSET_BY_DATE	CREATED	Feature created as not exists

Query customers for a specific date — only customers active on that date are included:

FeatureQL

CONST
    TARGET_DATE := DATE '2024-01-15'
WITH
    CUSTOMER_ID := INPUT(BIGINT),
SELECT
    CUSTOMER_ID,
    REVENUE := EXTERNAL_COLUMNS(
        customer_id BIGINT BIND TO CUSTOMER_ID,
        ds DATE BIND TO @literal(TARGET_DATE),
        revenue DOUBLE
        FROM TABLE(keysets.customers_by_date)
    )[revenue],
FOR
    CUSTOMER_ID := @bind_keyset(BY_DATE, FM.TUTORIALS.KEYSETS3.CUSTOMERS, TARGET_DATE)
;

Result

CUSTOMER_ID BIGINT	REVENUE VARCHAR
1	100.0
2	200.0
3	150.0

Change the date parameter to get a different customer population. Here, SECOND_DATE is computed from TARGET_DATE using DATE_ADD():

FeatureQL

CONST
    TARGET_DATE := DATE '2024-01-15',
    SECOND_DATE := DATE_ADD(TARGET_DATE::TIMESTAMP, 'DAY', 1)
WITH
    CUSTOMER_ID := INPUT(BIGINT),
SELECT
    CUSTOMER_ID,
    REVENUE := EXTERNAL_COLUMNS(
        customer_id BIGINT BIND TO CUSTOMER_ID,
        ds DATE BIND TO @literal(SECOND_DATE),
        revenue DOUBLE
        FROM TABLE(keysets.customers_by_date)
    )[revenue],
FOR
    CUSTOMER_ID := @bind_keyset(BY_DATE, FM.TUTORIALS.KEYSETS3.CUSTOMERS, SECOND_DATE)
;

Result

CUSTOMER_ID BIGINT	REVENUE VARCHAR
1	110.0
2	220.0
3	165.0
4	300.0
5	250.0

Notice how the BIND TO @literal(TARGET_DATE) in EXTERNAL_COLUMNS and the parameter in @BIND_KEYSET() work together: the keyset selects which customers to evaluate, while the bound literal filters the data source to the correct date partition.

Best practices

Use descriptive keyset labels: 'ALL', 'BY_DATE', 'ACTIVE' — labels help document the intent of each keyset
Keep keyset SQL simple: The SQL query should only return key columns. Complex filtering logic belongs in features, not in keyset definitions
Leverage table disambiguation: When multiple keysets share a label, associate each with its relevant tables so @BIND_KEYSET() can auto-select the correct one
Parameterize for flexibility: Use $1, $2 placeholders for date ranges, cohort filters, or any dynamic criteria rather than creating separate keysets for each variation
Combine with CONST: Use CONST to define parameters once and reference them in both @BIND_KEYSET() and BIND TO @literal() for consistency

Suggest changes to this page

Last update at: 2026/03/03 16:47:38

On this page

Online vs. Offline evaluation patterns How it works Syntax Basic usage: single keyset Multiple keysets with table disambiguation Parameterized keysets Date-filtered keysets Best practices