Bind keyset

FeatureQL's key selection approach varies between online and offline evaluation scenarios, requiring different strategies for defining the scope of feature computation.

Online vs. Offline evaluation patterns

Real-time serving:

  • Precise key specification: "Calculate revenue for user_id = 1"
  • Real-time feature serving with specific entity instances
  • Typically involves BIND_VALUE() or small BIND_VALUES() operations

Batch analytics:

  • Subset selection: "Calculate revenue for all orders from yesterday"
  • Batch processing across entity populations
  • Requires efficient key space definition and filtering

The @BIND_KEYSET() function is designed specifically for offline analytics scenarios, enabling sophisticated key selection based on business criteria while maintaining FeatureQL's entity-centric feature evaluation principles.

How it works

@BIND_KEYSET() is a metaprogramming function (prefixed with @) that resolves at compile time. It works in two steps:

  1. Define a keyset using the KEYSET() function when creating persisted features. A keyset associates an entity with a SQL query that returns the set of keys for that entity.
  2. Reference the keyset using @BIND_KEYSET() in the FOR clause of an evaluation query. At compile time, @BIND_KEYSET() expands into the appropriate BIND_SQL() call.

Syntax

Defining a keyset (in CREATE FEATURES):

<keyset_name> := KEYSET(
    '<label>',
    <entity>,
    '<SQL query>',
    ARRAY['<table>', ...]    -- optional
)

Using a keyset (in evaluation queries):

FOR
    <input> := @BIND_KEYSET(<label>, <entity> [, <param1>, ...])
ParameterDescription
labelThe keyset label defined in KEYSET() (e.g., ALL, BY_DATE)
entityThe entity reference (can be a local alias or a fully qualified name like FM.NAMESPACE.ENTITY)
param1, param2, ...Optional parameters substituted into $1, $2, ... placeholders in the keyset SQL

Basic usage: single keyset

The simplest case defines a keyset with a static SQL query and uses it to bind all entity keys.

First, create the entity and its keyset:

FeatureQL
CREATE OR REPLACE FEATURES IN FM.TUTORIALS.KEYSETS1 AS
SELECT
    ORDERS := ENTITY(),
    ORDER_KEYSET1 := KEYSET('ALL', ORDERS, `SELECT order_id FROM (VALUES (10), (11), (12)) AS t(order_id)`)
;
Result
feature_name VARCHARstatus VARCHARmessage VARCHAR
FM.TUTORIALS.KEYSETS1.ORDERSCREATEDFeature created as not exists
FM.TUTORIALS.KEYSETS1.ORDER_KEYSET1CREATEDFeature created as not exists

Then use @BIND_KEYSET() to evaluate features for all keys in the set:

FeatureQL
WITH
    ORDERS := FM.TUTORIALS.KEYSETS1.ORDERS,
    ORDER_ID := INPUT(BIGINT#ORDERS)
SELECT
    ORDER_ID,
FOR
    ORDER_ID := @bind_keyset(ALL, ORDERS)
;
Result
ORDER_ID BIGINT
10
11
12

The @BIND_KEYSET(ALL, ORDERS) call resolves to a BIND_SQL() that executes the SQL query defined in the ORDER_KEYSET1 keyset, returning order IDs 10, 11, and 12.

Multiple keysets with table disambiguation

When an entity has multiple keysets with the same label, @BIND_KEYSET() automatically selects the right one based on which tables the query references. This is useful when different data sources have different key populations.

First, set up two tables with different data:

FeatureQL
/* SQL */
CREATE SCHEMA IF NOT exists keysets;
--
CREATE OR REPLACE TABLE keysets.table1 AS SELECT num as key_id, 'table1_' || num::varchar as value FROM UNNEST(ARRAY[1,2,3]) AS t(num);
CREATE OR REPLACE TABLE keysets.table2 AS SELECT num as key_id, 'table2_' || num::varchar as value FROM UNNEST(ARRAY[1,2,3]) AS t(num);
Result
Count BIGINT
3

Create two keysets with the same 'SHARED' label, each associated with a different table:

FeatureQL
CREATE OR REPLACE FEATURES IN FM.TUTORIALS.KEYSETS2 AS
SELECT
    KEYS := ENTITY(),
    KEYSET_TABLE1 := KEYSET('SHARED', KEYS, `SELECT 1 as key_id`, ARRAY['keysets.table1']),
    KEYSET_TABLE2 := KEYSET('SHARED', KEYS, `SELECT 2 as key_id`, ARRAY['keysets.table2']),
;
Result
feature_name VARCHARstatus VARCHARmessage VARCHAR
FM.TUTORIALS.KEYSETS2.KEYSCREATEDFeature created as not exists
FM.TUTORIALS.KEYSETS2.KEYSET_TABLE1CREATEDFeature created as not exists
FM.TUTORIALS.KEYSETS2.KEYSET_TABLE2CREATEDFeature created as not exists

When the query uses EXTERNAL_COLUMNS from table1, @BIND_KEYSET() automatically picks the keyset associated with table1:

FeatureQL
WITH
    KEY_ID := INPUT(BIGINT),
SELECT
    VALUE := EXTERNAL_COLUMNS(
        key_id BIGINT BIND TO KEY_ID,
        value VARCHAR
        FROM TABLE(keysets.table1)
    )[value],
FOR
    KEY_ID := @bind_keyset(SHARED, FM.TUTORIALS.KEYSETS2.KEYS)
;
Result
VALUE VARCHAR
table1_1

And when the query references table2, the keyset for table2 is selected instead:

FeatureQL
WITH
    KEY_ID := INPUT(BIGINT),
SELECT
    VALUE := EXTERNAL_COLUMNS(
        key_id BIGINT BIND TO KEY_ID,
        value VARCHAR
        FROM TABLE(keysets.table2)
    )[value],
FOR
    KEY_ID := @bind_keyset(SHARED, FM.TUTORIALS.KEYSETS2.KEYS)
;
Result
VALUE VARCHAR
table2_2

The fourth argument to KEYSET()ARRAY['keysets.table1'] — tells the system which tables the keyset is associated with, enabling automatic disambiguation.

Parameterized keysets

Keyset SQL queries can include $1, $2, ... placeholders that are filled in at query time via additional arguments to @BIND_KEYSET(). This enables dynamic key selection based on runtime parameters.

Create tables and parameterized keysets (note the $1 placeholder in the SQL):

FeatureQL
/* SQL */
CREATE SCHEMA IF NOT exists keysets;
--
CREATE OR REPLACE TABLE keysets.tablec1 AS SELECT num as key_id, 'tablec1_' || num::varchar as value FROM UNNEST(GENERATE_SERIES(1, 30)) AS t(num);
CREATE OR REPLACE TABLE keysets.tablec2 AS SELECT num as key_id, 'tablec2_' || num::varchar as value FROM UNNEST(GENERATE_SERIES(1, 30)) AS t(num);
Result
Count BIGINT
30

Then create the keysets that reference those tables, using $1 as a placeholder for the parameter that will be supplied at query time:

FeatureQL
CREATE OR REPLACE FEATURES IN FM.TUTORIALS.KEYSETS2 AS
SELECT
    KEYSC := ENTITY(),
    KEYSET_TABLEC1 := KEYSET('SHARED', KEYSC, 'SELECT 1+$1 as key_id', ARRAY['keysets.tablec1']),
    KEYSET_TABLEC2 := KEYSET('SHARED', KEYSC, 'SELECT 2+$1 as key_id', ARRAY['keysets.tablec2']),
;
Result
feature_name VARCHARstatus VARCHARmessage VARCHAR
FM.TUTORIALS.KEYSETS2.KEYSCCREATEDFeature created as not exists
FM.TUTORIALS.KEYSETS2.KEYSET_TABLEC1CREATEDFeature created as not exists
FM.TUTORIALS.KEYSETS2.KEYSET_TABLEC2CREATEDFeature created as not exists

Pass a computed parameter to @BIND_KEYSET() — the value is substituted into $1 in the keyset SQL:

FeatureQL
CONST
    BASE := 5
WITH
    KEY_ID := INPUT(BIGINT),
SELECT
    VALUE := EXTERNAL_COLUMNS(
        key_id BIGINT BIND TO KEY_ID,
        value VARCHAR
        FROM TABLE(keysets.tablec1)
    )[value],
    PARAM_1_OF_BINDKEYSET := VALUE(@literal(BASE*2))
FOR
    KEY_ID := @bind_keyset(SHARED, FM.TUTORIALS.KEYSETS2.KEYSC, CAST(BASE*2 AS VARCHAR))
;
Result
VALUE VARCHARPARAM_1_OF_BINDKEYSET BIGINT
tablec1_1110

The same pattern works with different parameter values, selecting from the appropriate table:

FeatureQL
CONST
    BASE := 5
WITH
    KEY_ID := INPUT(BIGINT),
SELECT
    VALUE := EXTERNAL_COLUMNS(
        key_id BIGINT BIND TO KEY_ID,
        value VARCHAR
        FROM TABLE(keysets.tablec2)
    )[value],
    PARAM_1_OF_BINDKEYSET := VALUE(@literal(BASE*4))
FOR
    KEY_ID := @bind_keyset(SHARED, FM.TUTORIALS.KEYSETS2.KEYSC, CAST(BASE*4 AS VARCHAR))
;
Result
VALUE VARCHARPARAM_1_OF_BINDKEYSET BIGINT
tablec2_2220

Parameters are passed as strings (use CAST(... AS VARCHAR) for non-string values) and substituted into the keyset SQL at compile time via CONST evaluation.

Date-filtered keysets

A common pattern in batch analytics is filtering entity populations by date — for example, computing features for all customers who had activity on a specific day. Parameterized keysets make this straightforward.

Create a table with customer activity across multiple dates:

FeatureQL
/* SQL */
CREATE SCHEMA IF NOT EXISTS keysets;
--
CREATE OR REPLACE TABLE keysets.customers_by_date AS
SELECT * FROM (VALUES
    (1, DATE '2024-01-15', 100.0),
    (2, DATE '2024-01-15', 200.0),
    (3, DATE '2024-01-15', 150.0),
    (1, DATE '2024-01-16', 110.0),
    (2, DATE '2024-01-16', 220.0),
    (3, DATE '2024-01-16', 165.0),
    (4, DATE '2024-01-16', 300.0),
    (5, DATE '2024-01-16', 250.0)
) AS t(customer_id, ds, revenue);
Result
Count BIGINT
8

Define a keyset with a date filter using the $1 placeholder:

FeatureQL
CREATE OR REPLACE FEATURES IN FM.TUTORIALS.KEYSETS3 AS
SELECT
    CUSTOMERS := ENTITY(),
    KEYSET_BY_DATE := KEYSET('BY_DATE', CUSTOMERS, `SELECT DISTINCT customer_id FROM keysets.customers_by_date WHERE ds = '$1'`, ARRAY['keysets.customers_by_date']),
;
Result
feature_name VARCHARstatus VARCHARmessage VARCHAR
FM.TUTORIALS.KEYSETS3.CUSTOMERSCREATEDFeature created as not exists
FM.TUTORIALS.KEYSETS3.KEYSET_BY_DATECREATEDFeature created as not exists

Query customers for a specific date — only customers active on that date are included:

FeatureQL
CONST
    TARGET_DATE := DATE '2024-01-15'
WITH
    CUSTOMER_ID := INPUT(BIGINT),
SELECT
    CUSTOMER_ID,
    REVENUE := EXTERNAL_COLUMNS(
        customer_id BIGINT BIND TO CUSTOMER_ID,
        ds DATE BIND TO @literal(TARGET_DATE),
        revenue DOUBLE
        FROM TABLE(keysets.customers_by_date)
    )[revenue],
FOR
    CUSTOMER_ID := @bind_keyset(BY_DATE, FM.TUTORIALS.KEYSETS3.CUSTOMERS, TARGET_DATE)
;
Result
CUSTOMER_ID BIGINTREVENUE VARCHAR
1100.0
2200.0
3150.0

Change the date parameter to get a different customer population. Here, SECOND_DATE is computed from TARGET_DATE using DATE_ADD():

FeatureQL
CONST
    TARGET_DATE := DATE '2024-01-15',
    SECOND_DATE := DATE_ADD(TARGET_DATE::TIMESTAMP, 'DAY', 1)
WITH
    CUSTOMER_ID := INPUT(BIGINT),
SELECT
    CUSTOMER_ID,
    REVENUE := EXTERNAL_COLUMNS(
        customer_id BIGINT BIND TO CUSTOMER_ID,
        ds DATE BIND TO @literal(SECOND_DATE),
        revenue DOUBLE
        FROM TABLE(keysets.customers_by_date)
    )[revenue],
FOR
    CUSTOMER_ID := @bind_keyset(BY_DATE, FM.TUTORIALS.KEYSETS3.CUSTOMERS, SECOND_DATE)
;
Result
CUSTOMER_ID BIGINTREVENUE VARCHAR
1110.0
2220.0
3165.0
4300.0
5250.0

Notice how the BIND TO @literal(TARGET_DATE) in EXTERNAL_COLUMNS and the parameter in @BIND_KEYSET() work together: the keyset selects which customers to evaluate, while the bound literal filters the data source to the correct date partition.

Best practices

  • Use descriptive keyset labels: 'ALL', 'BY_DATE', 'ACTIVE' — labels help document the intent of each keyset
  • Keep keyset SQL simple: The SQL query should only return key columns. Complex filtering logic belongs in features, not in keyset definitions
  • Leverage table disambiguation: When multiple keysets share a label, associate each with its relevant tables so @BIND_KEYSET() can auto-select the correct one
  • Parameterize for flexibility: Use $1, $2 placeholders for date ranges, cohort filters, or any dynamic criteria rather than creating separate keysets for each variation
  • Combine with CONST: Use CONST to define parameters once and reference them in both @BIND_KEYSET() and BIND TO @literal() for consistency
Last update at: 2026/03/03 16:47:38
Last updated: 2026-03-03 16:48:19