LLM reference — data sources and binding

How to connect features to external data and provide input values. Load this page when you need to write EXTERNAL_COLUMNS(), INLINE_COLUMNS(), EXTERNAL_SQL(), or any BIND_XXX() pattern.

The binding contract

Every FeatureQL query must bind concrete values to its INPUT() features via the FOR clause. Without binding, the query will not execute.

WITH
    ID := INPUT(BIGINT),
SELECT
    ID,
    DOUBLED := ID * 2
FOR
    ID := BIND_VALUES(ARRAY[1, 2, 3]);
sql

Binding patterns

-- Single value (one row)
FOR NAME := BIND_VALUE('Alice');

-- Array of values (multiple rows, one input)
FOR ID := BIND_VALUES(ARRAY[1, 3, 5]);

-- Multiple inputs paired together (no cross-product)
FOR (INPUT1, INPUT2) := BIND_VALUES(ARRAY[ROW(1, 'a'), ROW(3, 'b')]);

-- Separate bindings = Cartesian product (n × m rows)
FOR INPUT1 := BIND_VALUES(ARRAY[1, 3]), INPUT2 := BIND_VALUES(ARRAY[2, 4]);

-- From a SQL query (backend-specific SQL)
FOR (ID, NAME) := BIND_SQL(SELECT id, name FROM (VALUES (1, 'Alice'), (2, 'Bob')) AS t(id, name));

-- From table columns
FOR (INPUT1, INPUT2) := BIND_COLUMNS(col1, col2 FROM TABLE(schema.table));
sql

EXTERNAL_COLUMNS — the primary data source mapping

EXTERNAL_COLUMNS() is the recommended way to map table data to features. It handles 95% of use cases. You declare columns, bind key columns to inputs, and specify a source. FeatureQL generates all join logic.

General pattern

FEATURE_NAME := EXTERNAL_COLUMNS(
    key_column TYPE BOUND TO SOME_INPUT,   -- Key column linked to an INPUT feature
    value_column1 TYPE,                     -- Value columns
    value_column2 TYPE
    FROM <SOURCE>(...)                      -- TABLE(), SQL(), or VIEW()
),
-- Access individual fields with [] operator:
FIELD1 := FEATURE_NAME[value_column1],
FIELD2 := FEATURE_NAME[value_column2],
sql

The BOUND TO clause links a key column to an INPUT() feature, enforcing 0 or 1 row per input value. The result is a ROW — extract fields with [].

From a table (simplest)

DIM_CUSTOMERS := EXTERNAL_COLUMNS(
    customer_id BIGINT BOUND TO CUSTOMER_ID,
    name VARCHAR,
    email VARCHAR
    FROM TABLE(schema.dim_customers)
),
CUSTOMER_NAME := DIM_CUSTOMERS[name],
CUSTOMER_EMAIL := DIM_CUSTOMERS[email],
sql

From a SQL query (when you need transforms or joins)

DIM_CUSTOMERS := EXTERNAL_COLUMNS(
    customer_id BIGINT BOUND TO CUSTOMER_ID,
    name VARCHAR,
    total_orders BIGINT
    FROM SQL(
        SELECT c.customer_id, c.name, COUNT(o.order_id) as total_orders
        FROM customers c LEFT JOIN orders o ON c.id = o.customer_id
        GROUP BY c.customer_id, c.name
    )
),
sql

From a view (realtime serving)

DIM_CUSTOMERS := EXTERNAL_COLUMNS(
    customer_id BIGINT BOUND TO CUSTOMER_ID,
    name VARCHAR
    FROM VIEW(ext.v_dim_customers)
),
sql

Entity annotations on columns

Use #ENTITY annotations on columns that are foreign keys to maintain semantic relationships:

FCT_ORDERS := EXTERNAL_COLUMNS(
    order_id BIGINT#ORDERS BOUND TO ORDER_ID,
    customer_id BIGINT#CUSTOMERS,              -- Foreign key to CUSTOMERS entity
    amount DOUBLE,
    order_date DATE
    FROM TABLE(schema.fct_orders)
),
ORDER_CUSTOMER_ID := FCT_ORDERS[customer_id],  -- Typed as BIGINT#CUSTOMERS
sql

INLINE_COLUMNS — for prototyping and tests

Same syntax as EXTERNAL_COLUMNS() but embeds data directly in the query. Use for prototyping without a database.

With CSV

DIM_CUSTOMERS := INLINE_COLUMNS(
    customer_id BIGINT BOUND TO CUSTOMER_ID,
    name VARCHAR,
    country VARCHAR
    FROM CSV(
        customer_id,name,country
        101,Alice,US
        102,Bob,FR
        103,Charlie,UK
    )
),
CUSTOMER_NAME := DIM_CUSTOMERS[name],
sql

With JSON

DIM_CUSTOMERS := INLINE_COLUMNS(
    customer_id BIGINT BOUND TO CUSTOMER_ID,
    name VARCHAR
    FROM JSON([
        {"customer_id": 101, "name": "Alice"},
        {"customer_id": 102, "name": "Bob"}
    ])
),
sql

EXTERNAL_SQL — full control (use sparingly)

EXTERNAL_SQL() gives you full control over the SQL query and join conditions. Use it only when EXTERNAL_COLUMNS() cannot express what you need — typically for complex join conditions or backend-specific SQL.

SOURCE := EXTERNAL_SQL(
    'SELECT customer_id, name, email FROM customers',
    ON `SELF.customer_id = %CUSTOMER_ID`,     -- Join condition with %INPUT placeholders
    AS ROW(customer_id BIGINT, name VARCHAR, email VARCHAR)
),
CUSTOMER_NAME := SOURCE[name],
sql

Key differences from EXTERNAL_COLUMNS:

  • You write raw SQL as a string (must match the backend dialect)
  • You specify join conditions explicitly with ON using SELF.column = %INPUT_NAME syntax
  • You must declare the return type as ROW(...) explicitly
  • The SQL string can include GROUP BY for aggregation

When to use EXTERNAL_SQL

  • Complex join conditions that EXTERNAL_COLUMNS() cannot express
  • Backend-specific SQL features (e.g., BigQuery UNNEST, Trino-specific functions)
  • Dynamic SQL via metaprogramming (@echo(), @literal())

Aggregation example (one-to-many → single row)

CUSTOMER_ORDER_IDS := EXTERNAL_SQL(
    'SELECT customer_id, ARRAY_AGG(order_id) as order_ids FROM orders GROUP BY customer_id',
    ON `SELF.customer_id = %CUSTOMER_ID`,
    AS ROW(customer_id BIGINT, order_ids ARRAY(BIGINT#ORDERS))
)[order_ids],
sql

Quick mockup patterns for prototyping

When you just need to test logic without real tables:

All columns as inputs (fastest)

WITH
    ID := INPUT(BIGINT),
    NAME := INPUT(VARCHAR),
    GREETING := 'Hello, ' || NAME || '!',
SELECT ID, GREETING
FOR (ID, NAME) := BIND_VALUES(ARRAY[ROW(1, 'Alice'), ROW(2, 'Bob')]);
sql

Key-value separation with INLINE_COLUMNS (mirrors production)

WITH
    CUSTOMER_ID := INPUT(BIGINT),
    DIM := INLINE_COLUMNS(
        customer_id BIGINT BOUND TO CUSTOMER_ID,
        name VARCHAR, country VARCHAR
        FROM CSV(customer_id,name,country
            1,Alice,US
            2,Bob,FR)
    ),
    GREETING := 'Hello, ' || DIM[name] || ' from ' || DIM[country] || '!',
SELECT CUSTOMER_ID, GREETING
FOR CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2]);
sql

The second pattern is preferred because it mirrors production structure: keys are bound, values are looked up. When ready for production, replace INLINE_COLUMNS with EXTERNAL_COLUMNS ... FROM TABLE(...).

Rules for data source queries

  1. Keep SQL simple. SELECT with optional GROUP BY only. No WHERE clauses — filters belong in features, not in the data source.
  2. One mapping per entity+table. Don't create multiple mappings to the same table for the same entity.
  3. 0 or 1 row per input. The BOUND TO constraint enforces this. For one-to-many, aggregate with ARRAY_AGG in the SQL or use EXTERNAL_SQL with GROUP BY.
  4. Cast in the SQL if needed. If a column type doesn't match, cast it in the source SQL: CAST(date_col AS DATE) AS order_date.
Last update at: 2026/02/17 19:13:00
Last updated: 2026-02-17 19:13:38