LLM reference — data sources and binding

How to connect features to external data and provide input values. Load this page when you need to write EXTERNAL_COLUMNS(), INLINE_COLUMNS(), or any BIND_XXX() pattern.

The binding contract

If your SELECT depends on any INPUT() (directly or transitively), you must bind every such input in FOR using BIND_VALUE, BIND_VALUES, or BIND_COLUMNS. Queries whose result graph has no INPUT() may omit FOR (literals and/or persisted features only).

WITH
    ID := INPUT(BIGINT),
SELECT
    ID,
    DOUBLED := ID * 2
FOR
    ID := BIND_VALUES(ARRAY[1, 2, 3]);
sql

Binding patterns

-- Single value (one row)
FOR NAME := BIND_VALUE('Alice');

-- Array of values (multiple rows, one input)
FOR ID := BIND_VALUES(ARRAY[1, 3, 5]);

-- Multiple inputs paired together (no cross-product)
FOR (INPUT1, INPUT2) := BIND_VALUES(ARRAY[ROW(1, 'a'), ROW(3, 'b')]);

-- Separate bindings = Cartesian product (n × m rows) — requires FOR CROSS (plain FOR is rejected)
FOR CROSS
    INPUT1 := BIND_VALUES(ARRAY[1, 3]),
    INPUT2 := BIND_VALUES(ARRAY[2, 4]);

-- From table columns
FOR (INPUT1, INPUT2) := BIND_COLUMNS(col1, col2 FROM TABLE(schema.table));

-- From a SQL query (backend-specific SQL)
FOR (ID, NAME) := BIND_COLUMNS(id, name FROM SQL(SELECT id, name FROM (VALUES (1, 'Alice'), (2, 'Bob')) AS t(id, name)));
sql

See Binding inputs for pairing vs Cartesian product and when FOR CROSS is required.

EXTERNAL_COLUMNS — the primary data source mapping

EXTERNAL_COLUMNS() is the recommended way to map table data to features. It handles 95% of use cases. You declare columns, bind key columns to inputs, and specify a source. FeatureQL generates all join logic.

General pattern

FEATURE_NAME := EXTERNAL_COLUMNS(
    key_column TYPE BIND TO SOME_INPUT,   -- Key column linked to an INPUT feature
    value_column1 TYPE,                     -- Value columns
    value_column2 TYPE
    FROM <SOURCE>(...)                      -- TABLE(), SQL(), or VIEW()
),
-- Access individual fields with [] operator:
FIELD1 := FEATURE_NAME[value_column1],
FIELD2 := FEATURE_NAME[value_column2],
sql

The BIND TO clause (two words, not BOUND TO) links a key column to an INPUT() feature, enforcing 0 or 1 row per input value. The result is a ROW — extract fields with [].

Always include the #ENTITY annotation on key columns. If CUSTOMER_ID is INPUT(BIGINT#CUSTOMERS), then the key column must be customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID — not just customer_id BIGINT BIND TO CUSTOMER_ID. The #ENTITY tag is what enables RELATED() and EXTEND() to resolve join paths. Without it, cross-entity operations may fail silently or produce wrong results.

From a table (simplest)

DIM_CUSTOMERS := EXTERNAL_COLUMNS(
    customer_id BIGINT BIND TO CUSTOMER_ID,
    name VARCHAR,
    email VARCHAR
    FROM TABLE(schema.dim_customers)
),
CUSTOMER_NAME := DIM_CUSTOMERS[name],
CUSTOMER_EMAIL := DIM_CUSTOMERS[email],
sql

From a SQL query (when you need transforms or joins)

DIM_CUSTOMERS := EXTERNAL_COLUMNS(
    customer_id BIGINT BIND TO CUSTOMER_ID,
    name VARCHAR,
    total_orders BIGINT
    FROM SQL(
        SELECT c.customer_id, c.name, COUNT(o.order_id) as total_orders
        FROM customers c LEFT JOIN orders o ON c.id = o.customer_id
        GROUP BY c.customer_id, c.name
    )
),
sql

Structural SQL vs business filters: Joins and grouping to produce a stable key → row mapping (one row per customer_id here) are normal. Avoid pushing business filters (WHERE revenue > …, campaign rules) into mapping SQL — express those as FeatureQL features so they stay reusable and testable.

From a view (realtime serving)

DIM_CUSTOMERS := EXTERNAL_COLUMNS(
    customer_id BIGINT BIND TO CUSTOMER_ID,
    name VARCHAR
    FROM VIEW(ext.v_dim_customers)
),
sql

Entity annotations on columns

Use #ENTITY annotations on columns that are foreign keys to maintain semantic relationships:

FCT_ORDERS := EXTERNAL_COLUMNS(
    order_id BIGINT#ORDERS BIND TO ORDER_ID,
    customer_id BIGINT#CUSTOMERS,              -- Foreign key to CUSTOMERS entity
    amount DOUBLE,
    order_date DATE
    FROM TABLE(schema.fct_orders)
),
ORDER_CUSTOMER_ID := FCT_ORDERS[customer_id],  -- Typed as BIGINT#CUSTOMERS
sql

INLINE_COLUMNS — for prototyping and tests

Same syntax as EXTERNAL_COLUMNS() but embeds data directly in the query. Use for prototyping without a database.

With CSV

DIM_CUSTOMERS := INLINE_COLUMNS(
    customer_id BIGINT BIND TO CUSTOMER_ID,
    name VARCHAR,
    country VARCHAR
    FROM CSV(
        customer_id,name,country
        101,Alice,US
        102,Bob,FR
        103,Charlie,UK
    )
),
CUSTOMER_NAME := DIM_CUSTOMERS[name],
sql

LLM / API precision — FROM CSV must be multiline. The parser expects a header line and one data row per line inside CSV( … ). If you emit the whole CSV as one physical line (typical mistake: building the FeatureQL string inside a single-line JSON value or curl -d '...' without a heredoc), you get UE/EXTERNAL-COLUMNS-CSV-NO-DATA (only a header, no rows). First-shot fixes: (1) use a multiline query (triple-quoted string in Python; shell heredoc), or (2) use FROM JSON([...]) below so the mock needs no embedded newlines, or (3) use FROM SQL(SELECT … FROM (VALUES …)) for tiny tables.

With JSON

DIM_CUSTOMERS := INLINE_COLUMNS(
    customer_id BIGINT BIND TO CUSTOMER_ID,
    name VARCHAR
    FROM JSON([
        {"customer_id": 101, "name": "Alice"},
        {"customer_id": 102, "name": "Bob"}
    ])
),
sql

Aggregation example (one-to-many → single row)

CUSTOMER_ORDER_IDS := EXTERNAL_COLUMNS(
    customer_id BIGINT BIND TO CUSTOMER_ID,
    order_ids ARRAY(BIGINT#ORDERS)
    FROM SQL(SELECT customer_id, ARRAY_AGG(order_id) as order_ids FROM orders GROUP BY customer_id)
)[order_ids],
sql

Quick mockup patterns for prototyping

When you just need to test logic without real tables:

All columns as inputs (fastest but not recommended)

WITH
    ID := INPUT(BIGINT),
    NAME := INPUT(VARCHAR),
    GREETING := 'Hello, ' || NAME || '!',
SELECT ID, GREETING
FOR (ID, NAME) := BIND_VALUES(ARRAY[ROW(1, 'Alice'), ROW(2, 'Bob')]);
sql

Key-value separation with INLINE_COLUMNS (mirrors production)

WITH
    CUSTOMER_ID := INPUT(BIGINT),
    DIM := INLINE_COLUMNS(
        customer_id BIGINT BIND TO CUSTOMER_ID,
        name VARCHAR, country VARCHAR
        FROM CSV(customer_id,name,country
            1,Alice,US
            2,Bob,FR)
    ),
    GREETING := 'Hello, ' || DIM[name] || ' from ' || DIM[country] || '!',
SELECT CUSTOMER_ID, GREETING
FOR CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2]);
sql

The second pattern is preferred because it mirrors production structure: keys are bound, values are looked up. When ready for production, replace INLINE_COLUMNS with EXTERNAL_COLUMNS ... FROM TABLE(...).

Rules for data source queries

  1. 0 or 1 row per input. CRITICAL! Be sure to bind the key columns to inputs. One set of bounded inputs must return at most one row.
  2. Keep mapping SQL boring. Prefer raw tables/views; use SQL only for stable reshaping (joins, grouping to one row per key). Push business filters and metrics into FeatureQL features. Heavier transformation belongs in your warehouse / silver layer, not in ad hoc mapping SQL.
  3. Cast in the SQL if needed. However, if a column type doesn't match available FeatureQL types, cast it in the source SQL: CAST(date_col AS DATE) AS order_date.
  4. One mapping per entity+table. Don't create multiple mappings to the same table for the same entity.

Common data source mistakes (LLM checklist)

  1. Missing #ENTITY on key columns. If your INPUT is BIGINT#CUSTOMERS, the key column in EXTERNAL_COLUMNS must be customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID. Omitting #CUSTOMERS breaks entity resolution for RELATED() / EXTEND().

  2. Missing BIND TO on key columns. Every EXTERNAL_COLUMNS must have exactly one column with BIND TO pointing at the entity's INPUT(). Without it, FeatureQL cannot look up rows by key.

  3. Bare field names in RELATED expressions. When aggregating inside RELATED(), reference fields through the mapping feature: SUM(ORD_ITEMS[unit_price]), not bare SUM(unit_price). Bare names are only valid inside TRANSFORM where they refer to row fields.

  4. Multiple BIND TO in one mapping. An EXTERNAL_COLUMNS can only have one key (one BIND TO). If you need data keyed by two different entities, create two separate mappings.

  5. Using BIGINT[] instead of ARRAY(BIGINT#ORDERS). Always use FeatureQL type syntax with entity annotations: ARRAY(BIGINT#ORDERS), not SQL-style BIGINT[].

Tests / sandboxes: BIND_COLUMNS(… FROM SQL(SELECT … FROM (VALUES …))) is a portable pattern when you have no real table; production uses TABLE() / VIEW().

Last update at: 2026/05/05 08:09:16