LLM reference — data sources and binding
How to connect features to external data and provide input values. Load this page when you need to write EXTERNAL_COLUMNS(), INLINE_COLUMNS(), or any BIND_XXX() pattern.
The binding contract
If your SELECT depends on any INPUT() (directly or transitively), you must bind every such input in FOR using BIND_VALUE, BIND_VALUES, or BIND_COLUMNS. Queries whose result graph has no INPUT() may omit FOR (literals and/or persisted features only).
WITH
ID := INPUT(BIGINT),
SELECT
ID,
DOUBLED := ID * 2
FOR
ID := BIND_VALUES(ARRAY[1, 2, 3]); Binding patterns
-- Single value (one row)
FOR NAME := BIND_VALUE('Alice');
-- Array of values (multiple rows, one input)
FOR ID := BIND_VALUES(ARRAY[1, 3, 5]);
-- Multiple inputs paired together (no cross-product)
FOR (INPUT1, INPUT2) := BIND_VALUES(ARRAY[ROW(1, 'a'), ROW(3, 'b')]);
-- Separate bindings = Cartesian product (n × m rows) — requires FOR CROSS (plain FOR is rejected)
FOR CROSS
INPUT1 := BIND_VALUES(ARRAY[1, 3]),
INPUT2 := BIND_VALUES(ARRAY[2, 4]);
-- From table columns
FOR (INPUT1, INPUT2) := BIND_COLUMNS(col1, col2 FROM TABLE(schema.table));
-- From a SQL query (backend-specific SQL)
FOR (ID, NAME) := BIND_COLUMNS(id, name FROM SQL(SELECT id, name FROM (VALUES (1, 'Alice'), (2, 'Bob')) AS t(id, name)));
See Binding inputs for pairing vs Cartesian product and when FOR CROSS is required.
EXTERNAL_COLUMNS — the primary data source mapping
EXTERNAL_COLUMNS() is the recommended way to map table data to features. It handles 95% of use cases. You declare columns, bind key columns to inputs, and specify a source. FeatureQL generates all join logic.
General pattern
FEATURE_NAME := EXTERNAL_COLUMNS(
key_column TYPE BIND TO SOME_INPUT, -- Key column linked to an INPUT feature
value_column1 TYPE, -- Value columns
value_column2 TYPE
FROM <SOURCE>(...) -- TABLE(), SQL(), or VIEW()
),
-- Access individual fields with [] operator:
FIELD1 := FEATURE_NAME[value_column1],
FIELD2 := FEATURE_NAME[value_column2], The BIND TO clause (two words, not BOUND TO) links a key column to an INPUT() feature, enforcing 0 or 1 row per input value. The result is a ROW — extract fields with [].
Always include the #ENTITY annotation on key columns. If CUSTOMER_ID is INPUT(BIGINT#CUSTOMERS), then the key column must be customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID — not just customer_id BIGINT BIND TO CUSTOMER_ID. The #ENTITY tag is what enables RELATED() and EXTEND() to resolve join paths. Without it, cross-entity operations may fail silently or produce wrong results.
From a table (simplest)
DIM_CUSTOMERS := EXTERNAL_COLUMNS(
customer_id BIGINT BIND TO CUSTOMER_ID,
name VARCHAR,
email VARCHAR
FROM TABLE(schema.dim_customers)
),
CUSTOMER_NAME := DIM_CUSTOMERS[name],
CUSTOMER_EMAIL := DIM_CUSTOMERS[email], From a SQL query (when you need transforms or joins)
DIM_CUSTOMERS := EXTERNAL_COLUMNS(
customer_id BIGINT BIND TO CUSTOMER_ID,
name VARCHAR,
total_orders BIGINT
FROM SQL(
SELECT c.customer_id, c.name, COUNT(o.order_id) as total_orders
FROM customers c LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.customer_id, c.name
)
), Structural SQL vs business filters: Joins and grouping to produce a stable key → row mapping (one row per customer_id here) are normal. Avoid pushing business filters (WHERE revenue > …, campaign rules) into mapping SQL — express those as FeatureQL features so they stay reusable and testable.
From a view (realtime serving)
DIM_CUSTOMERS := EXTERNAL_COLUMNS(
customer_id BIGINT BIND TO CUSTOMER_ID,
name VARCHAR
FROM VIEW(ext.v_dim_customers)
), Entity annotations on columns
Use #ENTITY annotations on columns that are foreign keys to maintain semantic relationships:
FCT_ORDERS := EXTERNAL_COLUMNS(
order_id BIGINT#ORDERS BIND TO ORDER_ID,
customer_id BIGINT#CUSTOMERS, -- Foreign key to CUSTOMERS entity
amount DOUBLE,
order_date DATE
FROM TABLE(schema.fct_orders)
),
ORDER_CUSTOMER_ID := FCT_ORDERS[customer_id], -- Typed as BIGINT#CUSTOMERS INLINE_COLUMNS — for prototyping and tests
Same syntax as EXTERNAL_COLUMNS() but embeds data directly in the query. Use for prototyping without a database.
With CSV
DIM_CUSTOMERS := INLINE_COLUMNS(
customer_id BIGINT BIND TO CUSTOMER_ID,
name VARCHAR,
country VARCHAR
FROM CSV(
customer_id,name,country
101,Alice,US
102,Bob,FR
103,Charlie,UK
)
),
CUSTOMER_NAME := DIM_CUSTOMERS[name], LLM / API precision — FROM CSV must be multiline. The parser expects a header line and one data row per line inside CSV( … ). If you emit the whole CSV as one physical line (typical mistake: building the FeatureQL string inside a single-line JSON value or curl -d '...' without a heredoc), you get UE/EXTERNAL-COLUMNS-CSV-NO-DATA (only a header, no rows). First-shot fixes: (1) use a multiline query (triple-quoted string in Python; shell heredoc), or (2) use FROM JSON([...]) below so the mock needs no embedded newlines, or (3) use FROM SQL(SELECT … FROM (VALUES …)) for tiny tables.
With JSON
DIM_CUSTOMERS := INLINE_COLUMNS(
customer_id BIGINT BIND TO CUSTOMER_ID,
name VARCHAR
FROM JSON([
{"customer_id": 101, "name": "Alice"},
{"customer_id": 102, "name": "Bob"}
])
), Aggregation example (one-to-many → single row)
CUSTOMER_ORDER_IDS := EXTERNAL_COLUMNS(
customer_id BIGINT BIND TO CUSTOMER_ID,
order_ids ARRAY(BIGINT#ORDERS)
FROM SQL(SELECT customer_id, ARRAY_AGG(order_id) as order_ids FROM orders GROUP BY customer_id)
)[order_ids], Quick mockup patterns for prototyping
When you just need to test logic without real tables:
All columns as inputs (fastest but not recommended)
WITH
ID := INPUT(BIGINT),
NAME := INPUT(VARCHAR),
GREETING := 'Hello, ' || NAME || '!',
SELECT ID, GREETING
FOR (ID, NAME) := BIND_VALUES(ARRAY[ROW(1, 'Alice'), ROW(2, 'Bob')]); Key-value separation with INLINE_COLUMNS (mirrors production)
WITH
CUSTOMER_ID := INPUT(BIGINT),
DIM := INLINE_COLUMNS(
customer_id BIGINT BIND TO CUSTOMER_ID,
name VARCHAR, country VARCHAR
FROM CSV(customer_id,name,country
1,Alice,US
2,Bob,FR)
),
GREETING := 'Hello, ' || DIM[name] || ' from ' || DIM[country] || '!',
SELECT CUSTOMER_ID, GREETING
FOR CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2]); The second pattern is preferred because it mirrors production structure: keys are bound, values are looked up. When ready for production, replace INLINE_COLUMNS with EXTERNAL_COLUMNS ... FROM TABLE(...).
Rules for data source queries
- 0 or 1 row per input. CRITICAL! Be sure to bind the key columns to inputs. One set of bounded inputs must return at most one row.
- Keep mapping SQL boring. Prefer raw tables/views; use SQL only for stable reshaping (joins, grouping to one row per key). Push business filters and metrics into FeatureQL features. Heavier transformation belongs in your warehouse / silver layer, not in ad hoc mapping SQL.
- Cast in the SQL if needed. However, if a column type doesn't match available FeatureQL types, cast it in the source SQL:
CAST(date_col AS DATE) AS order_date. - One mapping per entity+table. Don't create multiple mappings to the same table for the same entity.
Common data source mistakes (LLM checklist)
Missing
#ENTITYon key columns. If your INPUT isBIGINT#CUSTOMERS, the key column inEXTERNAL_COLUMNSmust becustomer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID. Omitting#CUSTOMERSbreaks entity resolution forRELATED()/EXTEND().Missing
BIND TOon key columns. EveryEXTERNAL_COLUMNSmust have exactly one column withBIND TOpointing at the entity'sINPUT(). Without it, FeatureQL cannot look up rows by key.Bare field names in RELATED expressions. When aggregating inside
RELATED(), reference fields through the mapping feature:SUM(ORD_ITEMS[unit_price]), not bareSUM(unit_price). Bare names are only valid insideTRANSFORMwhere they refer to row fields.Multiple BIND TO in one mapping. An
EXTERNAL_COLUMNScan only have one key (oneBIND TO). If you need data keyed by two different entities, create two separate mappings.Using
BIGINT[]instead ofARRAY(BIGINT#ORDERS). Always use FeatureQL type syntax with entity annotations:ARRAY(BIGINT#ORDERS), not SQL-styleBIGINT[].
Tests / sandboxes: BIND_COLUMNS(… FROM SQL(SELECT … FROM (VALUES …))) is a portable pattern when you have no real table; production uses TABLE() / VIEW().