Pitfall patterns and edge cases
Patterns that trip up LLMs (and humans) because they require careful attention to scoping rules, type precision, and function availability. Each pattern was validated by having a fast LLM solve the problem blind, then executing the solution to find real errors.
P1: Computed fields inside TRANSFORM (WITH clause)
When you need a derived field that doesn't exist in the enriched array — like extracting a quarter from a date — use the WITH clause inside TRANSFORM to compute it before the SELECT:
WITH
CUSTOMERS := ENTITY(),
ORDERS := ENTITY(),
CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
ORDER_ID := INPUT(BIGINT#ORDERS),
DIM_CUSTOMERS := EXTERNAL_COLUMNS(
customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
name VARCHAR,
country VARCHAR,
signup_date DATE,
FROM TABLE(pitfall.customers)
),
FCT_ORDERS := EXTERNAL_COLUMNS(
order_id BIGINT#ORDERS BIND TO ORDER_ID,
order_customer_id BIGINT#CUSTOMERS,
amount DOUBLE,
category VARCHAR,
order_date DATE,
status VARCHAR,
FROM TABLE(pitfall.orders)
),
CUSTOMER_ORDERS_OBT := EXTERNAL_COLUMNS(
customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
order_ids ARRAY(BIGINT#ORDERS),
FROM TABLE(pitfall.customer_orders_obt)
),
CUSTOMER_NAME := DIM_CUSTOMERS[name],
ORDER_IDS := CUSTOMER_ORDERS_OBT[order_ids],
ORDER_AMOUNT := FCT_ORDERS[amount],
ORDER_CATEGORY := FCT_ORDERS[category],
ORDER_ORDER_DATE := FCT_ORDERS[order_date],
ENRICHED_ORDERS := EXTEND(
ZIP(ORDER_IDS AS order_id)
WITH ORDER_AMOUNT AS amount, ORDER_CATEGORY AS category, ORDER_ORDER_DATE AS order_date
VIA order_id BIND TO ORDER_ID
),
REVENUE_BY_QUARTER := ENRICHED_ORDERS.TRANSFORM(
WITH quarter := EXTRACT(QUARTER FROM order_date)
SELECT quarter, SUM(amount) GROUP BY quarter AS total_amount
ORDER BY quarter
),
SELECT
CUSTOMER_ID,
CUSTOMER_NAME,
REVENUE_BY_QUARTER,
FOR CROSS
CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2, 3]),
ORDER_ID := BIND_VALUES(SEQUENCE(10, 17)),
;| CUSTOMER_ID BIGINT | CUSTOMER_NAME VARCHAR | REVENUE_BY_QUARTER VARCHAR |
|---|---|---|
| 1 | Alice | [{quarter: 1, total_amount: 170.0}, {quarter: 2, total_amount: 30.0}] |
| 2 | Bob | [{quarter: 1, total_amount: 200.0}, {quarter: 2, total_amount: 80.0}] |
| 3 | Charlie | [{quarter: 2, total_amount: 150.0}, {quarter: 3, total_amount: 135.0}] |
The WITH quarter := EXTRACT(QUARTER FROM order_date) creates a temporary field visible only within that TRANSFORM's SELECT. This is often missed — LLMs try to compute the quarter outside the array scope where order_date isn't available.
P2: Filtered aggregation with RELATED
RELATED() supports FILTER (WHERE ...) for conditional aggregation. The predicate must be a named boolean feature at the correct entity grain — not an inline expression:
WITH
CUSTOMERS := ENTITY(),
ORDERS := ENTITY(),
CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
ORDER_ID := INPUT(BIGINT#ORDERS),
DIM_CUSTOMERS := EXTERNAL_COLUMNS(
customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
name VARCHAR,
FROM TABLE(pitfall.customers)
),
FCT_ORDERS := EXTERNAL_COLUMNS(
order_id BIGINT#ORDERS BIND TO ORDER_ID,
order_customer_id BIGINT#CUSTOMERS,
amount DOUBLE,
status VARCHAR,
FROM TABLE(pitfall.orders)
),
CUSTOMER_NAME := DIM_CUSTOMERS[name],
ORDER_STATUS := FCT_ORDERS[status],
IS_SHIPPED := ORDER_STATUS = 'shipped',
TOTAL_REVENUE := CUSTOMER_ID.RELATED(SUM(FCT_ORDERS[amount]) GROUP BY FCT_ORDERS[order_customer_id]),
SHIPPED_REVENUE := CUSTOMER_ID.RELATED(
SUM(FCT_ORDERS[amount]) FILTER (WHERE IS_SHIPPED)
GROUP BY FCT_ORDERS[order_customer_id]
),
SHIPPED_RATIO := SHIPPED_REVENUE / TOTAL_REVENUE,
SELECT
CUSTOMER_ID,
CUSTOMER_NAME,
TOTAL_REVENUE,
SHIPPED_REVENUE,
SHIPPED_RATIO,
FOR CROSS
CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2, 3]),
ORDER_ID := BIND_VALUES(SEQUENCE(10, 17)),
;| CUSTOMER_ID BIGINT | CUSTOMER_NAME VARCHAR | TOTAL_REVENUE VARCHAR | SHIPPED_REVENUE VARCHAR | SHIPPED_RATIO VARCHAR |
|---|---|---|---|---|
| 1 | Alice | 200 | 170 | 0.85 |
| 2 | Bob | 280 | 200 | 0.7142857142857143 |
| 3 | Charlie | 285 | 285 | 1 |
LLM pitfall: using NULLIF() for safe division. FeatureQL does not have NULLIF — use IF(TOTAL_REVENUE = 0, NULL, SHIPPED_REVENUE / TOTAL_REVENUE) or handle NULLs with COALESCE() downstream. In this dataset all customers have orders, so simple division works.
P3: Scalar injection with CARRY — the >.N precision rule
CARRY() injects a customer-level scalar into each row of an array, making it available as a field inside TRANSFORM(). This is the correct way to compare per-row values against a per-entity aggregate:
WITH
CUSTOMERS := ENTITY(),
ORDERS := ENTITY(),
CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
ORDER_ID := INPUT(BIGINT#ORDERS),
CUSTOMER_ORDERS_OBT := EXTERNAL_COLUMNS(
customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
order_ids ARRAY(BIGINT#ORDERS),
FROM TABLE(pitfall.customer_orders_obt)
),
FCT_ORDERS := EXTERNAL_COLUMNS(
order_id BIGINT#ORDERS BIND TO ORDER_ID,
amount DOUBLE,
FROM TABLE(pitfall.orders)
),
ORDER_IDS := CUSTOMER_ORDERS_OBT[order_ids],
ORDER_AMOUNT := FCT_ORDERS[amount],
ENRICHED_ORDERS := EXTEND(
ZIP(ORDER_IDS AS order_id)
WITH ORDER_AMOUNT AS amount
VIA order_id BIND TO ORDER_ID
),
AVG_AMOUNT := ENRICHED_ORDERS.TRANSFORM(SELECT AVG(amount)).UNWRAP_ONE(),
ORDERS_FLAGGED := ENRICHED_ORDERS.CARRY(AVG_AMOUNT).TRANSFORM(
SELECT order_id, amount, IF(amount >.2 avg_amount, 'above', 'below') AS flag
),
SELECT
CUSTOMER_ID,
AVG_AMOUNT,
ORDERS_FLAGGED,
FOR CROSS
CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2, 3]),
ORDER_ID := BIND_VALUES(SEQUENCE(10, 17)),
;| CUSTOMER_ID BIGINT | AVG_AMOUNT VARCHAR | ORDERS_FLAGGED VARCHAR |
|---|---|---|
| 1 | 66.66666666666667 | [{order_id: 10, amount: 50.0, flag: below}, {order_id: 11, amount: 120.0, flag: above}, {order_id: 12, amount: 30.0, flag: below}] |
| 2 | 140 | [{order_id: 13, amount: 200.0, flag: above}, {order_id: 14, amount: 80.0, flag: below}] |
| 3 | 95 | [{order_id: 15, amount: 150.0, flag: above}, {order_id: 16, amount: 45.0, flag: below}, {order_id: 17, amount: 90.0, flag: below}] |
Critical pitfall: DOUBLE comparisons require precision syntax. amount > avg_amount fails with COMPARISON-DIGITS-EMPTY. You must use >.N (e.g., >.2 for 2 digits of precision) when comparing DOUBLE values. This is the single most common runtime error LLMs hit.
P4: Composing RELATED results into derived features
Features returned by RELATED() are regular features — you can combine them with string concatenation, boolean logic, or arithmetic:
WITH
AIRPORTS := ENTITY(),
FLIGHTS := ENTITY(),
AIRPORT_ID := INPUT(BIGINT#AIRPORTS),
FLIGHT_ID := INPUT(BIGINT#FLIGHTS),
DIM_AIRPORTS := EXTERNAL_COLUMNS(
airport_id BIGINT#AIRPORTS BIND TO AIRPORT_ID,
name VARCHAR,
city VARCHAR,
country VARCHAR,
FROM TABLE(airport.dim_airports)
),
FCT_FLIGHTS := EXTERNAL_COLUMNS(
flight_id BIGINT#FLIGHTS BIND TO FLIGHT_ID,
flight_origin_id BIGINT#AIRPORTS,
flight_dest_id BIGINT#AIRPORTS,
departure_time TIMESTAMP,
airline VARCHAR,
FROM TABLE(airport.fct_flights)
),
AIRPORT_CITY := DIM_AIRPORTS[city],
AIRPORT_COUNTRY := DIM_AIRPORTS[country],
FLIGHT_AIRLINE := FCT_FLIGHTS[airline],
ORIGIN_CITY := RELATED(AIRPORT_CITY VIA FCT_FLIGHTS[flight_origin_id]),
DEST_CITY := RELATED(AIRPORT_CITY VIA FCT_FLIGHTS[flight_dest_id]),
ORIGIN_COUNTRY := RELATED(AIRPORT_COUNTRY VIA FCT_FLIGHTS[flight_origin_id]),
DEST_COUNTRY := RELATED(AIRPORT_COUNTRY VIA FCT_FLIGHTS[flight_dest_id]),
ROUTE_LABEL := ORIGIN_CITY || ' → ' || DEST_CITY,
IS_DOMESTIC := ORIGIN_COUNTRY = DEST_COUNTRY,
SELECT
FLIGHT_ID,
FLIGHT_AIRLINE AS AIRLINE,
ROUTE_LABEL,
IS_DOMESTIC,
FOR CROSS
FLIGHT_ID := BIND_VALUES(SEQUENCE(200, 207)),
AIRPORT_ID := BIND_VALUES(SEQUENCE(100, 104)),
;| FLIGHT_ID BIGINT | AIRLINE VARCHAR | ROUTE_LABEL VARCHAR | IS_DOMESTIC VARCHAR |
|---|---|---|---|
| 200 | BA | London → Paris | FALSE |
| 201 | KLM | London → Amsterdam | FALSE |
| 202 | AF | Paris → Madrid | FALSE |
| 203 | KLM | Amsterdam → London | FALSE |
| 204 | AZ | Amsterdam → Rome | FALSE |
| 205 | IB | Madrid → London | FALSE |
| 206 | IB | Madrid → Amsterdam | FALSE |
| 207 | AZ | Rome → Paris | FALSE |
ROUTE_LABEL concatenates two RELATED() results. IS_DOMESTIC compares two RELATED() results with =. No special syntax is needed — FeatureQL features compose freely regardless of how they were computed.
P5: Triple-chained TRANSFORM to extract a scalar
When you need to go from an enriched array → grouped aggregation → top pick → single scalar value, chain three TRANSFORM() calls ending with UNWRAP_ONE():
WITH
CUSTOMERS := ENTITY(),
ORDERS := ENTITY(),
CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
ORDER_ID := INPUT(BIGINT#ORDERS),
DIM_CUSTOMERS := EXTERNAL_COLUMNS(
customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
name VARCHAR,
FROM TABLE(pitfall.customers)
),
FCT_ORDERS := EXTERNAL_COLUMNS(
order_id BIGINT#ORDERS BIND TO ORDER_ID,
order_customer_id BIGINT#CUSTOMERS,
amount DOUBLE,
category VARCHAR,
FROM TABLE(pitfall.orders)
),
CUSTOMER_ORDERS_OBT := EXTERNAL_COLUMNS(
customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
order_ids ARRAY(BIGINT#ORDERS),
FROM TABLE(pitfall.customer_orders_obt)
),
CUSTOMER_NAME := DIM_CUSTOMERS[name],
ORDER_IDS := CUSTOMER_ORDERS_OBT[order_ids],
ORDER_AMOUNT := FCT_ORDERS[amount],
ORDER_CATEGORY := FCT_ORDERS[category],
ENRICHED_ORDERS := EXTEND(
ZIP(ORDER_IDS AS order_id)
WITH ORDER_AMOUNT AS amount, ORDER_CATEGORY AS category
VIA order_id BIND TO ORDER_ID
),
REVENUE_BY_CATEGORY := ENRICHED_ORDERS.TRANSFORM(
SELECT category, SUM(amount) GROUP BY category AS total
ORDER BY total DESC
),
TOP_CATEGORY_ROW := REVENUE_BY_CATEGORY.TRANSFORM(
SELECT category, total ORDER BY total DESC LIMIT 1
),
TOP_CATEGORY_NAME := TOP_CATEGORY_ROW.TRANSFORM(SELECT category).UNWRAP_ONE(),
SELECT
CUSTOMER_ID,
CUSTOMER_NAME,
TOP_CATEGORY_NAME,
FOR CROSS
CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2, 3]),
ORDER_ID := BIND_VALUES(SEQUENCE(10, 17)),
;| CUSTOMER_ID BIGINT | CUSTOMER_NAME VARCHAR | TOP_CATEGORY_NAME VARCHAR |
|---|---|---|
| 1 | Alice | books |
| 2 | Bob | clothing |
| 3 | Charlie | books |
The chain: .TRANSFORM(GROUP BY) → .TRANSFORM(ORDER BY LIMIT 1) → .TRANSFORM(SELECT field).UNWRAP_ONE(). Each TRANSFORM operates on the output of the previous one. The final UNWRAP_ONE() extracts the VARCHAR from a single-element ARRAY(ROW(category VARCHAR)).
P6: RELATED with computed boolean filter (cross-entity)
FILTER (WHERE ...) in RELATED() can reference boolean features that themselves depend on other RELATED() calls. The feature IS_INTL_FLIGHT compares origin and destination countries (both computed via RELATED()), and NUM_INTERNATIONAL_DEPARTURES uses that boolean as a filter:
WITH
AIRPORTS := ENTITY(),
FLIGHTS := ENTITY(),
AIRPORT_ID := INPUT(BIGINT#AIRPORTS),
FLIGHT_ID := INPUT(BIGINT#FLIGHTS),
DIM_AIRPORTS := EXTERNAL_COLUMNS(
airport_id BIGINT#AIRPORTS BIND TO AIRPORT_ID,
name VARCHAR,
country VARCHAR,
FROM TABLE(airport.dim_airports)
),
FCT_FLIGHTS := EXTERNAL_COLUMNS(
flight_id BIGINT#FLIGHTS BIND TO FLIGHT_ID,
flight_origin_id BIGINT#AIRPORTS,
flight_dest_id BIGINT#AIRPORTS,
FROM TABLE(airport.fct_flights)
),
AIRPORT_NAME := DIM_AIRPORTS[name],
AIRPORT_COUNTRY := DIM_AIRPORTS[country],
ORIGIN_COUNTRY_FL := RELATED(AIRPORT_COUNTRY VIA FCT_FLIGHTS[flight_origin_id]),
DEST_COUNTRY_FL := RELATED(AIRPORT_COUNTRY VIA FCT_FLIGHTS[flight_dest_id]),
IS_INTL_FLIGHT := ORIGIN_COUNTRY_FL <> DEST_COUNTRY_FL,
NUM_INTERNATIONAL_DEPARTURES := AIRPORT_ID.RELATED(
COUNT(1) FILTER (WHERE IS_INTL_FLIGHT) GROUP BY FCT_FLIGHTS[flight_origin_id]
),
SELECT
AIRPORT_ID,
AIRPORT_NAME,
NUM_INTERNATIONAL_DEPARTURES,
FOR CROSS
AIRPORT_ID := BIND_VALUES(SEQUENCE(100, 104)),
FLIGHT_ID := BIND_VALUES(SEQUENCE(200, 207)),
ORDER BY NUM_INTERNATIONAL_DEPARTURES DESC
LIMIT 1
;| AIRPORT_ID BIGINT | AIRPORT_NAME VARCHAR | NUM_INTERNATIONAL_DEPARTURES BIGINT |
|---|---|---|
| 100 | Heathrow | 2 |
This demonstrates that FeatureQL correctly resolves deep dependency chains inside RELATED() filters.
LLM error catalog from blind testing
| Error | Root cause | Fix |
|---|---|---|
FUNCTION-NOT-FOUND: NULLIF | SQL function not available in FeatureQL | Use IF(x = 0, NULL, ...) or COALESCE() |
COMPARISON-DIGITS-EMPTY | Comparing DOUBLE values with > instead of >.N | Use >.2, <.2, etc. for DOUBLE comparisons |
| Extra columns in EXTERNAL_COLUMNS | Mapping unnecessary columns (harmless but verbose) | Only map columns you actually use |
What's next
- CARRY vs IMPORT — when to use each
- Tricky modeling patterns — dual FK, successive aggregation, many-to-many
- TRANSFORM reference — full WITH/SELECT/WHERE/GROUP BY syntax