Pitfall patterns and edge cases

Patterns that trip up LLMs (and humans) because they require careful attention to scoping rules, type precision, and function availability. Each pattern was validated by having a fast LLM solve the problem blind, then executing the solution to find real errors.

P1: Computed fields inside TRANSFORM (WITH clause)

When you need a derived field that doesn't exist in the enriched array — like extracting a quarter from a date — use the WITH clause inside TRANSFORM to compute it before the SELECT:

FeatureQL
WITH
    CUSTOMERS := ENTITY(),
    ORDERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    ORDER_ID := INPUT(BIGINT#ORDERS),
    DIM_CUSTOMERS := EXTERNAL_COLUMNS(
        customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
        name VARCHAR,
        country VARCHAR,
        signup_date DATE,
        FROM TABLE(pitfall.customers)
    ),
    FCT_ORDERS := EXTERNAL_COLUMNS(
        order_id BIGINT#ORDERS BIND TO ORDER_ID,
        order_customer_id BIGINT#CUSTOMERS,
        amount DOUBLE,
        category VARCHAR,
        order_date DATE,
        status VARCHAR,
        FROM TABLE(pitfall.orders)
    ),
    CUSTOMER_ORDERS_OBT := EXTERNAL_COLUMNS(
        customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
        order_ids ARRAY(BIGINT#ORDERS),
        FROM TABLE(pitfall.customer_orders_obt)
    ),
    CUSTOMER_NAME := DIM_CUSTOMERS[name],
    ORDER_IDS := CUSTOMER_ORDERS_OBT[order_ids],
    ORDER_AMOUNT := FCT_ORDERS[amount],
    ORDER_CATEGORY := FCT_ORDERS[category],
    ORDER_ORDER_DATE := FCT_ORDERS[order_date],
    ENRICHED_ORDERS := EXTEND(
        ZIP(ORDER_IDS AS order_id)
        WITH ORDER_AMOUNT AS amount, ORDER_CATEGORY AS category, ORDER_ORDER_DATE AS order_date
        VIA order_id BIND TO ORDER_ID
    ),
    REVENUE_BY_QUARTER := ENRICHED_ORDERS.TRANSFORM(
        WITH quarter := EXTRACT(QUARTER FROM order_date)
        SELECT quarter, SUM(amount) GROUP BY quarter AS total_amount
        ORDER BY quarter
    ),
SELECT
    CUSTOMER_ID,
    CUSTOMER_NAME,
    REVENUE_BY_QUARTER,
FOR CROSS
    CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2, 3]),
    ORDER_ID := BIND_VALUES(SEQUENCE(10, 17)),
;
Result
CUSTOMER_ID BIGINTCUSTOMER_NAME VARCHARREVENUE_BY_QUARTER VARCHAR
1Alice[{quarter: 1, total_amount: 170.0}, {quarter: 2, total_amount: 30.0}]
2Bob[{quarter: 1, total_amount: 200.0}, {quarter: 2, total_amount: 80.0}]
3Charlie[{quarter: 2, total_amount: 150.0}, {quarter: 3, total_amount: 135.0}]

The WITH quarter := EXTRACT(QUARTER FROM order_date) creates a temporary field visible only within that TRANSFORM's SELECT. This is often missed — LLMs try to compute the quarter outside the array scope where order_date isn't available.

P2: Filtered aggregation with RELATED

RELATED() supports FILTER (WHERE ...) for conditional aggregation. The predicate must be a named boolean feature at the correct entity grain — not an inline expression:

FeatureQL
WITH
    CUSTOMERS := ENTITY(),
    ORDERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    ORDER_ID := INPUT(BIGINT#ORDERS),
    DIM_CUSTOMERS := EXTERNAL_COLUMNS(
        customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
        name VARCHAR,
        FROM TABLE(pitfall.customers)
    ),
    FCT_ORDERS := EXTERNAL_COLUMNS(
        order_id BIGINT#ORDERS BIND TO ORDER_ID,
        order_customer_id BIGINT#CUSTOMERS,
        amount DOUBLE,
        status VARCHAR,
        FROM TABLE(pitfall.orders)
    ),
    CUSTOMER_NAME := DIM_CUSTOMERS[name],
    ORDER_STATUS := FCT_ORDERS[status],
    IS_SHIPPED := ORDER_STATUS = 'shipped',
    TOTAL_REVENUE := CUSTOMER_ID.RELATED(SUM(FCT_ORDERS[amount]) GROUP BY FCT_ORDERS[order_customer_id]),
    SHIPPED_REVENUE := CUSTOMER_ID.RELATED(
        SUM(FCT_ORDERS[amount]) FILTER (WHERE IS_SHIPPED)
        GROUP BY FCT_ORDERS[order_customer_id]
    ),
    SHIPPED_RATIO := SHIPPED_REVENUE / TOTAL_REVENUE,
SELECT
    CUSTOMER_ID,
    CUSTOMER_NAME,
    TOTAL_REVENUE,
    SHIPPED_REVENUE,
    SHIPPED_RATIO,
FOR CROSS
    CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2, 3]),
    ORDER_ID := BIND_VALUES(SEQUENCE(10, 17)),
;
Result
CUSTOMER_ID BIGINTCUSTOMER_NAME VARCHARTOTAL_REVENUE VARCHARSHIPPED_REVENUE VARCHARSHIPPED_RATIO VARCHAR
1Alice2001700.85
2Bob2802000.7142857142857143
3Charlie2852851

LLM pitfall: using NULLIF() for safe division. FeatureQL does not have NULLIF — use IF(TOTAL_REVENUE = 0, NULL, SHIPPED_REVENUE / TOTAL_REVENUE) or handle NULLs with COALESCE() downstream. In this dataset all customers have orders, so simple division works.

P3: Scalar injection with CARRY — the >.N precision rule

CARRY() injects a customer-level scalar into each row of an array, making it available as a field inside TRANSFORM(). This is the correct way to compare per-row values against a per-entity aggregate:

FeatureQL
WITH
    CUSTOMERS := ENTITY(),
    ORDERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    ORDER_ID := INPUT(BIGINT#ORDERS),
    CUSTOMER_ORDERS_OBT := EXTERNAL_COLUMNS(
        customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
        order_ids ARRAY(BIGINT#ORDERS),
        FROM TABLE(pitfall.customer_orders_obt)
    ),
    FCT_ORDERS := EXTERNAL_COLUMNS(
        order_id BIGINT#ORDERS BIND TO ORDER_ID,
        amount DOUBLE,
        FROM TABLE(pitfall.orders)
    ),
    ORDER_IDS := CUSTOMER_ORDERS_OBT[order_ids],
    ORDER_AMOUNT := FCT_ORDERS[amount],
    ENRICHED_ORDERS := EXTEND(
        ZIP(ORDER_IDS AS order_id)
        WITH ORDER_AMOUNT AS amount
        VIA order_id BIND TO ORDER_ID
    ),
    AVG_AMOUNT := ENRICHED_ORDERS.TRANSFORM(SELECT AVG(amount)).UNWRAP_ONE(),
    ORDERS_FLAGGED := ENRICHED_ORDERS.CARRY(AVG_AMOUNT).TRANSFORM(
        SELECT order_id, amount, IF(amount >.2 avg_amount, 'above', 'below') AS flag
    ),
SELECT
    CUSTOMER_ID,
    AVG_AMOUNT,
    ORDERS_FLAGGED,
FOR CROSS
    CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2, 3]),
    ORDER_ID := BIND_VALUES(SEQUENCE(10, 17)),
;
Result
CUSTOMER_ID BIGINTAVG_AMOUNT VARCHARORDERS_FLAGGED VARCHAR
166.66666666666667[{order_id: 10, amount: 50.0, flag: below}, {order_id: 11, amount: 120.0, flag: above}, {order_id: 12, amount: 30.0, flag: below}]
2140[{order_id: 13, amount: 200.0, flag: above}, {order_id: 14, amount: 80.0, flag: below}]
395[{order_id: 15, amount: 150.0, flag: above}, {order_id: 16, amount: 45.0, flag: below}, {order_id: 17, amount: 90.0, flag: below}]

Critical pitfall: DOUBLE comparisons require precision syntax. amount > avg_amount fails with COMPARISON-DIGITS-EMPTY. You must use >.N (e.g., >.2 for 2 digits of precision) when comparing DOUBLE values. This is the single most common runtime error LLMs hit.

P4: Composing RELATED results into derived features

Features returned by RELATED() are regular features — you can combine them with string concatenation, boolean logic, or arithmetic:

FeatureQL
WITH
    AIRPORTS := ENTITY(),
    FLIGHTS := ENTITY(),
    AIRPORT_ID := INPUT(BIGINT#AIRPORTS),
    FLIGHT_ID := INPUT(BIGINT#FLIGHTS),
    DIM_AIRPORTS := EXTERNAL_COLUMNS(
        airport_id BIGINT#AIRPORTS BIND TO AIRPORT_ID,
        name VARCHAR,
        city VARCHAR,
        country VARCHAR,
        FROM TABLE(airport.dim_airports)
    ),
    FCT_FLIGHTS := EXTERNAL_COLUMNS(
        flight_id BIGINT#FLIGHTS BIND TO FLIGHT_ID,
        flight_origin_id BIGINT#AIRPORTS,
        flight_dest_id BIGINT#AIRPORTS,
        departure_time TIMESTAMP,
        airline VARCHAR,
        FROM TABLE(airport.fct_flights)
    ),
    AIRPORT_CITY := DIM_AIRPORTS[city],
    AIRPORT_COUNTRY := DIM_AIRPORTS[country],
    FLIGHT_AIRLINE := FCT_FLIGHTS[airline],
    ORIGIN_CITY := RELATED(AIRPORT_CITY VIA FCT_FLIGHTS[flight_origin_id]),
    DEST_CITY := RELATED(AIRPORT_CITY VIA FCT_FLIGHTS[flight_dest_id]),
    ORIGIN_COUNTRY := RELATED(AIRPORT_COUNTRY VIA FCT_FLIGHTS[flight_origin_id]),
    DEST_COUNTRY := RELATED(AIRPORT_COUNTRY VIA FCT_FLIGHTS[flight_dest_id]),
    ROUTE_LABEL := ORIGIN_CITY || ' → ' || DEST_CITY,
    IS_DOMESTIC := ORIGIN_COUNTRY = DEST_COUNTRY,
SELECT
    FLIGHT_ID,
    FLIGHT_AIRLINE AS AIRLINE,
    ROUTE_LABEL,
    IS_DOMESTIC,
FOR CROSS
    FLIGHT_ID := BIND_VALUES(SEQUENCE(200, 207)),
    AIRPORT_ID := BIND_VALUES(SEQUENCE(100, 104)),
;
Result
FLIGHT_ID BIGINTAIRLINE VARCHARROUTE_LABEL VARCHARIS_DOMESTIC VARCHAR
200BALondon → ParisFALSE
201KLMLondon → AmsterdamFALSE
202AFParis → MadridFALSE
203KLMAmsterdam → LondonFALSE
204AZAmsterdam → RomeFALSE
205IBMadrid → LondonFALSE
206IBMadrid → AmsterdamFALSE
207AZRome → ParisFALSE

ROUTE_LABEL concatenates two RELATED() results. IS_DOMESTIC compares two RELATED() results with =. No special syntax is needed — FeatureQL features compose freely regardless of how they were computed.

P5: Triple-chained TRANSFORM to extract a scalar

When you need to go from an enriched array → grouped aggregation → top pick → single scalar value, chain three TRANSFORM() calls ending with UNWRAP_ONE():

FeatureQL
WITH
    CUSTOMERS := ENTITY(),
    ORDERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    ORDER_ID := INPUT(BIGINT#ORDERS),
    DIM_CUSTOMERS := EXTERNAL_COLUMNS(
        customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
        name VARCHAR,
        FROM TABLE(pitfall.customers)
    ),
    FCT_ORDERS := EXTERNAL_COLUMNS(
        order_id BIGINT#ORDERS BIND TO ORDER_ID,
        order_customer_id BIGINT#CUSTOMERS,
        amount DOUBLE,
        category VARCHAR,
        FROM TABLE(pitfall.orders)
    ),
    CUSTOMER_ORDERS_OBT := EXTERNAL_COLUMNS(
        customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
        order_ids ARRAY(BIGINT#ORDERS),
        FROM TABLE(pitfall.customer_orders_obt)
    ),
    CUSTOMER_NAME := DIM_CUSTOMERS[name],
    ORDER_IDS := CUSTOMER_ORDERS_OBT[order_ids],
    ORDER_AMOUNT := FCT_ORDERS[amount],
    ORDER_CATEGORY := FCT_ORDERS[category],
    ENRICHED_ORDERS := EXTEND(
        ZIP(ORDER_IDS AS order_id)
        WITH ORDER_AMOUNT AS amount, ORDER_CATEGORY AS category
        VIA order_id BIND TO ORDER_ID
    ),
    REVENUE_BY_CATEGORY := ENRICHED_ORDERS.TRANSFORM(
        SELECT category, SUM(amount) GROUP BY category AS total
        ORDER BY total DESC
    ),
    TOP_CATEGORY_ROW := REVENUE_BY_CATEGORY.TRANSFORM(
        SELECT category, total ORDER BY total DESC LIMIT 1
    ),
    TOP_CATEGORY_NAME := TOP_CATEGORY_ROW.TRANSFORM(SELECT category).UNWRAP_ONE(),
SELECT
    CUSTOMER_ID,
    CUSTOMER_NAME,
    TOP_CATEGORY_NAME,
FOR CROSS
    CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2, 3]),
    ORDER_ID := BIND_VALUES(SEQUENCE(10, 17)),
;
Result
CUSTOMER_ID BIGINTCUSTOMER_NAME VARCHARTOP_CATEGORY_NAME VARCHAR
1Alicebooks
2Bobclothing
3Charliebooks

The chain: .TRANSFORM(GROUP BY).TRANSFORM(ORDER BY LIMIT 1).TRANSFORM(SELECT field).UNWRAP_ONE(). Each TRANSFORM operates on the output of the previous one. The final UNWRAP_ONE() extracts the VARCHAR from a single-element ARRAY(ROW(category VARCHAR)).

P6: RELATED with computed boolean filter (cross-entity)

FILTER (WHERE ...) in RELATED() can reference boolean features that themselves depend on other RELATED() calls. The feature IS_INTL_FLIGHT compares origin and destination countries (both computed via RELATED()), and NUM_INTERNATIONAL_DEPARTURES uses that boolean as a filter:

FeatureQL
WITH
    AIRPORTS := ENTITY(),
    FLIGHTS := ENTITY(),
    AIRPORT_ID := INPUT(BIGINT#AIRPORTS),
    FLIGHT_ID := INPUT(BIGINT#FLIGHTS),
    DIM_AIRPORTS := EXTERNAL_COLUMNS(
        airport_id BIGINT#AIRPORTS BIND TO AIRPORT_ID,
        name VARCHAR,
        country VARCHAR,
        FROM TABLE(airport.dim_airports)
    ),
    FCT_FLIGHTS := EXTERNAL_COLUMNS(
        flight_id BIGINT#FLIGHTS BIND TO FLIGHT_ID,
        flight_origin_id BIGINT#AIRPORTS,
        flight_dest_id BIGINT#AIRPORTS,
        FROM TABLE(airport.fct_flights)
    ),
    AIRPORT_NAME := DIM_AIRPORTS[name],
    AIRPORT_COUNTRY := DIM_AIRPORTS[country],
    ORIGIN_COUNTRY_FL := RELATED(AIRPORT_COUNTRY VIA FCT_FLIGHTS[flight_origin_id]),
    DEST_COUNTRY_FL := RELATED(AIRPORT_COUNTRY VIA FCT_FLIGHTS[flight_dest_id]),
    IS_INTL_FLIGHT := ORIGIN_COUNTRY_FL <> DEST_COUNTRY_FL,
    NUM_INTERNATIONAL_DEPARTURES := AIRPORT_ID.RELATED(
        COUNT(1) FILTER (WHERE IS_INTL_FLIGHT) GROUP BY FCT_FLIGHTS[flight_origin_id]
    ),
SELECT
    AIRPORT_ID,
    AIRPORT_NAME,
    NUM_INTERNATIONAL_DEPARTURES,
FOR CROSS
    AIRPORT_ID := BIND_VALUES(SEQUENCE(100, 104)),
    FLIGHT_ID := BIND_VALUES(SEQUENCE(200, 207)),
ORDER BY NUM_INTERNATIONAL_DEPARTURES DESC
LIMIT 1
;
Result
AIRPORT_ID BIGINTAIRPORT_NAME VARCHARNUM_INTERNATIONAL_DEPARTURES BIGINT
100Heathrow2

This demonstrates that FeatureQL correctly resolves deep dependency chains inside RELATED() filters.

LLM error catalog from blind testing

ErrorRoot causeFix
FUNCTION-NOT-FOUND: NULLIFSQL function not available in FeatureQLUse IF(x = 0, NULL, ...) or COALESCE()
COMPARISON-DIGITS-EMPTYComparing DOUBLE values with > instead of >.NUse >.2, <.2, etc. for DOUBLE comparisons
Extra columns in EXTERNAL_COLUMNSMapping unnecessary columns (harmless but verbose)Only map columns you actually use

What's next

Last update at: 2026/04/27 15:40:31