LLM reference — advanced features

Advanced FeatureQL features for reuse, experimentation, polymorphism, and analytics at top-level grain. Load this page when you need MACRO(), VARIANT(), UDF(), top-level GROUP BY, UNNEST, or window functions.

MACRO — reusable parameterized functions

MACRO() wraps a feature expression into a callable function. You declare which INPUT() features become parameters, and the result can be called with different arguments — like defining a function.

Syntax

MACRO_NAME := MACRO(expression USING param1, param2, ...)

sql

Creating and calling

WITH
    RADIUS := INPUT(DOUBLE),
    AREA_CIRCLE := MACRO(PI() * POW(RADIUS, 2e0) USING RADIUS)
SELECT
    SMALL := AREA_CIRCLE(2e0),      -- 12.566...
    LARGE := AREA_CIRCLE(5e0)       -- 78.539...
;

sql

Arguments can be expressions: AREA_CIRCLE(1e0 + 1e0) works.

Multiple parameters

Parameter order in USING determines argument order:

WITH
    L := INPUT(DOUBLE),
    W := INPUT(DOUBLE),
    AREA_RECT := MACRO(L * W USING L, W)
SELECT
    A1 := AREA_RECT(2e0, 5e0),     -- 10.0
    A2 := AREA_RECT(5e0, 3e0)      -- 15.0
;

sql

Chaining syntax

The chained value becomes the first argument: 2e0.AREA_CIRCLE() is equivalent to AREA_CIRCLE(2e0).

Persisting macros

CREATE OR REPLACE FEATURES IN fm.utils AS
SELECT
    RADIUS := INPUT(DOUBLE),
    AREA_CIRCLE := MACRO(PI() * POW(RADIUS, 2e0) USING RADIUS);

sql

Then use from any query: SELECT RESULT := AREA_CIRCLE(3e0) FROM fm.utils;

Limitations

Macros are not polymorphic — each parameter accepts only the type declared in its INPUT(). For multi-type support, use UDF() instead.

VARIANT — what-if analysis via dependency substitution

VARIANT() creates a new version of an existing feature by swapping out specific dependencies. The original feature and all other dependencies stay unchanged — only the named replacements get substituted.

Syntax

VARIANT(feature REPLACING dep1, dep2, ... WITH replacement1, replacement2, ...)

sql

Basic usage

WITH
    EXPR := INPUT(BIGINT),
    POWER := INPUT(BIGINT),
    OPERATION := POW(EXPR, POWER),
    RESULT_5_6 := VARIANT(OPERATION REPLACING EXPR, POWER WITH 5, 6),   -- 15625
    RESULT_2_3 := VARIANT(OPERATION REPLACING EXPR, POWER WITH 2, 3)    -- 8
SELECT RESULT_5_6, RESULT_2_3;

sql

Comparing original and variants side by side

WITH
    EXPR := INPUT(BIGINT),
    POWER := INPUT(BIGINT),
    OPERATION := POW(EXPR, POWER)
SELECT
    ORIGINAL := OPERATION,
    VAR_A := VARIANT(OPERATION REPLACING EXPR WITH 10),
    VAR_B := VARIANT(OPERATION REPLACING POWER WITH 3)
FOR (EXPR, POWER) := BIND_VALUES(ARRAY[ROW(2, 4)]);
-- Result: 16, 10000, 8

sql

Replacing formulas, not just inputs

VARIANT() replaces dependencies, not the feature itself. To swap a formula, create an alias and replace the alias's dependency:

WITH
    X := INPUT(BIGINT),
    BASE_FORMULA := X * 2,
    ALIAS := BASE_FORMULA,
    ALTERNATIVE := VARIANT(ALIAS REPLACING BASE_FORMULA WITH X * 3)
SELECT ALIAS, ALTERNATIVE
FOR X := BIND_VALUE(5);
-- Result: 10, 15

sql

Binding through variants

You cannot put BIND_VALUES() directly inside a replacement. Bind to a separate input and use that as the replacement:

WITH
    EXPR := INPUT(BIGINT),
    POWER := INPUT(BIGINT),
    OPERATION := POW(EXPR, POWER),
    BINDED := INPUT(BIGINT),
    RESULT := VARIANT(OPERATION REPLACING EXPR, POWER WITH BINDED, 3)
SELECT RESULT
FOR BINDED := BIND_VALUES(ARRAY[2, 3, 5]);
-- Result: 8, 27, 125

sql

Shared dependencies

When a variant replaces a dependency shared by other features, each feature gets its own version of the dependency chain. VARIANT(A REPLACING D WITH D_new) does not affect B even if B also depends on D through the same intermediate.

When to use VARIANT vs MACRO

	`VARIANT()`	`MACRO()`
Purpose	Alternative versions of existing features	Reusable function templates
Mechanism	Replaces dependencies in an existing DAG	Parameterizes an expression
Best for	Experimentation, A/B testing, what-if analysis	DRY patterns, reusable calculations
Scope	Affects the entire dependency chain	Self-contained

UDF — polymorphic custom functions

UDFs extend FeatureQL with custom functions that support multiple input types. Unlike macros, a single UDF name can accept DATE or TIMESTAMP and produce different output types.

Syntax

UDF(JSON '{
    "positions": ["param1", "param2"],
    "signatures": [
        ["INPUT_TYPE1", "OUTPUT_TYPE1"],
        ["INPUT_TYPE2", "OUTPUT_TYPE2"]
    ],
    "templates": {
        "SQL": "backend_sql_template_using_{{ jexpr(p(''param1'')) }}"
    }
}') AS function_name

sql

Example: quarter formatter

WITH
    UDF(JSON '{
        "positions": ["date_input"],
        "signatures": [
            ["DATE", "VARCHAR"],
            ["TIMESTAMP", "VARCHAR"]
        ],
        "templates": {
            "SQL": "CONCAT(CAST(YEAR({{ jexpr(p(''date_input'')) }}) AS VARCHAR), ''Q'', CAST(QUARTER({{ jexpr(p(''date_input'')) }}) AS VARCHAR))"
        }
    }') AS normalized_quarter
SELECT
    normalized_quarter(DATE '2025-08-15') AS Q_DATE,                -- '2025Q3'
    TIMESTAMP '2025-10-15 14:30:00'.normalized_quarter() AS Q_TS    -- '2025Q4'
;

sql

When to use UDF vs MACRO

Use a UDF when you need polymorphism (multiple input types) or backend-specific SQL. Use a macro for everything else — it is simpler and does not require JSON configuration.

Top-level GROUP BY

In FeatureQL, GROUP BY lives inside each aggregate expression, not as a separate clause. Different features in the same query can group by different keys.

Single key

WITH
    ID := INPUT(BIGINT),
    DATA := EXTERNAL_COLUMNS(
        id BIGINT BIND TO ID, category VARCHAR, amount DOUBLE
        FROM TABLE(schema.sales)
    ),
    CATEGORY := DATA[category],
    AMOUNT := DATA[amount]
SELECT
    CATEGORY,
    TOTAL := SUM(AMOUNT) GROUP BY CATEGORY,
    CNT := COUNT(1) GROUP BY CATEGORY
FOR ID := BIND_COLUMNS(id FROM TABLE(schema.sales));

sql

FILTER inside aggregates

FILTER (WHERE ...) applies a condition during aggregation without affecting other features:

TOTAL_HIGH := SUM(AMOUNT) FILTER (WHERE AMOUNT > 100e0) GROUP BY CATEGORY,

sql

Use COALESCE(..., 0e0) when all rows for a group may be filtered out (returns NULL otherwise).

UNNEST — flatten arrays to rows

UNNEST() expands an array of rows so each element becomes its own output row. It replaces the entity-per-row grain with element-per-row grain.

Rules

UNNEST() is exclusive: you can only return unnested features and features derived from them in SELECT. You cannot mix unnested and non-unnested features.
For plain arrays (not array of rows), wrap with ZIP() first: UNNEST(ZIP(my_array AS value)).

Basic pattern

WITH
    ID := INPUT(BIGINT),
    ORDERS := ... -- ARRAY(ROW(order_id BIGINT, amount DOUBLE))
SELECT
    UNNEST(ORDERS) AS U,
    ORDER_ID := U[order_id],
    AMOUNT := U[amount]
FOR ID := BIND_VALUES(ARRAY[1, 2]);

sql

Unnest then aggregate

Flatten first, then re-aggregate at a different grain:

SELECT
    UNNEST(ORDERS) AS U,
    CATEGORY := U[category],
    TOTAL := SUM(U[amount]) GROUP BY CATEGORY
FOR ...;

sql

Window functions

Window functions compute a value for each row based on a "window" of related rows, without collapsing into fewer rows. They use OVER() just like SQL.

Syntax

FEATURE := aggregate(expression) OVER (
    [PARTITION BY key1, key2, ...]
    [ORDER BY expr1 [ASC|DESC], ...]
)

sql

Common patterns

SELECT
    RUNNING_TOTAL := SUM(AMOUNT) OVER (ORDER BY ID ASC),
    RANK := RANK() OVER (ORDER BY AMOUNT DESC),
    BY_CAT := SUM(AMOUNT) OVER (PARTITION BY CATEGORY),
    RUNNING_BY_CAT := SUM(AMOUNT) OVER (PARTITION BY CATEGORY ORDER BY ID ASC),
    PREV := LAG(AMOUNT, 1) OVER (ORDER BY ID ASC),
    NEXT := LEAD(AMOUNT, 1) OVER (ORDER BY ID ASC),
    BUCKET := NTILE(4) OVER (ORDER BY AMOUNT DESC),
    RN := ROW_NUMBER() OVER (ORDER BY ID ASC)
FROM fm.namespace
FOR ID := BIND_VALUES(SEQUENCE(1, 10));

sql

Available window functions

All standard aggregates (SUM, AVG, MIN, MAX, COUNT, ARRAY_AGG, etc.) work with OVER(). Dedicated window functions:

Function	Description
`ROW_NUMBER()`	Sequential number within partition
`RANK()`	Rank with gaps for ties
`DENSE_RANK()`	Rank without gaps
`NTILE(n)`	Distribute into n buckets
`LEAD(expr, offset)`	Value from offset rows ahead
`LAG(expr, offset)`	Value from offset rows behind
`FIRST_VALUE(expr)`	First value in window frame
`LAST_VALUE(expr)`	Last value in window frame

Inside TRANSFORM

Window functions also work inside TRANSFORM() inner queries, operating on the fields of the array:

RANKED_ORDERS := ORDERS.TRANSFORM(
    SELECT *, RANK() OVER (ORDER BY amount DESC) AS rank
),

sql

INDEX lookups (efficient repeated access)

When you need to look up specific rows by a field value repeatedly, building an index is more efficient than scanning with TRANSFORM(... WHERE ...) each time.

-- Unique index: one row per key (MAP)
IDX := INDEX_UNIQUE(orders_array BY order_id),
ONE_ORDER := ELEMENT_AT_KEY(IDX, 42),           -- ROW or NULL

-- Multi index: multiple rows per key (MAP to arrays)
IDX_CAT := INDEX_MULTI(products_array BY category),
ELECTRONICS := ELEMENTS_AT_KEY(IDX_CAT, 'electronics'),  -- ARRAY(ROW) or NULL

sql

Use INDEX_UNIQUE + ELEMENT_AT_KEY for unique fields (IDs). Use INDEX_MULTI + ELEMENTS_AT_KEY for non-unique fields (categories, statuses).

Links to full reference

Suggest changes to this page

Last update at: 2026/04/27 15:40:31