LLM reference — advanced features
Advanced FeatureQL features for reuse, experimentation, polymorphism, and analytics at top-level grain. Load this page when you need MACRO(), VARIANT(), UDF(), top-level GROUP BY, UNNEST, or window functions.
MACRO — reusable parameterized functions
MACRO() wraps a feature expression into a callable function. You declare which INPUT() features become parameters, and the result can be called with different arguments — like defining a function.
Syntax
MACRO_NAME := MACRO(expression USING param1, param2, ...) Creating and calling
WITH
RADIUS := INPUT(DOUBLE),
AREA_CIRCLE := MACRO(PI() * POW(RADIUS, 2e0) USING RADIUS)
SELECT
SMALL := AREA_CIRCLE(2e0), -- 12.566...
LARGE := AREA_CIRCLE(5e0) -- 78.539...
; Arguments can be expressions: AREA_CIRCLE(1e0 + 1e0) works.
Multiple parameters
Parameter order in USING determines argument order:
WITH
L := INPUT(DOUBLE),
W := INPUT(DOUBLE),
AREA_RECT := MACRO(L * W USING L, W)
SELECT
A1 := AREA_RECT(2e0, 5e0), -- 10.0
A2 := AREA_RECT(5e0, 3e0) -- 15.0
; Chaining syntax
The chained value becomes the first argument: 2e0.AREA_CIRCLE() is equivalent to AREA_CIRCLE(2e0).
Persisting macros
CREATE OR REPLACE FEATURES IN fm.utils AS
SELECT
RADIUS := INPUT(DOUBLE),
AREA_CIRCLE := MACRO(PI() * POW(RADIUS, 2e0) USING RADIUS); Then use from any query: SELECT RESULT := AREA_CIRCLE(3e0) FROM fm.utils;
Limitations
Macros are not polymorphic — each parameter accepts only the type declared in its INPUT(). For multi-type support, use UDF() instead.
VARIANT — what-if analysis via dependency substitution
VARIANT() creates a new version of an existing feature by swapping out specific dependencies. The original feature and all other dependencies stay unchanged — only the named replacements get substituted.
Syntax
VARIANT(feature REPLACING dep1, dep2, ... WITH replacement1, replacement2, ...) Basic usage
WITH
EXPR := INPUT(BIGINT),
POWER := INPUT(BIGINT),
OPERATION := POW(EXPR, POWER),
RESULT_5_6 := VARIANT(OPERATION REPLACING EXPR, POWER WITH 5, 6), -- 15625
RESULT_2_3 := VARIANT(OPERATION REPLACING EXPR, POWER WITH 2, 3) -- 8
SELECT RESULT_5_6, RESULT_2_3; Comparing original and variants side by side
WITH
EXPR := INPUT(BIGINT),
POWER := INPUT(BIGINT),
OPERATION := POW(EXPR, POWER)
SELECT
ORIGINAL := OPERATION,
VAR_A := VARIANT(OPERATION REPLACING EXPR WITH 10),
VAR_B := VARIANT(OPERATION REPLACING POWER WITH 3)
FOR (EXPR, POWER) := BIND_VALUES(ARRAY[ROW(2, 4)]);
-- Result: 16, 10000, 8 Replacing formulas, not just inputs
VARIANT() replaces dependencies, not the feature itself. To swap a formula, create an alias and replace the alias's dependency:
WITH
X := INPUT(BIGINT),
BASE_FORMULA := X * 2,
ALIAS := BASE_FORMULA,
ALTERNATIVE := VARIANT(ALIAS REPLACING BASE_FORMULA WITH X * 3)
SELECT ALIAS, ALTERNATIVE
FOR X := BIND_VALUE(5);
-- Result: 10, 15 Binding through variants
You cannot put BIND_VALUES() directly inside a replacement. Bind to a separate input and use that as the replacement:
WITH
EXPR := INPUT(BIGINT),
POWER := INPUT(BIGINT),
OPERATION := POW(EXPR, POWER),
BINDED := INPUT(BIGINT),
RESULT := VARIANT(OPERATION REPLACING EXPR, POWER WITH BINDED, 3)
SELECT RESULT
FOR BINDED := BIND_VALUES(ARRAY[2, 3, 5]);
-- Result: 8, 27, 125 Shared dependencies
When a variant replaces a dependency shared by other features, each feature gets its own version of the dependency chain. VARIANT(A REPLACING D WITH D_new) does not affect B even if B also depends on D through the same intermediate.
When to use VARIANT vs MACRO
VARIANT() | MACRO() | |
|---|---|---|
| Purpose | Alternative versions of existing features | Reusable function templates |
| Mechanism | Replaces dependencies in an existing DAG | Parameterizes an expression |
| Best for | Experimentation, A/B testing, what-if analysis | DRY patterns, reusable calculations |
| Scope | Affects the entire dependency chain | Self-contained |
UDF — polymorphic custom functions
UDFs extend FeatureQL with custom functions that support multiple input types. Unlike macros, a single UDF name can accept DATE or TIMESTAMP and produce different output types.
Syntax
UDF(JSON '{
"positions": ["param1", "param2"],
"signatures": [
["INPUT_TYPE1", "OUTPUT_TYPE1"],
["INPUT_TYPE2", "OUTPUT_TYPE2"]
],
"templates": {
"SQL": "backend_sql_template_using_{{ jexpr(p(''param1'')) }}"
}
}') AS function_name Example: quarter formatter
WITH
UDF(JSON '{
"positions": ["date_input"],
"signatures": [
["DATE", "VARCHAR"],
["TIMESTAMP", "VARCHAR"]
],
"templates": {
"SQL": "CONCAT(CAST(YEAR({{ jexpr(p(''date_input'')) }}) AS VARCHAR), ''Q'', CAST(QUARTER({{ jexpr(p(''date_input'')) }}) AS VARCHAR))"
}
}') AS normalized_quarter
SELECT
normalized_quarter(DATE '2025-08-15') AS Q_DATE, -- '2025Q3'
TIMESTAMP '2025-10-15 14:30:00'.normalized_quarter() AS Q_TS -- '2025Q4'
; When to use UDF vs MACRO
Use a UDF when you need polymorphism (multiple input types) or backend-specific SQL. Use a macro for everything else — it is simpler and does not require JSON configuration.
Top-level GROUP BY
In FeatureQL, GROUP BY lives inside each aggregate expression, not as a separate clause. Different features in the same query can group by different keys.
Single key
WITH
ID := INPUT(BIGINT),
DATA := EXTERNAL_COLUMNS(
id BIGINT BIND TO ID, category VARCHAR, amount DOUBLE
FROM TABLE(schema.sales)
),
CATEGORY := DATA[category],
AMOUNT := DATA[amount]
SELECT
CATEGORY,
TOTAL := SUM(AMOUNT) GROUP BY CATEGORY,
CNT := COUNT(1) GROUP BY CATEGORY
FOR ID := BIND_COLUMNS(id FROM TABLE(schema.sales)); FILTER inside aggregates
FILTER (WHERE ...) applies a condition during aggregation without affecting other features:
TOTAL_HIGH := SUM(AMOUNT) FILTER (WHERE AMOUNT > 100e0) GROUP BY CATEGORY, Use COALESCE(..., 0e0) when all rows for a group may be filtered out (returns NULL otherwise).
UNNEST — flatten arrays to rows
UNNEST() expands an array of rows so each element becomes its own output row. It replaces the entity-per-row grain with element-per-row grain.
Rules
UNNEST()is exclusive: you can only return unnested features and features derived from them inSELECT. You cannot mix unnested and non-unnested features.- For plain arrays (not array of rows), wrap with
ZIP()first:UNNEST(ZIP(my_array AS value)).
Basic pattern
WITH
ID := INPUT(BIGINT),
ORDERS := ... -- ARRAY(ROW(order_id BIGINT, amount DOUBLE))
SELECT
UNNEST(ORDERS) AS U,
ORDER_ID := U[order_id],
AMOUNT := U[amount]
FOR ID := BIND_VALUES(ARRAY[1, 2]); Unnest then aggregate
Flatten first, then re-aggregate at a different grain:
SELECT
UNNEST(ORDERS) AS U,
CATEGORY := U[category],
TOTAL := SUM(U[amount]) GROUP BY CATEGORY
FOR ...; Window functions
Window functions compute a value for each row based on a "window" of related rows, without collapsing into fewer rows. They use OVER() just like SQL.
Syntax
FEATURE := aggregate(expression) OVER (
[PARTITION BY key1, key2, ...]
[ORDER BY expr1 [ASC|DESC], ...]
) Common patterns
SELECT
RUNNING_TOTAL := SUM(AMOUNT) OVER (ORDER BY ID ASC),
RANK := RANK() OVER (ORDER BY AMOUNT DESC),
BY_CAT := SUM(AMOUNT) OVER (PARTITION BY CATEGORY),
RUNNING_BY_CAT := SUM(AMOUNT) OVER (PARTITION BY CATEGORY ORDER BY ID ASC),
PREV := LAG(AMOUNT, 1) OVER (ORDER BY ID ASC),
NEXT := LEAD(AMOUNT, 1) OVER (ORDER BY ID ASC),
BUCKET := NTILE(4) OVER (ORDER BY AMOUNT DESC),
RN := ROW_NUMBER() OVER (ORDER BY ID ASC)
FROM fm.namespace
FOR ID := BIND_VALUES(SEQUENCE(1, 10)); Available window functions
All standard aggregates (SUM, AVG, MIN, MAX, COUNT, ARRAY_AGG, etc.) work with OVER(). Dedicated window functions:
| Function | Description |
|---|---|
ROW_NUMBER() | Sequential number within partition |
RANK() | Rank with gaps for ties |
DENSE_RANK() | Rank without gaps |
NTILE(n) | Distribute into n buckets |
LEAD(expr, offset) | Value from offset rows ahead |
LAG(expr, offset) | Value from offset rows behind |
FIRST_VALUE(expr) | First value in window frame |
LAST_VALUE(expr) | Last value in window frame |
Inside TRANSFORM
Window functions also work inside TRANSFORM() inner queries, operating on the fields of the array:
RANKED_ORDERS := ORDERS.TRANSFORM(
SELECT *, RANK() OVER (ORDER BY amount DESC) AS rank
), INDEX lookups (efficient repeated access)
When you need to look up specific rows by a field value repeatedly, building an index is more efficient than scanning with TRANSFORM(... WHERE ...) each time.
-- Unique index: one row per key (MAP)
IDX := INDEX_UNIQUE(orders_array BY order_id),
ONE_ORDER := ELEMENT_AT_KEY(IDX, 42), -- ROW or NULL
-- Multi index: multiple rows per key (MAP to arrays)
IDX_CAT := INDEX_MULTI(products_array BY category),
ELECTRONICS := ELEMENTS_AT_KEY(IDX_CAT, 'electronics'), -- ARRAY(ROW) or NULL Use INDEX_UNIQUE + ELEMENT_AT_KEY for unique fields (IDs). Use INDEX_MULTI + ELEMENTS_AT_KEY for non-unique fields (categories, statuses).