Backend Particularities

FeatureQL transpiles to multiple SQL backends (DuckDB, Trino, BigQuery, DataFusion), and each backend has its own quirks. FeatureQL normalizes most of these differences so your queries return identical results everywhere — but there are cases where you need to be aware of the underlying backend.

Consistent output across backends

Some array operations produce different orderings depending on the backend. For example, ARRAY_DISTINCT() returns sorted arrays in Trino but unsorted arrays in DuckDB. FeatureQL's default functions normalize this by sorting the output on all backends, even if it costs a small performance penalty.

When performance matters more than deterministic ordering, use the _UNSAFE variants. These skip the normalization step and return whatever the backend produces natively:

FunctionBehavior
ARRAY_DISTINCT()Sorted arrays on all backends
ARRAY_UNION()Sorted arrays on all backends
ARRAY_EXCEPT()Sorted arrays on all backends
ARRAY_INTERSECT()Sorted arrays on all backends
ARRAY_DISTINCT_UNSAFE()Sorted on Trino, unsorted on DuckDB
ARRAY_UNION_UNSAFE()Sorted on Trino, unsorted on DuckDB
ARRAY_EXCEPT_UNSAFE()Sorted on Trino, unsorted on DuckDB
ARRAY_INTERSECT_UNSAFE()Sorted on Trino, unsorted on DuckDB
When to use _UNSAFE

Only use _UNSAFE variants when you're processing large arrays and profiling shows the sort step is a bottleneck. For most queries, the safe defaults are fast enough and save you from debugging non-deterministic test failures.

Inline data syntax in BIND_SQL and EXTERNAL_SQL

BIND_SQL(), EXTERNAL_SQL(), and EXTERNAL_VIEW() embed raw SQL that runs directly on the target backend. Since this SQL is passed through without translation, you must write it in the backend's native dialect.

The most common case is inlining small datasets for testing or prototyping. Here's how to do it on each backend:

Standard SQL (Trino, DataFusion)

Single feature:

SELECT FEATURE1 FROM UNNEST(ARRAY[1,2,3]) AS t(FEATURE1)
sql

Multiple features:

SELECT FEATURE1, FEATURE2 FROM (VALUES
    (1, 'A'),
    (2, 'B'),
    (3, 'C')
) AS t(FEATURE1, FEATURE2)
sql

BigQuery

Single feature:

SELECT * FROM UNNEST([1,2,3]) AS FEATURE1
sql

Multiple features:

SELECT AS STRUCT customer_id, attribute
FROM UNNEST(CAST([
    STRUCT(1, 'attribute'),
    STRUCT(2, 'attribute'),
    STRUCT(3, 'attribute'),
] AS ARRAY<STRUCT<customer_id INT64, attribute STRING>>))
sql

DuckDB

DuckDB supports the standard SQL syntax above, plus an alternative using typed UNNEST:

SELECT UNNEST(ARRAY[
    (1, 'A'),
    (2, 'B'),
    (3, 'C')
]::ROW(FEATURE1 BIGINT, FEATURE2 VARCHAR)[], recursive:=true)
sql
These SQL snippets are not FeatureQL

The code blocks above are raw SQL passed as string arguments to BIND_SQL() or EXTERNAL_SQL(). They are not parsed by FeatureQL — they go directly to the backend engine. Use FeatureQL's INLINE_COLUMNS() with CSV or JSON format when you want backend-agnostic inline data. See Inlining Mockup Data for details.

Last update at: 2026/03/03 16:47:38
Last updated: 2026-03-03 16:48:19