Common patterns
This page collects recurring patterns for working with arrays of rows. Each pattern solves a problem you'll encounter regularly when building entity-centric features.
Extracting scalars with UNWRAP and UNWRAP_ONE
After a TRANSFORM(), the result is still an array of rows — even when you only selected a single field or aggregated down to a single value. Two functions bridge the gap back to simpler types:
UNWRAP()extracts a single-field array of rows into a plain array.[{amount: 10}, {amount: 20}]becomes[10, 20].UNWRAP_ONE()goes one step further: it extracts a single value from a single-row, single-field result. Useful after aggregations likeSELECT SUM(...).
A common workflow: use bracket notation array[field] to extract a field as a plain array, UNWRAP() after filtering, and UNWRAP_ONE() after aggregating.
SELECT
ORDERS := ARRAY[
ROW(101, 49.99e0),
ROW(102, 129.00e0),
ROW(103, 15.50e0),
ROW(104, 89.99e0)
]::ARRAY(ROW(order_id BIGINT, amount DOUBLE)),
ALL_AMOUNTS := ORDERS[amount],
HIGH_VALUE_AMOUNTS := UNWRAP(ORDERS.TRANSFORM(SELECT amount WHERE amount >.5 50e0)),
TOTAL_SPENT := ORDERS.TRANSFORM(SELECT SUM(amount)).UNWRAP_ONE()
;| ORDERS VARCHAR | ALL_AMOUNTS ARRAY | HIGH_VALUE_AMOUNTS ARRAY | TOTAL_SPENT VARCHAR |
|---|---|---|---|
| [{order_id: 101, amount: 49.99}, {order_id: 102, amount: 129.0}, {order_id: 103, amount: 15.5}, {order_id: 104, amount: 89.99}] | [49.99, 129.0, 15.5, 89.99] | [129.0, 89.99] | 284.48 |
Enrich first, then transform
TRANSFORM() is self-contained — it cannot reference features outside the array. When you need external data inside a transform (e.g., filtering orders by category, where category comes from another entity), the solution is to enrich the array with EXTEND() first, then transform the enriched result.
This pattern is the standard way to combine cross-entity data with array operations:
- Start with a scalar array of foreign keys (e.g.,
order_ids) - Use
EXTEND()to add fields from the referenced entity - Use
TRANSFORM()on the enriched array to filter, aggregate, or sort
WITH
CUSTOMERS := ENTITY(),
CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
CUSTOMER_SOURCE := INLINE_COLUMNS(
customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
order_ids ARRAY(BIGINT#ORDERS)
FROM CSV(
customer_id,order_ids
1,"[10, 13]"
2,"[11, 14, 15]"
)
),
ORDERS := ENTITY(),
ORDER_ID := INPUT(BIGINT#ORDERS),
ORDER_SOURCE := INLINE_COLUMNS(
order_id BIGINT#ORDERS BIND TO ORDER_ID,
price DOUBLE,
category VARCHAR
FROM CSV(
order_id,price,category
10,12.51,food
11,13.20,food
13,25.70,electronics
14,30.26,food
15,10.00,electronics
)
),
ORDER_PRICE := ORDER_SOURCE[price],
ORDER_CATEGORY := ORDER_SOURCE[category],
ENRICHED_ORDERS := EXTEND(
ZIP(CUSTOMER_SOURCE[order_ids] AS order_id)
WITH ORDER_PRICE as ORDER_PRICE, ORDER_CATEGORY as ORDER_CATEGORY
VIA order_id BIND TO ORDER_ID
),
REVENUE := ENRICHED_ORDERS.TRANSFORM(SELECT SUM(order_price) WHERE order_category = 'food').UNWRAP_ONE()
SELECT
CUSTOMER_ID,
ENRICHED_ORDERS,
REVENUE
FOR
CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2])
;| CUSTOMER_ID BIGINT | ENRICHED_ORDERS VARCHAR | REVENUE VARCHAR |
|---|---|---|
| 1 | [{order_id: 10, order_price: 12.51, order_category: food}, {order_id: 13, order_price: 25.7, order_category: electronics}] | 12.51 |
| 2 | [{order_id: 11, order_price: 13.2, order_category: food}, {order_id: 14, order_price: 30.26, order_category: food}, {order_id: 15, order_price: 10.0, order_category: electronics}] | 43.46 |
The EXTEND() step adds order_price and order_category to each row. The TRANSFORM() step can then filter on order_category and sum order_price — fields that weren't in the original array.
Top-N rows
Sorting and limiting inside TRANSFORM() gives you the top-N pattern: the N highest (or lowest) rows by some field. This is the array-of-rows equivalent of ROW_NUMBER() OVER (ORDER BY ...) <= N in SQL.
WITH
CUSTOMER_ID := INPUT(BIGINT),
ORDERS := ARRAY[
ROW(101, 49.99e0),
ROW(102, 129.00e0),
ROW(103, 15.50e0),
ROW(104, CUSTOMER_ID::DOUBLE * 100e0)
]::ARRAY(ROW(order_id BIGINT, amount DOUBLE)),
TOP_2_ORDERS := ORDERS.TRANSFORM(SELECT * ORDER BY amount DESC LIMIT 2)
SELECT
CUSTOMER_ID,
TOP_2_ORDERS
FOR
CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2])
;| CUSTOMER_ID BIGINT | TOP_2_ORDERS VARCHAR |
|---|---|
| 1 | [{order_id: 102, amount: 129.0}, {order_id: 104, amount: 100.0}] |
| 2 | [{order_id: 104, amount: 200.0}, {order_id: 102, amount: 129.0}] |
Array functions vs TRANSFORM equivalence
Many built-in array functions — ARRAY_SUM(), ARRAY_MIN(), ARRAY_SORT(), etc. — are shortcuts for patterns you could write with ZIP() + TRANSFORM(). The built-in functions are more concise and should be preferred, but understanding the equivalence helps when you need a custom operation that doesn't have a built-in.
The pattern is always the same: ZIP(array AS name).TRANSFORM(SELECT ...).UNWRAP() (or .UNWRAP_ONE() for scalar results).
SELECT
BASE := ARRAY[1,2,3,4],
ARRAY_SUM1 := ARRAY_SUM(BASE),
ARRAY_SUM2 := ZIP(BASE as v).TRANSFORM(SELECT SUM(v)).UNWRAP_ONE(),
ARRAY_AVG1 := ARRAY_AVG(BASE)::DECIMAL(3,1), -- TODO: Keep it as DOUBLE triggers a bug with ARRAY_DIST2
ARRAY_AVG2 := ZIP(BASE as v).TRANSFORM(SELECT AVG(v)).UNWRAP_ONE(),
ARRAY_MIN1 := ARRAY_MIN(BASE),
ARRAY_MIN2 := ZIP(BASE as v).TRANSFORM(SELECT MIN(v)).UNWRAP_ONE(),
ARRAY_MAX1 := ARRAY_MAX(BASE),
ARRAY_MAX2 := ZIP(BASE as v).TRANSFORM(SELECT MAX(v)).UNWRAP_ONE(),
ARRAY_DIST1 := ARRAY_DISTINCT(BASE),
ARRAY_DIST2 := (ZIP(BASE as v).TRANSFORM(SELECT v, COUNT(1) GROUP BY v ORDER BY v))[v], -- Note the need of an aggregate metric
ARRAY_SORT1 := ARRAY_SORT(BASE),
ARRAY_SORT2 := ZIP(BASE as v).TRANSFORM(SELECT v ORDER BY v).UNWRAP(),
ARRAY_NORM1 := NORMALIZE(BASE),
ARRAY_NORM2 := ZIP(BASE as v, REPEAT(ARRAY_SUM(BASE), ARRAY_LENGTH(BASE)) as s).TRANSFORM(SELECT v / s).UNWRAP(),
;| BASE ARRAY | ARRAY_SUM1 BIGINT | ARRAY_SUM2 BIGINT | ARRAY_AVG1 VARCHAR | ARRAY_AVG2 VARCHAR | ARRAY_MIN1 BIGINT | ARRAY_MIN2 BIGINT | ARRAY_MAX1 BIGINT | ARRAY_MAX2 BIGINT | ARRAY_DIST1 ARRAY | ARRAY_DIST2 ARRAY | ARRAY_SORT1 ARRAY | ARRAY_SORT2 ARRAY | ARRAY_NORM1 ARRAY | ARRAY_NORM2 ARRAY |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| [1, 2, 3, 4] | 10 | 10 | 2.5 | 2.5 | 1 | 1 | 4 | 4 | [1, 2, 3, 4] | [1, 2, 3, 4] | [1, 2, 3, 4] | [1, 2, 3, 4] | [0.1, 0.2, 0.3, 0.4] | [0.1, 0.2, 0.3, 0.4] |
Joins vs window functions inside TRANSFORM
When you need to add an aggregate value back to each row (like "count of items in this category" next to each item), there are two approaches:
- TRANSFORM + GROUP BY, then ARRAY_MERGE JOINED ON — aggregate separately, then merge the result back by key
- TRANSFORM with window functions — compute the aggregate inline using
OVER(PARTITION BY ...)
Both produce the same result. Window functions are more concise; the merge approach is more explicit and easier to debug.
WITH
KEY_ID := INPUT(BIGINT),
CATEGORIES := ARRAY[ROW(1, 'cat' || KEY_ID::VARCHAR as field_2), ROW(2, 'cat2'), ROW(3, 'cat3')]::ROW(id BIGINT, category VARCHAR)[],
CATEGORIES_GR := CATEGORIES.TRANSFORM(SELECT CATEGORY, COUNT(1) GROUP BY CATEGORY AS NUM ORDER BY CATEGORY),
CATEGORIES_MG := ARRAY_MERGE(CATEGORIES, CATEGORIES_GR JOIN ON category),
CATEGORIES_WF := CATEGORIES.TRANSFORM(SELECT ID, CATEGORY, COUNT(1) OVER(PARTITION BY CATEGORY) AS NUM ORDER BY ID),
SELECT
KEY_ID,
CATEGORIES,
CATEGORIES_GR,
CATEGORIES_MG,
CATEGORIES_WF -- BUG because of PARTITION BY CATEGORY instead of PARTITION BY KEY_ID, CATEGORY
FOR
KEY_ID := BIND_VALUES(ARRAY[1,2,3])
;| KEY_ID BIGINT | CATEGORIES VARCHAR | CATEGORIES_GR VARCHAR | CATEGORIES_MG VARCHAR | CATEGORIES_WF VARCHAR |
|---|---|---|---|---|
| 1 | [{id: 1, category: cat1}, {id: 2, category: cat2}, {id: 3, category: cat3}] | [{category: cat1, num: 1}, {category: cat2, num: 1}, {category: cat3, num: 1}] | [{id: 1, category: cat1, num: 1}, {id: 2, category: cat2, num: 1}, {id: 3, category: cat3, num: 1}] | [{id: 1, category: cat1, num: 1}, {id: 2, category: cat2, num: 1}, {id: 3, category: cat3, num: 1}] |
| 2 | [{id: 1, category: cat2}, {id: 2, category: cat2}, {id: 3, category: cat3}] | [{category: cat2, num: 2}, {category: cat3, num: 1}] | [{id: 1, category: cat2, num: 2}, {id: 2, category: cat2, num: 2}, {id: 3, category: cat3, num: 1}] | [{id: 1, category: cat2, num: 2}, {id: 2, category: cat2, num: 2}, {id: 3, category: cat3, num: 1}] |
| 3 | [{id: 1, category: cat3}, {id: 2, category: cat2}, {id: 3, category: cat3}] | [{category: cat2, num: 1}, {category: cat3, num: 2}] | [{id: 1, category: cat3, num: 2}, {id: 2, category: cat2, num: 1}, {id: 3, category: cat3, num: 2}] | [{id: 1, category: cat3, num: 2}, {id: 2, category: cat2, num: 1}, {id: 3, category: cat3, num: 2}] |
Matrix operations
Arrays of arrays (matrices) can be manipulated by combining ZIP(), TRANSFORM(), and UNWRAP(). The trick is to ZIP the matrix into an array of rows where each row contains one sub-array, then transform each sub-array independently.
WITH
INP := INPUT(BIGINT)
SELECT
INP,
MATRIX := ARRAY[ARRAY[INP,2,3], ARRAY[4,5,6], ARRAY[7,8,9]],
TWO_FIRST_ROWS := MATRIX[1:2],
TWO_FIRST_COLUMNS := ZIP(MATRIX as value).TRANSFORM(SELECT value[1:2]).UNWRAP(),
SUM_ALL_ELTS := ZIP(MATRIX as value).TRANSFORM(SELECT SUM(ARRAY_SUM(value))).UNWRAP_ONE(),
SUM_DIAGONAL := ZIP(REPEAT(INP, ARRAY_LENGTH(MATRIX)) as inp, MATRIX as value).TRANSFORM(SELECT SUM(element_at_pos(value, ROW_NUMBER() OVER (PARTITION BY inp)))).UNWRAP_ONE(),
FOR
INP := BIND_VALUES(SEQUENCE(1,3))| INP BIGINT | MATRIX ARRAY | TWO_FIRST_ROWS ARRAY | TWO_FIRST_COLUMNS ARRAY | SUM_ALL_ELTS BIGINT | SUM_DIAGONAL BIGINT |
|---|---|---|---|---|---|
| 1 | [[1, 2, 3], [4, 5, 6], [7, 8, 9]] | [[1, 2, 3], [4, 5, 6]] | [[1, 2], [4, 5], [7, 8]] | 45 | 15 |
| 2 | [[2, 2, 3], [4, 5, 6], [7, 8, 9]] | [[2, 2, 3], [4, 5, 6]] | [[2, 2], [4, 5], [7, 8]] | 46 | 16 |
| 3 | [[3, 2, 3], [4, 5, 6], [7, 8, 9]] | [[3, 2, 3], [4, 5, 6]] | [[3, 2], [4, 5], [7, 8]] | 47 | 17 |
See also
- EXTEND() — enriching arrays with cross-entity data
- TRANSFORM() — full syntax reference
- Lookup with indexes — efficient repeated lookups
- Functions reference — complete list of array-of-rows functions