Data Modeling Patterns

FeatureMesh works with both star-schema and One Big Table (OBT) data models. The choice affects how you write your features — star-schema relies on RELATED() to traverse entities, while OBT gives you pre-aggregated arrays that you query directly with TRANSFORM().

Star-schema: join at query time

In a star-schema, facts and dimensions live in separate tables. You use RELATED() to join them when building features:

-- Star-schema: look up store category for each order
ORDER_CATEGORY := RELATED(STORE_CATEGORY VIA ORDER_SOURCE[order_store_id])
sql

This is the natural fit when your data warehouse follows a normalized or Kimball-style design. FeatureQL resolves the joins automatically based on entity annotations.

OBT: pre-aggregated arrays

In an OBT model, related facts are already aggregated into arrays on the entity row. A customer row might contain an order_history column with all their orders as an array of rows. You query it directly with TRANSFORM():

-- OBT: filter and aggregate directly on the pre-materialized array
JULY_REVENUE := order_history.TRANSFORM(
    SELECT SUM(amount) WHERE order_date >= DATE '2024-07-01'
        AND order_date < DATE '2024-08-01'
).UNWRAP_ONE()
sql

An OBT column is logically equivalent to RELATED(ARRAY_AGG(fact) VIA entity_id) — it just skips the join at query time because the aggregation was done during materialization.

Choosing between the two approaches

Both approaches can coexist. A common pattern is to use EXTEND() for ad-hoc enrichment (star-schema style) while also maintaining pre-materialized OBT columns for heavily queried paths.

Direct aggregation with EXTEND()

You can aggregate across entities in a single expression using EXTEND():

REVENUE_BY_COUNTRY := EXTEND(
    ROW(order_country)
    WITH SUM(price) GROUP BY order_country AS revenue_by_country
    VIA order_country BIND TO order_country
)[revenue_by_country]
sql

Flexible alternative: persist the array, then TRANSFORM()

The same result can be achieved by first building an array of rows, then aggregating:

-- Build the array (can be persisted as a feature)
COUNTRY_ORDERS := EXTEND(
    ROW(country)
    WITH ARRAY_AGG(ROW(order_id, price)) GROUP BY order_country AS country_orders
    VIA country BIND TO order_country
)[country_orders],

-- Aggregate on the persisted array
REVENUE_BY_COUNTRY := COUNTRY_ORDERS.TRANSFORM(SELECT SUM(price)).UNWRAP_ONE()
sql

The second approach is more verbose, but the intermediate COUNTRY_ORDERS array can be persisted as a feature and reused across many downstream computations — filtering, grouping, time-windowing — without re-joining the source tables each time.

When to use which

ScenarioApproach
Ad-hoc exploration, few queriesStar-schema with RELATED() / EXTEND()
Heavily queried entity enrichmentPre-materialized OBT arrays with TRANSFORM()
Both exploration and productionStar-schema for development, persist key arrays as OBT for production

See Leverage Entity-Centered Modeling for a deeper dive on building and maintaining OBT tables.

Last update at: 2026/03/18 16:18:57
Last updated: 2026-03-18 16:19:34