Batch analytics

By defining features on top of analytics tables (BigQuery, Trino, DuckDB), FeatureMesh acts as a semantic layer for your data warehouse. But unlike traditional semantic layers, features compose: any feature can build on any other feature, and VARIANT() lets you test alternatives without duplicating logic.

How it works

Register your data sources in the feature registry:

CREATE FEATURES AS
SELECT
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    CURRENT_DATE := INPUT(DATE),

    TABLES.CUSTOMERS_HISTORY := INLINE_COLUMNS(
        customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
        country_code VARCHAR(2),
        orders_details ARRAY(ROW(order_id BIGINT, amount DECIMAL, order_date DATE))
        FROM TABLE(analytics.customers_history)
    ),
    COUNTRY_CODE := TABLES.CUSTOMERS_HISTORY[country_code],
    ORDERS_DETAILS := TABLES.CUSTOMERS_HISTORY[orders_details]
;
sql

Then define the features you need:

CREATE FEATURES AS
    LIFETIME_VALUE := ARRAY_SUM(ORDERS_DETAILS[amount]),
    DAYS_SINCE_LAST_ORDER := DATE_DIFF(CURRENT_DATE, ARRAY_MAX(ORDERS_DETAILS[order_date]), 'day'),
    AD_THRESHOLD := 90,
    CUSTOMER_SEGMENT := CASE
        WHEN DAYS_SINCE_LAST_ORDER < AD_THRESHOLD THEN
            CASE
                WHEN LIFETIME_VALUE > 10000 THEN 'VIP'
                WHEN LIFETIME_VALUE > 2000 THEN 'Premium'
                ELSE 'Standard'
            END
        ELSE 'Inactive'
    END
;
sql

Now anyone can query these features from anywhere (Python, SQL clients, BI tools):

SELECT
    CUSTOMER_ID,
    CUSTOMER_SEGMENT
FOR
    CUSTOMER_ID := BIND_COLUMNS(id FROM SQL(`SELECT id FROM customers WHERE country = 'US'`))
sql

FeatureQL transpiles to optimized SQL that runs natively on your warehouse. No data movement, no new infrastructure.

Composition is where it gets interesting

So far, this looks like what any semantic layer can do. The difference is what comes next.

Reuse customer segment to qualify orders, splitting revenue by segment:

WITH
    ORDER_SEGMENT := RELATED(CUSTOMER_SEGMENT VIA CUSTOMER_ID),
SELECT
    ORDER_SEGMENT,
    SUM(AMOUNT) AS REVENUE
GROUP BY ORDER_SEGMENT;
sql

Create a variant to test a different activity threshold:

WITH
    ORDER_SEGMENT_30DAYS := VARIANT(ORDER_SEGMENT REPLACING AD_THRESHOLD WITH 30),
SELECT
    ORDER_SEGMENT_30DAYS,
    SUM(AMOUNT) AS REVENUE
GROUP BY ORDER_SEGMENT_30DAYS;
sql

Or make the threshold dynamic per country:

WITH
    AD_THRESHOLD_BY_COUNTRY := CASE
        WHEN COUNTRY_CODE = 'US' THEN 30
        WHEN COUNTRY_CODE = 'FR' THEN 60
        ELSE 90
    END,
    ORDER_SEGMENT_BY_COUNTRY := VARIANT(ORDER_SEGMENT REPLACING AD_THRESHOLD WITH AD_THRESHOLD_BY_COUNTRY),
SELECT
    ORDER_SEGMENT_BY_COUNTRY,
    SUM(AMOUNT) AS REVENUE
GROUP BY ORDER_SEGMENT_BY_COUNTRY;
sql

None of this required modifying the original CUSTOMER_SEGMENT definition. The three examples build on each other, reuse the same dependency graph, and would each take significantly more SQL to express. This depth of composition is what separates FeatureMesh from traditional semantic layers.

How it compares

Traditional semantic layers (LookML, Cube, dbt metrics) define metrics in YAML or config files. They give you named metrics, but not composable ones: you cannot build a metric on top of another metric, swap a dependency to test an alternative, or cross entity boundaries with a single reference.

FeatureMesh also differs in how it validates changes. The registry checks type compatibility, dependency existence, and consistency on every mutation. You cannot persist a feature that would break something downstream. Traditional semantic layers rely on CI/CD and code review to catch these errors; FeatureMesh catches them at the language level.

Adoption is gradual. FeatureMesh works alongside your existing SQL. Start with one feature, add more over time.

Last update at: 2026/03/18 16:18:57
Last updated: 2026-03-18 16:19:34