Batch analytics

By defining features on top of analytics tables (BigQuery, Trino, DuckDB...), FeatureMesh becomes a semantic layer for your data warehouse.

What you get automatically

No more metric drift: Define "customer_lifetime_value" once in FeatureQL. Every query, dashboard, and model uses the exact same logic.

Works with your stack: FeatureQL compiles to native SQL. Your BI tools, notebooks, and dbt models don't need to change.

Build on what exists: Combine existing features into new ones. When logic changes, update one definition instead of hunting through queries.

How it works

First, register your data sources in the feature registry:

CREATE FEATURES AS
SELECT
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    CURRENT_DATE := INPUT(DATE),

    TABLES.CUSTOMERS_HISTORY := EXTERNAL_COLUMNS(
        customer_id BIGINT#CUSTOMERS BOUND TO CUSTOMER_ID,
        country_code VARCHAR(2),
        orders_details ARRAY(ROW(order_id BIGINT, amount DECIMAL, order_date DATE))
        FROM analytics.customers_history
    ),
    COUNTRY_CODE := TABLES.CUSTOMERS_HISTORY[country_code],
    ORDERS_DETAILS := TABLES.CUSTOMERS_HISTORY[orders_details]
;
sql

Then define the features you need:

CREATE FEATURES AS
    LIFETIME_VALUE := ARRAY_SUM(ORDERS_DETAILS[amount]),
    DAYS_SINCE_LAST_ORDER := DATE_DIFF(CURRENT_DATE, ARRAY_MAX(ORDERS_DETAILS[order_date]), 'day'),
    AD_THRESHOLD := 90,
    CUSTOMER_SEGMENT := CASE
        WHEN DAYS_SINCE_LAST_ORDER < AD_THRESHOLD THEN
            CASE
                WHEN LIFETIME_VALUE > 10000 THEN 'VIP'
                WHEN LIFETIME_VALUE > 2000 THEN 'Premium'
                ELSE 'Standard'
            END
        ELSE 'Inactive'
    END
;
sql

Now anyone can query these features from anywhere (Python, SQL clients, BI tools):

SELECT
    CUSTOMER_ID,
    CUSTOMER_SEGMENT
FOR
    CUSTOMER_ID := BIND_SQL(`SELECT id FROM customers WHERE country = 'US'`)
sql

FeatureQL transpiles to optimized SQL that runs natively on your warehouse. No data movement, no new infrastructure.

Deep composition changes everything

So far, this looks like what any semantic layer can do. The difference is what comes next.

You can reuse customer segment to qualify orders, splitting revenue by segment:

WITH
    ORDER_SEGMENT := RELATED(CUSTOMER_SEGMENT VIA CUSTOMER_ID),
SELECT
    ORDER_SEGMENT,
    SUM(AMOUNT) AS REVENUE
GROUP BY ORDER_SEGMENT;
sql

You can also create variants of existing features by changing any of their dependencies, such as changing thresholds:

WITH
    ORDER_SEGMENT_30DAYS := VARIANT(ORDER_SEGMENT REPLACING AD_THRESHOLD WITH 30),
SELECT
    ORDER_SEGMENT_30DAYS,
    SUM(AMOUNT) AS REVENUE
GROUP BY ORDER_SEGMENT_30DAYS;
sql

Or even implementing dynamic logic for different countries:

WITH
    AD_THRESHOLD_BY_COUNTRY := CASE
        WHEN COUNTRY_CODE = 'US' THEN 30
        WHEN COUNTRY_CODE = 'FR' THEN 60
        ELSE 90
    END,
    ORDER_SEGMENT_BY_COUNTRY := VARIANT(ORDER_SEGMENT REPLACING AD_THRESHOLD WITH AD_THRESHOLD_BY_COUNTRY),
SELECT
    ORDER_SEGMENT_BY_COUNTRY,
    SUM(AMOUNT) AS REVENUE
GROUP BY ORDER_SEGMENT_BY_COUNTRY;
sql

Better than traditional semantic layers

Registry based, not YAML files: Traditional semantic layers define metrics in YAML files that go through git workflows and deployments. FeatureMesh lets you add features directly via FeatureQL queries. The registry validates consistency on every change, so you can't accidentally break downstream dependencies.

Full query flexibility: Tools like LookML give you predefined metrics but lock you into their query syntax. FeatureMesh lets you write any SQL you want while still referencing consistent feature definitions.

Gradual adoption: Most semantic layers require you to model your entire schema upfront. FeatureMesh works alongside your existing SQL. Start with one metric, add more over time. No big bang migration.

Last update at: 2025/12/05 16:03:14
Last updated: 2025-12-05 16:07:55