Batch analytics
By defining features on top of analytics tables (BigQuery, Trino, DuckDB), FeatureMesh acts as a semantic layer for your data warehouse. But unlike traditional semantic layers, features compose: any feature can build on any other feature, and VARIANT() lets you test alternatives without duplicating logic.
How it works
Register your data sources in the feature registry:
CREATE FEATURES AS
SELECT
CUSTOMERS := ENTITY(),
CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
CURRENT_DATE := INPUT(DATE),
TABLES.CUSTOMERS_HISTORY := INLINE_COLUMNS(
customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
country_code VARCHAR(2),
orders_details ARRAY(ROW(order_id BIGINT, amount DECIMAL, order_date DATE))
FROM TABLE(analytics.customers_history)
),
COUNTRY_CODE := TABLES.CUSTOMERS_HISTORY[country_code],
ORDERS_DETAILS := TABLES.CUSTOMERS_HISTORY[orders_details]
; Then define the features you need:
CREATE FEATURES AS
LIFETIME_VALUE := ARRAY_SUM(ORDERS_DETAILS[amount]),
DAYS_SINCE_LAST_ORDER := DATE_DIFF(CURRENT_DATE, ARRAY_MAX(ORDERS_DETAILS[order_date]), 'day'),
AD_THRESHOLD := 90,
CUSTOMER_SEGMENT := CASE
WHEN DAYS_SINCE_LAST_ORDER < AD_THRESHOLD THEN
CASE
WHEN LIFETIME_VALUE > 10000 THEN 'VIP'
WHEN LIFETIME_VALUE > 2000 THEN 'Premium'
ELSE 'Standard'
END
ELSE 'Inactive'
END
; Now anyone can query these features from anywhere (Python, SQL clients, BI tools):
SELECT
CUSTOMER_ID,
CUSTOMER_SEGMENT
FOR
CUSTOMER_ID := BIND_COLUMNS(id FROM SQL(`SELECT id FROM customers WHERE country = 'US'`)) FeatureQL transpiles to optimized SQL that runs natively on your warehouse. No data movement, no new infrastructure.
Composition is where it gets interesting
So far, this looks like what any semantic layer can do. The difference is what comes next.
Reuse customer segment to qualify orders, splitting revenue by segment:
WITH
ORDER_SEGMENT := RELATED(CUSTOMER_SEGMENT VIA CUSTOMER_ID),
SELECT
ORDER_SEGMENT,
SUM(AMOUNT) AS REVENUE
GROUP BY ORDER_SEGMENT; Create a variant to test a different activity threshold:
WITH
ORDER_SEGMENT_30DAYS := VARIANT(ORDER_SEGMENT REPLACING AD_THRESHOLD WITH 30),
SELECT
ORDER_SEGMENT_30DAYS,
SUM(AMOUNT) AS REVENUE
GROUP BY ORDER_SEGMENT_30DAYS; Or make the threshold dynamic per country:
WITH
AD_THRESHOLD_BY_COUNTRY := CASE
WHEN COUNTRY_CODE = 'US' THEN 30
WHEN COUNTRY_CODE = 'FR' THEN 60
ELSE 90
END,
ORDER_SEGMENT_BY_COUNTRY := VARIANT(ORDER_SEGMENT REPLACING AD_THRESHOLD WITH AD_THRESHOLD_BY_COUNTRY),
SELECT
ORDER_SEGMENT_BY_COUNTRY,
SUM(AMOUNT) AS REVENUE
GROUP BY ORDER_SEGMENT_BY_COUNTRY; None of this required modifying the original CUSTOMER_SEGMENT definition. The three examples build on each other, reuse the same dependency graph, and would each take significantly more SQL to express. This depth of composition is what separates FeatureMesh from traditional semantic layers.
How it compares
Traditional semantic layers (LookML, Cube, dbt metrics) define metrics in YAML or config files. They give you named metrics, but not composable ones: you cannot build a metric on top of another metric, swap a dependency to test an alternative, or cross entity boundaries with a single reference.
FeatureMesh also differs in how it validates changes. The registry checks type compatibility, dependency existence, and consistency on every mutation. You cannot persist a feature that would break something downstream. Traditional semantic layers rely on CI/CD and code review to catch these errors; FeatureMesh catches them at the language level.
Adoption is gradual. FeatureMesh works alongside your existing SQL. Start with one feature, add more over time.