FeatureQL in your organization

Adopting a new language is a serious decision. This page makes the case for when FeatureQL is worth the investment, where it outperforms SQL for organizational use, and where SQL remains the right tool.

The comparison is structured around the properties that matter when business logic is shared across teams and systems: correctness, composability, governance, readability, learnability, ecosystem integration, and AI readiness.

Correctness and safety

SQL teams are not working without safety nets. Schema constraints, foreign keys, data modeling, linters, dbt tests, and code review all contribute to correctness. Those practices are real and effective.

What they cannot catch is a class of errors that live below the language level. If a customer ID column and an order ID column are both integers, SQL lets you join them. The query compiles, runs, and produces silently wrong results. Implicit type coercion varies between backends: a query that widens BIGINT to DOUBLE in DuckDB might truncate precision in BigQuery. These are not problems of practice; they are problems of language design. No amount of testing or review eliminates them entirely, because the language itself does not distinguish between a customer ID and an order ID.

FeatureQL moves this safety into the language. Entity annotations (BIGINT#CUSTOMERS vs BIGINT#ORDERS) are not configuration or documentation; they are part of the type system, checked by the compiler. This is different from semantic layers like LookML or dbt Semantic Layer, which declare relationships in YAML or config files but do not enforce them at the expression level. In FeatureQL, CUSTOMER_ID.RELATED(ORDER_CITY) fails to compile if the entity annotations don't match. No implicit coercion means a feature produces the same result on every backend. The registry validates type compatibility, dependency existence, and graph acyclicity on every change, before anything runs.

The practical consequence: when a feature compiles and the registry accepts it, the entire dependency tree is valid and will produce consistent results across backends.

Composability

SQL has real reuse mechanisms: views, CTEs, macros, dbt models, UDFs, stored procedures. These are genuine tools that experienced teams use effectively. But they all operate at the table level or the query level. A view is a table. A dbt model is a table. A CTE is query scoped. A UDF is backend specific. You can reference these units, but you cannot take one column from one definition, compose it with a column from another definition, swap a dependency to test an alternative, and have the system resolve the data fetching automatically.

Feature stores like Tecton and metrics layers like Transform address parts of this problem. They manage features or metrics as reusable units. But they treat the language as a given (you write features in Python or SQL) and focus on the infrastructure: storage, serving, versioning.

FeatureQL's difference is that composability is in the language itself, not in a framework around it. Each feature is a node in a typed DAG. Any feature can reference any other feature, across entity boundaries, with automatic join resolution from the type system. VARIANT() replaces a named dependency and the entire downstream graph recomputes. This is not a framework pattern that can be broken by undisciplined code; it is enforced by the compiler.

The reason this matters is that the properties reinforce each other. Purity makes composition safe (no hidden side effects when you combine features). Entity annotations make composition correct (the compiler catches join mismatches). The registry makes composition discoverable (you can find and inspect what others have published). Removing any one of these and implementing the rest as a framework on top of SQL would leave gaps that the language cannot close. This is why bundling these properties into a language, rather than layering them as tools, is the core design decision.

A data engineer publishes CUSTOMER_LTV. An analyst builds CUSTOMER_SEGMENT on top of it. A data scientist creates VARIANT(CUSTOMER_SEGMENT REPLACING AD_THRESHOLD WITH 30) to test an alternative. A product manager uses the segment to qualify orders with RELATED(CUSTOMER_SEGMENT VIA CUSTOMER_ID). Each person builds on the others' work without coordination, without copying logic, and without understanding the underlying data sources.

Governance and operability

SQL the language has no built in governance. But SQL teams are not ungoverned. dbt provides versioned models and lineage. Git provides history. CI pipelines provide validation. Catalog systems like DataHub and Amundsen provide discoverability. These tools work, and many organizations run successfully with them.

The gap is that governance lives outside the language. A dbt model is a SQL file in git. You can version it, review it, and test it. But dbt cannot prevent you from deploying a model that breaks a downstream model's type expectations, because SQL does not encode those expectations. The validation happens at runtime or in CI, not at the language level. Lineage is reconstructed from metadata, not enforced by the compiler. The audit trail is in git commits, not in the logic system itself.

FeatureQL moves governance into the language runtime. The registry validates type compatibility and dependency consistency on every mutation, before it takes effect. You cannot persist a feature that would break a downstream dependency. Feature lifecycle management (DEV, BETA, ACTIVE, DEPRECATED) prevents promoting unstable logic to production. Dependency tracking (IS_UPSTREAM_OF, IS_DOWNSTREAM_OF) is built in, not reconstructed.

This is a tradeoff. The benefit is that governance is guaranteed, not optional. The cost is a dependency on the registry as a runtime component. For teams where consistency and auditability are strict requirements (regulated industries, shared business logic across many systems), the tradeoff favors FeatureQL. For teams with simpler needs, dbt + git may be sufficient.

Readability

Simple SQL is extremely readable. Complex SQL is famously not. A 200 line query with five CTEs, three self joins, and window functions is hard for even experienced engineers to follow. There is no built in mechanism for naming intermediate calculations at the column level, and complexity grows non linearly with query length.

FeatureQL features are named, isolated, and independently inspectable. A feature reads like a business rule: CUSTOMER_LTV := CUSTOMER_ID.RELATED(SUM(ORDER_AMOUNT) GROUP BY ORDER_CUSTOMER_ID). Complex logic is decomposed into a graph of named pieces, each testable on its own. The two scope level restriction (feature graph + TRANSFORM inner queries) means you never trace through arbitrarily nested subqueries.

Both languages get harder to read as complexity grows. A large feature graph has its own complexity; 500 nodes in a dependency graph is not automatically easier to navigate than a long SQL file. The difference is that each node in the graph is independently named, typed, testable, and inspectable, which provides structure that a flat SQL file lacks. Readability improves not because the complexity disappears, but because it decomposes into manageable pieces with explicit interfaces.

Learnability

SQL's SELECT / FROM / WHERE / GROUP BY is one of the most learnable syntaxes ever designed for a data language. The basics are a weekend. FeatureQL requires learning a new mental model, which is a real cost.

In practice, the learning curve depends on the role.

Business users and analysts who think in spreadsheet formulas will find FeatureQL's model natural: a feature is a named cell formula that takes inputs and references other formulas. They work with features that data engineers have already published, writing queries like SELECT CUSTOMER_ID, CUSTOMER_SEGMENT FROM FM.ECOMM FOR CUSTOMER_ID := BIND_COLUMNS(...). This layer is lightweight and resembles SQL closely. Most analysts are productive within days for consumption queries, and within a few weeks for defining their own features.

Data engineers set up the infrastructure: entity declarations, source mappings, type annotations. This layer is heavier than equivalent SQL because it encodes more information (entity relationships, type constraints, source bindings). The investment pays off because this work is done once and shared broadly; every analyst and data scientist builds on top of it without repeating it.

SQL's learning curve is gentler at the start but steeper at the high end: window functions, correlated subqueries, and the interaction between GROUP BY and HAVING trip up even experienced users for years. FeatureQL's learning curve is steeper at the start (new mental model) but flattens once the entity centric pattern clicks, because composition replaces complexity.

Ecosystem integration

SQL is the universal interface for data. Every BI tool, every database, every notebook speaks SQL. Nothing matches this ecosystem reach.

FeatureQL addresses this through integration rather than replacement. The proxy is transparent to existing SQL clients: tools like Tableau, Looker, and Metabase connect to what they think is Trino or BigQuery. FeatureQL queries can be embedded inside regular SQL as comments that the proxy intercepts, so even tools that do schema introspection remain compatible. The Python library provides Jupyter integration. The REST API enables real time serving.

FeatureQL also addresses a problem that SQL's ubiquity obscures: there is not one SQL, but many incompatible dialects. BigQuery SQL, Trino SQL, DuckDB SQL, and PostgreSQL SQL are not the same language. Migrating a complex query between backends is real work. FeatureQL's transpilation to multiple backends means a feature defined once runs correctly on any supported backend, which is a form of portability that SQL itself does not provide.

The tradeoff is an additional layer in the execution path. When a query passes through the proxy and transpiler, debugging failures may require inspecting the generated SQL alongside the FeatureQL source. The --show-sql flag and debug mode exist for this purpose, but the added indirection is a real operational cost.

AI readiness

LLMs are unreliable at generating complex SQL. The more joins, subqueries, and window functions a query requires, the more likely the output is subtly wrong. The LLM must reason about schema, join paths, grouping granularity, and null handling simultaneously, with a vast space of syntactically valid but semantically incorrect possibilities.

Structured metadata helps. Semantic layers, metric stores, and well documented dbt projects all give LLMs better context than a raw schema. But these tools provide structure around SQL; the LLM still generates SQL, and SQL has no type system to catch the semantic errors that LLMs commonly make.

FeatureQL's difference is that the safety properties are in the language the LLM writes, not in a layer around it. A registry of named, typed features acts as a structured vocabulary: the LLM composes existing features instead of reasoning from raw tables. The entity annotation type system rejects invalid joins at compile time. The flat DAG structure is easier to generate correctly than nested SQL. VARIANT() maps directly to natural language "what if" questions.

More importantly, every AI authored decision expressed as a FeatureQL feature is an inspectable, versioned, reproducible artifact. It can be traced, replayed, and audited. This matters for regulated industries, where the ability to answer "exactly what rule produced this decision, and can we reproduce it?" is a prerequisite for AI adoption. A SQL query can be logged, but it is not registered, typed, dependency tracked, or guaranteed to produce the same result when re-executed against a different backend. A FeatureQL feature is all of those things by construction.

Suggest changes to this page

Last update at: 2026/03/14 14:27:33

On this page

Correctness and safety Composability Governance and operability Readability Learnability Ecosystem integration AI readiness