FeatureQL language

FeatureQL is a SQL dialect for data transformations with three design principles:

Abstracted: Separates transformation logic from storage and execution engines. You can swap data sources or execution backends without rewriting transformations.

Atomic: Works at the column level instead of SQL's table level. This makes dependencies explicit and enables granular lineage tracking.

Functional: Enforces pure transformations that can be composed together. Each transformation is self-contained and predictable.

FeatureQL transpiles to native SQL for DuckDB, Trino, BigQuery, and Datafusion, so it works with your existing database infrastructure.

What problems it solves

Reusable business logic: Instead of copying the same revenue calculation across 50 queries, define it once as a feature. When the calculation changes, update one place, not 50.

Consistent metrics: Everyone uses the same definition of "active customer" or "churn rate" from the feature catalog. No more meetings about why numbers don't match.

Rapid experimentation: Compose features like building blocks. Test variations with VARIANT(), swap calculation methods instantly, and iterate without rewriting entire queries.

One language, all use cases: The same features power dashboards, ML models, real-time APIs, and operational systems. No translation between analytics SQL and application code.

Simpler complex queries: Working with nested data (like all orders per customer) is natural with FeatureQL's array operations. What takes 100 lines of SQL with multiple CTEs becomes 10 clear lines.

Faster debugging: Features are testable functions. You can test each feature in isolation instead of debugging a 500-line SQL query.

Better collaboration: Data teams publish verified features that analysts and engineers reuse. New team members are productive immediately instead of learning years of tribal knowledge.

Suggest changes to this page

Last update at: 2025/10/13 10:23:46

On this page

What problems it solves