Analytics data sources
EXTERNAL_SQL()
connects FeatureQL to external data sources by executing SQL queries and mapping their results to features. This enables seamless integration with existing data warehouses, databases, and analytical systems.
Understanding EXTERNAL_SQL()
The EXTERNAL_SQL()
function bridges FeatureQL features with external data through structured queries:
SELECT
source := EXTERNAL_SQL(
⟨query⟩ -- SQL query string (literal or computed)
ON ⟨conditions⟩ -- Optional: join/filter conditions
AS ROW(field_name1 ⟨datatype1⟩, field_name2 ⟨datatype2⟩, ...)
),
feature1 := source[field_name1], -- Map column to feature
feature2 := source[field_name2] + 1 -- Use in expressions
;
Parameters explained:
query
: SQL query string that conforms to your backend's dialect. Can be computed using metaprogramming.conditions
: Optional join conditions usingSELF.column=%INPUT_NAME
syntaxdatatype
: ROW type specification defining the expected result schema
Key principles:
- Type safety: All columns must be explicitly typed in the ROW specification
- Entity linking: Use entity annotations (
BIGINT#CUSTOMERS
) to maintain semantic relationships - Result cardinality: Must return 0 or 1 row per unique input combination
Static vs. Dynamic Filtering
Static filtering with compile-time values
When you have compile-time constants (like partition dates), include them directly in the query as a computed literal.
Key characteristics:
- Compile-time filtering: The
STATIC_PARTITION
constant is embedded in the query - Efficient execution: Database can optimize based on partitions or other known filter values
- Use cases: Daily partitions for dimension tables or facts tables
Dynamic filtering with runtime conditions
For values that vary per input row, include them in the query as an input feature placeholder starting with %
.
Key characteristics:
- Runtime filtering: Conditions are applied per input using
SELF.column=%INPUT_NAME
- Flexible joins: Can handle complex relationships between inputs and external data
- Use cases: Point-in-time lookups
Non-lateral join compatibility
For databases that don't support LEFT LATERAL JOIN
, separate the data source from join logic:
Backend compatibility considerations:
- DuckDB/Trino: Support lateral joins for optimal performance
- BigQuery: Uses different syntax but similar semantics
Best practices
- Query optimization: Do not put overly complicated queries in the
EXTERNAL_SQL()
function, stay close to the table sources - Type precision: Use entity types (
BIGINT#ENTITY
)