EXTERNAL_COLUMNS vs EXTERNAL_SQL

FeatureQL provides three ways to map external data to features. Each trades off simplicity for flexibility, so choosing the right one depends on how much control you need over the SQL and join logic.

EXTERNAL_COLUMNS() with FROM TABLE() — the simplest option. FeatureQL generates all SQL for you.
EXTERNAL_COLUMNS() with FROM SQL() — you write the source query, but FeatureQL still handles the join.
EXTERNAL_SQL() — you control both the source query and the join conditions.

The same pattern applies to binding functions: BIND_COLUMNS() mirrors EXTERNAL_COLUMNS(), and BIND_SQL() mirrors EXTERNAL_SQL().

EXTERNAL_COLUMNS with TABLE

The most declarative approach. You specify column names, types, and which column is the key. FeatureQL generates the SELECT and join logic automatically:

FeatureQL

WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    DIM_CUSTOMERS := EXTERNAL_COLUMNS(
        customer_id BIGINT#CUSTOMERS BOUND TO CUSTOMER_ID,
        name VARCHAR,
        FROM TABLE(ecomm.dim_customers)
    ),
    CUSTOMER_NAME := DIM_CUSTOMERS[name],
SELECT
    CUSTOMER_ID := BIND_COLUMNS(customer_id FROM TABLE(ecomm.dim_customers)),
    DIM_CUSTOMERS,
    CUSTOMER_NAME
;

Result

CUSTOMER_ID BIGINT	DIM_CUSTOMERS ROW	CUSTOMER_NAME VARCHAR
100	{customer_id: 100, name: John Doe}	John Doe
101	{customer_id: 101, name: Jane Doe}	Jane Doe
102	{customer_id: 102, name: Jack Doe}	Jack Doe

EXTERNAL_COLUMNS with SQL

When you need to customize the source query — for example, to join two tables or apply a transformation — wrap it in FROM SQL(). FeatureQL still manages the join between the result and your bound keys:

FeatureQL

WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    DIM_CUSTOMERS := EXTERNAL_COLUMNS(
        customer_id BIGINT#CUSTOMERS BOUND TO CUSTOMER_ID,
        name VARCHAR,
        FROM SQL(SELECT * FROM ecomm.dim_customers)
    ),
    CUSTOMER_NAME := DIM_CUSTOMERS[name],
SELECT
    CUSTOMER_ID := BIND_COLUMNS(customer_id FROM SQL(SELECT customer_id FROM ecomm.dim_customers)),
    DIM_CUSTOMERS,
    CUSTOMER_NAME
;

Result

CUSTOMER_ID BIGINT	DIM_CUSTOMERS ROW	CUSTOMER_NAME VARCHAR
100	{customer_id: 100, name: John Doe}	John Doe
101	{customer_id: 101, name: Jane Doe}	Jane Doe
102	{customer_id: 102, name: Jack Doe}	Jack Doe

EXTERNAL_SQL

For full control over both the source query and the join logic. You write the SQL string and specify join conditions explicitly with the ON clause using SELF.column=%INPUT_NAME syntax:

FeatureQL

WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    DIM_CUSTOMERS := EXTERNAL_SQL(
        `SELECT * FROM ecomm.dim_customers`
        ON `SELF.customer_id=%CUSTOMER_ID`
        AS ROW(customer_id BIGINT#CUSTOMERS, name VARCHAR)
    ),
    CUSTOMER_NAME := DIM_CUSTOMERS[name],
SELECT
    CUSTOMER_ID := BIND_SQL(SELECT customer_id FROM ecomm.dim_customers),
    DIM_CUSTOMERS,
    CUSTOMER_NAME
;

Result

CUSTOMER_ID BIGINT	DIM_CUSTOMERS ROW	CUSTOMER_NAME VARCHAR
100	{customer_id: 100, name: John Doe}	John Doe
101	{customer_id: 101, name: Jane Doe}	Jane Doe
102	{customer_id: 102, name: Jack Doe}	Jack Doe

Understanding EXTERNAL_SQL() in depth

source := EXTERNAL_SQL(
    ⟨query⟩                 -- SQL query string (literal or computed)
    ON ⟨conditions⟩         -- Optional: join/filter conditions
    AS ROW(field1 ⟨type1⟩, field2 ⟨type2⟩, ...)
),
feature1 := source[field1],
feature2 := source[field2] + 1

sql

All columns must be explicitly typed in the ROW specification, and entity annotations (BIGINT#CUSTOMERS) preserve semantic relationships. The query must return 0 or 1 row per unique input combination. Keep the SQL close to the table source — complex transformations belong in features, not inside EXTERNAL_SQL().

Static vs dynamic filtering

Compile-time values

When filter values are known at compile time (like a partition date), embed them directly in the query using metaprogramming. The database can optimize based on the known value:

FeatureQL

CONST
    STATIC_PARTITION := '2025-01-05'
WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    SOURCE_CUSTOMER1 := EXTERNAL_SQL(
        @literal(`SELECT * FROM (VALUES
            (1, '2025-01-05', '1-5'),
            (2, '2025-01-05', '2-5'),
            (3, '2025-01-05', '3-5'),
            (1, '2025-01-06', '1-6'),
            (2, '2025-01-06', '2-6'),
            (3, '2025-01-06', '3-6')
        ) AS t(customer_id, ds, attribute1)
        WHERE customer_id=%CUSTOMER_ID AND ds='` || STATIC_PARTITION || `'`)
        AS ROW(customer_id BIGINT#CUSTOMERS, ds VARCHAR, attribute1 VARCHAR)
    ),
    ATTRIBUTE1 := SOURCE_CUSTOMER1[attribute1]
SELECT
    CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2]),
    STATIC_PARTITION := @literal(STATIC_PARTITION),
    ATTRIBUTE1
;

Result

CUSTOMER_ID BIGINT	STATIC_PARTITION VARCHAR	ATTRIBUTE1 VARCHAR
1	2025-01-05	1-5
2	2025-01-05	2-5

The STATIC_PARTITION constant is resolved at compile time and injected into the SQL string via @literal(). This is ideal for daily partition filters on dimension or fact tables.

Runtime values

When filter values vary per input row, use the ON clause with SELF.column=%INPUT_NAME placeholders. FeatureQL translates this into a lateral join:

FeatureQL

WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    DYNAMIC_PARTITION := INPUT(VARCHAR),
    SOURCE_CUSTOMER1 := EXTERNAL_SQL(
        `SELECT * FROM (VALUES
            (1, '2025-01-05', '1-5'),
            (2, '2025-01-05', '2-5'),
            (3, '2025-01-05', '3-5'),
            (1, '2025-01-06', '1-6'),
            (2, '2025-01-06', '2-6'),
            (3, '2025-01-06', '3-6')
        ) AS t(customer_id, ds, attribute1)
        WHERE customer_id=%CUSTOMER_ID AND ds=%DYNAMIC_PARTITION`
        AS ROW(customer_id BIGINT#CUSTOMERS, ds VARCHAR, attribute1 VARCHAR)
    ),
    ATTRIBUTE1 := SOURCE_CUSTOMER1[attribute1]
SELECT
    (CUSTOMER_ID, DYNAMIC_PARTITION) := BIND_VALUES(ARRAY[
        ROW(1, '2025-01-05'),
        ROW(2, '2025-01-05'),
        ROW(3, '2025-01-05'),
        ROW(1, '2025-01-06')
    ]),
    ATTRIBUTE1
;

Result

CUSTOMER_ID BIGINT	DYNAMIC_PARTITION VARCHAR	ATTRIBUTE1 VARCHAR
1	2025-01-05	1-5
1	2025-01-06	1-6
2	2025-01-05	2-5
3	2025-01-05	3-5

Each input row gets its own lookup, so the filter conditions can reference any bound input.

Backends without lateral join support

BigQuery and DataFusion don't support LEFT LATERAL JOIN. For these backends, separate the data source from the join logic — put the full dataset in the query and the filter conditions in the ON clause:

FeatureQL

WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    DYNAMIC_PARTITION := INPUT(VARCHAR),
    SOURCE_CUSTOMER1 := EXTERNAL_SQL(
        `SELECT * FROM (VALUES
            (1, '2025-01-05', '1-5'),
            (2, '2025-01-05', '2-5'),
            (3, '2025-01-05', '3-5'),
            (1, '2025-01-06', '1-6'),
            (2, '2025-01-06', '2-6'),
            (3, '2025-01-06', '3-6')
        ) AS t(customer_id, ds, attribute1)`
        ON `SELF.customer_id=%CUSTOMER_ID AND SELF.ds=%DYNAMIC_PARTITION`
        AS ROW(customer_id BIGINT#CUSTOMERS, ds VARCHAR, attribute1 VARCHAR)
    ),
    ATTRIBUTE1 := SOURCE_CUSTOMER1[attribute1]
SELECT
    (CUSTOMER_ID, DYNAMIC_PARTITION) := BIND_VALUES(ARRAY[
        ROW(1, '2025-01-05'),
        ROW(2, '2025-01-05'),
        ROW(3, '2025-01-05'),
        ROW(1, '2025-01-06')
    ]),
    ATTRIBUTE1
;

Result

CUSTOMER_ID BIGINT	DYNAMIC_PARTITION VARCHAR	ATTRIBUTE1 VARCHAR
1	2025-01-05	1-5
1	2025-01-06	1-6
2	2025-01-05	2-5
3	2025-01-05	3-5

When to use each approach

Approach	Use when	Limitations
`EXTERNAL_COLUMNS` + `TABLE`	Simple table lookups	No custom SQL
`EXTERNAL_COLUMNS` + `SQL`	Custom source query, standard join	No custom join conditions
`EXTERNAL_SQL`	Full control over query and join	More verbose, requires explicit typing

See data source mapping for the full reference.

Suggest changes to this page

Last update at: 2026/02/17 19:13:00

On this page

EXTERNAL_COLUMNS with TABLE EXTERNAL_COLUMNS with SQL EXTERNAL_SQL Understanding EXTERNAL_SQL() in depth Static vs dynamic filtering Compile-time values Runtime values Backends without lateral join support When to use each approach