EXTERNAL_COLUMNS vs EXTERNAL_SQL

FeatureQL provides three ways to map external data to features. Each trades off simplicity for flexibility, so choosing the right one depends on how much control you need over the SQL and join logic.

  • EXTERNAL_COLUMNS() with FROM TABLE() — the simplest option. FeatureQL generates all SQL for you.
  • EXTERNAL_COLUMNS() with FROM SQL() — you write the source query, but FeatureQL still handles the join.
  • EXTERNAL_SQL() — you control both the source query and the join conditions.

The same pattern applies to binding functions: BIND_COLUMNS() mirrors EXTERNAL_COLUMNS(), and BIND_SQL() mirrors EXTERNAL_SQL().

EXTERNAL_COLUMNS with TABLE

The most declarative approach. You specify column names, types, and which column is the key. FeatureQL generates the SELECT and join logic automatically:

FeatureQL
WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    DIM_CUSTOMERS := EXTERNAL_COLUMNS(
        customer_id BIGINT#CUSTOMERS BOUND TO CUSTOMER_ID,
        name VARCHAR,
        FROM TABLE(ecomm.dim_customers)
    ),
    CUSTOMER_NAME := DIM_CUSTOMERS[name],
SELECT
    CUSTOMER_ID := BIND_COLUMNS(customer_id FROM TABLE(ecomm.dim_customers)),
    DIM_CUSTOMERS,
    CUSTOMER_NAME
;
Result
CUSTOMER_ID BIGINTDIM_CUSTOMERS ROWCUSTOMER_NAME VARCHAR
100{customer_id: 100, name: John Doe}John Doe
101{customer_id: 101, name: Jane Doe}Jane Doe
102{customer_id: 102, name: Jack Doe}Jack Doe

EXTERNAL_COLUMNS with SQL

When you need to customize the source query — for example, to join two tables or apply a transformation — wrap it in FROM SQL(). FeatureQL still manages the join between the result and your bound keys:

FeatureQL
WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    DIM_CUSTOMERS := EXTERNAL_COLUMNS(
        customer_id BIGINT#CUSTOMERS BOUND TO CUSTOMER_ID,
        name VARCHAR,
        FROM SQL(SELECT * FROM ecomm.dim_customers)
    ),
    CUSTOMER_NAME := DIM_CUSTOMERS[name],
SELECT
    CUSTOMER_ID := BIND_COLUMNS(customer_id FROM SQL(SELECT customer_id FROM ecomm.dim_customers)),
    DIM_CUSTOMERS,
    CUSTOMER_NAME
;
Result
CUSTOMER_ID BIGINTDIM_CUSTOMERS ROWCUSTOMER_NAME VARCHAR
100{customer_id: 100, name: John Doe}John Doe
101{customer_id: 101, name: Jane Doe}Jane Doe
102{customer_id: 102, name: Jack Doe}Jack Doe

EXTERNAL_SQL

For full control over both the source query and the join logic. You write the SQL string and specify join conditions explicitly with the ON clause using SELF.column=%INPUT_NAME syntax:

FeatureQL
WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    DIM_CUSTOMERS := EXTERNAL_SQL(
        `SELECT * FROM ecomm.dim_customers`
        ON `SELF.customer_id=%CUSTOMER_ID`
        AS ROW(customer_id BIGINT#CUSTOMERS, name VARCHAR)
    ),
    CUSTOMER_NAME := DIM_CUSTOMERS[name],
SELECT
    CUSTOMER_ID := BIND_SQL(SELECT customer_id FROM ecomm.dim_customers),
    DIM_CUSTOMERS,
    CUSTOMER_NAME
;
Result
CUSTOMER_ID BIGINTDIM_CUSTOMERS ROWCUSTOMER_NAME VARCHAR
100{customer_id: 100, name: John Doe}John Doe
101{customer_id: 101, name: Jane Doe}Jane Doe
102{customer_id: 102, name: Jack Doe}Jack Doe

Understanding EXTERNAL_SQL() in depth

source := EXTERNAL_SQL(
    ⟨query⟩                 -- SQL query string (literal or computed)
    ON ⟨conditions⟩         -- Optional: join/filter conditions
    AS ROW(field1 ⟨type1⟩, field2 ⟨type2⟩, ...)
),
feature1 := source[field1],
feature2 := source[field2] + 1
sql

All columns must be explicitly typed in the ROW specification, and entity annotations (BIGINT#CUSTOMERS) preserve semantic relationships. The query must return 0 or 1 row per unique input combination. Keep the SQL close to the table source — complex transformations belong in features, not inside EXTERNAL_SQL().

Static vs dynamic filtering

Compile-time values

When filter values are known at compile time (like a partition date), embed them directly in the query using metaprogramming. The database can optimize based on the known value:

FeatureQL
CONST
    STATIC_PARTITION := '2025-01-05'
WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    SOURCE_CUSTOMER1 := EXTERNAL_SQL(
        @literal(`SELECT * FROM (VALUES
            (1, '2025-01-05', '1-5'),
            (2, '2025-01-05', '2-5'),
            (3, '2025-01-05', '3-5'),
            (1, '2025-01-06', '1-6'),
            (2, '2025-01-06', '2-6'),
            (3, '2025-01-06', '3-6')
        ) AS t(customer_id, ds, attribute1)
        WHERE customer_id=%CUSTOMER_ID AND ds='` || STATIC_PARTITION || `'`)
        AS ROW(customer_id BIGINT#CUSTOMERS, ds VARCHAR, attribute1 VARCHAR)
    ),
    ATTRIBUTE1 := SOURCE_CUSTOMER1[attribute1]
SELECT
    CUSTOMER_ID := BIND_VALUES(ARRAY[1, 2]),
    STATIC_PARTITION := @literal(STATIC_PARTITION),
    ATTRIBUTE1
;
Result
CUSTOMER_ID BIGINTSTATIC_PARTITION VARCHARATTRIBUTE1 VARCHAR
12025-01-051-5
22025-01-052-5

The STATIC_PARTITION constant is resolved at compile time and injected into the SQL string via @literal(). This is ideal for daily partition filters on dimension or fact tables.

Runtime values

When filter values vary per input row, use the ON clause with SELF.column=%INPUT_NAME placeholders. FeatureQL translates this into a lateral join:

FeatureQL
WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    DYNAMIC_PARTITION := INPUT(VARCHAR),
    SOURCE_CUSTOMER1 := EXTERNAL_SQL(
        `SELECT * FROM (VALUES
            (1, '2025-01-05', '1-5'),
            (2, '2025-01-05', '2-5'),
            (3, '2025-01-05', '3-5'),
            (1, '2025-01-06', '1-6'),
            (2, '2025-01-06', '2-6'),
            (3, '2025-01-06', '3-6')
        ) AS t(customer_id, ds, attribute1)
        WHERE customer_id=%CUSTOMER_ID AND ds=%DYNAMIC_PARTITION`
        AS ROW(customer_id BIGINT#CUSTOMERS, ds VARCHAR, attribute1 VARCHAR)
    ),
    ATTRIBUTE1 := SOURCE_CUSTOMER1[attribute1]
SELECT
    (CUSTOMER_ID, DYNAMIC_PARTITION) := BIND_VALUES(ARRAY[
        ROW(1, '2025-01-05'),
        ROW(2, '2025-01-05'),
        ROW(3, '2025-01-05'),
        ROW(1, '2025-01-06')
    ]),
    ATTRIBUTE1
;
Result
CUSTOMER_ID BIGINTDYNAMIC_PARTITION VARCHARATTRIBUTE1 VARCHAR
12025-01-051-5
12025-01-061-6
22025-01-052-5
32025-01-053-5

Each input row gets its own lookup, so the filter conditions can reference any bound input.

Backends without lateral join support

BigQuery and DataFusion don't support LEFT LATERAL JOIN. For these backends, separate the data source from the join logic — put the full dataset in the query and the filter conditions in the ON clause:

FeatureQL
WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    DYNAMIC_PARTITION := INPUT(VARCHAR),
    SOURCE_CUSTOMER1 := EXTERNAL_SQL(
        `SELECT * FROM (VALUES
            (1, '2025-01-05', '1-5'),
            (2, '2025-01-05', '2-5'),
            (3, '2025-01-05', '3-5'),
            (1, '2025-01-06', '1-6'),
            (2, '2025-01-06', '2-6'),
            (3, '2025-01-06', '3-6')
        ) AS t(customer_id, ds, attribute1)`
        ON `SELF.customer_id=%CUSTOMER_ID AND SELF.ds=%DYNAMIC_PARTITION`
        AS ROW(customer_id BIGINT#CUSTOMERS, ds VARCHAR, attribute1 VARCHAR)
    ),
    ATTRIBUTE1 := SOURCE_CUSTOMER1[attribute1]
SELECT
    (CUSTOMER_ID, DYNAMIC_PARTITION) := BIND_VALUES(ARRAY[
        ROW(1, '2025-01-05'),
        ROW(2, '2025-01-05'),
        ROW(3, '2025-01-05'),
        ROW(1, '2025-01-06')
    ]),
    ATTRIBUTE1
;
Result
CUSTOMER_ID BIGINTDYNAMIC_PARTITION VARCHARATTRIBUTE1 VARCHAR
12025-01-051-5
12025-01-061-6
22025-01-052-5
32025-01-053-5

When to use each approach

ApproachUse whenLimitations
EXTERNAL_COLUMNS + TABLESimple table lookupsNo custom SQL
EXTERNAL_COLUMNS + SQLCustom source query, standard joinNo custom join conditions
EXTERNAL_SQLFull control over query and joinMore verbose, requires explicit typing

See data source mapping for the full reference.

Last update at: 2026/02/17 19:13:00
Last updated: 2026-02-17 19:13:38