External Columns

EXTERNAL_COLUMNS() connects features to columns in your data warehouse. It is the primary way to map external data into FeatureQL for batch analytics — you declare which columns you need, bind key columns to inputs, and FeatureQL handles the rest.

For full syntax and source types (TABLE(), SQL(), CSV(), JSON()), see Mapping to data sources . This page focuses on how EXTERNAL_COLUMNS() fits into batch analytics workflows.

Multiple source types, same result

FeatureQL supports several source types for column-based mappings: EXTERNAL_SQL() for raw SQL queries, EXTERNAL_COLUMNS() with TABLE() or SQL() for declarative column mapping, and INLINE_COLUMNS() with CSV() or JSON() for embedded data. All of them produce the same result when given the same data.

The example below defines the same customer data source four ways — EXTERNAL_SQL(), EXTERNAL_COLUMNS() with SQL(), INLINE_COLUMNS() with CSV(), and INLINE_COLUMNS() with JSON() — and confirms they all return identical rows:

FeatureQL
WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    DYNAMIC_PARTITION := INPUT(VARCHAR),
    SOURCE_CUSTOMER_BASE := EXTERNAL_SQL(
        `SELECT * FROM (VALUES
            (1, '2025-01-05', '1-5'),
            (2, '2025-01-05', '2-5'),
            (3, '2025-01-05', '3-5'),
            (1, '2025-01-06', '1-6'),
            (2, '2025-01-06', '2-6'),
            (3, '2025-01-06', '3-6')
        ) AS t(customer_id, ds, attribute1)
        WHERE customer_id=%CUSTOMER_ID AND ds=%DYNAMIC_PARTITION`
        AS ROW(customer_id BIGINT#CUSTOMERS, ds VARCHAR, attribute1 VARCHAR)
    ),
    -- SQL source version
    SOURCE_CUSTOMER_SQL := EXTERNAL_COLUMNS(
        customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
        ds VARCHAR BIND TO DYNAMIC_PARTITION,
        attribute1 VARCHAR
        FROM SQL(
            SELECT * FROM (VALUES
                (1, '2025-01-05', '1-5'),
                (2, '2025-01-05', '2-5'),
                (3, '2025-01-05', '3-5'),
                (1, '2025-01-06', '1-6'),
                (2, '2025-01-06', '2-6'),
                (3, '2025-01-06', '3-6')
            ) AS t(customer_id, ds, attribute1)
        )
    ),
    -- CSV source version (headers required)
    SOURCE_CUSTOMER_CSV := INLINE_COLUMNS(
        customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
        ds VARCHAR BIND TO DYNAMIC_PARTITION,
        attribute1 VARCHAR
        FROM CSV(
  			customer_id,ds,attribute1
			1,2025-01-05,1-5
			2,2025-01-05,2-5
			3,2025-01-05,3-5
			1,2025-01-06,1-6
			2,2025-01-06,2-6
			3,2025-01-06,3-6
        )
    ),
    -- JSON source version (array of objects)
    SOURCE_CUSTOMER_JSON := INLINE_COLUMNS(
        customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
        ds VARCHAR BIND TO DYNAMIC_PARTITION,
        attribute1 VARCHAR
        FROM JSON([
            {"customer_id": 1, "ds": "2025-01-05", "attribute1": "1-5"},
            {"customer_id": 2, "ds": "2025-01-05", "attribute1": "2-5"},
            {"customer_id": 3, "ds": "2025-01-05", "attribute1": "3-5"},
            {"customer_id": 1, "ds": "2025-01-06", "attribute1": "1-6"},
            {"customer_id": 2, "ds": "2025-01-06", "attribute1": "2-6"},
            {"customer_id": 3, "ds": "2025-01-06", "attribute1": "3-6"}
        ])
    ),
SELECT
    (CUSTOMER_ID, DYNAMIC_PARTITION) := BIND_VALUES(ARRAY[
        ROW(1, '2025-01-05'),
        ROW(2, '2025-01-05'),
        ROW(3, '2025-01-05'),
        ROW(1, '2025-01-06')
    ]),
    ALL(
        SOURCE_CUSTOMER_BASE=SOURCE_CUSTOMER_SQL,
        SOURCE_CUSTOMER_BASE=SOURCE_CUSTOMER_CSV,
        SOURCE_CUSTOMER_BASE=SOURCE_CUSTOMER_JSON
    ) as all_are_equal,
    SOURCE_CUSTOMER_BASE,
    SOURCE_CUSTOMER_SQL,
    SOURCE_CUSTOMER_CSV,
    SOURCE_CUSTOMER_JSON
;
Result
CUSTOMER_ID BIGINTDYNAMIC_PARTITION VARCHARALL_ARE_EQUAL BOOLEANSOURCE_CUSTOMER_BASE ROWSOURCE_CUSTOMER_SQL ROWSOURCE_CUSTOMER_CSV ROWSOURCE_CUSTOMER_JSON ROW
12025-01-05true{customer_id: 1, ds: 2025-01-05, attribute1: 1-5}{customer_id: 1, ds: 2025-01-05, attribute1: 1-5}{customer_id: 1, ds: 2025-01-05, attribute1: 1-5}{customer_id: 1, ds: 2025-01-05, attribute1: 1-5}
22025-01-05true{customer_id: 2, ds: 2025-01-05, attribute1: 2-5}{customer_id: 2, ds: 2025-01-05, attribute1: 2-5}{customer_id: 2, ds: 2025-01-05, attribute1: 2-5}{customer_id: 2, ds: 2025-01-05, attribute1: 2-5}
32025-01-05true{customer_id: 3, ds: 2025-01-05, attribute1: 3-5}{customer_id: 3, ds: 2025-01-05, attribute1: 3-5}{customer_id: 3, ds: 2025-01-05, attribute1: 3-5}{customer_id: 3, ds: 2025-01-05, attribute1: 3-5}
12025-01-06true{customer_id: 1, ds: 2025-01-06, attribute1: 1-6}{customer_id: 1, ds: 2025-01-06, attribute1: 1-6}{customer_id: 1, ds: 2025-01-06, attribute1: 1-6}{customer_id: 1, ds: 2025-01-06, attribute1: 1-6}

Notice the DYNAMIC_PARTITION input: it acts as a second key column (BIND TO DYNAMIC_PARTITION), filtering rows by date partition. This pattern is common in batch analytics where you evaluate features for a specific date slice.

When to use which

SourceBest for
EXTERNAL_COLUMNS() with TABLE()Most cases — simple, declarative, optimizable
EXTERNAL_COLUMNS() with SQL()When you need to transform or pre-filter data before mapping
INLINE_COLUMNS() with CSV() or JSON()Prototyping and tests — data lives inside the query
EXTERNAL_SQL()Complex joins or when you need full SQL control

For a detailed comparison of EXTERNAL_COLUMNS() vs EXTERNAL_SQL(), see EXTERNAL_COLUMNS vs EXTERNAL_SQL .

Last update at: 2026/03/03 16:47:38
Last updated: 2026-03-03 16:48:19