External Columns

EXTERNAL_COLUMNS() connects features to columns in your data warehouse. It is the primary way to map external data into FeatureQL for batch analytics — you declare which columns you need, bind key columns to inputs, and FeatureQL handles the rest.

For full syntax and source types (TABLE(), SQL(), CSV(), JSON()), see Mapping to data sources . This page focuses on how EXTERNAL_COLUMNS() fits into batch analytics workflows.

Multiple source types, same result

FeatureQL supports several source types for column-based mappings: EXTERNAL_SQL() for raw SQL queries, EXTERNAL_COLUMNS() with TABLE() or SQL() for declarative column mapping, and INLINE_COLUMNS() with CSV() or JSON() for embedded data. All of them produce the same result when given the same data.

The example below defines the same customer data source four ways — EXTERNAL_SQL(), EXTERNAL_COLUMNS() with SQL(), INLINE_COLUMNS() with CSV(), and INLINE_COLUMNS() with JSON() — and confirms they all return identical rows:

FeatureQL

WITH
    CUSTOMERS := ENTITY(),
    CUSTOMER_ID := INPUT(BIGINT#CUSTOMERS),
    DYNAMIC_PARTITION := INPUT(VARCHAR),
    SOURCE_CUSTOMER_BASE := EXTERNAL_SQL(
        `SELECT * FROM (VALUES
            (1, '2025-01-05', '1-5'),
            (2, '2025-01-05', '2-5'),
            (3, '2025-01-05', '3-5'),
            (1, '2025-01-06', '1-6'),
            (2, '2025-01-06', '2-6'),
            (3, '2025-01-06', '3-6')
        ) AS t(customer_id, ds, attribute1)
        WHERE customer_id=%CUSTOMER_ID AND ds=%DYNAMIC_PARTITION`
        AS ROW(customer_id BIGINT#CUSTOMERS, ds VARCHAR, attribute1 VARCHAR)
    ),
    -- SQL source version
    SOURCE_CUSTOMER_SQL := EXTERNAL_COLUMNS(
        customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
        ds VARCHAR BIND TO DYNAMIC_PARTITION,
        attribute1 VARCHAR
        FROM SQL(
            SELECT * FROM (VALUES
                (1, '2025-01-05', '1-5'),
                (2, '2025-01-05', '2-5'),
                (3, '2025-01-05', '3-5'),
                (1, '2025-01-06', '1-6'),
                (2, '2025-01-06', '2-6'),
                (3, '2025-01-06', '3-6')
            ) AS t(customer_id, ds, attribute1)
        )
    ),
    -- CSV source version (headers required)
    SOURCE_CUSTOMER_CSV := INLINE_COLUMNS(
        customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
        ds VARCHAR BIND TO DYNAMIC_PARTITION,
        attribute1 VARCHAR
        FROM CSV(
  			customer_id,ds,attribute1
			1,2025-01-05,1-5
			2,2025-01-05,2-5
			3,2025-01-05,3-5
			1,2025-01-06,1-6
			2,2025-01-06,2-6
			3,2025-01-06,3-6
        )
    ),
    -- JSON source version (array of objects)
    SOURCE_CUSTOMER_JSON := INLINE_COLUMNS(
        customer_id BIGINT#CUSTOMERS BIND TO CUSTOMER_ID,
        ds VARCHAR BIND TO DYNAMIC_PARTITION,
        attribute1 VARCHAR
        FROM JSON([
            {"customer_id": 1, "ds": "2025-01-05", "attribute1": "1-5"},
            {"customer_id": 2, "ds": "2025-01-05", "attribute1": "2-5"},
            {"customer_id": 3, "ds": "2025-01-05", "attribute1": "3-5"},
            {"customer_id": 1, "ds": "2025-01-06", "attribute1": "1-6"},
            {"customer_id": 2, "ds": "2025-01-06", "attribute1": "2-6"},
            {"customer_id": 3, "ds": "2025-01-06", "attribute1": "3-6"}
        ])
    ),
SELECT
    (CUSTOMER_ID, DYNAMIC_PARTITION) := BIND_VALUES(ARRAY[
        ROW(1, '2025-01-05'),
        ROW(2, '2025-01-05'),
        ROW(3, '2025-01-05'),
        ROW(1, '2025-01-06')
    ]),
    ALL(
        SOURCE_CUSTOMER_BASE=SOURCE_CUSTOMER_SQL,
        SOURCE_CUSTOMER_BASE=SOURCE_CUSTOMER_CSV,
        SOURCE_CUSTOMER_BASE=SOURCE_CUSTOMER_JSON
    ) as all_are_equal,
    SOURCE_CUSTOMER_BASE,
    SOURCE_CUSTOMER_SQL,
    SOURCE_CUSTOMER_CSV,
    SOURCE_CUSTOMER_JSON
;

Result

CUSTOMER_ID BIGINT	DYNAMIC_PARTITION VARCHAR	ALL_ARE_EQUAL BOOLEAN	SOURCE_CUSTOMER_BASE ROW	SOURCE_CUSTOMER_SQL ROW	SOURCE_CUSTOMER_CSV ROW	SOURCE_CUSTOMER_JSON ROW
1	2025-01-05	true	{customer_id: 1, ds: 2025-01-05, attribute1: 1-5}	{customer_id: 1, ds: 2025-01-05, attribute1: 1-5}	{customer_id: 1, ds: 2025-01-05, attribute1: 1-5}	{customer_id: 1, ds: 2025-01-05, attribute1: 1-5}
2	2025-01-05	true	{customer_id: 2, ds: 2025-01-05, attribute1: 2-5}	{customer_id: 2, ds: 2025-01-05, attribute1: 2-5}	{customer_id: 2, ds: 2025-01-05, attribute1: 2-5}	{customer_id: 2, ds: 2025-01-05, attribute1: 2-5}
3	2025-01-05	true	{customer_id: 3, ds: 2025-01-05, attribute1: 3-5}	{customer_id: 3, ds: 2025-01-05, attribute1: 3-5}	{customer_id: 3, ds: 2025-01-05, attribute1: 3-5}	{customer_id: 3, ds: 2025-01-05, attribute1: 3-5}
1	2025-01-06	true	{customer_id: 1, ds: 2025-01-06, attribute1: 1-6}	{customer_id: 1, ds: 2025-01-06, attribute1: 1-6}	{customer_id: 1, ds: 2025-01-06, attribute1: 1-6}	{customer_id: 1, ds: 2025-01-06, attribute1: 1-6}

Notice the DYNAMIC_PARTITION input: it acts as a second key column (BIND TO DYNAMIC_PARTITION), filtering rows by date partition. This pattern is common in batch analytics where you evaluate features for a specific date slice.

When to use which

Source	Best for
`EXTERNAL_COLUMNS()` with `TABLE()`	Most cases — simple, declarative, optimizable
`EXTERNAL_COLUMNS()` with `SQL()`	When you need to transform or pre-filter data before mapping
`INLINE_COLUMNS()` with `CSV()` or `JSON()`	Prototyping and tests — data lives inside the query
`EXTERNAL_SQL()`	Complex joins or when you need full SQL control

For a detailed comparison of EXTERNAL_COLUMNS() vs EXTERNAL_SQL(), see EXTERNAL_COLUMNS vs EXTERNAL_SQL .

Suggest changes to this page

Last update at: 2026/03/03 16:47:38

On this page

Multiple source types, same result When to use which