Python library guide

This guide covers configuration, backend setup, error handling, and result objects for the featuremesh Python package. For installation and basic usage, see the Python library page.

Configuration

Use set_default() to choose local vs managed registry, SQLite path (local), managed/serving API endpoints, and defaults for the Jupyter magic:

from featuremesh import Registry, set_default

# Transpilation + persistence: LOCAL (default) or MANAGED
set_default("registry", Registry.MANAGED)

# Local mode: SQLite file for persisted features (default ./featuremesh.db)
set_default("local.db_path", "./featuremesh.db")

# Managed batch API (BatchClient when registry=MANAGED)
set_default("managed.host", "https://api.featuremesh.com")
set_default("managed.path", "/v1/featureql")
set_default("managed.timeout", 30)
set_default("managed.verify_ssl", True)

# Serving API (ServingClient)
set_default("serving.host", "http://host.docker.internal:10090")
set_default("serving.path", "/v1/featureql")
set_default("serving.timeout", 30)
set_default("serving.verify_ssl", True)

# Magic defaults (used when a flag is omitted on %%featureql)
set_default("debug_mode", False)
set_default("show_sql", False)

# Get current settings
from featuremesh import get_default, get_all_defaults

debug_mode = get_default("debug_mode")
all_settings = get_all_defaults()

python

Valid keys are exactly those accepted by set_default (see get_all_defaults()). Older registry.* keys were renamed to managed.*.

Local mode and SQL backends

With registry=LOCAL (the default), BatchClient uses the bundled engine and SQLite persistence. Execution is limited to Backend.DUCKDB in this release. To run transpiled SQL on Trino, BigQuery, or DataFusion, set set_default("registry", Registry.MANAGED), pass a project access_token, and provide a sql_executor that matches your chosen Backend.

Backend setup

Each backend needs a sql_executor function that takes a SQL string and returns a Pandas DataFrame. Here are ready-to-use examples for each supported backend.

DuckDB

from featuremesh import BatchClient, Backend, Registry, set_default
import duckdb

set_default("registry", Registry.MANAGED)  # backend setup examples use managed mode

# Option 1: Using a persistent connection
_duckdb_conn = None

def get_duckdb_conn(storage_path: str = ":memory:"):
    """Get or create a DuckDB connection."""
    global _duckdb_conn
    if _duckdb_conn is None:
        _duckdb_conn = duckdb.connect(storage_path)
    return _duckdb_conn

def query_duckdb(sql: str, storage_path: str = ":memory:"):
    """Execute SQL query and return results as DataFrame."""
    conn = get_duckdb_conn(storage_path)
    result = conn.sql(sql)
    return result.df()

client = BatchClient(
    access_token=__YOUR_ACCESS_TOKEN__,
    backend=Backend.DUCKDB,
    sql_executor=query_duckdb
)

# Option 2: Simple in-memory executor
def simple_duckdb_executor(sql: str):
    return duckdb.sql(sql).df()

client = BatchClient(
    access_token=__YOUR_ACCESS_TOKEN__,
    backend=Backend.DUCKDB,
    sql_executor=simple_duckdb_executor
)

python

Trino

from featuremesh import BatchClient, Backend
import pandas as pd
import trino.dbapi

def query_trino(sql: str):
    """Execute SQL query on Trino and return results as DataFrame."""
    # Configure your Trino connection details
    conn = trino.dbapi.connect(
        host="localhost",  # or host.docker.internal for docker
        port=8080,
        user="admin",
        catalog="memory",
        schema="default"
    )
    cur = conn.cursor()
    cur.execute(sql)

    # Fetch results
    cols = cur.description
    rows = cur.fetchall()

    if len(rows) > 0:
        df = pd.DataFrame(rows, columns=[col[0] for col in cols])
        return df
    else:
        return pd.DataFrame()

client = BatchClient(
    access_token=__YOUR_ACCESS_TOKEN__,
    backend=Backend.TRINO,
    sql_executor=query_trino
)

# For production with OAuth2 authentication:
import trino.auth

def query_trino_oauth(sql: str):
    """Execute SQL query on Trino with OAuth2 authentication."""
    conn = trino.dbapi.connect(
        host="trino.your-domain.com",
        port=443,
        user="your-username",
        catalog="your-catalog",
        schema="default",
        http_scheme="https",
        auth=trino.auth.OAuth2Authentication()
    )
    cur = conn.cursor()
    cur.execute(sql)
    cols = cur.description
    rows = cur.fetchall()

    if len(rows) > 0:
        return pd.DataFrame(rows, columns=[col[0] for col in cols])
    return pd.DataFrame()

python

BigQuery

from featuremesh import BatchClient, Backend
from google.cloud import bigquery

def query_bigquery(sql: str):
    """Execute SQL query on BigQuery and return results as DataFrame."""
    client = bigquery.Client(project=__YOUR_PROJECT_ID__)
    return client.query(sql).to_dataframe()

client = BatchClient(
    access_token=__YOUR_ACCESS_TOKEN__,
    backend=Backend.BIGQUERY,
    sql_executor=query_bigquery
)

python

Error handling

All operations return result objects with structured error information. Check result.success before accessing the DataFrame:

result = client.query("""
    WITH
        FEATURE1 := INPUT(BIGINT)
    SELECT
        FEATURE1 := BIND_VALUES(ARRAY[1, 2, 3]),
        FEATURE2 := FEATURE1 * 2
""")

if result.success:
    print("Query succeeded!")
    print(result.dataframe)
else:
    print("Query failed!")
    for error in result.errors:
        print(f"Error [{error.code}]: {error.message}")
        if error.context:
            print(f"Context: {error.context}")

python

For richer display in notebooks, use the result helpers:

# Prints errors/warnings/SQL/SLT/debug as requested; returns the dataframe if show_dataframe=True
result.display(show_sql=True, show_slt=False, show_debug=False, show_dataframe=True)

# Markdown helpers (Help / Describe / Validate)
client.help("zip").display()
client.describe("fm.demo").display()
client.validate("SELECT F := 1;").display()

python

Translation only

You can also translate FeatureQL to SQL without executing it — useful for debugging or integrating with other tools:

# Available with BatchClient
featureql_query = """
    WITH
        FEATURE1 := INPUT(BIGINT)
    SELECT
        FEATURE1 := BIND_VALUES(ARRAY[1, 2, 3]),
        FEATURE2 := FEATURE1 * 2
"""
translate_result = client.translate(featureql_query)

print(translate_result.sql)      # Generated SQL
print(translate_result.success)  # True if translation succeeded

python

Debug mode

Pass debug_mode=True to see the intermediate translation steps — useful for understanding how FeatureQL resolves dependencies and generates SQL:

result = client.query("""
    WITH
        FEATURE1 := INPUT(BIGINT)
    SELECT
        FEATURE1 := BIND_VALUES(ARRAY[1, 2, 3]),
        FEATURE2 := FEATURE1 * 2
""", debug_mode=True)

if result.debug_logs:
    print(result.debug_logs)

python

Result objects

QueryResult

Returned by BatchClient.query() and ServingClient.query(). Important fields:

success — True when there are no errors and a dataframe was produced
dataframe, sql, slt — slt is populated for batch execution (website-style SLT); ServingClient sets slt to None
translate_seconds, execute_seconds — wall time for translation and SQL execution (batch)
column_types — list of (name, type) pairs from the API when present
errors / warnings — Errors and Warnings collections (iterate like lists)
debug_logs — DebugLogs mapping when debug_mode=True
display(...), to_dict() — notebook-friendly printing and JSON-serializable export

In Jupyter, %%featureql --hook NAME stores to_dict() in NAME: a plain dict whose "dataframe" value is row records (list[dict]), not a DataFrame. For a real QueryResult / DataFrame, call client.query(...) or use the magic’s return value when --hide-dataframe is omitted (_ / Out[n]). See Python library .

TranslateResult

Returned by BatchClient.translate() only (no translate on ServingClient):

Same error/warning/debug patterns as above
full_response — raw registry/engine payload when available
client_type — e.g. BatchClient(local) vs BatchClient(managed)

HelpResult, DescribeResult, ValidateResult

Returned by BatchClient.help(), BatchClient.describe(), and BatchClient.validate(). Each has text (markdown), optional structured row data, display(), and to_dict().

JSON encoding

import json
from featuremesh import FeatureMeshJSONEncoder

json.dumps(result, cls=FeatureMeshJSONEncoder)

python

Test suite

BatchClient.sltest() runs bundled SLT-style checks against documentation CODE_SAMPLE snippets. By default it uses SHOW DOCS (INCLUDE (CONTENT)) with optional where and limit. Pass source= with a custom FeatureQL query that returns NAME and CONTENT columns when you do not want that built-in fetch.

Use labels=[...] with snippet headers skipif and onlyif (sqllogictest-style):

Active labels are the union of the client’s SQL backend name (always present, lowercased) and every string in labels. Extra labels add tags; they do not replace or hide the backend. So on a DuckDB client, onlyif duckdb still matches even if you also pass labels=["trino"]—both duckdb and trino are active. To exercise backend-specific snippets, use the matching BatchClient backend (or rely on skipif / onlyif against the real backend name).
skipif X: skip the snippet if X is in the active set—for example skipif trino skips when the batch client is Trino, or skipif exclude-slt-bugs when you pass labels=["exclude-slt-bugs"].
onlyif: if a snippet declares one or more onlyif lines, it runs when at least one of those tokens is in the active set (OR across lines). Example: onlyif identified-as-bug is skipped by default and runs only when you pass that string in labels. Example: onlyif trino runs when the client backend is Trino. skipif is evaluated only after the onlyif gate passes.

Other keyword arguments include halt_on_fail, quiet_engine_output, and force_no_schema. The return value is a summary dict (passed, failed, skipped, blocked, per-status lists, timing fields). See the method docstring for the full API.

Version

import featuremesh
print(featuremesh.__version__)

python

Python library guide

Configuration

Local mode and SQL backends

Backend setup

DuckDB

Trino

BigQuery

Error handling

Translation only

Debug mode

Result objects

QueryResult

TranslateResult

HelpResult, DescribeResult, ValidateResult

JSON encoding

Test suite

Version

See also