Variants
The VARIANT() function enables you to create a new version of a feature by replacing specific dependencies with alternative definitions.
VARIANT(
ORIGINAL_FEATURE: feature, -- The feature to create a variant of
ORIGINAL_DEPENDENCIES: list of features, -- Dependencies to replace
OVERWRITES: list of features -- New definitions to use
)
It's equivalent to overwriting dependencies only within the context of a specific feature's dependency chain.
Flexible composition
FeatureQL combines SQL's expressiveness with transformation patterns:
decision = source.transform_a().transform_b()
-- Create a variant of decision to run an experiment
decision_exp = variant(decision REPLACING transform_a WITH transform_a_exp)
-- decision_exp = source.transform_a_exp().transform_b()
This composition design enables:
- Flexible prototyping through interchangeable data sources
- Controlled experimentation with isolated changes
- Testing at any level of the DAG
- Performance optimization via materialization strategies
Where traditional SQL requires managing complex table relationships, FeatureQL focuses on the essence of data transformation: clear operations and their logical composition, unified for analytics, machine learning features, and backend data processing.
Basic Example
For example, you can evaluate a feature with two different input overwrites in the same query:
SELECT
RADIUS := INPUT(DOUBLE),
AREA_CIRCLE := PI() * POW(RADIUS, 2),
-- Create variants with different RADIUS values
AREA_CIRCLE_2 := VARIANT(AREA_CIRCLE, ARRAY[RADIUS], ARRAY[2.0]), -- Will return PI() * POW(2.0, 2)
AREA_CIRCLE_5 := VARIANT(AREA_CIRCLE, ARRAY[RADIUS], ARRAY[5.0]) -- Will return PI() * POW(5.0, 2)
;
Notes:
- To simplify this syntax and transform a feature into a reusable template, please refer to section 1.3.5.1 Macros.
- For more advanced examples, including shared dependencies and nested variants, please refer to section 2.4.2 Feature variants.
Impact on shared dependencies
Consider this scenario:
-- Original features
SELECT
A := FUNC(C),
B := FUNC(C),
C := FUNC(D, E)
;
-- Create variant of A by replacing D with D_new
SELECT
A_variant := VARIANT(A, [D], [D_new])
;
When A_variant
is created:
- It needs a modified version of C that uses D_new instead of D:
C_variant := FUNC(D_new, E)
. - B continues to use the original C with D:
B := FUNC(C)
. - The two features A_variant and B are no longer sharing the exact same C:
A := FUNC(C_variant)
.
For caching and tracking, each feature is identified by its name plus a hash of its entire dependency tree, not just its name. This ensures proper handling of variants with different dependency chains.
Nested Variants
When a DAG contains multiple VARIANT calls, substitutions are applied following the path from the original feature toward the root. Substitutions closer to the original feature take precedence over those closer to the root.
-- Original feature definitions
SELECT
C := FUNC(D, E), -- Base computation
B := FUNC(C), -- Intermediate computation
F := FUNC(G), -- Another base computation
A := FUNC(B, C, F), -- Final computation uses both chains
D_new := FUNC(F) -- D_new depends on F
;
-- Create nested variants
SELECT
-- First variant: replace D with D_new in the dependency chain
A_variant := VARIANT(
A, -- Original feature
[D], -- Replace D
[D_new] -- with D_new (which depends on F)
),
-- Nested variant: builds upon A_variant and also replaces F with F_new
A_variant_nested := VARIANT(
A_variant, -- Build on previous variant
[F], -- Replace F
[F_new] -- with F_new
)
When resolved, this creates contextualized features like this:
-- First level variant (A_variant)
F := FUNC(G)
D_new := FUNC(F)
C_ctx_D_D_new := FUNC(D_new, E)
B_ctx_D_D_new := FUNC(C_ctx_D_D_new)
A_variant := FUNC(B_ctx_D_D_new, C_ctx_D_D_new, F)
-- Nested variant (A_variant_nested) - incorporates both changes
F_new := FUNC(G)
D_new_ctx_F_F_new := FUNC(F_new) -- D_new gets contextualized with F_new
C_ctx_D_D_new_F_F_new := FUNC(D_new_ctx_F_F_new, E)
B_ctx_D_D_new_F_F_new := FUNC(C_ctx_D_D_new_F_F_new)
A_variant_nested := FUNC(B_ctx_D_D_new_F_F_new, C_ctx_D_D_new_F_F_new, F_new)
This shows a more complex interaction where:
- The first variant introduces D_new which depends on F
- When F is replaced with F_new in the nested variant, it affects both:
- The direct F dependency in A
- The indirect F dependency through D_new
- All the contextualized features properly reflect both changes in their names