expr: Make sqlfunc macro generic over container element types#35431
Open
antiguru wants to merge 11 commits intoMaterializeInc:mainfrom
Open
expr: Make sqlfunc macro generic over container element types#35431antiguru wants to merge 11 commits intoMaterializeInc:mainfrom
antiguru wants to merge 11 commits intoMaterializeInc:mainfrom
Conversation
Add phantom type parameters to DatumList, DatumMap, and Array (defaulting to Datum<'a>) so the #[sqlfunc] proc macro can track which input type parameter flows to the output. The macro detects generic type parameters like T in `fn foo<T>(a: DatumList<T>) -> T`, auto-derives the output_type_expr, and erases T → Datum<'a> in the emitted code. Supports multiple generic type parameters (e.g., `<A, B>`) and all container types: DatumList, Array, DatumMap, Range. Also adds DatumList::iter_typed() which yields typed T elements via a FromDatum trait, enabling compile-time type safety while monomorphizing to identity at runtime. Migrates range_lower, range_upper, list_list_concat, list_element_concat, element_list_concat, list_remove, and map_get_value to use the new generic syntax. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone. PR title guidelines
Pre-merge checklist
|
antiguru
commented
Mar 11, 2026
* Add typed map iterator (`DatumDictTypedIter`) wrapping `DatumDictIter` and `DatumMap::iter_typed()` method * Add `RowArena::make_datum_list()` returning typed `DatumList<'a, T>` * Migrate list concat/remove functions to use `make_datum_list` * Change `map_get_value` return type to `Option<T>` using `iter_typed()` * Make `DatumList::empty()` and `DatumMap::empty()` generic over T * Make `Debug` for `DatumMap` generic over T Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add insta tests showing macro expansion and erasure for: * Unary generic over Range<T> (range_lower) * Binary generic over DatumList<T> (list_list_concat) * Multi-generic binary with A, B (multi_generic) Also addresses remaining PR feedback: * Add DatumDictTypedIter wrapping DatumDictIter (no duplication) * Add RowArena::make_datum_list for typed list construction * Use iter_typed() in list/map functions * Make DatumList/DatumMap::empty() and DatumMap Debug generic * Fix line overflow in quote! macros Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of erasing T → Datum<'a> in the emitted function, keep the generic signature intact so users see readable, type-safe code. Split FromDatum into two traits: * FromDatum<'a>: converts Datum<'a> → T (for iteration) * IntoDatum<'a>: converts T → Datum<'a> (for packing) Add RowPacker::push_datum and push_list_elems for generic element types. Users add FromDatum<'a> bounds to their function signatures where needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
`make_datum_list` previously accepted a `FnOnce(&mut RowPacker)` closure, which made it easy to accidentally double-wrap elements (e.g., calling `push_list_elems` inside the closure would create a nested list). Change it to accept an `IntoIterator<Item = T>` instead, which guarantees only elements of type `T` are pushed and prevents the double-wrapping bug that caused `list_list_concat` to produce `List([List([...])])` instead of `List([...])`. Simplify the list function implementations to use iterator combinators. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…methods Replace the `IntoDatum` trait with `Borrow<Datum<'a>>` as a supertrait on `FromDatum`. This makes `push_datum` and `push_list_elems` on `RowPacker` unnecessary since callers can use the existing `push` method with `*val.borrow()`. Also refactor `DatumListTypedIter` to wrap `DatumListIter` and delegate to it, matching the pattern already used by `DatumDictTypedIter`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ility The `element_list_concat` and `list_element_concat` functions now correctly report their output as non-nullable since they always produce a list, even when one input is NULL. Update the expected descriptors for `mz_dataflow_channel_operators_per_worker` and `mz_dataflow_channel_operators` to reflect this. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Convert `cast_array_to_list_one_dim` and `array_index` to use generic type parameters, removing their explicit `output_type_expr` attributes. Generalize `Array::dims()` and `Array::elements()` from `impl<'a> Array<'a>` to `impl<'a, T> Array<'a, T>` so they work with typed arrays. Add cross-container derivation case `(InDatumList, InArray)` to the sqlfunc macro so it can auto-derive the output type when a function converts from `Array<T>` to `DatumList<T>`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Convert range_union, range_intersection, range_empty, range_lower_inc, range_upper_inc, range_lower_inf, and range_upper_inf to use generic type parameters instead of concrete Datum types, enabling auto-derived output_type_expr in the sqlfunc macro. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
antiguru
commented
Mar 11, 2026
* Remove no-op `emitted_func` if/else blocks in unary, binary, variadic * Rename `iter_typed` to `typed_iter` on DatumList and DatumMap * Add private constructors `DatumList::new()` and `DatumMap::new()`, use them at all construction sites * Handle `(Bare, Bare)` case in output type derivation by forwarding the input type * Prefer container usages over bare in tuple generic classification * Extract container type name strings to constants and add compile-time test verifying they match actual type names * Update design doc to reflect current implementation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add phantom type parameters to
DatumList,DatumMap, andArray(defaulting toDatum<'a>) so the#[sqlfunc]proc macro can track which input type parameter flows to the output.The macro detects generic type parameters like
Tinfn foo<T>(a: DatumList<T>) -> T, classifies their structural position (bare, in list, in array, in map, in range), and auto-derivesoutput_type_expr.Since
Tdefaults toDatum<'a>, the emitted function compiles without type erasure.Changes in
mz_repr:PhantomData<fn() -> T>toDatumList,DatumMap; propagateTthroughArrayvia itsDatumListfieldDatumList::new(),DatumMap::new()to centralizePhantomDatabookkeepingFromDatumtrait withBorrow<Datum<'a>>supertrait, replacing the previousIntoDatumtraittyped_iter()onDatumListandDatumMapyielding typedTelements viaFromDatumRowArena::make_datum_list()acceptingimpl IntoIterator<Item = T>for type-safe list constructionChanges in
mz_expr_derive_impl:classify_generic_usageto detect howTappears in types (bare, in container, absent)derive_output_type_for_genericcovering all container-to-container, container-to-bare, bare-to-bare, and cross-container derivations<A, B>)Migrated functions (removing manual
output_type_expr):list_list_concat,list_element_concat,element_list_concat,list_removecast_array_to_list_one_dim,array_indexmap_get_valuerange_lower,range_upper,range_empty,range_lower_inc,range_upper_inc,range_lower_inf,range_upper_inf,range_union,range_intersectionOther:
doc/developer/design/20260311_sqlfunc_generic.md🤖 Generated with Claude Code