[Expr Serde] [PR 2/X] Operation type enum conversion helper#534
[Expr Serde] [PR 2/X] Operation type enum conversion helper#534evindj wants to merge 8 commits intoapache:mainfrom
Conversation
| // Helper to test round-trip serialization | ||
| // Uses string comparison since expressions may have different internal identity | ||
| // but the same semantic meaning (i.e., ToString() output matches) | ||
| void TestRoundTrip(const Expression& expr) { |
There was a problem hiding this comment.
You add this helper function but not used anywhere.
src/iceberg/CMakeLists.txt
Outdated
| expression/expression.cc | ||
| expression/expressions.cc | ||
| expression/inclusive_metrics_evaluator.cc | ||
| expression/json_internal.cc |
There was a problem hiding this comment.
By convention, this file should be named json.cc, as internal is typically reserved for header files. I'm aware that we already have a json_internal.cc, but that’s legacy. I think json_serde.cc/ json_serde_internal.h would be a better choice. What do you think, @wgtmac?
| #include "iceberg/iceberg_export.h" | ||
| #include "iceberg/result.h" | ||
|
|
||
| /// \file iceberg/expression/json_internal.h |
There was a problem hiding this comment.
This comment appears to be stale.
| ICEBERG_EXPORT nlohmann::json ToJson(const Model& model); | ||
|
|
||
| /// \note Don't forget to add `ICEBERG_DEFINE_FROM_JSON` to the end of | ||
| /// `json_internal.cc` to define the `FromJson` function for the model. |
| /// \return The operation type string (e.g., "eq", "lt-eq", "is-null") | ||
| ICEBERG_EXPORT std::string_view ToStringOperationType(Expression::Operation op); | ||
|
|
||
| /// Check if an operation is a unary predicate (no values) |
There was a problem hiding this comment.
Maybe we don't need the explanation in the parentheses, unary predicate is self-explanatory?
| /// Check if an operation is a unary predicate (no values) | ||
| ICEBERG_EXPORT bool IsUnaryOperation(Expression::Operation op); | ||
|
|
||
| /// Check if an operation is a set predicate (multiple values) |
We have the convention that only header files should use the `_internal` suffix, so that iceberg_install_all_headers does not install them. However, we have some existing .cc files with the _internal suffix, which should be fixed. This commit only addresses json_internal.h/.cc, since we discussed renaming them to json_serde_internal.h and json_serde.cc in apache#534
We have the convention that only header files should use the `_internal` suffix, so that iceberg_install_all_headers does not install them. However, we have some existing .cc files with the _internal suffix, which should be fixed. This commit only addresses json_internal.h/.cc, since we discussed renaming them to json_serde_internal.h and json_serde.cc in #534
| } // namespace | ||
|
|
||
| /// Check if an operation is a unary predicate (no values) | ||
| bool IsUnaryOperation(Expression::Operation op) { |
There was a problem hiding this comment.
Is it better to add a iceberg/util/expression_util.h for this and similar functions?
|
|
||
| Result<std::shared_ptr<Expression>> ExpressionFromJson(const nlohmann::json& json) { | ||
| // Handle boolean | ||
| if (json.is_boolean()) { |
There was a problem hiding this comment.
Should we combine this PR with another one as they have duplicate content?
| /// | ||
| /// \param type_str The JSON type string | ||
| /// \return The corresponding Operation or an error if unknown | ||
| Result<Expression::Operation> OperationTypeFromString(const std::string_view type_str) { |
There was a problem hiding this comment.
| Result<Expression::Operation> OperationTypeFromString(const std::string_view type_str) { | |
| Result<Expression::Operation> OperationTypeFromString(std::string_view type_str) { |
| return kTypeStartsWith; | ||
| case Expression::Operation::kNotStartsWith: | ||
| return kTypeNotStartsWith; | ||
| default: |
There was a problem hiding this comment.
Can we remove the default branch and add std::unreachable in the end? In this way, complier will complain if we forget to add a new case in the future.
|
closing in favor of PR #532 |
Summary
This PR is a small change that will be used later. It is a simple string to enum and enum to string conversion. It does not add any functionality
Test
Added a unit tests for one of the case, just to confirm it works.
./build/src/iceberg/test/expression_test --gtest_filter="ExpressionJson*" ✔ 1322 11:28:25
Running main() from gmock_main.cc
Note: Google Test filter = ExpressionJson*
[==========] Running 3 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 3 tests from ExpressionJsonTest
[ RUN ] ExpressionJsonTest.TrueExpression
[ OK ] ExpressionJsonTest.TrueExpression (0 ms)
[ RUN ] ExpressionJsonTest.FalseExpression
[ OK ] ExpressionJsonTest.FalseExpression (0 ms)
[ RUN ] ExpressionJsonTest.OpToString
[ OK ] ExpressionJsonTest.OpToString (0 ms)
[----------] 3 tests from ExpressionJsonTest (0 ms total)
[----------] Global test environment tear-down
[==========] 3 tests from 1 test suite ran. (0 ms total)
[ PASSED ] 3 tests.