feat: support Spark luhn_check expression#3573
Conversation
|
Thanks @n0r0shi! Kicking off CI. |
Register datafusion-spark's SparkLuhnCheck UDF and add StaticInvoke handler for ExpressionImplUtils.isLuhnNumber (Spark 3.5+).
|
Fixed a spotless issue. Could you run a CI again? |
andygrove
left a comment
There was a problem hiding this comment.
Nice work handling the RuntimeReplaceable → StaticInvoke chain. The dispatch approach in statics.scala is clean and follows the existing readSidePadding pattern well. Good job documenting why Comet sees StaticInvoke rather than luhn_check directly.
One question on the implementation. Would CometScalarFunction("luhn_check") work here instead of the custom CometLuhnCheck handler? That's the pattern used for readSidePadding, and it would simplify the code. CometScalarFunction already maps expr.children through exprToProtoInternal and calls scalarFunctionExprToProto. I can see the custom handler uses scalarFunctionExprToProtoWithReturnType with an explicit BooleanType return type. Is that needed for correctness, or would the generic path work?
The test coverage is good with valid, invalid, non-numeric, empty, and null cases. The SQL file framework in spark/src/test/resources/sql-tests/expressions/string/ would be a good fit if you'd like to move them there.
Summary
datafusion-spark'sSparkLuhnCheckUDF and addStaticInvokehandler forExpressionImplUtils.isLuhnNumberluhn_checkwas introduced in Spark 3.5 asRuntimeReplaceable, so Comet sees it as aStaticInvoke.Test plan