Skip to content

feat: support Spark luhn_check expression#3573

Open
n0r0shi wants to merge 3 commits intoapache:mainfrom
n0r0shi:luhn-check
Open

feat: support Spark luhn_check expression#3573
n0r0shi wants to merge 3 commits intoapache:mainfrom
n0r0shi:luhn-check

Conversation

@n0r0shi
Copy link

@n0r0shi n0r0shi commented Feb 23, 2026

Summary

  • Register datafusion-spark's SparkLuhnCheck UDF and add StaticInvoke handler for ExpressionImplUtils.isLuhnNumber
  • luhn_check was introduced in Spark 3.5 as RuntimeReplaceable, so Comet sees it as a StaticInvoke.

Test plan

  • Column data with valid, invalid, non-numeric, empty, and null values
  • Literal value
  • Null handling

@mbutrovich
Copy link
Contributor

Thanks @n0r0shi! Kicking off CI.

Register datafusion-spark's SparkLuhnCheck UDF and add StaticInvoke
handler for ExpressionImplUtils.isLuhnNumber (Spark 3.5+).
@n0r0shi n0r0shi marked this pull request as ready for review February 24, 2026 09:49
@n0r0shi
Copy link
Author

n0r0shi commented Feb 24, 2026

Fixed a spotless issue. Could you run a CI again?

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work handling the RuntimeReplaceableStaticInvoke chain. The dispatch approach in statics.scala is clean and follows the existing readSidePadding pattern well. Good job documenting why Comet sees StaticInvoke rather than luhn_check directly.

One question on the implementation. Would CometScalarFunction("luhn_check") work here instead of the custom CometLuhnCheck handler? That's the pattern used for readSidePadding, and it would simplify the code. CometScalarFunction already maps expr.children through exprToProtoInternal and calls scalarFunctionExprToProto. I can see the custom handler uses scalarFunctionExprToProtoWithReturnType with an explicit BooleanType return type. Is that needed for correctness, or would the generic path work?

The test coverage is good with valid, invalid, non-numeric, empty, and null cases. The SQL file framework in spark/src/test/resources/sql-tests/expressions/string/ would be a good fit if you'd like to move them there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants