Skip to content

Dynamic protocol classes using upstream json schemas#2727

Open
dpkp wants to merge 23 commits intomasterfrom
dpkp/api-message
Open

Dynamic protocol classes using upstream json schemas#2727
dpkp wants to merge 23 commits intomasterfrom
dpkp/api-message

Conversation

@dpkp
Copy link
Owner

@dpkp dpkp commented Mar 9, 2026

kafka.protocol.new.* implements dynamically generated protocol classes from upstream/official json schemas. It builds data-container classes for all request/responses structs as well as all internal structs. Struct attributes are accessed by name (with slight alteration: upstream CamelCase converted to underscore_case), similar to current classes.

Api version can be specified using the same Request[version] interface as currently used, or can be deferred and set later with request.version = 2. Version can also be passed as an argument at encoding time: request.encode(version=2).

Motivation is primarily to remove the maintenance burden of hand-curated protocol definitions, improve support for newer "flexible" schema versions (this new approach supports all "tagged" fields transparently: they are get and set by name just like other fields), and enable using a single request instance as data container before deciding on api version (though we'll see how this works in practice).

Assisted by gemini, which I used for design feedback, help fixing bugs, and filling out the protocol test suite. I actually started this branch last year sometime, before the coding agents were born / reasonably good. I decided to dust it off and complete with gemini. So here we are. I ended up not using gemini as much as I thought, but I did find it very useful.

@dpkp dpkp force-pushed the dpkp/api-message branch 6 times, most recently from 284d7db to bb53bd9 Compare March 11, 2026 20:45
dpkp added 23 commits March 11, 2026 15:27
  kafka.protocol.messages.json
  ApiArray
  Dont need AbstractType
  Simplify field default()
  ApiSchema(ApiStruct); add more opts to encode/decode
  Add more api_versions test coverage for (1) buffer underruns, (2) response error parsing
  First working version: ApiVersionsRequest w/ tests
  Rename ApiSchema -> ApiMessage
  Split schema.py into class files; add is_array() / is_struct() / is_struct_array() to Field to break circular refs
  Improve nested field and data_class access w/ .fields[]
  Gemini-created api compatibility tests...
  Simplify with Field.has_data_class()
  Implement .tags and .unknown_tags + add tests (gemini)
  Add .load_json_file() to ApiStruct and ApiHeader (gemini)
  Minor fixups for print_class to handle empty/no-field schemas
  Move new protocol bits to kafka.protocol.new ; add kafka.protocol.messages.api_versions
  Add default for tagged struct types; improve NotImplementedError
  Fix ApiArray.inner_type def
  Add Metadata/Fetch/Produce/ListOffsets to kafka.protocol.messages
  .messages.json -> .new.schema; .messages -> .new.messages
  Move new api message tests to test/protocol/new/messages/
  minor formatting fixups for fetch/list/metadata/produce tests
  Add consumer group messages for new protocol
  Update field default handling for literal ints/bools; add test_field.py
  Handle commonStructs in schema json
  Add transactional producer api messages to protocol/new/
  Add admin client messages to protocol/new/
  Add sasl messages to protocol/new/
  Fixup kafka.protocol.new tests
  replace ApiStructData.build_data_class with ApiStructMeta metaclass
  Full metaclass support for ApiMessage and ApiHeader

    * Returns data-class wrappers; drop .data_class calls from tests
    * Drop ApiMessage.from_json_file()
    * Move custom code from ApiMessage to ApiVersionsResponse class
    * Move ApiStructMeta to api_struct_data.py
    * Use weakrefs to avoid circular struct + data_class ref cycles
    * decode() is now class method
    * Drop extra 'Data' from struct data_class names
    * ApiMessage version classes are now simple wrappers around primary
    * Add [None] version lookup to get back to primary class
    * Allow subscript lookups from version classes
    * Calling field instance passes through to data_class

  Simplify ApiStruct.encode and support dict items
  use @classproperty in ApiDataStruct and ApiMessage
  ProduceRequest.expect_response() override
  _version -> _class_version
  Add __slots__ to base classes
  Include Field name in encode/decode errors; decode as default if field not valid for version
  ApiMessage stateful _header attr and with_header()
  ApiMessage API_KEY/API_VERSION; add instance-level _version
@dpkp dpkp force-pushed the dpkp/api-message branch from bb53bd9 to 0720821 Compare March 11, 2026 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant