Skip to content

Conversation

@pratiktiwari13
Copy link

Fixes #1032

Currently the snippets of codes in question don't honour the array whose collection is already in progress. The proposed solution is to make sure the context is checked before the array is initialized.

@stleary
Copy link
Owner

stleary commented Jan 29, 2026

Good catch, this is a real bug in forceList when processing XML-to-JSON. However, the code changes will need some work:

  • Please maintain consistency with non-forceList behavior by calling context.append(tagName, "") when a middle or ending empty tag is found. The code currently does not do this.
  • It seems to me that you only need to check context.has(tagName) in both code locations, but the code has additional conditions, which I believe can never occur. For example, if context.isEmpty(), then context.opt(tagName) has to be null, does it not?

@stleary
Copy link
Owner

stleary commented Jan 29, 2026

Also, there is an existing issue where the first entry is empty. Currently it is being dropped, but the non-forceList behavior is to insert an initial empty string. The issues are:

  • Consistency: One of the elements is being dropped, which is inconistent with non-forceList
  • Backwards compatibility: Does this bug justify changing the behavior? Normally backwards compatibility is paramount except for egregious bugs, but we grant more leeway for XML processing.
  • Workaround availability: If they don't like empty array entries, they can be filtered out by the user

Note: if you don't want to address this as being out of scope, let me know, and I will open a new issue to fix it.

[stleary update 1/30/2026] A new issue was added for this: #1040

@pratiktiwari13
Copy link
Author

pratiktiwari13 commented Jan 30, 2026

Good catch, this is a real bug in forceList when processing XML-to-JSON. However, the code changes will need some work:

  • Please maintain consistency with non-forceList behavior by calling context.append(tagName, "") when a middle or ending empty tag is found. The code currently does not do this.
  • It seems to me that you only need to check context.has(tagName) in both code locations, but the code has additional conditions, which I believe can never occur. For example, if context.isEmpty(), then context.opt(tagName) has to be null, does it not?
  1. If I understood you correctly you're talking about the case where an empty element that's not in the forceList, needs to be marked as an empty string and that is something which is already happening. Otherwise if it is in the forceList, it needs to be initialized as an empty array right?
  2. Fixed with context.isEmpty(). Sorry about that!

@pratiktiwari13
Copy link
Author

pratiktiwari13 commented Jan 30, 2026

Also, there is an existing issue where the first entry is empty. Currently it is being dropped, but the non-forceList behavior is to insert an initial empty string. The issues are:

  • Consistency: One of the elements is being dropped, which is inconistent with non-forceList
  • Backwards compatibility: Does this bug justify changing the behavior? Normally backwards compatibility is paramount except for egregious bugs, but we grant more leeway for XML processing.
  • Workaround availability: If they don't like empty array entries, they can be filtered out by the user

Note: if you don't want to address this as being out of scope, let me know, and I will open a new issue to fix it.

Does this bug justify changing the behavior?

I think I would also lean more towards preserving the empty value and have user filter it out as and when needed, because consistency becomes an expectation of the user, if I purely think from a DX point of view.

Note: if you don't want to address this as being out of scope, let me know, and I will open a new issue to fix it.

I don't have a strong preference here, but I think a new issue would be helpful to reason about the changes. I can pick that up as well.

@stleary
Copy link
Owner

stleary commented Jan 30, 2026

Please add this test case to the unit tests. This captures what I was trying to explain in the comments.
Getting this test to pass will probably break the new tests that you added, so they will need to be adjusted as well.
It will probably also fix the new issue #1040 as a side effect, but no worries if it does not.

    @Test
    public void testForceListConsistencyWithDefault() {
        final String originalXml = "<root><id>0</id><id>1</id><id/><id></id></root>";
        final String expectedJsonString = "{\"root\":{\"id\":[0,1,\"\",\"\"]}}";

        // confirm expected result of default array-of-tags processing
        JSONObject json = XML.toJSONObject(originalXml);
        assertEquals(expectedJsonString, json.toString());

        // confirm forceList array-of-tags processing is consistent with default processing
        HashSet<String> forceListCandidates = new HashSet<>();
        forceListCandidates.add("id");
        json = XML.toJSONObject(originalXml,
                new XMLParserConfiguration()
                        .withForceList(forceListCandidates));
        assertEquals(expectedJsonString, json.toString());
    }

@pratiktiwari13 pratiktiwari13 force-pushed the bugfix/empty-force-list branch from b897d59 to 510a03a Compare January 31, 2026 05:05
@sonarqubecloud
Copy link

@pratiktiwari13
Copy link
Author

pratiktiwari13 commented Jan 31, 2026

@stleary I have added the fix for #1040 , but since the resulting json, though might be a valid JSON with mixed types, it is not very type-safe unmarshalling friendly. I understand that changing that will result in a big hit to backwards-compatibility, but maybe in XMLParserConfiguration we can have a way to not append empty string if withIgnoreArrayEmpty is set to true ? or maybe give an option withEmptyAsNull as null might be more type-safe than empty string? Of course this is regardless of forceList or non-forceList

@stleary
Copy link
Owner

stleary commented Jan 31, 2026

What problem does this code solve?
Empty XML elements were not processed correctly when the tag was in the forceList list. This update matches the non-forceList behavior for empty elements: an empty string is set as the new JSONArray element.

Does the code still compile with Java6?
Yes

Risks
Low

Changes to the API?
No

Will this require a new release?
No

Should the documentation be updated?
No

Does it break the unit tests?
No, new unit tests were added

Was any code refactored in this commit?
No

Review status
APPROVED

Starting 3-day comment window

@stleary
Copy link
Owner

stleary commented Jan 31, 2026

the resulting json, though might be a valid JSON with mixed types, it is not very type-safe unmarshalling friendly. I understand that changing that will result in a big hit to backwards-compatibility, but maybe in XMLParserConfiguration we can have a way to not append empty string if withIgnoreArrayEmpty is set to true ? or maybe give an option withEmptyAsNull as null might be more type-safe than empty string? Of course this is regardless of forceList or non-forceList

Good point. Config entries for tags with default values can be added, but I consider this a corner case with an available reasonable workaround where the developer can filter out or modify the non-compliant values. Will address it if someone raises an issue.

@pratiktiwari13
Copy link
Author

Makes sense, modifying them to null can be a reasonable solution even for massive arrays for now. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

XMLParserConfiguration's forceList resets the whole array when empty elements present in the middle or end

2 participants