Skip to content

Implement streaming FVDL processing with memory tracking to optimize …#920

Open
mneeta wants to merge 1 commit intofortify:dev/v3.xfrom
mneeta:feature/aviator-Stax-fvdl-processor
Open

Implement streaming FVDL processing with memory tracking to optimize …#920
mneeta wants to merge 1 commit intofortify:dev/v3.xfrom
mneeta:feature/aviator-Stax-fvdl-processor

Conversation

@mneeta
Copy link

@mneeta mneeta commented Feb 16, 2026

Problem

The existing FVDL processing relied on DOM-based parsing, which loads the entire XML document into memory.
For large FPR/FVDL files, this resulted in high memory consumption and scalability limitations.

Solution

This PR introduces a streaming-based FVDL processor that parses the XML incrementally instead of loading it entirely into memory.

Key additions:

StreamingFVDLProcessor

Parser components for metadata, description, and trace

Memory tracking support via MemoryTracker

YAML-based language comment configuration

The new implementation significantly reduces peak memory usage during parsing.

@rsenden
Copy link
Contributor

rsenden commented Feb 16, 2026

To what extent is this streaming parser related to the DOM-based parser logic used by other Fortify products? I guess other Fortify products could also benefit from a streaming parser, and at the same time, we want to ensure that we're using the parsing logic everywhere to ensure consistency and avoid difficult to identify differences in behavior (like we've had in the past between SSC & AWB for example).

As a side note, eventually we'll likely want to move FPR parsing code to fcli-common or similar, as we'd want to reuse this functionality to provide FPR-based fcli commands (i.e., as an enhanced replacement for FPRUtility).

@mneeta
Copy link
Author

mneeta commented Feb 17, 2026

Thanks for raising this — this is an important consideration.

The current streaming implementation is functionally aligned with the existing DOM-based parsing logic. The goal was not to introduce new parsing semantics, but to replicate the same behavior while avoiding loading the entire FVDL document into memory.

Specifically:

  • The streaming parser follows the same structural interpretation of FVDL elements as the DOM-based implementation.
  • No changes were made to business logic or vulnerability interpretation.
  • The focus was strictly on improving memory efficiency and scalability.

I fully agree that consistency across Fortify products is critical to avoid behavioral differences. Moving the parsing logic into a shared component (e.g., fcli-common) would be architecturally beneficial. However, given the tight timeline for this release, such refactoring would be difficult to complete safely. This would be a good candidate for a follow-up improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants