Skip to content

pr: stream simple layouts for non-file inputs#11244

Open
mattsu2020 wants to merge 4 commits intouutils:mainfrom
mattsu2020:pr_zero
Open

pr: stream simple layouts for non-file inputs#11244
mattsu2020 wants to merge 4 commits intouutils:mainfrom
mattsu2020:pr_zero

Conversation

@mattsu2020
Copy link
Contributor

Summary

This PR adds a streaming path for simple pr layouts when reading from non-file inputs such as FIFOs and character devices.

The main goal is to avoid buffering the entire input in memory for cases where pr can process the input line by line. This also makes inputs like /dev/zero start producing output immediately instead of waiting for a full read.

What changed

  • add a streaming implementation for simple pr layouts

  • use the streaming path for supported non-file inputs

  • align streaming output with the existing buffered behavior, including page trailer formatting

  • validate UTF-8 incrementally while streaming and report invalid input cleanly

  • fall back to the buffered path when the selected page layout leaves no printable content lines

  • add regression tests for:

    • character-device input
    • invalid UTF-8 from a FIFO
    • FIFO/file parity in zero-content-line edge cases

    related

    pr /dev/zero [SIGPIPE|SIGKILL] #11139

@sylvestre
Copy link
Contributor

the list of commits isn't ideal, could you please improve it? thanks

Add a streaming code path for simple  layouts so pipes and devices can be processed without buffering the full input in memory. Keep the import-only cleanup in the same commit to avoid review noise.
Tighten streaming mode selection for direct non-file inputs and align page trailer formatting with the existing buffered implementation. Update the regression test to exercise the supported char-device path.
Reject invalid UTF-8 incrementally in the streaming path, including truncated multibyte tails at EOF, and cover the failure mode with a FIFO-based regression test. Fold the spell-checker update into the same commit because it is introduced by the new test.
Avoid the streaming path for non-file inputs when the requested layout leaves zero printable content lines, and add FIFO-vs-file regressions for those edge cases. Keep the unix-only test import cleanup with the helper it belongs to.
@github-actions
Copy link

github-actions bot commented Mar 9, 2026

GNU testsuite comparison:

Skip an intermittent issue tests/date/date-locale-hour (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/rm/many-dir-entries-vs-OOM is now passing!
Congrats! The gnu test tests/tail/pipe-f is now passing!

@codspeed-hq
Copy link

codspeed-hq bot commented Mar 9, 2026

Merging this PR will improve performance by 3.64%

⚡ 1 improved benchmark
✅ 297 untouched benchmarks
⏩ 48 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation du_summarize_balanced_tree[(5, 4, 10)] 6.7 ms 6.5 ms +3.64%

Comparing mattsu2020:pr_zero (8a28cc6) with main (995c9e0)

Open in CodSpeed

Footnotes

  1. 48 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants