Parser architecture rework by xnacly · Pull Request #4 · xnacly/libjson

xnacly · 2026-02-18T13:23:27Z

This pr aims to rework the parser to remove intermediate values, omittable allocations and in general be faster due to less recursion and less gc pressure.

Goals:

replace recursion in the parser with an explicit stack for intermediate containers and jumping around inside of Parser.parse based on token type (this should also account for feat: recursive json object test #2, since the stack overflow for large recursion is now replaced by OOM 😹)
replace inline allocations with a more preallocation to omit slice and map grow
replace io.ReadAll with a syscall.Mmap in a new libjson.FromFile function
deal with escapes in strings, which somehow is still missing
~~add t_string_escapes to only pay the cost of calling unescapeInPlace for strings containing escapes~~ (slowed things down due to multiple extra branches)

Pre pr Benchmarks:

Test input is generated with test/gen.py

Input size	library	time	faster
1MB	libjson	9.6ms	1.57x
	encoding/json	15.0ms
2MB	libjson	36.2ms	1.82x
	encoding/json	65.9ms
5MB	libjson	74.1ms	1.71x
	encoding/json	126.6ms

encoding/json

nogc

$ go run cmd/lj.go -libjson=false -s -nogc -pprof test/10MB.json
$ go tool pprof 10MB.json.pprof
(pprof) top
Showing nodes accounting for 120ms, 100% of 120ms total
Showing top 10 nodes out of 36
      flat  flat%   sum%        cum   cum%
      30ms 25.00% 25.00%       40ms 33.33%  encoding/json.(*Decoder).readValue
      10ms  8.33% 33.33%       10ms  8.33%  encoding/json.(*decodeState).convertNumber
      10ms  8.33% 41.67%       20ms 16.67%  encoding/json.(*decodeState).literalInterface
      10ms  8.33% 50.00%       10ms  8.33%  encoding/json.stateEndValue
      10ms  8.33% 58.33%       10ms  8.33%  internal/chacha8rand.(*State).Next
      10ms  8.33% 66.67%       10ms  8.33%  internal/runtime/maps.(*ctrlGroup).setEmpty
      10ms  8.33% 75.00%       10ms  8.33%  runtime.convTslice
      10ms  8.33% 83.33%       10ms  8.33%  runtime.getMCache
      10ms  8.33% 91.67%       10ms  8.33%  runtime.memclrNoHeapPointers
      10ms  8.33%   100%       10ms  8.33%  runtime.memmove
$ hyperfine "go run cmd/lj.go -libjson=false -s -nogc -pprof test/10MB.json"
Benchmark 1: go run cmd/lj.go -libjson=false -s -nogc -pprof test/10MB.json
  Time (mean ± σ):     163.9 ms ±   3.4 ms    [User: 130.0 ms, System: 88.7 ms]
  Range (min … max):   159.0 ms … 171.9 ms    17 runs

gc

$  go run cmd/lj.go -libjson=false -s -pprof test/10MB.json
$ go tool pprof 10MB.json.pprof
(pprof) top
Showing nodes accounting for 150ms, 83.33% of 180ms total
Showing top 10 nodes out of 55
      flat  flat%   sum%        cum   cum%
      20ms 11.11% 11.11%       40ms 22.22%  encoding/json.(*Decoder).readValue
      20ms 11.11% 22.22%       20ms 11.11%  internal/runtime/gc/scan.scanSpanPackedAVX512
      20ms 11.11% 33.33%       20ms 11.11%  runtime.memclrNoHeapPointers
      20ms 11.11% 44.44%       20ms 11.11%  runtime.memmove
      20ms 11.11% 55.56%       20ms 11.11%  runtime.suspendG
      10ms  5.56% 61.11%       30ms 16.67%  encoding/json.(*decodeState).scanWhile
      10ms  5.56% 66.67%       10ms  5.56%  encoding/json.(*scanner).pushParseState
      10ms  5.56% 72.22%       20ms 11.11%  encoding/json.stateBeginValue
      10ms  5.56% 77.78%       10ms  5.56%  encoding/json.stateEndValue
      10ms  5.56% 83.33%       10ms  5.56%  internal/runtime/maps.(*ctrlGroup).setEmpty
$ hyperfine "go run cmd/lj.go -libjson=false -s -pprof test/10MB.json"
Benchmark 1: go run cmd/lj.go -libjson=false -s -pprof test/10MB.json
  Time (mean ± σ):     160.5 ms ±   3.0 ms    [User: 186.3 ms, System: 79.4 ms]
  Range (min … max):   156.3 ms … 167.9 ms    18 runs

libjson

nogc

$ go run cmd/lj.go -nogc -s -pprof test/10MB.json
$ go tool pprof 10MB.json.pprof
(pprof) top
Showing nodes accounting for 60ms, 100% of 60ms total
Showing top 10 nodes out of 31
      flat  flat%   sum%        cum   cum%
      20ms 33.33% 33.33%       20ms 33.33%  runtime.memclrNoHeapPointers
      10ms 16.67% 50.00%       10ms 16.67%  github.com/xnacly/libjson.(*lexer).next
      10ms 16.67% 66.67%       10ms 16.67%  github.com/xnacly/libjson.pow10 (inline)
      10ms 16.67% 83.33%       10ms 16.67%  internal/runtime/maps.(*ctrlGroup).setEmpty (inline)
      10ms 16.67%   100%       10ms 16.67%  runtime.rand
         0     0%   100%       10ms 16.67%  github.com/xnacly/libjson.(*parser).advance
         0     0%   100%       50ms 83.33%  github.com/xnacly/libjson.(*parser).array
         0     0%   100%       20ms 33.33%  github.com/xnacly/libjson.(*parser).atom
         0     0%   100%       50ms 83.33%  github.com/xnacly/libjson.(*parser).expression
         0     0%   100%       50ms 83.33%  github.com/xnacly/libjson.(*parser).object
$ hyperfine "go run cmd/lj.go -nogc -s -pprof test/10MB.json"
Benchmark 1: go run cmd/lj.go -nogc -s -pprof test/10MB.json
  Time (mean ± σ):     106.6 ms ±   2.1 ms    [User: 84.9 ms, System: 76.6 ms]
  Range (min … max):   103.8 ms … 110.7 ms    28 runs

gc

$ go run cmd/lj.go -s -pprof test/10MB.json
$ go tool pprof 10MB.json.pprof
(pprof) top
Showing nodes accounting for 100ms, 100% of 100ms total
Showing top 10 nodes out of 39
      flat  flat%   sum%        cum   cum%
      10ms 10.00% 10.00%       10ms 10.00%  github.com/xnacly/libjson.pow10
      10ms 10.00% 20.00%       10ms 10.00%  internal/runtime/gc/scan.scanSpanPackedAVX512
      10ms 10.00% 30.00%       10ms 10.00%  internal/runtime/maps.(*ctrlGroup).setEmpty
      10ms 10.00% 40.00%       10ms 10.00%  runtime.acquirem (inline)
      10ms 10.00% 50.00%       10ms 10.00%  runtime.findObject
      10ms 10.00% 60.00%       10ms 10.00%  runtime.heapArenaOf
      10ms 10.00% 70.00%       30ms 30.00%  runtime.mallocgcSmallScanNoHeader
      10ms 10.00% 80.00%       10ms 10.00%  runtime.memclrNoHeapPointers
      10ms 10.00% 90.00%       10ms 10.00%  runtime.nextFreeFast (inline)
      10ms 10.00%   100%       10ms 10.00%  runtime.typePointers.next
$ hyperfine "go run cmd/lj.go -s -pprof test/10MB.json"
Benchmark 1: go run cmd/lj.go -s -pprof test/10MB.json
  Time (mean ± σ):     105.5 ms ±   2.5 ms    [User: 116.6 ms, System: 77.9 ms]
  Range (min … max):   101.5 ms … 110.0 ms    27 runs

Before this change "\uD834\uDD1E" would result in "�DD1E" but should have resulted in "��", due to both being unmerged surrogates.

Previously time spent for parsing 100MB JSON input (600ms) took 60ms in a number of unnecessary bound checks: CALL runtime.panicBounds(SB), now reduced to 20ms by moving explicit bound checks before indizes, reusing indexed slots and merging manual out of loop increments.

Reduced time taken in unescapeInPlace by 30ms (from 5.75% to 3.41%)

… fast but more correct

This commit replaces the need for hashing json object keys at parse time by replacing the previously used map[string]any with the new obj struct: | Benchmark | LibJson B/op | EncodingJson B/op | LibJson x Less Memory | LibJson Allocs | EncodingJson Allocs | LibJson x Fewer Allocs | | --------- | ------------ | ----------------- | --------------------- | -------------- | ------------------- | ---------------------- | | Naive | 29,632,671 | 42,744,497 | 1.44x | 450,023 | 1,050,031 | 2.33x | | Escaped | 22,471,438 | 37,544,412 | 1.67x | 350,023 | 1,100,030 | 3.14x | | Hard | 121,444,318 | 173,944,500 | 1.43x | 1,400,023 | 3,000,032 | 2.14x | These changes result in a ~10-15% speedup and allows libjson to hit the ~2x faster than encoding/json milestone. For instance with 1MB, 5MB, 10MB and 100MB sized files filled with: { "id": 12345, "name": "very_long_string_with_escapes_and_unicode_abcdefghijklmnopqrstuvwxyz_0123456789", "description": "This string contains\nmultiple\nlines\nand \"quotes\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"\"", "nested": { "level1": { "level2": { "level3": { "level4": { "array": [ "short", "string_with_escape\\n", "another\\tvalue", "unicode\u2603", "escaped_quote_\"_and_backslash_\\", 11234567890,1234567890,1234567890,1234567890,1234567890,1234567890,1234567890,1234567890,1234567890,1234567890,1234567890,1234567890,234567890, -1.2345e67, 3.1415926535897932384626433832795028841971, True, False, None, "\u0041\u0042\u0043\u00A9\u20AC\u0041\u0042\u0043\u00A9\u20AC\u0041\u0042\u0043\u00A9\u20AC\u0041\u0042\u0043\u00A9\u20AC\u0041\u0042\u0043\u00A9\u20AC\u0041\u0042\u0043\u00A9\u20AC\u0041\u0042\u0043\u00A9\u20AC\u0041\u0042\u0043\u00A9\u20AC\u0041\u0042\u0043\u00A9\u20AC\u0041\u0042\u0043\u00A9\u20AC\u0041\u0042\u0043\u00A9\u20AC", "mix\\n\\t\\r\\\\\\\"end" ] } } } } } } libjson now outperforms encoding/json: $ cd ./benchmarks $ ./bench.sh | rg "faster" 1.72 ± 0.15 times faster than ./test -s -libjson=false ./1MB.json 1.89 ± 0.11 times faster than ./test -s -libjson=false ./5MB.json 1.90 ± 0.06 times faster than ./test -s -libjson=false ./10MB.json 1.95 ± 0.05 times faster than ./test -s -libjson=false ./100MB.json

xnacly added 15 commits February 18, 2026 14:03

git: add pprof to gitignore

92d6fbe

go: update to g1.26

1b95215

cmd/lj: rework the cli with options and flags

c669f22

test: replace test binary with cmd/lj

7410c20

lexer+parser: support ecma404 escape characters

815f881

docs: change ecma404 ref

86709d8

types: remove t_string_escaped

74c795c

parser: fix off by one error in unescapeInPlace unicode handling

12878d7

Before this change "\uD834\uDD1E" would result in "�DD1E" but should have resulted in "��", due to both being unmerged surrogates.

parser: update benchmarking input

df35815

benchmarks: use heavier input in benchmarking (1-100MB)

7ff2cb2

benchmarks: deeper nested and more escapes in benchmark

0f98f84

parser: ripped hex out and make it table driven

e4b2043

Reduced time taken in unescapeInPlace by 30ms (from 5.75% to 3.41%)

parser: replace parseFloat with strconv.ParseFloat due to it being as…

79124c1

… fast but more correct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Parser architecture rework#4

Parser architecture rework#4
xnacly wants to merge 15 commits intomasterfrom
parser-architecture-rework

xnacly commented Feb 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

xnacly commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pre pr Benchmarks:

encoding/json

libjson

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

xnacly commented Feb 18, 2026 •

edited

Loading