Skip to content

Large hash with mixed symbol and string key causing segfaults with to_json #929

@Jiig

Description

@Jiig

In ruby 3.4.6 and 4.0.0 we are seeing strange exceptions and even segfaults when trying to dump JSON strings of large hashes

Hash example:

test_data = {
  "source" => "source",
  "startTime" => Time.now.utc.iso8601,
  "endTime" => Time.now.utc.iso8601,
  "createTime" => Time.now.utc.iso8601,
  "dataType" => "foo",
  "uuid" => SecureRandom.uuid,
  "flag" => true,
  "data" => [[]...] # Large array of arrays, sub arrays contain date string and lots of floats
}

I've traced the root cause in our code to setting a key in the hash to a symbol version later in the code:

test_data[:isTest] = true

I've fixed this in our code by just using a string key (as we should have been).
However the segfaults are concerning, and the behavior of the exceptions also doesn't make much sense to me.

When I call to_json on the hash I get one of two exceptions:

'JSON::Ext::Generator::GeneratorMethods::Hash#to_json': no implicit conversion of Integer into String (TypeError)

Or sometimes its a float:

'JSON::Ext::Generator::GeneratorMethods::Hash#to_json': no implicit conversion of Float into String (TypeError)

And then sometimes it segfaults and crashes ruby (see link for crash dump):

https://gist.github.com/Jiig/b89f887c9c601345afeade76dfffd2a7

I'm not sure if its the layout of these hashes in tandem with the mixed keys, or if using mixed keys at all in larges hashes is causing the issue.

And if the answer is "don't use mixed keys" I understand, just wanted to raise the fact that its crashing ruby.

Reproducible ruby code:

require 'json'
require 'securerandom'

test_data = {
  "source" => "source",
  "startTime" => Time.now.utc.iso8601,
  "endTime" => Time.now.utc.iso8601,
  "createTime" => Time.now.utc.iso8601,
  "dataType" => "foo",
  "uuid" => SecureRandom.uuid,
  "flag" => true,
  "data" => []
}

# Changing the amount of data points changes the behavior
# small amounts like 100 have no issue, but once you start getting over 300 or so it starts having issues (didn't find an exact number)
(1..10000).each do |n|
  test_data["data"] << [
    Time.now.utc.iso8601,
    Random.rand(-100000.0..100000.0),
    Random.rand(-100000.0..100000.0),
    Random.rand(-100000.0..100000.0),
    Random.rand(-100000.0..100000.0),
    Random.rand(-100000.0..100000.0),
    Random.rand(-100000.0..100000.0),
    1.0,
    0.0,
    0.0,
    1.0,
    0.0,
    1.0
  ]
end
# Convert the random data to JSON, doesn't have any issues as expected
test_data.to_json

# Then if we set the symbol key:
test_data[:flag] = false

# The to_json method starts raising exceptions or segfaults
# Worth noting that the JSON.pretty_generate_method does not have the same issue
# JSON.pretty_generate(test_data) # If you replace the to_json below with this it doesn't have any issues
test_data.to_json

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions