Summary
JSON.pretty_generate currently preserves the insertion order of Ruby hashes. This leads to non-deterministic output when hash construction order varies, making diffs noisy and reproducibility harder. Add an option to sort hash keys during generation to produce stable, predictable JSON output.
Motivation / Problem
-
In many workflows (config generation, test fixtures, CI artifacts), stable serialization is critical.
-
Hash insertion order may differ across code paths, Ruby versions, or data sources, causing semantically identical objects to produce different JSON.
-
This complicates:
- Git diffs and code reviews
- Caching and content hashing
- Snapshot testing
- Reproducible builds
Proposed Solution
Introduce an option to JSON.pretty_generate (and possibly JSON.generate) to sort object keys lexicographically.
API Options (one of):
-
Keyword argument:
JSON.pretty_generate(obj, sort_keys: true)
-
Extend JSON::State:
state = JSON::State.new(sort_keys: true)
JSON.pretty_generate(obj, state)
Behavior
- When
sort_keys: true, all hashes are serialized with keys sorted (string comparison).
- Default remains
false to preserve backward compatibility and performance characteristics.
Example
obj = { b: 1, a: 2 }
JSON.pretty_generate(obj)
# => {
# "b": 1,
# "a": 2
# }
JSON.pretty_generate(obj, sort_keys: true)
# => {
# "a": 2,
# "b": 1
# }
Alternatives Considered
Impact
- Improves determinism and reproducibility across tooling and environments
- Reduces diff noise and improves developer experience
- Aligns with behavior available in other ecosystems (e.g., Python’s
json.dumps(sort_keys=True))
Performance Considerations
- Sorting introduces overhead proportional to key count per object
- Acceptable when opt-in; no impact on default behavior
Backward Compatibility
- Fully backward compatible if default remains unsorted
Test Plan
Open Questions
- Should sorting be strictly lexicographic on stringified keys?
- Should there be a global default toggle via
JSON::State configuration?
Additional Context
This feature would support reproducible outputs in CI pipelines and long-lived systems where deterministic artifacts are a requirement.
Summary
JSON.pretty_generatecurrently preserves the insertion order of Ruby hashes. This leads to non-deterministic output when hash construction order varies, making diffs noisy and reproducibility harder. Add an option to sort hash keys during generation to produce stable, predictable JSON output.Motivation / Problem
In many workflows (config generation, test fixtures, CI artifacts), stable serialization is critical.
Hash insertion order may differ across code paths, Ruby versions, or data sources, causing semantically identical objects to produce different JSON.
This complicates:
Proposed Solution
Introduce an option to
JSON.pretty_generate(and possiblyJSON.generate) to sort object keys lexicographically.API Options (one of):
Keyword argument:
Extend
JSON::State:Behavior
sort_keys: true, all hashes are serialized with keys sorted (string comparison).falseto preserve backward compatibility and performance characteristics.Example
Alternatives Considered
Pre-sorting hashes before serialization:
Relying on insertion order discipline:
Impact
json.dumps(sort_keys=True))Performance Considerations
Backward Compatibility
Test Plan
Unit tests verifying:
Benchmark comparison with and without sorting
Open Questions
JSON::Stateconfiguration?Additional Context
This feature would support reproducible outputs in CI pipelines and long-lived systems where deterministic artifacts are a requirement.