Skip to content

Add support for 'orjson' and introduce a 'ResponseDecoder' API for deserialization#1385

Open
sirosen wants to merge 2 commits intoglobus:mainfrom
sirosen:orjson-encoder-and-decoder
Open

Add support for 'orjson' and introduce a 'ResponseDecoder' API for deserialization#1385
sirosen wants to merge 2 commits intoglobus:mainfrom
sirosen:orjson-encoder-and-decoder

Conversation

@sirosen
Copy link
Copy Markdown
Member

@sirosen sirosen commented May 5, 2026

Goal

Why?

Provide support for orjson as an alternative way of encoding and decoding HTTP body data.
This should be opt in because it could be breaking for someone, but in a future major version it should become the default.

It should be possible to set this passively to change "the whole SDK" via an env var, but for application authors, there should be explicit True/False programmatic control somewhere.

Extensibility (e.g., to other JSON implementations or other formats) is a non-goal, but the design should not be hostile to future needs.

What Changed

Add a new OrjsonRequestEncoder, which mirrors the JSONRequestEncoder type.

Add a new decoders module, containing ResponseDecoder and OrjsonResponseDecoder.
Both provide a method, get_body_json(response) for decoding a response body as JSON.

The transport can select which encoder and decoder to use based on an init-time flag, use_orjson=True or an env var, GLOBUS_SDK_USE_ORJSON=1.
Setting these values when orjson is not installed raises a RuntimeError on transport init.

The objects doing decoding are various: responses, errors, and retry hooks.
Each of these must be customized in some way to do decoding with a configured/desired decoder.
To support this, the "currently active" transport is now available through a contextvar and a narrow getter interface, RequestsTransport.get_current_transport().

Tests are extended to cover orjson testing, including several new tests, an orjson dependency group, new frozen requirements, tox config, and CI config.
Because responses read the client's decoder, a large number of test tweaks are needed to provide a "better" client mock which satisfies this requirement.

In CI and the default tox env list, we only test orjson on a small selection of Python versions, to moderate the expansion of our test matrix.

Tradeoffs & Open Questions

There aren't perfect solutions in the current state of the SDK when accounting for backwards compatibility.

requests.Response in public APIs

Anywhere that the SDK is passing a requests.Response through a public interface1, we have lost control of that interface.
We would now like to be passing a pair (response, decoder), but changing these interfaces would be breaking.
Making changes in a future major version to always pass around our own response objects is one option to solve this.

All three of the contexts in which the decoder is passed (retry, response, api errors) are tricky.
Only one of those three (retry hooks) operates at the abstraction layer of the transport, and even that has open questions.

In all of these cases, the "get current transport" contextvar is used to sidestep our inability to fully control these APIs as we would like.
(A previous draft updated the RetryContext, so that one has a better solution than the others.)

RequestsTransport.encoders

Currently, encoding behaviors are defined at the class level, via a mapping RequestsTransport.encoders.
Users can access, customize, and replace the JSON encoder today by reading or writing RequestsTransport.encoders["json"].

Unfortunately, a class var acts as both the class-level and instance-level control, since instances all reference that same dict.
Any instance-level customization will be paradigm breaking here, since class-level customization will stop being effective.

In reality, SDK users are probably not touching this at all.
GitHub code searches in-org show no results, and our own usages are some of the most elaborate.
Even my own work where I have enabled orjson did not use this path -- not based on foresight, but because it was simpler to encode things in custom client methods.

I have chosen a moderate path, by which we

  • deprecate RequestsTransport.encoders (users are now being told, in the docs, not to modify or use it)
  • copy it at runtime into encoder_map, an instance attribute
  • when orjson use is enabled, encoder_map is modified during init
    • this modification explicitly checks to see if the user replaced the json encoder (as someone might do today)

This is actually mildly breaking because transports will no longer transparently pick up on modifications to the class var, even with orjson disabled.
I think that tradeoff is worth it, but we could modify this to be even more considerate.

requests import deferral and import paths

Because API errors now need access to globus_sdk.transport, these import paths are more heavily implicated in other parts of the SDK.
This breaks tests which help enforce deferred imports of requests, due to its very slow import time.
To rectify, a number of requests imports, specifically, are now TYPE_CHECKING flagged and deferred.

In general, the whole SDK is trying to lazily load requests, so this is all directionally aligned.
It does, however, highlight the new dependency (globus_sdk.exc -> globus_sdk.transport).

Our imports are getting a little bit tangled -- lots of parts of the SDK pull in exc for GlobusSDKUsageError, but exc.api needs transport.
We should think about rearranging to clean this up. Many of the deferred import arrangements could be seen as punting on this issue.

Diff Info

GitHub's diff looks massive because it's counting all those lines of lockfile/frozen-deps content. True diff is much more modest, but still nontrivial:

$ git diff main --shortstat -- src tests
 26 files changed, 599 insertions(+), 102 deletions(-)

Changelog

Added

  • The SDK now supports use of orjson as an alternative JSON encoder and decoder.
    When GLOBUS_SDK_USE_ORJSON=1 is set, request sending and response decoding will use orjson.

    • Use of orjson is optional, but if the variable is set and orjson is not installed, errors will be emitted.

    • The setting can also be configured on transport objects with the init option, use_orjson=True.

    • In a future major version of the SDK, use of orjson will default to true when it is available.

  • RequestsTransport objects are now visible via RequestsTransport.get_current_transport(), a staticmethod, while the transport is sending a request or being used to handle a response.
    This method raises a LookupError if there is no currently active transport.

Deprecated

  • The RequestsTransport class supports configuration of request encoding via a class-variable mapping, encoders. This limits the ability of the SDK to apply per-object customizations, as in the case of orjson support. The class variable encoders is deprecated, and users should leverage the new encoder_map instance variable instead.

Footnotes

  1. Some of these interfaces were originally conceived as public and some were not. So that also adds complexity.

@sirosen sirosen force-pushed the orjson-encoder-and-decoder branch 5 times, most recently from cfb2376 to 66a8dde Compare May 7, 2026 17:29
sirosen added 2 commits May 7, 2026 12:34
In order to support `orjson` for serialization, add a new
`OrjsonRequestEncoder`, which mirrors the `JSONRequestEncoder` type.

The transport can select which encoder to use based on an init-time flag.
This setting highlights the awkwardness of the class-level (read: global)
mapping of strings to encoders. Therefore, it is now copied into an
instance attribute, which is then post-hoc modified if `use_orjson=True`.

On the decoding side, the story is more complex, as the objects doing
decoding are various: responses, errors, and retry hooks. Only one of
those three (retry hooks) operates at the abstraction layer of the
transport. Therefore, the changes to support a defined decoder type, and
provide it in these contexts, are as follows:

- RetryContext objects now include the response decoder

- response objects use their client object's decoder (specifically,
  `self.client.transport.decoder`)

- API errors have their own decoder setting, which can be _injected_ via
  a contextvar API, but which is not publicly controllable via init --
  this avoids compatibility issues around changing the init signature of
  our error types

Selection of orjson is provided via an init arg to the transport and via
an env var, `GLOBUS_SDK_USE_ORJSON=1`. If the setting is enabled but
`orjson` is not installed, instantiating encoders and decoders (of the
`Orjson*` types) will emit errors, to provide an early-error experience.

Because API errors now need access to `globus_sdk.transport`, these
import paths are more heavily implicated in other parts of the SDK.
This breaks tests which help enforce deferred imports of `requests`, due
to its very slow import time. To rectify, a number of `requests` imports,
specifically, are now `TYPE_CHECKING` flagged and deferred.

Tests are extended to cover `orjson` testing, including several new
tests, an `orjson` dependency group, new frozen requirements, tox config,
and CI config. Because responses read the client's decoder, a large
number of test tweaks are needed to provide a "better" client mock which
satisfies this requirement.

In CI and the default tox env list, we only test `orjson` on a small
selection of Python versions, to moderate the expansion of our test
matrix.
1. Add a contextvar to the transport module, which tracks the "current
   or active transport"
2. Add a private helper for setting that value
3. Add a *public* method which gets that value; LookupError if none
4. Add a private method which gets the current decoder out of the current
   transport, with a "safe" failover -- this is purely an internal
   convenience

This replaces GlobusAPIError decoder injection, responses picking the
decoder off of their client object, and retry contexts carrying the
decoder explicitly.
@sirosen sirosen force-pushed the orjson-encoder-and-decoder branch from 66a8dde to 9a0c311 Compare May 7, 2026 17:34
@sirosen sirosen marked this pull request as ready for review May 7, 2026 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant