Skip to content

Bug(ChainRules): gradient/pullback return NoTangent instead of zeros when f does not depend on x #1016

Description

@bdrhill

Description

AutoChainRules (e.g. AutoChainRules(ZygoteRuleConfig())) returns ChainRulesCore.NoTangent() as the gradient/pullback result when the underlying rrule_via_ad reports a NoTangent cotangent — most commonly when f does not depend on x. DI does not convert this sentinel into a numeric zero before returning, so callers see a NoTangent instead of a value with the same shape and type as x.

This is the same class of issue as #1011 (AutoZygote returning nothing), but in a different code path (ext/DifferentiationInterfaceChainRulesCoreExt/reverse_onearg.jl).

Affected operators:

Operator Behavior
gradient returns NoTangent()
value_and_gradient tangent slot is NoTangent()
pullback returns NoTangent()
jacobian MethodError from arroftup_to_tupofarr(::Tuple{NoTangent}, ::Float64)
derivative (scalar→scalar) MethodError from arroftup_to_tupofarr
pushforward (scalar→scalar) MethodError from arroftup_to_tupofarr

The same pattern affects vector-output constant functions (pullback returns NoTangent, jacobian errors).

MWE

using DifferentiationInterface
using ChainRules
using Zygote: ZygoteRuleConfig

backend = AutoChainRules(ZygoteRuleConfig())
fc(x) = 42.0
x = [1.0, 2.0]

gradient(fc, backend, x)
# ChainRulesCore.NoTangent()
# expected: [0.0, 0.0]

value_and_gradient(fc, backend, x)
# (42.0, ChainRulesCore.NoTangent())
# expected: (42.0, [0.0, 0.0])

pullback(fc, backend, x, (1.0,))[1]
# ChainRulesCore.NoTangent()
# expected: [0.0, 0.0]

jacobian(fc, backend, x)
# MethodError (see stacktrace)

derivative(t -> 42.0, backend, 1.5)
# MethodError (see stacktrace)
Stacktrace
julia> derivative(t -> 42.0, backend, 1.5)
ERROR: MethodError: no method matching arroftup_to_tupofarr(::Tuple{ChainRulesCore.NoTangent}, ::Float64)
The function `arroftup_to_tupofarr` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  arroftup_to_tupofarr(!Matched::NTuple{B, var"#s56"} where var"#s56"<:Number, ::Number) where B
   @ DifferentiationInterface ~/dev/DifferentiationInterface.jl/DifferentiationInterface/src/utils/linalg.jl:44
  arroftup_to_tupofarr(!Matched::AbstractArray{<:NTuple{B, var"#s3"} where var"#s3"<:Number}, !Matched::GPUArraysCore.AbstractGPUArray{<:Number}) where B
   @ DifferentiationInterfaceGPUArraysCoreExt ~/dev/DifferentiationInterface.jl/DifferentiationInterface/ext/DifferentiationInterfaceGPUArraysCoreExt/DifferentiationInterfaceGPUArraysCoreExt.jl:21
  arroftup_to_tupofarr(!Matched::AbstractArray{<:NTuple{B, var"#s6"} where var"#s6"<:Number}, !Matched::AbstractArray{<:Number}) where B
   @ DifferentiationInterface ~/dev/DifferentiationInterface.jl/DifferentiationInterface/src/utils/linalg.jl:46

Native ChainRules Behavior

Native rrule_via_ad does report NoTangent, which is the canonical zero in ChainRulesCore:

julia> using ChainRules; using Zygote
julia> rc = Zygote.ZygoteRuleConfig();
julia> y, pb = ChainRules.rrule_via_ad(rc, x -> 42.0, [1.0, 2.0]);
julia> pb(1.0)
(NoTangent(), NoTangent())

So at the boundary of the AD library this is expected. The DI extension is the layer that should normalize NoTangent into a numeric zero (with the shape and element type of x), the same way #1011 proposes for nothing in the Zygote extension.

Expected Behavior

For f whose gradient is zero (e.g. f(x) = 42.0):

  • gradient(f, backend, x) returns zero(x) (here [0.0, 0.0])
  • pullback(f, backend, x, (1.0,))[1] returns zero(x)
  • jacobian(f, backend, x) returns a zero matrix of shape (length(y), length(x))
  • derivative, pushforward for scalar input return zero(y) of the right shape

Pattern: in unthunk(pb(dy)[2]), convert NoTangent to zero(x) (or the matching shape/type) before returning.

Cross-Backend Comparison

The same functions on other backends return zeros without issue:

gradient(x -> 42.0, AutoForwardDiff(), [1.0, 2.0])    # [0.0, 0.0]
gradient(x -> 42.0, AutoFiniteDiff(), [1.0, 2.0])     # ~ [0.0, 0.0]

For AutoChainRules the affected functions also include simple ones (f(x) = sign(x[1])-like in regions where Zygote returns nothing) — though in this case sum(sign.(x)) for non-zero x already passes a 0.0 array, so the user-visible failure mode is mostly constant-output functions.

Backend

  • Backend: AutoChainRules(Zygote.ZygoteRuleConfig())
  • Native API returns NoTangent: Yes (canonical zero in ChainRulesCore)
  • Works with other backends: Yes (FD/FiniteDiff/Enzyme return zeros)

Environment

  • Julia 1.12.5
  • DifferentiationInterface v0.7.18 (this branch)
  • ChainRules v1.x, Zygote v0.7.x
  • ADTypes v1.22.0
Full environment
julia> Pkg.status()
  [47edcb42] ADTypes v1.22.0
  [082447d4] ChainRules v1.x
  [a0c0ee7d] DifferentiationInterface v0.7.18 `../../..`
  [a82114a7] DifferentiationInterfaceTest v0.11.0 `../../../../DifferentiationInterfaceTest`
  [e88e6eb3] Zygote v0.7.x

julia> versioninfo()
Julia Version 1.12.5
Commit 5fe89b8ddc1 (2026-02-09 16:05 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 4 × Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz

🤖 I am a robot. This is an experiment in agentic bug-catching under the supervision of @adrhill and @gdalle (#1008). Contents may be hallucinated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    botIssue or PR created automatically, wait for human review before interacting

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions