Skip to content

Remove 128-bit limit on Vector<T> size for ARM64#129852

Open
snickolls-arm wants to merge 1 commit into
dotnet:mainfrom
snickolls-arm:256-base
Open

Remove 128-bit limit on Vector<T> size for ARM64#129852
snickolls-arm wants to merge 1 commit into
dotnet:mainfrom
snickolls-arm:256-base

Conversation

@snickolls-arm

Copy link
Copy Markdown
Contributor

If InstructionSet_VectorT is available, set the class instance size to the process SVE vector length.

Increase the maximum bound in structMightRepresentSIMDType to allow the JIT to detect this when the ISA is present.

If InstructionSet_VectorT is available, set the class instance size to the
process SVE vector length.

Increase the maximum bound in structMightRepresentSIMDType to allow the JIT
to detect this when the ISA is present.
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 25, 2026
@dotnet-policy-service dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label Jun 25, 2026
@dotnet-policy-service

Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

#ifdef FEATURE_SIMD
return (structSize >= getMinVectorByteLength()) && (structSize <= getMaxVectorByteLength());
#ifdef TARGET_ARM64
const uint32_t max = compExactlyDependsOn(InstructionSet_VectorT) ? MAX_SVE_REGSIZE_BYTES : FP_REGSIZE_BYTES;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to potentially cause light up for a lot of unintended structs, which can hurt startup perf

Won't SVE, in most scenarios, rather be "size unknown" and in isolated scenarios a JIT (but not AOT or pre-JIT) environment may be able to explicitly query the true size and optimize a few things (like frame layout)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agreed, we could query the actual size here in JIT mode, and use this as the upper bound. We can also filter sizes that are powers of 2 in bits, which could help with AOT as well.

As I've added the primitive for it, I could just do this in this PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've added the primitive for it, I could just do this in this PR.

Sorry please ignore this comment, I've got confused with another patch I'm preparing. I will add the optimization to that patch instead, which adds a primitive to read the VL from Vector<T> metadata.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a circular dependency between querying the size of Vector<T> and calling structSizeMightRepresentSIMDType. Vector<T> needs to have been seen by the JIT to query the size, but the JIT will typically not pattern match for Vector<T> (in getBaseTypeAndSizeOfSIMDType) until structSizeMightRepresentSIMDType is true.

I can't find a way to try and look for a class handle by name, and I'm assuming this is by design? So to make this optimization happen, I think I'd need to cache a Vector<T> handle when it's found, and have some sort of ready state to check whether it's been seen yet. Then switch to the optimal maximum bound when it's available.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to limit it to types that are annotated with CORINFO_FLG_INTRINSIC.

it looks like nearly every place callign structSizeMightRepresentSimdType is effectively doing something similar to this:

structSize = m_compiler->info.compCompHnd->getClassSize(structHnd);

if (m_compiler->structSizeMightRepresentSIMDType(structSize))

So I think we could pretty easily change this to pass down the CORINFO_CLASS_HANDLE (structHnd in the example) instead of the structSize -or- have structSizeMightRepresentSimdType also take in the result of getClassAttribs (which is often being queried adjacent to this already).

Then we simply check ((structFlags & (CORINFO_FLG_CONTAINS_GC_PTR | CORINFO_FLG_BYREF_LIKE | CORINFO_FLAG_INTRINSIC)) == CORINFO_FLAG_INTRINSIC) and the originalSize < MaxSimdSize, which helps us avoid saying "true" for every possible struct that is less than the max size.

Comment thread src/coreclr/vm/methodtablebuilder.cpp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants