Remove 128-bit limit on Vector<T> size for ARM64#129852
Conversation
If InstructionSet_VectorT is available, set the class instance size to the process SVE vector length. Increase the maximum bound in structMightRepresentSIMDType to allow the JIT to detect this when the ISA is present.
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
| #ifdef FEATURE_SIMD | ||
| return (structSize >= getMinVectorByteLength()) && (structSize <= getMaxVectorByteLength()); | ||
| #ifdef TARGET_ARM64 | ||
| const uint32_t max = compExactlyDependsOn(InstructionSet_VectorT) ? MAX_SVE_REGSIZE_BYTES : FP_REGSIZE_BYTES; |
There was a problem hiding this comment.
This is going to potentially cause light up for a lot of unintended structs, which can hurt startup perf
Won't SVE, in most scenarios, rather be "size unknown" and in isolated scenarios a JIT (but not AOT or pre-JIT) environment may be able to explicitly query the true size and optimize a few things (like frame layout)
There was a problem hiding this comment.
Yes agreed, we could query the actual size here in JIT mode, and use this as the upper bound. We can also filter sizes that are powers of 2 in bits, which could help with AOT as well.
As I've added the primitive for it, I could just do this in this PR.
There was a problem hiding this comment.
As I've added the primitive for it, I could just do this in this PR.
Sorry please ignore this comment, I've got confused with another patch I'm preparing. I will add the optimization to that patch instead, which adds a primitive to read the VL from Vector<T> metadata.
There was a problem hiding this comment.
There's a circular dependency between querying the size of Vector<T> and calling structSizeMightRepresentSIMDType. Vector<T> needs to have been seen by the JIT to query the size, but the JIT will typically not pattern match for Vector<T> (in getBaseTypeAndSizeOfSIMDType) until structSizeMightRepresentSIMDType is true.
I can't find a way to try and look for a class handle by name, and I'm assuming this is by design? So to make this optimization happen, I think I'd need to cache a Vector<T> handle when it's found, and have some sort of ready state to check whether it's been seen yet. Then switch to the optimal maximum bound when it's available.
There was a problem hiding this comment.
We should be able to limit it to types that are annotated with CORINFO_FLG_INTRINSIC.
it looks like nearly every place callign structSizeMightRepresentSimdType is effectively doing something similar to this:
structSize = m_compiler->info.compCompHnd->getClassSize(structHnd);
if (m_compiler->structSizeMightRepresentSIMDType(structSize))So I think we could pretty easily change this to pass down the CORINFO_CLASS_HANDLE (structHnd in the example) instead of the structSize -or- have structSizeMightRepresentSimdType also take in the result of getClassAttribs (which is often being queried adjacent to this already).
Then we simply check ((structFlags & (CORINFO_FLG_CONTAINS_GC_PTR | CORINFO_FLG_BYREF_LIKE | CORINFO_FLAG_INTRINSIC)) == CORINFO_FLAG_INTRINSIC) and the originalSize < MaxSimdSize, which helps us avoid saying "true" for every possible struct that is less than the max size.
If
InstructionSet_VectorTis available, set the class instance size to the process SVE vector length.Increase the maximum bound in
structMightRepresentSIMDTypeto allow the JIT to detect this when the ISA is present.