Add architecture registration + fallback loading path for newly released HF model types by Copilot · Pull Request #27 · codewithdark-git/QuantLLM

Copilot · 2026-04-25T07:11:32Z

Confirm baseline state and inspect current fallback implementation/docs/tests
Add trust_remote_code safety warning for unregistered/new architectures
Improve fallback RuntimeError with resolved/base model-type guidance and registration example
Ensure defaults/polish: explicit base_model_fallback default path, fallback-order comment, unbounded regex cache
Add fallback quantization regression test
Update loading docs with security note and a concrete "released yesterday" example
Run targeted lint/tests for touched files and summarize results
Address automated review feedback (test import order)

Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/274cc6b0-d42e-47b1-9673-1f6db346ecf2 Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>

codewithdark-git

@copilot Thank you for this PR. It’s a very well-targeted and timely improvement. Directly addressing the “architecture not recognized by transformers” error for brand-new models is one of the highest-priority issues for QuantLLM right now. The overall direction — adding a lightweight registration system + multi-tier fallback loading — is excellent and aligns perfectly with our goal of staying competitive with Unsloth on new-model support.

What I Like

Clean and focused scope.
Good API design: register_architecture(), model_type_override, base_model_fallback, and from_config_only are intuitive and powerful.
Smart use of token-based regex matching with caching instead of naive string checks.
Solid integration with the existing quantization path (_should_apply_quantization).
Clear error messages with actionable suggestions — this is a big UX win.
Comprehensive tests and updated documentation (including contribution template).

This PR already makes QuantLLM significantly more robust for recently released models.

Suggested Improvements (Before Merging)

Here are a few concrete refinements I recommend:

Security Warning for trust_remote_code=True
Since this flag is now more prominently exposed for new models, we should add a prominent warning (both in code and docs) when it is used with unregistered or very new models. Something like:
```
if trust_remote_code and not is_registered:
    logger.warning("trust_remote_code=True was enabled for an unregistered architecture. "
                   "Only use this for models from trusted sources.")
```
Also add a clear note in docs/guide/loading-models.md.

Improve Fallback Error Message
The current RuntimeError is good, but we can make it even more helpful by including the resolved base model type and a one-line registration example:

f"Architecture '{config.model_type}' is not recognized.\n"
f"Try: register_architecture('{config.model_type}', base_model_type='llama')\n"
f"or use model_type_override='llama'."

Make base_model_fallback=True the Default
For maximum “it just works” experience with new models, consider making base_model_fallback=True the default in turbo() and TurboModel.from_pretrained(). Users can still disable it with base_model_fallback=False if needed. This would reduce friction significantly.
Minor Code Polish
- Ensure _compiled_model_name_pattern uses functools.lru_cache(maxsize=None) for thread-safety and performance.
- Add a small comment explaining the fallback priority order in _load_model_with_fallback.
- Verify that when fallback succeeds, quantization (NF4, double quant, etc.) is still applied correctly — maybe add one integration test for 4-bit loading after fallback.
Documentation
Add one concrete real-world-style example in the loading guide using a hypothetical model released “yesterday” (e.g., Qwen3-8B or similar).

Overall Verdict

This PR is very close to ready — I’d rate it ~85% complete. With the above adjustments (especially the security warning and default fallback behavior), it will be production-ready and a strong addition to the library.

Once these points are addressed, I’m happy to approve and merge.

Let me know if you’d like me to push any of these changes directly or if you have questions about any suggestion.

Thanks again for the solid work!

Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/8867f3b4-18ae-4207-b2e8-51444418c7aa Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>

…ates Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/8867f3b4-18ae-4207-b2e8-51444418c7aa Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>

Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/8867f3b4-18ae-4207-b2e8-51444418c7aa Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>

Initial plan

f11b319

Copilot AI assigned Copilot and codewithdark-git Apr 25, 2026

Copilot started work on behalf of codewithdark-git April 25, 2026 07:11 View session

Copilot AI linked an issue Apr 25, 2026 that may be closed by this pull request

New LLM Architectures Not Recognized by Transformers — Add Flexible Model Class Registration and Fallback System #26

Closed

Copilot AI and others added 2 commits April 25, 2026 07:18

feat: add architecture registration and fallback loading for new models

cdd6188

Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/274cc6b0-d42e-47b1-9673-1f6db346ecf2 Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>

docs/tests: polish architecture fallback behavior and guidance

83b66ae

Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/274cc6b0-d42e-47b1-9673-1f6db346ecf2 Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add flexible model class registration and fallback system~~ Add architecture registration + fallback loading path for newly released HF model types Apr 25, 2026

Copilot AI requested a review from codewithdark-git April 25, 2026 07:25

Copilot finished work on behalf of codewithdark-git April 25, 2026 07:25

codewithdark-git reviewed Apr 25, 2026

View reviewed changes

codewithdark-git marked this pull request as ready for review April 25, 2026 07:44

Copilot AI review requested due to automatic review settings April 25, 2026 07:44

Copilot started reviewing on behalf of codewithdark-git April 25, 2026 07:44 View session

This comment was marked as low quality.

Sign in to view

Copilot started work on behalf of codewithdark-git April 25, 2026 08:20 View session

Copilot AI requested a review from codewithdark-git April 25, 2026 08:22

Copilot started work on behalf of codewithdark-git April 25, 2026 13:23 View session

Copilot AI and others added 3 commits April 25, 2026 13:28

Address PR review refinements for architecture fallback loading

2e59606

Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/8867f3b4-18ae-4207-b2e8-51444418c7aa Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>

Minimize turbo_model diff while keeping review-requested fallback upd…

0b877a0

…ates Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/8867f3b4-18ae-4207-b2e8-51444418c7aa Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>

Fix import order in fallback tests per review feedback

c752dfa

Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/8867f3b4-18ae-4207-b2e8-51444418c7aa Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>

Copilot finished work on behalf of codewithdark-git April 25, 2026 13:32

Copilot started work on behalf of codewithdark-git April 25, 2026 13:36 View session

Copilot finished work on behalf of codewithdark-git April 25, 2026 13:38

Copilot started work on behalf of codewithdark-git April 25, 2026 13:39 View session

Copilot stopped work on behalf of codewithdark-git due to an error April 25, 2026 13:39
The session was cancelled by the user.

Copilot started work on behalf of codewithdark-git April 25, 2026 13:40 View session

codewithdark-git approved these changes Apr 25, 2026

View reviewed changes

Copilot AI requested a review from codewithdark-git April 25, 2026 13:40

codewithdark-git merged commit c32c63d into main Apr 25, 2026
2 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add architecture registration + fallback loading path for newly released HF model types#27

Add architecture registration + fallback loading path for newly released HF model types#27
codewithdark-git merged 6 commits intomainfrom
copilot/add-flexible-model-registration

Copilot AI commented Apr 25, 2026 •

edited

Loading

Uh oh!

codewithdark-git left a comment •

edited

Loading

Uh oh!

This comment was marked as low quality.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Copilot AI commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codewithdark-git left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

What I Like

Suggested Improvements (Before Merging)

Overall Verdict

Uh oh!

This comment was marked as low quality.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Apr 25, 2026 •

edited

Loading

codewithdark-git left a comment •

edited

Loading