Strip Gemma 4 thinking tokens from translation output by PolynomialDivision · Pull Request #29 · LibreTranslate/LTEngine

PolynomialDivision · 2026-06-18T17:45:08Z

I am sorry, the "Big" PR was not supposed to be merged anymore, since I thought splitting it up into several smaller ones would be good for review. Now it got a bit scrambled, and some PRs I squashed differently into the other ones, so the code is now in a weird state. I am going through everything and checking again. Sorry for that. So this PR is still needed since Gemma 4 answers contain this "thought".

PolynomialDivision · 2026-06-18T17:54:12Z

For the gemma4 model we need update-llama-cpp-2026-06-17 from llama-cpp-rs. But I guess it is only a matter of days when it is merged. :D It contains an important fix for the chat template. Sorry again, that I screwed up the PRs.

Gemma 4 emits thinking content in two forms: - <|channel>thought\n...<channel|>answer (full block with closing tag) - <|channel>thought answer (no closing tag, space-separated) Handle both cases so thinking tokens never leak into the translation result.

Emit a warning when apply_chat_template fails and ltengine falls back to the hardcoded Gemma prompt format.

pierotofy · 2026-06-18T19:20:59Z

This looks OK, although it's a bit of a hack, we can include this, but long term it might be better to explicitly disable thinking mode from certain models, based on https://ai.google.dev/gemma/docs/capabilities/thinking it should be possible.

PolynomialDivision · 2026-06-18T19:30:19Z

This looks OK, although it's a bit of a hack, we can include this, but long term it might be better to explicitly disable thinking mode from certain models, based on https://ai.google.dev/gemma/docs/capabilities/thinking it should be possible.

I will look into this. :)

PolynomialDivision marked this pull request as draft June 18, 2026 18:02

PolynomialDivision force-pushed the strip-gemma4 branch from eb5702d to d1052b5 Compare June 18, 2026 18:10

Log chat template fallback failures

9e897d7

Emit a warning when apply_chat_template fails and ltengine falls back to the hardcoded Gemma prompt format.

PolynomialDivision force-pushed the strip-gemma4 branch from 711e00f to 9e897d7 Compare June 18, 2026 18:19

PolynomialDivision marked this pull request as ready for review June 18, 2026 18:25

pierotofy merged commit 594a1f7 into LibreTranslate:main Jun 18, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strip Gemma 4 thinking tokens from translation output#29

Strip Gemma 4 thinking tokens from translation output#29
pierotofy merged 2 commits into
LibreTranslate:mainfrom
PolynomialDivision:strip-gemma4

PolynomialDivision commented Jun 18, 2026

Uh oh!

PolynomialDivision commented Jun 18, 2026

Uh oh!

pierotofy commented Jun 18, 2026

Uh oh!

Uh oh!

PolynomialDivision commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

PolynomialDivision commented Jun 18, 2026

Uh oh!

PolynomialDivision commented Jun 18, 2026

Uh oh!

pierotofy commented Jun 18, 2026

Uh oh!

Uh oh!

PolynomialDivision commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants