docs: add documentation for automated EvalBench evaluation workflow and test configuration by omkargaikwad23 · Pull Request #172 · gemini-cli-extensions/cloud-sql-postgresql

omkargaikwad23 · 2026-05-08T10:47:05Z

…

…nd test configuration

prernakakkar-google · 2026-05-08T10:54:00Z

+1.  Open `evals/gemini_dataset.json` (and/or `evals/claude_dataset.json`).
+2.  Add a new scenario block with a unique `id`, a clear `starting_prompt`, a detailed `conversation_plan`, and the `expected_trajectory` of tool calls.
+3.  Apply the `ci:run-evals` label while creating your pull request to trigger the evaluation pipeline.
+4.  Evaluation metrics and outcomes are uploaded to BigQuery and can be monitored on the team's centralized evaluation dashboards.


This dashboard is internal so lets not put this in public repo.

…d and maintainer review process

docs: add documentation for automated EvalBench evaluation workflow a…

8296b10

…nd test configuration

omkargaikwad23 requested review from a team as code owners May 8, 2026 10:47

github-actions Bot assigned prernakakkar-google May 8, 2026

github-actions Bot requested a review from prernakakkar-google May 8, 2026 10:47

prernakakkar-google reviewed May 8, 2026

View reviewed changes

docs: update evaluation pipeline instructions to reference Cloud Buil…

0dedfcc

…d and maintainer review process

omkargaikwad23 requested a review from prernakakkar-google May 8, 2026 12:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add documentation for automated EvalBench evaluation workflow and test configuration#172

docs: add documentation for automated EvalBench evaluation workflow and test configuration#172
omkargaikwad23 wants to merge 2 commits intomainfrom
evals-doc-update

omkargaikwad23 commented May 8, 2026

Uh oh!

prernakakkar-google May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

omkargaikwad23 commented May 8, 2026

Uh oh!

prernakakkar-google May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants