feat: improve codebase by LeadcodeDev · Pull Request #3 · LeadcodeDev/sqlx_gen

LeadcodeDev · 2026-06-03T19:57:16Z

No description provided.

sqlx 0.8's PgTypeInfo::with_name does not accept schema-qualified names like "agent.canal_type_enum"; emitting them causes runtime decode errors. Always emit the unqualified type name and rely on the connection's search_path to resolve non-public schemas.

35-task TDD plan covering 78 findings across security, codegen/typemap, error handling, tests/CI, and SQL ↔ Rust conformity.

Introduces codegen::identifiers with quote_ident, quote_qualified, and is_safe_ident helpers. Foundation for preventing SQL injection in generated CRUD code by quoting table/column/schema names per dialect (backticks for MySQL, double quotes for Postgres/SQLite).

Every table, schema, and column name interpolated into generated SQL strings now goes through identifiers::quote_ident / quote_qualified. Prevents SQL injection when DB metadata contains quote characters or reserved words, and produces correct SQL for identifiers that would otherwise be ambiguous (e.g. columns named "select"). Updated 13 existing tests whose substring assertions encoded the prior unquoted output. Fixed the junction_entity test fixture to split schema/table properly instead of relying on a dotted table_name.

Each Rust type passed via --type-overrides is now parsed by syn::parse_str::<syn::Type> before being injected into generated code. Rejects empty keys/values, missing '=', and strings that aren't a single valid Rust type. Closes the code-injection path where "--type-overrides jsonb=Vec<u8>; fn pwned() {}" would have been emitted verbatim into the output.

Connection failures previously bubbled up the raw sqlx::Error which can include the full database URL (user:password@host) in its Display implementation. Wrap the pool.connect() error in a new Error::Connection variant that carries a redacted URL, and add a redact_url helper that replaces the password with "****".

parse_and_format / parse_and_format_with_tab_spaces previously called std::process::exit(1) when prettyplease failed to parse the generated TokenStream. That kills the user's build with no recovery path and no useful diagnostic if sqlx-gen is ever used as a library. Now both helpers return Result<String, error::Error>; format_tokens, format_tokens_with_imports, and codegen::generate propagate via ?. The error message includes the failing token stream and a request to file an issue, since this only fires on internal codegen bugs. Test helpers across struct_gen, enum_gen, composite_gen, domain_gen, crud_gen, codegen::mod, and e2e_sqlite were updated to .unwrap() the Result — they assert on the happy path and want a clear panic if it breaks.

Removed 7× .expect() on MySQL information_schema Vec<u8> → String conversions, replaced with utf8_field helper that returns Error::Config on invalid UTF-8. Removed 5× .last_mut().unwrap() panic risks across postgres.rs and mysql.rs (tables, views, enums, composites). Each now returns Error::Config with an "internal sqlx-gen bug" message that points the user at filing an issue rather than crashing the build.

write_atomic streams into a sibling NamedTempFile then renames into place, so a Ctrl-C or disk-full error never leaves a half-written .rs file that would break the user's next build. validate_safe_filename rejects path separators, "..", absolute paths, empty names, and non-.rs extensions before any write happens. Defends against the rare case where introspected table names flow into the output filename and could otherwise escape output_dir.

Runs on every push to main and every PR. Three primary jobs: - test: cargo test --all (unit + sqlite-based integration) - fmt: rustfmt --check - clippy: -D warnings on all targets Two optional jobs spin up Postgres 16 and MySQL 8.0 services and run the e2e_postgres / e2e_mysql test files (added in upcoming commits). These continue-on-error until the e2e suites exist.

…e decimal - MySQL `bit(1)` → bool (idiomatic boolean column); bit(N>1) stays Vec<u8> - MySQL `boolean`/`bool` aliases → bool (previously fell through to String) - Postgres `interval` → PgInterval with the corresponding import (was hitting the String fallback) - SQLite `NUMERIC`/`DECIMAL` → Decimal instead of f64; matches the precision-safe behaviour already shipped for Postgres and MySQL

Before this commit, a Postgres column of type my_enum[] was mapped to Vec<MyEnum> but the generated MyEnum had no PgHasArrayType impl. At runtime sqlx then bailed with "unsupported type _my_enum of column #N" because it could not resolve the array's element type info. Now both enum_gen and composite_gen emit an `impl PgHasArrayType` whose array_type_info() returns PgTypeInfo::with_name("_<name>"), which matches how Postgres names array types. Gated on DatabaseKind::Postgres so MySQL and SQLite output is unchanged.

Two SQL enum values like 'foo bar' and 'foo_bar' both collapse to the Rust identifier FooBar via to_upper_camel_case, which previously generated code that would not compile. check_variant_collisions runs during codegen::generate and returns a clear Error::Config pointing at the conflicting variants and the Rust identifier they share.

Columns named "user-id", "created at", "123foo" etc. previously produced Rust code that wouldn't compile because format_ident! cannot encode dashes/spaces/leading digits. sanitize_rust_ident: - replaces every non-alphanumeric (and non-_) character with '_' - prefixes a '_' if the result starts with a digit - falls back to "_field" on an empty string The original DB column name is preserved via the existing #[sqlx(rename = "<original>")] rewrite, so reads and writes still hit the right column.

Tables in "public" (Postgres), "main" (SQLite), or "dbo" no longer get their schema rendered into every generated SELECT/INSERT/UPDATE/DELETE. The qualified form is still used for non-default schemas, where it is required for unambiguous resolution.

The audit flagged inline ENUMs as potentially broken, but the existing per-variant #[sqlx(rename)] emitted whenever the camelCase identifier differs from the SQL value is exactly what sqlx::Type expects for text encoding on MySQL/SQLite. These tests pin that behaviour for both lowercase and case-sensitive variants so a future refactor can't silently regress it.

`--domain-style alias` (default) keeps the existing `pub type Email = String;` behaviour. `--domain-style newtype` instead emits #[derive(..., sqlx::Type)] #[sqlx(transparent)] pub struct Email(pub String); so the user can attach validation, traits, or accessors to the domain. Both styles share the same doc-comment and codegen plumbing via the new DomainStyle enum and generate_with_domain_style entry point. CLI defaults preserve current behaviour exactly.

SQLite has no native enum type, so users encode them with TEXT CHECK (status IN ('active', 'inactive')) extract_check_enums parses sqlite_master.sql for each table, looks for that pattern column-by-column, and synthesises an EnumInfo plus rewrites the column's udt_name to <table>_<col>_enum. From there the existing enum/typemap pipeline takes over and emits a real Rust enum that round-trips via per-variant #[sqlx(rename)].

contextualize_sqlx_error inspects the SQLSTATE on a sqlx::Error and re-raises: - 42501 / 28000 → PermissionDenied with a hint about the DB user's privileges on information_schema / pg_catalog / sqlite_master - 42P01 / 3F000 / 42S02 → SchemaNotFound with a hint about --schemas Other sqlx::Error values still fall through to the existing Error::Database variant, so the public API and behaviour are unchanged for unrelated failures.

LAST_INSERT_ID() only returns a meaningful value when the table has a single AUTO_INCREMENT primary key. For composite PKs: - include every PK column in InsertParams so the user can supply them - run the INSERT with the bound values - SELECT the freshly inserted row by binding the same PK values build_insert_method_parsed and build_insert_many_transactionally_method both branch on pk_fields.len(); single-PK MySQL flows continue to use LAST_INSERT_ID exactly as before. Postgres / SQLite are unaffected because their RETURNING * already handled this case.

compile_check.rs validates that codegen output is loadable in two modes: 1. Fast path (always on): each GeneratedFile is parsed with syn::parse_file. Catches malformed attributes, unclosed braces, invalid identifiers, and anything else that breaks at the AST level. Runs across Postgres, MySQL, SQLite, and the newtype-domain variant. 2. Deep path (gated on SQLX_GEN_COMPILE_CHECK=1): scaffolds a temporary downstream crate, drops the generated code into src/lib.rs, and runs `cargo check` with the full sqlx dependency tree. This is the only check that confirms the emitted derives and #[sqlx(...)] attributes are actually accepted by sqlx itself.

Postgres' information_schema.columns reports the schema in which a column's user-defined type lives (e.g. "auth" for an auth.role enum column, "pg_catalog" for builtins). Capture it on every column so the typemap and codegen layers can disambiguate two schemas declaring a type with the same name. - Adds udt_schema: Option<String> to ColumnInfo - Postgres fetch_tables / fetch_views select COALESCE(udt_schema, '') and unpack to None when empty - MySQL, SQLite, and synthetic test fixtures keep it None - ColumnInfo derives Default so future test code can use struct update syntax

When the same SQL name (e.g. "role") exists in two non-default schemas, sqlx-gen now prefixes the Rust identifier with the schema's PascalCase form: auth.role → AuthRole, billing.role → BillingRole. The bare PascalCase ("Role") is reserved for unique names and for the default schema even when a collision exists. Plumbing: - codegen::rust_type_name_for + type_name_has_cross_schema_collision as the single source of truth, callable from typemap and from each *_gen module. - typemap::postgres exposes map_type_qualified that takes the column's udt_schema (added in the previous commit) so cross-schema duplicate lookups land on the right (schema, name) pair. - enum_gen::generate_enum_with_schema wraps the legacy entry point and propagates the SchemaInfo so the emitted Rust enum carries the prefixed name. composite_gen and domain_gen call rust_type_name_for directly since they already receive SchemaInfo. - codegen::generate now calls generate_enum_with_schema. The SQL #[sqlx(type_name = "...")] attribute is still emitted in its unqualified form because sqlx 0.8 doesn't accept "schema.type"; users remain responsible for setting search_path on the connection.

When an enum or composite lives in a schema other than public, sqlx 0.8 cannot resolve its unqualified type_name unless the connection's search_path includes that schema. To make this discoverable: - Emit a /// doc-comment on every non-default-schema enum and composite spelling out the requirement with a copy-paste-ready SET search_path snippet - Add codegen::required_pg_search_path(&schema_info), which returns the sorted, deduplicated list of schemas needed - Make the CLI log the exact SET search_path line after introspection when the result references any non-default schemas - Document the whole flow (after_connect hook + collision prefixing) in a new "PostgreSQL — multi-schema setup" section in README.md

…ype) #[derive(sqlx::Type)] combined with #[sqlx(type_name = "x")] already auto-generates `impl PgHasArrayType` pointing at `_x` in sqlx 0.8+. The manual impl added by Task 27 collided with the derive output, producing E0119 "conflicting implementations" in any downstream crate that consumed the generated types. Remove the manual block from enum_gen and composite_gen, replace the "must emit" tests with "must NOT emit" regressions across all three dialects, and rely on the sqlx derive for array support.

Every column, table, and schema reference was previously emitted with unconditional dialect quotes. For lowercase ASCII names that aren't reserved words this produced noisy SQL ("agent"."agent__connector", "connector_id" = $1) without any added safety. quote_ident now defers to is_safe_unquoted: an identifier is emitted bare when it starts with a lowercase letter or underscore, contains only ASCII lowercase / digits / underscores, and is not in a curated ~100-word SQL reserved list (sorted, binary-searched). quote_ident_always remains for sites that genuinely need to force the quotes. quote_qualified composes per-part. This means agent.agent__connector instead of "agent"."agent__connector" on the user's reported schema, while user-supplied DB names that collide with SELECT / order / user etc. still get quoted defensively.

The two crates used to declare their version independently (0.5.5 in both, but with sqlx-gen-macros pinned at 0.5.4 inside sqlx-gen). A single field bump would have to happen in three places before they matched again, and the cross-dep made silent drift easy to ship. - Root Cargo.toml grows [workspace.package] with version, edition, rust-version, license, repository, keywords, categories. - Root Cargo.toml grows [workspace.dependencies] declaring every dependency once, including the internal sqlx-gen-macros (now always = the workspace version) and every external crate. - Each member crate inherits with `*.workspace = true`. Per-crate Cargo.toml shrinks to per-crate fields only (name, description, features, bin). - .gitignore now excludes /docs/superpowers/ so locally generated audit/plan files stay out of the repo.

LeadcodeDev added 30 commits June 3, 2026 16:05

docs: add Rust engineering audit and remediation plan

3c132b3

35-task TDD plan covering 78 findings across security, codegen/typemap, error handling, tests/CI, and SQL ↔ Rust conformity.

chore: ignore macOS .DS_Store files

5ca1a07

fix: wire enum collision check into codegen::generate

0d6dbf8

feat: accept Postgres array type names in both _x and x[] notations

89d1891

chore: document MSRV as rust 1.75 in both crates

dd6c90e

fix: short-circuit insert_many_transactionally on empty input

ccb11c1

feat: warn when introspection returns no tables/views/enums

810d732

fix: pick raw-string fence width that avoids embedded #

c89459b

chore: silence clippy warnings (manual strip_prefix, broken doc list)

c66a189

style: apply rustfmt

ac18eb1

LeadcodeDev added 8 commits June 3, 2026 17:04

style: apply rustfmt to compile_check harness

6a3c3ac

chore: uprade version

b37bb3f

LeadcodeDev self-assigned this Jun 3, 2026

LeadcodeDev added the enhancement New feature or request label Jun 3, 2026

LeadcodeDev merged commit 76cdf2e into main Jun 3, 2026
5 checks passed

LeadcodeDev deleted the feat/improve-codebase branch June 3, 2026 19:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve codebase#3

feat: improve codebase#3
LeadcodeDev merged 38 commits into
mainfrom
feat/improve-codebase

LeadcodeDev commented Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LeadcodeDev commented Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant