Skip to content

Add DELETE /users/{id} with no-resources rule#317

Open
igennova wants to merge 2 commits intoopenml:mainfrom
igennova:issue/194
Open

Add DELETE /users/{id} with no-resources rule#317
igennova wants to merge 2 commits intoopenml:mainfrom
igennova:issue/194

Conversation

@igennova
Copy link
Copy Markdown
Contributor

@igennova igennova commented Apr 19, 2026

Fix : #194

Description

Adds a new DELETE /users/{user_id} endpoint (Phase 1) so users can delete
their own OpenML account, and administrators can delete any account, as long
as the account has no uploaded resources (datasets, flows, runs, studies).

Phase 1

  • 204 No Content — account deleted successfully
  • 401 Unauthorized — no / invalid API key
  • 403 Forbidden — non-admin tries to delete another user's account
  • 404 Not Founduser_id does not exist
  • 409 Conflict — user still has uploaded datasets / flows / runs / studies
    (RFC 9457 AccountHasResourcesError)

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 19, 2026

Walkthrough

This change implements a user account deletion feature. It adds two new error classes (UserNotFoundError and AccountHasResourcesError) to support RFC 9457 problem details. Three database helper functions are introduced to check user existence, count user resources, and delete user data. A new FastAPI router defines a DELETE endpoint that enforces authorization rules, validates user existence, checks for owned resources, and performs the deletion. The implementation includes comprehensive test coverage for various scenarios including authentication, authorization, missing users, resource conflicts, and successful deletion cases.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: adding a DELETE /users/{id} endpoint with a no-resources constraint rule.
Description check ✅ Passed The pull request description clearly relates to the changeset, detailing a new DELETE /users/{user_id} endpoint with specific status codes and resource validation rules.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The delete flow performs three separate DB operations (existence check, resource count, and deletion) without an explicit transaction, so consider wrapping these in a single transaction or using DB constraints to avoid race conditions where the user or their resources change between checks.
  • In delete_user_rows, the two DELETE statements could leave the DB in an inconsistent state if the second fails after the first succeeds; consider enforcing this via foreign keys with cascading deletes or executing both deletes within an explicit transaction block.
  • The tests insert users with hard-coded group_id = 2; consider using a named constant or fixture for this group ID so that tests are less brittle to changes in group configuration.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The delete flow performs three separate DB operations (existence check, resource count, and deletion) without an explicit transaction, so consider wrapping these in a single transaction or using DB constraints to avoid race conditions where the user or their resources change between checks.
- In `delete_user_rows`, the two DELETE statements could leave the DB in an inconsistent state if the second fails after the first succeeds; consider enforcing this via foreign keys with cascading deletes or executing both deletes within an explicit transaction block.
- The tests insert users with hard-coded `group_id = 2`; consider using a named constant or fixture for this group ID so that tests are less brittle to changes in group configuration.

## Individual Comments

### Comment 1
<location path="src/routers/openml/users.py" line_range="26-35" />
<code_context>
+async def delete_user_account(
</code_context>
<issue_to_address>
**issue (bug_risk):** There is a race window between checking for resources and deleting the account.

Because `count_uploaded_resources` and `delete_user_rows` operate on different DBs/connections (`expdb` vs `userdb`), we can’t easily enforce a single transaction, so another process could create new datasets/flows/runs/studies in between. Consider either documenting this as a best-effort check or tightening invariants so new resources cannot be created once deletion starts (e.g., DB constraints or a soft-delete flag plus background cleanup).
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread src/routers/openml/users.py
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 19, 2026

Codecov Report

❌ Patch coverage is 98.30508% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.70%. Comparing base (621e1c3) to head (e3b016d).

Files with missing lines Patch % Lines
src/routers/openml/users.py 92.00% 2 Missing ⚠️

❌ Your patch check has failed because the patch coverage (98.30%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #317      +/-   ##
==========================================
+ Coverage   93.53%   93.70%   +0.17%     
==========================================
  Files          67       69       +2     
  Lines        3109     3227     +118     
  Branches      221      224       +3     
==========================================
+ Hits         2908     3024     +116     
- Misses        143      145       +2     
  Partials       58       58              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
src/routers/openml/users.py (2)

54-54: Nit: use HTTPStatus.NO_CONTENT instead of the literal.

Matches the idiom used elsewhere in this repo (and in the responses= dict keys above this is fine as int, but the return value tends to use the enum).

♻️ Proposed tweak
-from typing import Annotated
+from http import HTTPStatus
+from typing import Annotated
@@
-    await database.users.delete_user_rows(user_id=user_id, userdb=userdb)
-    return Response(status_code=204)
+    await database.users.delete_user_rows(user_id=user_id, userdb=userdb)
+    return Response(status_code=HTTPStatus.NO_CONTENT)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/users.py` at line 54, Replace the literal 204 return in
the users route (the line returning Response(status_code=204) in
src/routers/openml/users.py) with the enum HTTPStatus.NO_CONTENT; also add or
update the import from http import HTTPStatus at the top of the module if it’s
missing so the Response uses HTTPStatus.NO_CONTENT instead of the integer
literal.

26-53: Consider audit logging for admin-initiated deletions.

Admin deletions of another user's account are a high-impact, irreversible action but there's no logger.info(...) / structured audit event on the success path. Since the rest of the codebase already uses loguru with contextual binding, a single log line (with actor_id, target_id, whether it was self vs admin) would make this much easier to triage later without adding behavior changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/users.py` around lines 26 - 53, Add an audit log on the
success path of delete_user_account: after the permission checks and
before/after calling database.users.delete_user_rows, emit a loguru
structured/info event that records actor_id (current_user.user_id), target_id
(user_id), and whether the action was self_deletion or admin_initiated (use
current_user.is_admin() to determine). Use the same contextual binding pattern
as other modules (bind actor_id/target_id) and ensure the log is only emitted
for successful deletions performed by admins or users deleting their own account
so it appears in audit traces without changing behavior.
src/database/users.py (1)

107-116: Optional: guard against an unexpected multi-row match.

DELETE FROM users WHERE id = :user_id relies on id being the PK (safe). Consider also asserting rowcount == 1 after the second execute (or wrapping in a SELECT ... FOR UPDATE earlier) to make unexpected races (user already deleted between exists_by_id and here) observable — otherwise the endpoint silently returns 204 even if nothing was deleted.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/database/users.py` around lines 107 - 116, In delete_user_rows, guard
against unexpected multi/no-row deletes by inspecting the result of the second
DELETE (the one against "users WHERE id = :user_id")—capture the execute return
(from AsyncConnection.execute), check result.rowcount == 1 and if not either
raise an appropriate exception or return an observable error; alternatively
document or implement an earlier SELECT ... FOR UPDATE to lock/verify the user
before deletion. Ensure you reference the second execute in delete_user_rows so
races where the user was already removed do not silently succeed.
tests/routers/openml/users_delete_test.py (2)

49-61: Fragile reliance on seeded test data (user 16 / dataset 130).

The test depends on the test fixture DB having exactly "user 16 owns dataset 130". This works today but will silently break if the seed data is regenerated or re-numbered. Consider either:

  • Creating a disposable user + inserting a minimal dataset row with uploader=new_id inside the test (mirrors the pattern used in the success tests), OR
  • Pulling the expected user/dataset IDs from a shared constants module so the linkage is discoverable.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/routers/openml/users_delete_test.py` around lines 49 - 61, The test
test_delete_user_conflict_when_user_has_resources currently depends on
hard-coded seeded IDs (user 16 / dataset 130); change it to create its own
disposable user and a minimal dataset tied to that user's id (set
dataset.uploader = new_id) inside the test before calling py_api.delete
(mirroring the pattern used in the successful delete tests), then assert the
response is HTTPStatus.CONFLICT, content-type problem+json, body["type"] ==
AccountHasResourcesError.uri and that "datasets" appears in body["detail"];
alternatively, if your test suite provides a shared constants module for seeded
IDs, replace the hard-coded 16/130 with those constants so the ownership link is
explicit.

16-21: Minor: assert application/problem+json content-type here too.

The other error-response tests (test_delete_user_not_found, test_delete_user_conflict_when_user_has_resources) verify the RFC 9457 media type but this one does not, even though the same handler produces it. Adding the assertion gives full coverage of the error envelope.

🧪 Proposed addition
     assert response.status_code == HTTPStatus.UNAUTHORIZED
+    assert response.headers["content-type"] == "application/problem+json"
     body = response.json()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/routers/openml/users_delete_test.py` around lines 16 - 21, The test
test_delete_user_missing_auth is missing an assertion that the error response
uses the RFC 9457 media type; add an assertion after calling
py_api.delete("/users/1") to verify the response Content-Type is
"application/problem+json" (or startswith that value to tolerate charset), so
the test mirrors the other error-response tests (test_delete_user_not_found,
test_delete_user_conflict_when_user_has_resources) and fully covers the error
envelope.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/database/users.py`:
- Around line 107-116: In delete_user_rows, guard against unexpected
multi/no-row deletes by inspecting the result of the second DELETE (the one
against "users WHERE id = :user_id")—capture the execute return (from
AsyncConnection.execute), check result.rowcount == 1 and if not either raise an
appropriate exception or return an observable error; alternatively document or
implement an earlier SELECT ... FOR UPDATE to lock/verify the user before
deletion. Ensure you reference the second execute in delete_user_rows so races
where the user was already removed do not silently succeed.

In `@src/routers/openml/users.py`:
- Line 54: Replace the literal 204 return in the users route (the line returning
Response(status_code=204) in src/routers/openml/users.py) with the enum
HTTPStatus.NO_CONTENT; also add or update the import from http import HTTPStatus
at the top of the module if it’s missing so the Response uses
HTTPStatus.NO_CONTENT instead of the integer literal.
- Around line 26-53: Add an audit log on the success path of
delete_user_account: after the permission checks and before/after calling
database.users.delete_user_rows, emit a loguru structured/info event that
records actor_id (current_user.user_id), target_id (user_id), and whether the
action was self_deletion or admin_initiated (use current_user.is_admin() to
determine). Use the same contextual binding pattern as other modules (bind
actor_id/target_id) and ensure the log is only emitted for successful deletions
performed by admins or users deleting their own account so it appears in audit
traces without changing behavior.

In `@tests/routers/openml/users_delete_test.py`:
- Around line 49-61: The test test_delete_user_conflict_when_user_has_resources
currently depends on hard-coded seeded IDs (user 16 / dataset 130); change it to
create its own disposable user and a minimal dataset tied to that user's id (set
dataset.uploader = new_id) inside the test before calling py_api.delete
(mirroring the pattern used in the successful delete tests), then assert the
response is HTTPStatus.CONFLICT, content-type problem+json, body["type"] ==
AccountHasResourcesError.uri and that "datasets" appears in body["detail"];
alternatively, if your test suite provides a shared constants module for seeded
IDs, replace the hard-coded 16/130 with those constants so the ownership link is
explicit.
- Around line 16-21: The test test_delete_user_missing_auth is missing an
assertion that the error response uses the RFC 9457 media type; add an assertion
after calling py_api.delete("/users/1") to verify the response Content-Type is
"application/problem+json" (or startswith that value to tolerate charset), so
the test mirrors the other error-response tests (test_delete_user_not_found,
test_delete_user_conflict_when_user_has_resources) and fully covers the error
envelope.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: aadbfb3f-753a-450b-933c-0489450c1f46

📥 Commits

Reviewing files that changed from the base of the PR and between 621e1c3 and 885399d.

📒 Files selected for processing (5)
  • src/core/errors.py
  • src/database/users.py
  • src/main.py
  • src/routers/openml/users.py
  • tests/routers/openml/users_delete_test.py

Comment thread src/database/users.py
@igennova
Copy link
Copy Markdown
Contributor Author

@PGijsbers
I kept the explicit UNION ALL across the 15 user-FK tables instead of switching to a data-driven loop. It keeps the same single round-trip performance, matches the plain SQL style already used in src/database, and keeps this PR focused on Phase 1. If we need to reuse this list in Phase 2 (for example, cascade delete), I can move it into a USER_REFERENCE_COLUMNS tuple. Any schema drift is already handled by the IntegrityError safety net in the route handler.
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Endpoint to delete an account

1 participant