Add DELETE /users/{id} with no-resources rule#317
Add DELETE /users/{id} with no-resources rule#317igennova wants to merge 2 commits intoopenml:mainfrom
Conversation
WalkthroughThis change implements a user account deletion feature. It adds two new error classes ( 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The delete flow performs three separate DB operations (existence check, resource count, and deletion) without an explicit transaction, so consider wrapping these in a single transaction or using DB constraints to avoid race conditions where the user or their resources change between checks.
- In
delete_user_rows, the two DELETE statements could leave the DB in an inconsistent state if the second fails after the first succeeds; consider enforcing this via foreign keys with cascading deletes or executing both deletes within an explicit transaction block. - The tests insert users with hard-coded
group_id = 2; consider using a named constant or fixture for this group ID so that tests are less brittle to changes in group configuration.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The delete flow performs three separate DB operations (existence check, resource count, and deletion) without an explicit transaction, so consider wrapping these in a single transaction or using DB constraints to avoid race conditions where the user or their resources change between checks.
- In `delete_user_rows`, the two DELETE statements could leave the DB in an inconsistent state if the second fails after the first succeeds; consider enforcing this via foreign keys with cascading deletes or executing both deletes within an explicit transaction block.
- The tests insert users with hard-coded `group_id = 2`; consider using a named constant or fixture for this group ID so that tests are less brittle to changes in group configuration.
## Individual Comments
### Comment 1
<location path="src/routers/openml/users.py" line_range="26-35" />
<code_context>
+async def delete_user_account(
</code_context>
<issue_to_address>
**issue (bug_risk):** There is a race window between checking for resources and deleting the account.
Because `count_uploaded_resources` and `delete_user_rows` operate on different DBs/connections (`expdb` vs `userdb`), we can’t easily enforce a single transaction, so another process could create new datasets/flows/runs/studies in between. Consider either documenting this as a best-effort check or tightening invariants so new resources cannot be created once deletion starts (e.g., DB constraints or a soft-delete flag plus background cleanup).
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (98.30%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #317 +/- ##
==========================================
+ Coverage 93.53% 93.70% +0.17%
==========================================
Files 67 69 +2
Lines 3109 3227 +118
Branches 221 224 +3
==========================================
+ Hits 2908 3024 +116
- Misses 143 145 +2
Partials 58 58 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
src/routers/openml/users.py (2)
54-54: Nit: useHTTPStatus.NO_CONTENTinstead of the literal.Matches the idiom used elsewhere in this repo (and in the
responses=dict keys above this is fine as int, but the return value tends to use the enum).♻️ Proposed tweak
-from typing import Annotated +from http import HTTPStatus +from typing import Annotated @@ - await database.users.delete_user_rows(user_id=user_id, userdb=userdb) - return Response(status_code=204) + await database.users.delete_user_rows(user_id=user_id, userdb=userdb) + return Response(status_code=HTTPStatus.NO_CONTENT)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/routers/openml/users.py` at line 54, Replace the literal 204 return in the users route (the line returning Response(status_code=204) in src/routers/openml/users.py) with the enum HTTPStatus.NO_CONTENT; also add or update the import from http import HTTPStatus at the top of the module if it’s missing so the Response uses HTTPStatus.NO_CONTENT instead of the integer literal.
26-53: Consider audit logging for admin-initiated deletions.Admin deletions of another user's account are a high-impact, irreversible action but there's no
logger.info(...)/ structured audit event on the success path. Since the rest of the codebase already usesloguruwith contextual binding, a single log line (withactor_id,target_id, whether it was self vs admin) would make this much easier to triage later without adding behavior changes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/routers/openml/users.py` around lines 26 - 53, Add an audit log on the success path of delete_user_account: after the permission checks and before/after calling database.users.delete_user_rows, emit a loguru structured/info event that records actor_id (current_user.user_id), target_id (user_id), and whether the action was self_deletion or admin_initiated (use current_user.is_admin() to determine). Use the same contextual binding pattern as other modules (bind actor_id/target_id) and ensure the log is only emitted for successful deletions performed by admins or users deleting their own account so it appears in audit traces without changing behavior.src/database/users.py (1)
107-116: Optional: guard against an unexpected multi-row match.
DELETE FROM users WHERE id = :user_idrelies onidbeing the PK (safe). Consider also assertingrowcount == 1after the second execute (or wrapping in aSELECT ... FOR UPDATEearlier) to make unexpected races (user already deleted betweenexists_by_idand here) observable — otherwise the endpoint silently returns 204 even if nothing was deleted.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/database/users.py` around lines 107 - 116, In delete_user_rows, guard against unexpected multi/no-row deletes by inspecting the result of the second DELETE (the one against "users WHERE id = :user_id")—capture the execute return (from AsyncConnection.execute), check result.rowcount == 1 and if not either raise an appropriate exception or return an observable error; alternatively document or implement an earlier SELECT ... FOR UPDATE to lock/verify the user before deletion. Ensure you reference the second execute in delete_user_rows so races where the user was already removed do not silently succeed.tests/routers/openml/users_delete_test.py (2)
49-61: Fragile reliance on seeded test data (user 16 / dataset 130).The test depends on the test fixture DB having exactly "user 16 owns dataset 130". This works today but will silently break if the seed data is regenerated or re-numbered. Consider either:
- Creating a disposable user + inserting a minimal
datasetrow withuploader=new_idinside the test (mirrors the pattern used in the success tests), OR- Pulling the expected user/dataset IDs from a shared constants module so the linkage is discoverable.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/routers/openml/users_delete_test.py` around lines 49 - 61, The test test_delete_user_conflict_when_user_has_resources currently depends on hard-coded seeded IDs (user 16 / dataset 130); change it to create its own disposable user and a minimal dataset tied to that user's id (set dataset.uploader = new_id) inside the test before calling py_api.delete (mirroring the pattern used in the successful delete tests), then assert the response is HTTPStatus.CONFLICT, content-type problem+json, body["type"] == AccountHasResourcesError.uri and that "datasets" appears in body["detail"]; alternatively, if your test suite provides a shared constants module for seeded IDs, replace the hard-coded 16/130 with those constants so the ownership link is explicit.
16-21: Minor: assertapplication/problem+jsoncontent-type here too.The other error-response tests (
test_delete_user_not_found,test_delete_user_conflict_when_user_has_resources) verify the RFC 9457 media type but this one does not, even though the same handler produces it. Adding the assertion gives full coverage of the error envelope.🧪 Proposed addition
assert response.status_code == HTTPStatus.UNAUTHORIZED + assert response.headers["content-type"] == "application/problem+json" body = response.json()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/routers/openml/users_delete_test.py` around lines 16 - 21, The test test_delete_user_missing_auth is missing an assertion that the error response uses the RFC 9457 media type; add an assertion after calling py_api.delete("/users/1") to verify the response Content-Type is "application/problem+json" (or startswith that value to tolerate charset), so the test mirrors the other error-response tests (test_delete_user_not_found, test_delete_user_conflict_when_user_has_resources) and fully covers the error envelope.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/database/users.py`:
- Around line 107-116: In delete_user_rows, guard against unexpected
multi/no-row deletes by inspecting the result of the second DELETE (the one
against "users WHERE id = :user_id")—capture the execute return (from
AsyncConnection.execute), check result.rowcount == 1 and if not either raise an
appropriate exception or return an observable error; alternatively document or
implement an earlier SELECT ... FOR UPDATE to lock/verify the user before
deletion. Ensure you reference the second execute in delete_user_rows so races
where the user was already removed do not silently succeed.
In `@src/routers/openml/users.py`:
- Line 54: Replace the literal 204 return in the users route (the line returning
Response(status_code=204) in src/routers/openml/users.py) with the enum
HTTPStatus.NO_CONTENT; also add or update the import from http import HTTPStatus
at the top of the module if it’s missing so the Response uses
HTTPStatus.NO_CONTENT instead of the integer literal.
- Around line 26-53: Add an audit log on the success path of
delete_user_account: after the permission checks and before/after calling
database.users.delete_user_rows, emit a loguru structured/info event that
records actor_id (current_user.user_id), target_id (user_id), and whether the
action was self_deletion or admin_initiated (use current_user.is_admin() to
determine). Use the same contextual binding pattern as other modules (bind
actor_id/target_id) and ensure the log is only emitted for successful deletions
performed by admins or users deleting their own account so it appears in audit
traces without changing behavior.
In `@tests/routers/openml/users_delete_test.py`:
- Around line 49-61: The test test_delete_user_conflict_when_user_has_resources
currently depends on hard-coded seeded IDs (user 16 / dataset 130); change it to
create its own disposable user and a minimal dataset tied to that user's id (set
dataset.uploader = new_id) inside the test before calling py_api.delete
(mirroring the pattern used in the successful delete tests), then assert the
response is HTTPStatus.CONFLICT, content-type problem+json, body["type"] ==
AccountHasResourcesError.uri and that "datasets" appears in body["detail"];
alternatively, if your test suite provides a shared constants module for seeded
IDs, replace the hard-coded 16/130 with those constants so the ownership link is
explicit.
- Around line 16-21: The test test_delete_user_missing_auth is missing an
assertion that the error response uses the RFC 9457 media type; add an assertion
after calling py_api.delete("/users/1") to verify the response Content-Type is
"application/problem+json" (or startswith that value to tolerate charset), so
the test mirrors the other error-response tests (test_delete_user_not_found,
test_delete_user_conflict_when_user_has_resources) and fully covers the error
envelope.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: aadbfb3f-753a-450b-933c-0489450c1f46
📒 Files selected for processing (5)
src/core/errors.pysrc/database/users.pysrc/main.pysrc/routers/openml/users.pytests/routers/openml/users_delete_test.py
|
@PGijsbers |
Fix : #194
Description
Adds a new
DELETE /users/{user_id}endpoint (Phase 1) so users can deletetheir own OpenML account, and administrators can delete any account, as long
as the account has no uploaded resources (datasets, flows, runs, studies).
Phase 1
204 No Content— account deleted successfully401 Unauthorized— no / invalid API key403 Forbidden— non-admin tries to delete another user's account404 Not Found—user_iddoes not exist409 Conflict— user still has uploaded datasets / flows / runs / studies(RFC 9457
AccountHasResourcesError)