Add data dumps#2047
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files
☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
c6c0979 to
c3146f8
Compare
c3146f8 to
218faf9
Compare
Data dump-affecting changesThis pull request changes DB schema and may affect what data is included in the data dump. Please review: db/migrate/20260120032240_create_dumps.rb create_table :dumps do |t|db/migrate/20260513135320_add_defaults_for_data_dumps.rb change_column_default :filters, :user_id, -1
change_column_default :flags, :escalated, false
change_column_default :users, :sign_in_count, 0
change_column_default :users, :failed_attempts, 0
change_column_default :users, :deleted, falsedb/migrate/20260513160917_add_automatic_to_dumps.rb add_column :dumps, :automatic, :boolean, null: false, default: falsedb/migrate/20260513185013_add_link_to_dumps.rb add_column :dumps, :link, :stringdb/migrate/20260519211432_add_checksum_to_dumps.rb add_column :dumps, :checksum, :stringdb/migrate/20260520085943_add_more_default_values.rb change_column_default :votes, :created_at, '2000-01-01T00:00:00.000000Z'
change_column_default :votes, :updated_at, '2000-01-01T00:00:00.000000Z'db/migrate/20260520112652_add_filter_default_name.rb change_column_default :filters, :name, ''db/migrate/20260520175923_add_user_creation_defaults.rb change_column_default :users, :created_at, '2000-01-01T00:00:00.000000Z'
change_column_default :users, :updated_at, '2000-01-01T00:00:00.000000Z'
change_column_default :community_users, :created_at, '2000-01-01T00:00:00.000000Z'
change_column_default :community_users, :updated_at, '2000-01-01T00:00:00.000000Z' |
Oh, I think you're right. A cleanup job sounds like a good idea (and is separable from this PR). |
|
I have no idea why a sign-in rate limit test has suddenly started failing and have run out of time to work on this for a few days - if anyone else has time to look please feel free. |
Fetching directly is also dangerous in CI
It's because of the new default creation/update times for users. Don't merge until we resolve. |
…forge interaction
cellio
left a comment
There was a problem hiding this comment.
Retested with the latest changes and all looks good now: new users have the correct (real) timestamps, and data dumps show 1970 for users, community users, and votes. Changing the default in the dump DB only from within the dump job is clever!
Add data dumps. Weekly export of the entire database minus anything sensitive, uploaded to S3 and made available via a new page linked in the footer. Also adds an option for manually-created dump records, intended for quarterly uploads to Archive.org.
Incorporates #1950 by cherry-pick.