Skip to content

Archiving PDFs #6

@fcbond

Description

@fcbond

For all of my lab papers, and papers I have cited, I would like to combat bit-rot by storing the PDF locally (or in fact in github pages).

There are about 2500-3000 papers. If we say the average PDF size is 4MB then this is 12GB, too much for GH pages (limit 1GB).

If we just did my pages, then this is (* 4 400) ~ 1600, probably doable (my papers are generally less than 4MB).

It would also be good to have OCR results, ala gwern, for scanned papers, if I have any https://github.com/ocrmypdf/OCRmyPDF

Maybe the PDFs for other people's papers should be a local archive.

Let's see if this is doable. I could say that if it is in acl anthology I don't need to do it, which would cut things down a lot, ...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions