For all of my lab papers, and papers I have cited, I would like to combat bit-rot by storing the PDF locally (or in fact in github pages).
There are about 2500-3000 papers. If we say the average PDF size is 4MB then this is 12GB, too much for GH pages (limit 1GB).
If we just did my pages, then this is (* 4 400) ~ 1600, probably doable (my papers are generally less than 4MB).
It would also be good to have OCR results, ala gwern, for scanned papers, if I have any https://github.com/ocrmypdf/OCRmyPDF
Maybe the PDFs for other people's papers should be a local archive.
Let's see if this is doable. I could say that if it is in acl anthology I don't need to do it, which would cut things down a lot, ...
For all of my lab papers, and papers I have cited, I would like to combat bit-rot by storing the PDF locally (or in fact in github pages).
There are about 2500-3000 papers. If we say the average PDF size is 4MB then this is 12GB, too much for GH pages (limit 1GB).
If we just did my pages, then this is (* 4 400) ~ 1600, probably doable (my papers are generally less than 4MB).
It would also be good to have OCR results, ala gwern, for scanned papers, if I have any https://github.com/ocrmypdf/OCRmyPDF
Maybe the PDFs for other people's papers should be a local archive.
Let's see if this is doable. I could say that if it is in acl anthology I don't need to do it, which would cut things down a lot, ...