The original post: /r/datahoarder by /u/EDACerton on 2024-12-22 20:15:38.

One of my friends is a technical writer/editor, but the company that they work for will go bankrupt very soon. All of their work is on a web site, and they expect that the web site will disappear when the business collapses (taking their portfolio of work with it).

They asked me to scrape/archive the site so that they would have a copy of their work. I’m trying httrack, but getting poor results due to JavaScript, etc.

Does anyone know of any tools that could scrape all of their pages to something like PDFs?