Reply To: Huddersfield Chronicle (1850-1900)
So… some good news and some bad news!
The newspaper page content integrates well with the site and it’s definitely been worthwhile to attempt the project.
However, I’ve had to abandon the loading of the pages for technical reasons. As I mentioned in an earlier comment, “Whether [it’s] achievable with the current web server hardware remains to be seen!” Sadly, it’s not achievable with the current hardware…
The site runs on MediaWiki (the same software as Wikipedia) and the adding of the newspaper content wasn’t a major issue. As you’d expect, the size of the MediaWiki database grew as the content was added.
The site also uses CirrusSearch/Elasticsearch to power the search facility and that’s where I hit issues — the additional content caused the size of the Elasticsearch indexes to grow to a around 30GB in size. In an ideal world, you’d want the indexes to be held in the server’s memory but that would mean upgrading from a £20 per month server to one that costs £160 per month (or an annual increase in the hosting costs of £1,680… gulp!). Instead, the indexes had to sit on the disk space which makes searching a much slower (and more CPU intensive) process.
I’m going to roll back the addition of most of the newspaper content and lick my wounds whilst I mull over the options!