Site Updates and Feature Requests
25 April 2020 at 11:32 am #156
This topic is for details of site updates such as new features, and also nerdy stuff such as software or server updates.
If there’s a new feature that you’d like to see added to the site or perhaps you don’t think that something is working properly, feel free to leave a reply!
main web server:
- Ubuntu 20.04 LTS running on Linode 2GB (London) + backup service
- Linode Object Storage (~100GB)
- Nginx 1.18.x
- Apache 2.4.x with ModSecurity & mod_perl
- PHP 7.4.x (FPM)
- MediaWiki 1.35.x (LTS) with CirrusSearch
- 3 x Debian 11 running on Linode 2GB (London)
- Galera Cluster (3 nodes) running MariaDB 10.6.x
search index server:
- Debian 11 running on Linode 4GB (London)
major config changes:
- 02/05/2020: moved ElasticSearch from the web server onto the database server (not ideal, as ES is a memory hog but needed to free up memory on the web server to cope with the huge increase in traffic since lockdown)
- 29/05/2020: changed the MediaWiki job queue to a continuous service
- 30/07/2020: moved ElasticSearch back onto the main web server as traffic levels have dropped slightly from the peak around late May 2020
- 15/08/2020: migrated database to new Ubuntu 20.04 server running MariaDB
- 18/08/2020: upgraded database server hardware and moved ElasticSearch onto it
- 08/12/2020: site upgraded to MediaWiki 1.35 (LTS)
- 30/12/2020: hardware upgrade to MySQL/ElasticSearch server (2GB to 8GB)
- 31/01/2021: site upgraded to MediaWiki 1.35.1 (LTS)
- 28/05/2021: site upgraded to MediaWiki 1.35.2 (LTS)
- 28/07/2021: site patched to MediaWiki 1.35.3 (LTS)
- 07/10/2021: ElasticSearch moved to separate 4GB server / database server reduced to 4GB
- 13/10/2021: site patched to MediaWiki 1.35.4 (LTS)
- 21/11/2021: memcached replaced by Redis
- 11/12/2021: ElasticSearch server patched against CVE-2021-44228
- 21/12/2021: migrated databases across to a 3-node Galera Cluster
- 30/12/2021: site patched to MediaWiki 1.35.5 (LTS)
- 30/12/2021: remove Old Maps UK and Pastscape from the “Details & Links” tab for locations as both sites are now sadly defunct
- 30/12/2021: fixed the Bench Mark Database link in the “Details & Links” tab to work with their new site
- 28/04/2022: site patched to MediaWiki 1.35.6 (LTS)
- 09/08/2022: site patched to MediaWiki 1.35.7 (LTS)
- 29/11/2022: site patched to MediaWiki 1.35.8 (LTS)
- 10/01/2023: site patched to MediaWiki 1.35.9 (LTS)
- rewrite MW skin to resolve current JS loading issues
- migrate to MediaWiki 1.39 (LTS)
integrate Omeka content directly into the wiki (so that the content appears in searches, etc) and just use Omeka as a back-end management tool
25 April 2020 at 11:53 am #157
- 1 x Linode 2GB (web server) – $120 per year
- 1 x Linode 4GB (search index servers) – $240 per year
- 3 x Linode 2GB (3 node MariaDB Galera cluster) – $360 per year
- Linode Backups Service – $150 per year
- Linode Object Storage – $60 per year
- domain name -$25 per year
- email hosting – $15 per year
- USD: $970 + VAT
- UK (inc. VAT): approx £860 per year
Synonym searching is now enabled on Huddersfield Exposed. This is primarily being used to provide search across common variant spellings, such as:
- Hellawell / Helliwell / Hellwell
- Hinchliffe / Hinchcliff / Hinchcliffe / Hinchliff
- Lingards / Lingarths
- Linthwaite / Linfit
- Slaithwaite / Slawit
The initial emphasis is on adding synonyms to improve searching across the Holmfirth Flood Project content where spellings of names and places varies from one newspaper report to the next.
For example, a search for linfit hall will also match linthwaite hall…
Using quotation marks overrides the synonym searching, allowing you to focus on a specific spelling…
The current list of synonyms can be viewed here: https://huddersfield.exposed/wiki/HuddersfieldExposed:Synonyms
If there are any synonyms you’d like adding, please let me know.15 August 2020 at 5:02 pm #313
The site’s database has been migrated to a new server running MariaDB.27 November 2020 at 4:17 pm #378
The site was an early adopter of Linode’s Object Storage product and much of the multimedia content is stored “in the cloud”. At the time, the only option was to use the US-based storage.
As Linode now offer a European option and are able to support custom SSL certificates, I’ve begun migrating the bulk of the content across and it will be served from the domain storage.huddersfield.exposed. This should give slightly faster loading times for everyone in the UK.7 December 2020 at 9:17 am #381
The 1.34 branch of MediaWiki is now officially “end of life“.
Although there’s no immediate rush to move to 1.35 (LTS), I’ve decided it’d be better to carry out the upgrade before we begin loading the content of the newspaper OCR project as that will add over 43,000 new pages to the site.
I’ve carried out a successful test upgrade overnight, so the plan is to take the site offline for a short period tomorrow morning (Tuesday 8 December) to do the upgrade on the live server. If all goes well, it should only take about 15 minutes.
EDIT: the upgrade to 1.35 is now completed30 January 2021 at 3:33 pm #407
The server hardware upgrades necessitated by the newspaper OCR project now means that the database server is the fastest server and has spare CPU capacity. Therefore, I’ve begun migrating the mod_perl API code from the main web server onto the database server. The API code was written to be fast but is now even quicker 🙂 In turn, this frees up capacity on the main web server.28 May 2021 at 3:48 pm #7243
The site is now upgraded to MediaWiki 1.35.2
The bad news is that a test upgrade to 1.36.0 failed and it looks like the skin used by the site will need to be rewritten from scratch due to it being written a couple of years ago using the now outdated MW Skinning Guide. Ho hum!
Since the 1.35 branch is a long term support (LTS) release, my current intention is to develop a new skin for the site and then migrate to the next stable LTS release at some point prior to September 2023.21 December 2021 at 9:06 pm #7533
To try and make the website more resilient, I’ve completed a migration of the databases from a single Linode 4GB server to a 3-node MariaDB Galera Cluster (where each node is a Linode 2GB server).8 February 2022 at 4:26 pm #7789
I’ve started work on this “future plan” item:
- integrate Omeka content directly into the wiki (so that the content appears in searches, etc) and just use Omeka as a back-end management tool
When Huddersfield Exposed first launched, Flickr offered a reasonable amount of free storage for multimedia so it made sense to use that as the tool for storing and managing the content. Unfortunately, all that changed in 2018. We then moved to using a local install of Omeka Classic to manage the multimedia and uncoupled the Flickr integration. The downside was that the main site (which uses MediaWiki) became separate to the multimedia archive (Omeka).
Moving forward, I’m now using the Omeka API to help embed multimedia into the main site, e.g.:
As of today, running a search within the site, e.g. for Moldgreen, will include multimedia content that is stored in the main MediaWiki page index:
…previously I’d been running the search keywords against the Omeka API which tended to bring back irrelevant results when using multiple keywords.
The main thing that’s still to do is to provide a new page on the wiki that will allow you to run searches against only the multimedia content.27 March 2022 at 2:20 pm #7933
The site’s map feature (which displays icons of geoindexed locations) has just had a major update which should improve performance…
The icons are delivered via an API which takes the boundaries of the displayed map and, combined with the current zoom level, decides which markers should be displayed. Previously, the API response included details such as geocoordinates, location name, location ID, icon details, etc. and, if a large number of markers needed to be displayed, the response size could be large — perhaps up to around 15kB for a particularly busy map. Each time the user adjusts the map (scrolling, zooming, etc), another API request was needed, generating network traffic and CPU load on the server.
The new update moves to using the web browser’s localStorage (if available) to cache a full copy of all geoindexed locations using a JSON file on the server. The browser will periodically check to see if the localStorage cache is up-to-date and, if not, pull a fresh copy of the JSON file from the server. The API response is now much smaller (around 1kB) and contains only the location IDs of the markers to display — the rest of the information (geocoordinates, name, icon, etc) is retrieved from the localStorage.28 April 2022 at 10:08 am #8089
- You must be logged in to reply to this topic.