Clouds, Cookies and Migration Part 2: Clouds

Once upon a time, a little parrot decided to migrate across the vast ocean to the cloud lands, with nothing more than a handful of cookies.

The previous post described why I decided to migrate to a new web hosting provider, TSO Host. I had the choice between their cPanel account and their cloud account. I opted for the cloud account.

The term “cloud computing” or “in the cloud” can conjure up a fluffy image of data floating around in the air. It’s actually far more down-to-earth and basically entails a bunch of computers in a data centre that are on 24/7 so that the data on them can be accessed across the world at any time in any time zone.

Imagine a computer without a monitor, keyboard or mouse (because it’s only accessed remotely). It’s a server or, rather, has a server process running on it. This is a bit of a simplification, but in this case the server essentially serves up files on request (provided access is permitted). Now imagine that computer sitting on a rack full of other computers. Imagine row upon row upon row of these racks, filling up the entire room. All whirring away because obviously they have to be on all the time. They’re using up electricity and generating heat ― far too much heat, so they have to be cooled. Those devices could be storing private data, such as company documents or your personal photos, or they could be storing the files that make up a website. These data centres not only need to be protected from hacking they also need to be protected from physical intrusion (i.e. burglary).

In that sense, the cPanel shared hosting account is also cloud computing. The files that make up the web site are all contained on a single server in a data centre. The cloud web hosting provided by TSO Host is a particular type of cloud computing that uses a cluster server where the files that make up the web site are synchronized across multiple servers. This makes the web site more reliable. If one device goes down, the others in the cluster can keep the site on-line.

In the end, I decided on the cloud account and opted for the free migration help when I signed up. The migration team copied over all mail accounts, databases and the contents of the public_html directory from my account on Hostgator’s server. I had some files outside of that directory but they were small by comparison, and it was easy enough for me to archive them and copy them over.

There are pros and cons with the cloud cluster verses the cPanel single server. The cloud has a far simpler dashboard with a very primitive text file editor. This isn’t normally a problem for me as I mostly use the secure shell (ssh) to access my files so I can edit them with vi or vim. It’s necessary to first activate ssh, and you have to wait about 15 minutes after activating it before logging in. I then deactivate ssh after I’ve finished.

I had the choice of migrating my SSL certificate (for a fee) but it was less than a month from its expiry date so I didn’t bother. Instead I switched to Let’s Encrypt which is available for both the cPanel and cloud accounts. Unfortunately, the cloud account doesn’t support Let’s Encrypt for sub-domains. Apparently it is supported for sub-domains on cPanel. The TSO Host support staff said I could switch to cPanel if I preferred but, after considering it, I decided not to bother. My website is small enough not to really need sub-domains and I hadn’t advertised the intended change, so I moved the shop back to its original location.

For about a week I had my website files on both the Hostgator server and the TSO Host cluster while I made all the necessary changes. If you visited the site during that time you would’ve been viewing the old files on the Hostgator server.

The first thing I had to do was change any absolute paths. With the cPanel server, the home directory is usually in the form /home/username but with the cluster the home directory varies depending on which device in the cluster you’re on. For most of the scripts I could obtain the path from the DOCUMENT_ROOT server setting. With osCommerce (used by the on-line store), the configuration file conveniently defines a constant that stores the shop’s path so that just required a minor edit to reference DOCUMENT_ROOT, but there were also a few absolute paths stored in the shop’s database, such as the location of the public and private key (used to encrypt the order information sent to PayPal). I modified the software to allow me to store relative paths in the database instead. The only place that I now have a hard-coded absolute path (to a specific device in the cluster) is in the .htaccess files. I haven’t found any way around this.

The other changes I needed to make was to update my PHP files to work with PHP 7.3 as they were previously running on an older version and contained deprecated commands. With TSO Host I have the option to switch to an older version, such as PHP 5.4, which I did initially to ensure the scripts worked, but obviously it’s better to use the latest version, which comes with extra security measures. Once I’d finished making the necessary modifications I switched to PHP 7.3.

You might be wondering what happened to my Perl CGI scripts that I mentioned in my previous post. It turns out they don’t work here either. The missing modules are still missing, but at least I’m now getting an understandable error message from cpan: I don’t have permission to install them. Perhaps they’re not pre-installed because they’re now obsolete or have vulnerabilities or haven’t been vetted. Anyway, I’ve now decided that I’d rather use PHP which provides the necessary functions that those scripts require without the need to depend on extra libraries or modules. All those Perl scripts should now automatically redirect to the new PHP replacements.

Once I’d made all the modifications necessary to make the site work on the new cloud server cluster it was time to change the nameservers. (Basically, when you type an address into your web browser it has to ask the nameserver for directions.) After the switch was made, I then went back to my Hostgator account and deleted all files, databases etc because I’m paranoid tidy before closing my account.

Since then I’ve been working on the remaining new PHP scripts in between a lot of travelling and other commitments. I also installed WordPress. This was easy to do from the cloud dashboard, and the installation tool sensibly chose not to use stupidly obvious admin or database names (they seem to just be randomly generated strings). My plan is to republish the articles from my old blog here, although I may omit any time-sensitive information (such as giveaways that have now closed).

WordPress isn’t as easy to tinker with as osCommerce. For example, osCommerce has a constant defined in the configuration file that has the relative path to the admin directory. This makes it really easy to rename. WordPress, on the other hand, hard-codes the relative admin path. While it’s technically possible to alter this by editing all the files that reference this path, the changes will be lost when upgrading to a new version. Whilst one shouldn’t rely on obscurity as the only form of defence, there’s no point in making things too obvious. (Consider the Lonely Mountain in “The Hobbit”. The hidden door could only be unlocked with a key at a certain time, but that didn’t mean the dwarves went around putting up signs saying “secret entrance this way”.) There are, however, security plugins that restrict the number of login attempts etc.

Both osCommerce and WordPress have the database credentials in a configuration file within their installation paths. This is normally protected from public viewing by the server settings, but an accidental mis-edit or deletion of the .htaccess files could cause the contents of those configuration files to be shown as plain text, exposing the credentials (user name, password, database name etc). So I’ve moved them out of those configuration files to a location that can’t be accessed by a browser.

I’ve added some new PHP scripts that have replaced static pages, such as the gallery that’s now searchable, the book page and the site map. The “new book alert” Perl CGI script has been replaced with the more general book list. There have been a few glitches, but hopefully they’re all now fixed, and there are some more updates still to do but the main scripts are done.

With my previous web hosting company, this site had one strictly necessary cookie for the online store. The new cloud account has a second strictly necessary cookie. These cookies will be discussed in the next post.

Clouds, Cookies and Migration Part 1: Migration

Once upon a time, a little parrot decided to migrate across the vast ocean to the cloud lands, with nothing more than a handful of cookies.

I switched web hosting provider in May, and have since made quite a few changes to the site. It all took much longer than I had originally intended. I’m sorry for any disruption, but everything should hopefully now be all sorted. One of the new additions is this blog, and it seems appropriate that this first post should explain the reason for all the changes, but first a little background about the origins of Dickimaw Books.

I had some tutorials on that I had developed from some training courses for staff and postgraduates that I occasionally taught at the University of East Anglia (UEA) where I’m an honorary lecturer (that is, I’m not a full-time paid member of staff; I just do the odd little project or occasional short staff and PG training courses in LaTeX). I provided the online tutorials as a supplement to the courses, which I made available under the GNU Free Documentation License (GNU FDL). After a while I discovered that people from outside of the courses were reading and sharing them. Some people were also printing them because they preferred to read hard copies. The tutorials were gaining in popularity but there’s no guarantee that “” will continue to be available. It was originally “”. It may have another name change, or may be retired. Most of all, it’s quite complicated to access the files on that server when off-site, which makes them hard for me to update. My honorary status is reviewed every few years and if it isn’t renewed I’ll lose my UEA account. I needed a domain name and server that I had more control over and easier access to.

I started to wonder about providing printed versions. Would people want to buy a book that’s freely available online? Free resources are often subsidised by adverts, but I find them intrusive. There are some sites so loaded with ads that it’s almost impossible to actually read the real page content. I’d rather not have ads on my site if possible, but without them I have to sell enough copies to cover the costs. It was a bit of a gamble.

If you’ve bought a copy or copies of my books, a big thank you. It’s because of you that this site is still going and free of ads.

I’ve already written about being an independent writer/publisher but I didn’t mention about web hosting. With only a small budget available, I had to opt for a low-cost shared hosting account. I started out with Hostgator’s “hatchling” account. I’ve been a computer programmer for over 25 years (my introduction to programming was a Modular 2 first year undergraduate course in 1988), and I had written some web pages and scripts before setting up my own site, but this was the first time I was responsible for an entire domain.

The shared hosting account came with an easy-to-use web application that a contact form could use to email me a message according to a template. Unfortunately it proved far too easy to use for spam-bots, and it also seemed to have a problem with messages that contained a backslash, so pretty much every LaTeX-related message was garbled. I replaced the forms with Perl CGI scripts since I was already familiar with programming in Perl. This fixed the problem of the garbled messages but the spam was still getting through.

I knew enough to be wary of bots trying to inject spammy links or malicious code and I’ve been on Usenet long enough to know about the existence of trolls, but I didn’t realise until then about the troll-bots whose sole purpose seems to be to seek out forms and post inflammatory comments based on certain text found on the page (such as “books”). At least, I’m assuming they were posted by bots. Any real person who can mistake a bug report form for a thought-provoking literary article has to be singularly devoid of intelligence.

(If I’m mistaken and it turns out they were posted by a real person rather than a bot, I do apologise if I caused any offence. I’ll rephrase it: people who deliberately write content with the express intent to insult or inflame while cravenly hiding behind the cloak of anonymity are the same kind of witless cowards who would, a hundred years ago, have been the type to write illiterate poison pen letters to their local community. The community is much wider these days, but it’s the same mentality.)

My first attempt to reduce the spam was to use Google’s reCAPTCHA. The premise seemed quite good initially. It’s a CAPTCHA-like system designed to check if the user is human (rather than a bot) whilst at the same time help with the digitization of books. It showed a wiggly, distorted word that you needed to confirm in the box (common in CAPTCHAs), and also a word scanned from a book, which you also needed to confirm. However, when I looked at the scripts with the reCAPTCHA at a later date I noticed that the scanned words had been replaced with photos of house number plates, which struck me as a bit creepy. I ended up removing them, and I tried other approaches, such as getting the user to confirm an ID (for bug reports and feature requests) or making the form multi-paged (which allows the form to be customized, depending on the initial settings, but also makes it a little harder for a bot to follow).

My original plan was just to provide a website to host the online versions of my books and related information, but I started to add more stuff and eventually decided to try e-commerce. I opted for the open-source osCommerce which is written in PHP. The advantage with an open source project is that I can modify the code to better suit my requirements. This was my first introduction to PHP, and at first I only made minor modifications to make it specifically a book store rather than a more general store (such as changing “manufacturer” to “author”). Later I tried to make it more mobile-friendly. The great thing about osCommerce is that the upgrades are provided in terms of differences between the old version and the new one. It’s less convenient than simply clicking on an upgrade button that will do everything for me, but it means I’m less likely to lose my changes.

If you happen to be planning on setting up your own online store, I recommend you get an expert to write up the legal documents (such as the terms and conditions). A few years ago my electricity supplier decided to launch a brand new site with user accounts to manage bills etc online. I dutifully started filling in the new account form and, being the pedantic person I am, I followed the “terms and conditions” link so that I could read it. (I don’t like ticking boxes to say I’ve read something when I haven’t even glanced at it.) I found myself on a page that was blank except for the title “Cookies”, so I contacted the company. I received an apology and the “correct” URL, which turned out to be their cookie policy page. I wrote back and asked for their actual terms and conditions page. Their reply effectively said that was their only terms and conditions and that I should stop fussing and just tick the check box. (They didn’t literally write that last bit, but that was the subtext.)

Terms and conditions are there to protect the site owner by imposing restrictions (don’t post nasty stuff on our site, don’t try breaking it, don’t sue us if it goes off-line occasionally). A site shouldn’t have to provide such obvious conditions. Society expects visitors to behave well: wipe your feet on the doormat, don’t smash the crockery or insult the other guests. The terms and conditions may go beyond that (which is why it’s a good idea to check them). A cookie policy (as well as a privacy policy), on the other hand, is legally required in countries with legislation such as the General Data Protection Regulation (GDPR) as it relates to your personal data.

My concern with that particular form wasn’t that the company wasn’t trying to impose any conditions on my use of the site. The problem was that it gave the impression that the site was designed by an amateur who didn’t know the difference between the two types of document. If they didn’t know that, how could I be confident that they knew how to securely store my data?

Trust is fundamental to business. It doesn’t matter how good your product is, a certain level of trust is required for people to buy from your store. This is particularly true for on-line small business that have a tendency to pop up and disappear. One of the things I realised I had to do was add an SSL certificate to the site. This wasn’t an option with Hostgator’s hatchling account. There was a button on the cPanel dashboard, but it just popped up a message saying I needed to upgrade my account in order to include SSL. I decided to upgrade to the business account, which included a free SSL certificate.

When I first started with Hostgator, the support channels included email and a ticketing system. It was only when my site encountered a problem in late 2017 that I discovered that some of these channels had gone. My Perl CGI scripts suddenly stopped working. The CTAN team alerted me to the problem. They have a tool that checks all the links on their site and the ones leading to my FAQ were now triggering an error code.

I soon discovered that some of the Perl modules that those scripts depended on were no longer installed. The most obvious solution was to reinstall them, but when I tried I received a rather unhelpful “an error has occurred” message. I modified all of the scripts so that they no longer triggered an error code and just displayed a message saying the particular function was currently unavailable while I investigated further. I couldn’t find a link to the ticketing system so I tried to send an email to the support address but received an automated reply telling me to phone or use the on-line chat. I didn’t fancy being stuck in a trans-Atlantic telephone queue so I tried the on-line chat. The operator didn’t know anything about Perl and eventually gave up and apologised that the technical support staff were all currently unavailable. Please try again later. I found on the forum that I wasn’t the only one experiencing this problem, but there were no solutions offered.

That was when I first considered moving to a new web hosting provider. I started looking around and noticed that many of the shared hosting accounts seemed to use cPanel as a convenient web application to allow site administrators to access the site files, email accounts, databases etc. I investigated cPanel and found that a new version had recently come out. There was a note advising that following the upgrade it may be necessary to reinstall some Perl modules. It therefore seemed likely that it was a cPanel update that had caused the modules to disappear. In which case, if the problem was with this new version of cPanel then moving web hosting provider may not help. In the end, I decided to stick with Hostgator and replace the Perl CGI scripts with PHP. This turned out to take a lot longer than I had originally anticipated due to other more urgent commitments.

A few months ago I started wondering about adding a blog to this site. I was previously using the author blog on GoodReads, but I don’t have much control over the interface, and I’ve started to become wary about putting content on third-party sites. I investigated WordPress (another open-source web application written in PHP) and considered adding it to my site. The business shared hosting account provided the option for sub-domains. By this time, I was starting to get quite adventurous. Perhaps I could have a sub-domain for the shop and another sub-domain for the blog.

I started with the shop. I made a temporary index page for to say that it was currently off-line and set to work moving all the files over to a sub-domain. I ran through some test transactions and everything seemed to work fine. Then I decided that I ought to review all the security settings just to make sure everything was as tight as it could be.

The shop doesn’t store any financial details. It works by transferring the customer to PayPal to perform the actual payment. The shop, therefore, has to communicate with PayPal to provide the transaction details (customer name, invoice address, shipping address, cost etc). PayPal, in turn, has to send a message back to the shop to say the transaction has been completed. The customer is returned to the store, which triggers the confirmation email and the store empties the customer’s basket. The communications between the store and PayPal have to be made securely (in case anyone attempts to intercept them and also to prevent any naughty customer from trying to alter the total cost).

While reviewing all the settings I noticed that “verify SSL” was set to false and that looked like the kind of thing that ought to be on. So I switched it to true and ran a test transaction. An error message appear when I was returned from PayPal’s sandbox account to the store. This was rather annoying. So I copied the message into a search engine to find a solution. I encountered a lot of questions about the problem but the only “solution” offered was to switch off the “verify SSL” setting. This didn’t strike me as a particularly good answer.

Near the “verify SSL” setting there’s a “test connection” button, so I tried that out. It displayed a red “failure” when the setting was on and a green “success” when it was off. The most obvious next step was to find out exactly what that test did. It took a bit of rummaging around the PHP code, but I finally discovered that it was calling curl (client URL, a command-line tool for transferring data). I modified the code so that on failure it would also provide curl’s error message. This gave me a new bit of information in my search, and this time I finally found an answer. My version of curl was likely too old. I found the version number (7.19.7) and looked it up in curl’s release page. It was over 9 years old with 40 known vulnerabilities. Definitely time for an upgrade.

So I was back on Hostgator’s chat. “An operator will be with you in 5 minutes.” Fifteen minutes later I’m finally connected with someone. I was staggered by the response to my request to upgrade curl: sorry, that option isn’t available for the shared hosting accounts. “I hope you understand” the operator concluded. I reiterated it’s over 9 years old with 40 known vulnerabilities but I received the same response again concluded with “I hope you understand”.

Quite frankly, I don’t understand. I think it’s reprehensible of a web hosting provider to not have regular updates for common tools that form an essential part of web security, particularly for a business account that’s marketed as being suitable for e-commerce.

I replied that, in that case, I would find a new web hosting provider and did the internet equivalent of slamming the phone on the hook: I clicked on the close button. So now I really had to migrate and I once again began investigating alternatives.

This was back in May and my news feed had recently included articles about fake reviews, which made me quite wary. Was the 5 star “X is brilliant” written by a genuinely happy customer of X or was it written by someone paid by X? Was the 1 star “X is awful so I switched to Y who are brilliant” really written by a dissatisfied customer of X or was it written by someone paid by Y to boost Y and slate the opposition? How old is the review? Companies can improve or deteriorate over time. An old low rating could’ve prompted them to change for the better.

I decided not to be strongly influenced by ratings and reviews but instead draw up a list according to my requirements and find out from each company in turn what version of curl they had installed. Given the way software updates usually work, the chances are that if curl was up-to-date then so would the other tools I also require.

So, what are my primary requirements? Shared hosting (small budget), Linux (I don’t want to waste time learning how to use an unfamiliar operating system), apache (web server), MySQL (for my databases), PHP (web scripts), Perl (web scripts and non-web command line tools for maintenance etc). That doesn’t really narrow the list as this is quite a common set of requirements.

In order to filter the list further so that I had an easier starting point, I decided to add a secondary requirement that I was willing to drop if I couldn’t find a satisfactory fit: a company with a UK base so I didn’t have to worry about international helplines, currency conversion or bank fees for foreign transactions.

In the end I opted for TSO Host. The chat operator was quick to respond, helpful, and provided me with a curl version number that I checked and found to be less than 2 months old with no known vulnerabilities. I had the option to select either a cPanel account or a cloud account. There are three equivalent packages for each type of account. The price varies according to the package but doesn’t depend on whether you opt for cPanel or cloud. Since I was already familiar with cPanel the operator suggested I might prefer that, which I originally agreed about, but if I changed my mind I could switch to the cloud.

The next post deals with migrating from Hostgator’s cPanel single server to TSO Host’s cloud server cluster.