Another Migration

In the first post of this blog, I wrote about my decision to migrate to a new web hosting provider back in 2019. Last week, the site migrated again, but this time I stayed with the same web hosting provider. I moved from the cloud hosting platform (which uses a server cluster) to a newer single server platform.

Migrating a web site is rather like moving house. The removal company moves the content and will connect the large appliances in your new home, but there are a lot of little bits and pieces that you have to do yourself. You have to let everyone know you’ve moved and you need to get used to the new layout. Those handy tools that were in a certain location in the old place are now somewhere else. Gadgets need re-configuring. A convenient local service isn’t available and another one needs to be found.

In an analogous way, the web hosting company’s migration team moved over files and databases from the old servers onto the new one and set things up, but there are different paths and configurations on the new server that needed to be taken into account. Certain files lost their executable bit that had to be restored. Some code that worked in the old location doesn’t work in the new environment and had to be modified. The mail boxes had to be created manually, DNS records needed changing, and custom cron jobs had to be checked and set up.

The Domain Name System (DNS) provides public records associated with every domain. When you type an address in your browser, the browser needs to know where to go to fetch the file associated with that address. The DNS records provide the route to the server for the given domain (dickimaw-books.com in this case) and the information is cached (usually for around 24 hours) so that the browser doesn’t have to keep looking up the information as you move from one page to the next. Similarly, when you send an email, your mail server has to look up the appropriate entry in the DNS record to find out how to route your message.

When a site moves to a new server, all these records need to be updated, but there’s an additional delay as a result of caching. For a while, emails can’t be delivered, and visitors are directed to the old server and then, when the old site certificate becomes invalid, they find themselves confronted with a big scary warning message from the browser until the new certificate is sorted out.

There was a moment last week when I wondered why I’d been mad enough to consider migrating the site. Sure, the old cloud hosting package had its problems and it could be a little slow, but at least it had worked and I knew what tools were available and where to find them. However, eventually things were sorted out, the new server is much faster, and the stricter PHP settings flagged up a few bugs that I’ve now fixed.

Once the migration was successfully completed, the final step was to cancel the old cloud hosting package, but just before I did that I learnt that it had been marked for obsolescence and I would’ve had to have migrated in a month’s time anyway. So it all worked out for the best in the end. I’m sorry if you encountered any problems while trying to access the site last week, but it should mostly be operational now (except for the shop, which requires some further testing before it can be reopened).

If you are a regular visitor to the site, you may have noticed that there’s a new “Account” link in the main navigation bar. This is something I’ve been working on for some months now, and it was while working on it that I became so frustrated with the limitations of the cloud hosting package that I decided to move. I’ll describe it in more detail in the next post.

Clouds, Cookies and Migration Part 3: Cookies

Once upon a time, a little parrot decided to migrate across the vast ocean to the cloud lands, with nothing more than a handful of cookies.

This is a continuation of the story about this site’s migration to a cloud cluster web hosting account. The first part described what prompted the move, and the second part described the move from a single server to a cloud cluster.

This site now has two strictly necessary cookies. It also has one optional cookie that’s only set if you explicitly request it (in the settings page). It’s hard to visit a site these days without having a box popping up on the screen about cookies, and cookies have had a lot of bad press because they’re often used for tracking. However, despite the number of sites screaming at you about their cookies, they’re still a bit of a mystery to some Internet users. It’s some secret thingummy that follows you around, spying on your every move.

When you click on a link, select a bookmark or type an address in your browser’s location bar, your browser sends a request to the domain’s server for the contents of the particular page. The returned content is usually (but not always) in a format called hypertext markup language (HTML). For example, suppose you visit www.example.com/test.html then this corresponds to a file called test.html that’s on example.com’s server. In this case, all the server has to do is send the contents of this file to the browser. This is a “static page” because it’s always the same (until someone edits the test.html file).

Now suppose you visit www.example.com/test.php then this corresponds to a file called test.php. This is a script (application) that when run by the server creates the HTML content that needs to be sent to the browser. This is a “dynamic page” because the content may change depending on certain conditions.

The hypertext transfer protocol (HTTP) used by web pages (both static and dynamic) is stateless. That means a web page can’t remember anything, so it has to be repeatedly retold everything it needs to know. (Your browser may remember the information you’ve typed into forms to help autofill if you need to fill in the form again, but the server doesn’t remember.)

Image of text greeting guest and prompting for guest’s name.

Suppose Alice visits www.example.com/​​test.php. Her browser asks the example.com server for test.php and that script sends back the text “Hello, guest. What’s your name?” with a box for Alice to supply her name. She does this and clicks on the submit button.

Image greeting Alice and asking for her favourite colour.

Her browser now asks the server for test.php along with the information that the “name” parameter has been set to “Alice”. This time the script sends back the text “Hello, Alice. What’s your favourite colour?” with a box for Alice to specify her favourite colour. So she enters “purple” and clicks the submit button. This time her browser asks the server for test.php along with the information that the “colour” parameter has been set to “purple”. Now the script knows the colour but it’s forgotten her name.

The usual method is for the script to include the information it’s already been supplied with as hidden parameters in the new form so that the user doesn’t have to keep retyping the same information. So let’s start again.

The browser asks the server for test.php with no further information, so the script sends its default message “Hello, guest. What’s your name?” with a box for the name. Alice fills it in and clicks on the submit button. The browser then asks the server for test.php along with the information that the “name” parameter has been set to “Alice”. The script sends back the text “Hello, Alice. What’s your favourite colour?” with the colour box, but it also includes markup that defines a hidden parameter called “name” that’s set to “Alice”. Hidden parameters aren’t visible on the page but now when Alice clicks on the submit button her browser asks the server for test.php and tells it the name is “Alice” and the colour is “purple”. So this time the script knows both her name and her favourite colour and sends back the text “Hello, Alice” with an instruction that the browser should display her name in purple. The next time Alice visits test.php she’ll be greeted with the default message, because the script has forgotten the information she previously provided.

There are two types of parameters: “get” and “post”. The get parameters are included in the web address. For example, www.​​example.​​com/​​test.php?​​​​name=​​Alice&​​colour=​​purple (the question mark separates the address of the script and the parameters that need to be supplied to the script). If Alice sends this address to Bob and he clicks on the link, he’ll be greeted with the message “Hello, Alice”.

Post parameters aren’t included in the web address. They can only be sent when you submit a form.¹ If, after completing the form, Alice shares the page with Bob and he clicks on the link then he’ll be at the start.

Now let’s suppose that both Alice and Bob visit dickimaw-books.com/shop. Alice adds a copy of The Foolish Hedgehog to the basket, and Bob adds a copy of LaTeX for Complete Novices to the basket. Both of them click on the “Cart Contents” link which leads them to dickimaw-books.com/shop/shopping_cart.php but how does the shopping_cart.php script know that it needs to show Alice a link to The Foolish Hedgehog and Bob a link to LaTeX for Complete Novices?

Each visitor is assigned a session ID, which is a long sequence of letters and numbers that uniquely identifies the visitor. The server stores this ID in a database along with the shopping cart contents, but it needs to be told the ID along with the page request so that it can match it up with the ID in its database.

If the session ID is sent as a parameter then it can be sent as a hidden post parameter in a form (if Alice or Bob click on a button) but it would have to be sent as a get parameter (part of the address) if they click on a link.

Suppose Alice clicks on a link to Quack, Quack Quack, Give My Hat Back! and decides that Eve might be interested in it, so she shares the link with Eve. Eve clicks on the link but it includes Alice’s session ID as a get parameter, so the shop thinks that Eve is Alice. If Alice happens to be logged in then Eve will find herself logged in as Alice.

Ooh, thinks Eve. I wonder what kind of things Alice has been ordering? Eve clicks on the order history link, and the script sends her a list of Alice’s orders because Eve is using Alice’s session ID. Ooh, thinks Eve. Alice ordered a copy of The Private Enemy and sent it to Bob. Oh so that’s his address. I think I’ll pop round this evening to find out why he’s been avoiding me. Won’t he be surprised to find out I know he’s received a present from Alice!

A more secure solution is for the server to tell the browser to save the session ID and send it along with any requests for pages on that website. That way the ID doesn’t get included in any links so it won’t be accidentally shared. (It could be intentionally intercepted by an eavesdropper but that’s another story.) So when Alice now logs into her account the server creates a new session ID and asks the browser to save it. The browser writes the ID in a small file (on Alice’s computer or mobile device) that it associates with the website’s domain (dickimaw-books.com in this case). That file is called a cookie.

This particular type of cookie (used to store a session ID) is normally a session cookie, which means that it expires at the end of the session (that is, when the browser is closed). It’s the browser’s responsibility to delete expired cookies (although you can explicitly delete cookies through the browser’s privacy settings). If the site has a “remember me” option then a persistent cookie is needed instead. Also, some browsers use “session restoring” which makes the session cookies permanent, so it’s a good idea to check your browser configuration.

This site has two session cookies: one that contains the session ID for the shop and one for load balancing. The shop cookie is only valid on the dickimaw-books.com/shop path, not across the entire site (so the browser will only send it while you’re visiting the shop). The load balancing cookie helps to distribute the visitors to this site across the servers in the cluster.

A persistent cookie is one that stays around after the session has ended. This may have an expiry date and again the browser should delete it once that date has passed. This site has a persistent cookie (called “dickimaw_settings”) that will only be set if you change the default site settings. When you return to the site, your browser will be able to tell the server your preferences (which are stored in the cookie) so you don’t have to keep changing it whenever you visit.

If I enable comments for this blog, there will be additional cookies so that you can create an account and log in to post comments or modify your preferences. These are covered in the blog’s privacy policy in case I decide to enable blog commenting at some later date. Private areas of the site have other cookies, but they’re private so you shouldn’t be going there anyway.

The load balancing and session ID cookies are known as “strictly necessary” cookies because they’re needed to ensure that the website functions efficiently and to help protect you against accidentally sharing your session ID. The cookie that stores your preferences is an optional cookie because the site is able to function without it.

Some sites only have strictly necessary cookies and so have a box informing you of this and all you can do is dismiss the box (by clicking on “okay”, “got it!”, “close” or whatever). Some sites have optional cookies so the notification box will ask you if you want those optional cookies or not.

I once visited a site where I clicked on “no thanks” because I was getting annoyed with all these stupid cookie boxes popping up at me while I was searching for something. The site changed its content so that it simply displayed a churlish message saying that if I wasn’t going to accept its cookies it wouldn’t show me anything. (In hindsight, I should’ve just switched to reader view.) Some weeks later I happened to land on that site again and it still displayed the same curt message. But wait a minute, how did the site remember my response? It had, of course, used a cookie to store the fact that I didn’t want that site to create any cookies.

So, if a site just has a couple of strictly necessary session cookies but also has a box telling you about those cookies then it will also need a persistent cookie to record that you’ve acknowledged the cookie notice. This is also a strictly necessary cookie because the site can’t function properly if you have to dismiss a box every time you move from one page to the next (or reload a page). So you now have two session cookies that will expire after your session ends and a persistent cookie that will linger long after those session cookies have gone just in case you happen to revisit the site at a later date.

The Information Commissioner’s Office (ICO) stipulates that “you cannot set non-essential cookies on your website’s homepage before the user has consented to them”. So this site doesn’t set the optional settings cookie unless you explicitly request it on the settings page.

Consent usually isn’t required for strictly necessary cookies that are needed for online shopping or load balancing but, as the ICO points out, “it is still good practice to provide users with information about these cookies, even if you do not need consent”. So this site has a cookie notice that’s designed to be visible but not block the page content. If you don’t want to keep looking at it, you can use the settings page to hide it, but this will, of course, create a cookie to store your preference.

The normal advice is to use your browser privacy settings to allow first-party cookies but block third-party cookies. If a page on one website embeds content from another website then the browser will send the cookies that belong to the website you’re visiting (first-party cookies) but will also send the cookies that belong to the website that provides the embedded content (third-party cookies). It’s these third-party cookies that are following you around.

The next blog post will describe the settings page in more detail.

[Update 2021-06-05: after migrating from the cloud cluster to a single server account the load balancing cookie is no longer created. However, the implementation of the new site account means that there’s a new strictly necessary session cookie that, like the shop session cookie, is used to store your session ID and there’s a new optional cookie that’s created if you select the “trust this device” checkbox when authenticating using a time-based one time password (TOTP).]


¹Post parameters can also be sent by altering the HTTP header request, but I don’t want to get too technical here.

Clouds, Cookies and Migration Part 2: Clouds

Once upon a time, a little parrot decided to migrate across the vast ocean to the cloud lands, with nothing more than a handful of cookies.

The previous post described why I decided to migrate to a new web hosting provider, TSO Host. I had the choice between their cPanel account and their cloud account. I opted for the cloud account.

The term “cloud computing” or “in the cloud” can conjure up a fluffy image of data floating around in the air. It’s actually far more down-to-earth and basically entails a bunch of computers in a data centre that are on 24/7 so that the data on them can be accessed across the world at any time in any time zone.

Imagine a computer without a monitor, keyboard or mouse (because it’s only accessed remotely). It’s a server or, rather, has a server process running on it. This is a bit of a simplification, but in this case the server essentially serves up files on request (provided access is permitted). Now imagine that computer sitting on a rack full of other computers. Imagine row upon row upon row of these racks, filling up the entire room. All whirring away because obviously they have to be on all the time. They’re using up electricity and generating heat ― far too much heat, so they have to be cooled. Those devices could be storing private data, such as company documents or your personal photos, or they could be storing the files that make up a website. These data centres not only need to be protected from hacking they also need to be protected from physical intrusion (i.e. burglary).

In that sense, the cPanel shared hosting account is also cloud computing. The files that make up the web site are all contained on a single server in a data centre. The cloud web hosting provided by TSO Host is a particular type of cloud computing that uses a cluster server where the files that make up the web site are synchronized across multiple servers. This makes the web site more reliable. If one device goes down, the others in the cluster can keep the site on-line.

In the end, I decided on the cloud account and opted for the free migration help when I signed up. The migration team copied over all mail accounts, databases and the contents of the public_html directory from my account on Hostgator’s server. I had some files outside of that directory but they were small by comparison, and it was easy enough for me to archive them and copy them over.

There are pros and cons with the cloud cluster verses the cPanel single server. The cloud has a far simpler dashboard with a very primitive text file editor. This isn’t normally a problem for me as I mostly use the secure shell (ssh) to access my files so I can edit them with vi or vim. It’s necessary to first activate ssh, and you have to wait about 15 minutes after activating it before logging in. I then deactivate ssh after I’ve finished.

I had the choice of migrating my SSL certificate (for a fee) but it was less than a month from its expiry date so I didn’t bother. Instead I switched to Let’s Encrypt which is available for both the cPanel and cloud accounts. Unfortunately, the cloud account doesn’t support Let’s Encrypt for sub-domains. Apparently it is supported for sub-domains on cPanel. The TSO Host support staff said I could switch to cPanel if I preferred but, after considering it, I decided not to bother. My website is small enough not to really need sub-domains and I hadn’t advertised the intended change, so I moved the shop back to its original location.

For about a week I had my website files on both the Hostgator server and the TSO Host cluster while I made all the necessary changes. If you visited the site during that time you would’ve been viewing the old files on the Hostgator server.

The first thing I had to do was change any absolute paths. With the cPanel server, the home directory is usually in the form /home/username but with the cluster the home directory varies depending on which device in the cluster you’re on. For most of the scripts I could obtain the path from the DOCUMENT_ROOT server setting. With osCommerce (used by the on-line store), the configuration file conveniently defines a constant that stores the shop’s path so that just required a minor edit to reference DOCUMENT_ROOT, but there were also a few absolute paths stored in the shop’s database, such as the location of the public and private key (used to encrypt the order information sent to PayPal). I modified the software to allow me to store relative paths in the database instead. The only place that I now have a hard-coded absolute path (to a specific device in the cluster) is in the .htaccess files. I haven’t found any way around this.

The other changes I needed to make was to update my PHP files to work with PHP 7.3 as they were previously running on an older version and contained deprecated commands. With TSO Host I have the option to switch to an older version, such as PHP 5.4, which I did initially to ensure the scripts worked, but obviously it’s better to use the latest version, which comes with extra security measures. Once I’d finished making the necessary modifications I switched to PHP 7.3.

You might be wondering what happened to my Perl CGI scripts that I mentioned in my previous post. It turns out they don’t work here either. The missing modules are still missing, but at least I’m now getting an understandable error message from cpan: I don’t have permission to install them. Perhaps they’re not pre-installed because they’re now obsolete or have vulnerabilities or haven’t been vetted. Anyway, I’ve now decided that I’d rather use PHP which provides the necessary functions that those scripts require without the need to depend on extra libraries or modules. All those Perl scripts should now automatically redirect to the new PHP replacements.

Once I’d made all the modifications necessary to make the site work on the new cloud server cluster it was time to change the nameservers. (Basically, when you type an address into your web browser it has to ask the nameserver for directions.) After the switch was made, I then went back to my Hostgator account and deleted all files, databases etc because I’m paranoid tidy before closing my account.

Since then I’ve been working on the remaining new PHP scripts in between a lot of travelling and other commitments. I also installed WordPress. This was easy to do from the cloud dashboard, and the installation tool sensibly chose not to use stupidly obvious admin or database names (they seem to just be randomly generated strings). My plan is to republish the articles from my old blog here, although I may omit any time-sensitive information (such as giveaways that have now closed).

WordPress isn’t as easy to tinker with as osCommerce. For example, osCommerce has a constant defined in the configuration file that has the relative path to the admin directory. This makes it really easy to rename. WordPress, on the other hand, hard-codes the relative admin path. While it’s technically possible to alter this by editing all the files that reference this path, the changes will be lost when upgrading to a new version. Whilst one shouldn’t rely on obscurity as the only form of defence, there’s no point in making things too obvious. (Consider the Lonely Mountain in “The Hobbit”. The hidden door could only be unlocked with a key at a certain time, but that didn’t mean the dwarves went around putting up signs saying “secret entrance this way”.) There are, however, security plugins that restrict the number of login attempts etc.

Both osCommerce and WordPress have the database credentials in a configuration file within their installation paths. This is normally protected from public viewing by the server settings, but an accidental mis-edit or deletion of the .htaccess files could cause the contents of those configuration files to be shown as plain text, exposing the credentials (user name, password, database name etc). So I’ve moved them out of those configuration files to a location that can’t be accessed by a browser.

I’ve added some new PHP scripts that have replaced static pages, such as the gallery that’s now searchable, the book page and the site map. The “new book alert” Perl CGI script has been replaced with the more general book list. There have been a few glitches, but hopefully they’re all now fixed, and there are some more updates still to do but the main scripts are done.

With my previous web hosting company, this site had one strictly necessary cookie for the online store. The new cloud account has a second strictly necessary cookie. These cookies will be discussed in the next post.

Clouds, Cookies and Migration Part 1: Migration

Once upon a time, a little parrot decided to migrate across the vast ocean to the cloud lands, with nothing more than a handful of cookies.

I switched web hosting provider in May, and have since made quite a few changes to the site. It all took much longer than I had originally intended. I’m sorry for any disruption, but everything should hopefully now be all sorted. One of the new additions is this blog, and it seems appropriate that this first post should explain the reason for all the changes, but first a little background about the origins of Dickimaw Books.

I had some tutorials on theoval.cmp.uea.ac.uk/~nlct that I had developed from some training courses for staff and postgraduates that I occasionally taught at the University of East Anglia (UEA) where I’m an honorary lecturer (that is, I’m not a full-time paid member of staff; I just do the odd little project or occasional short staff and PG training courses in LaTeX). I provided the online tutorials as a supplement to the courses, which I made available under the GNU Free Documentation License (GNU FDL). After a while I discovered that people from outside of the courses were reading and sharing them. Some people were also printing them because they preferred to read hard copies. The tutorials were gaining in popularity but there’s no guarantee that “theoval.cmp.uea.ac.uk” will continue to be available. It was originally “theoval.sys.uea.ac.uk”. It may have another name change, or may be retired. Most of all, it’s quite complicated to access the files on that server when off-site, which makes them hard for me to update. My honorary status is reviewed every few years and if it isn’t renewed I’ll lose my UEA account. I needed a domain name and server that I had more control over and easier access to.

I started to wonder about providing printed versions. Would people want to buy a book that’s freely available online? Free resources are often subsidised by adverts, but I find them intrusive. There are some sites so loaded with ads that it’s almost impossible to actually read the real page content. I’d rather not have ads on my site if possible, but without them I have to sell enough copies to cover the costs. It was a bit of a gamble.

If you’ve bought a copy or copies of my books, a big thank you. It’s because of you that this site is still going and free of ads.

I’ve already written about being an independent writer/publisher but I didn’t mention about web hosting. With only a small budget available, I had to opt for a low-cost shared hosting account. I started out with Hostgator’s “hatchling” account. I’ve been a computer programmer for over 25 years (my introduction to programming was a Modular 2 first year undergraduate course in 1988), and I had written some web pages and scripts before setting up my own site, but this was the first time I was responsible for an entire domain.

The shared hosting account came with an easy-to-use web application that a contact form could use to email me a message according to a template. Unfortunately it proved far too easy to use for spam-bots, and it also seemed to have a problem with messages that contained a backslash, so pretty much every LaTeX-related message was garbled. I replaced the forms with Perl CGI scripts since I was already familiar with programming in Perl. This fixed the problem of the garbled messages but the spam was still getting through.

I knew enough to be wary of bots trying to inject spammy links or malicious code and I’ve been on Usenet long enough to know about the existence of trolls, but I didn’t realise until then about the troll-bots whose sole purpose seems to be to seek out forms and post inflammatory comments based on certain text found on the page (such as “books”). At least, I’m assuming they were posted by bots. Any real person who can mistake a bug report form for a thought-provoking literary article has to be singularly devoid of intelligence.

(If I’m mistaken and it turns out they were posted by a real person rather than a bot, I do apologise if I caused any offence. I’ll rephrase it: people who deliberately write content with the express intent to insult or inflame while cravenly hiding behind the cloak of anonymity are the same kind of witless cowards who would, a hundred years ago, have been the type to write illiterate poison pen letters to their local community. The community is much wider these days, but it’s the same mentality.)

My first attempt to reduce the spam was to use Google’s reCAPTCHA. The premise seemed quite good initially. It’s a CAPTCHA-like system designed to check if the user is human (rather than a bot) whilst at the same time help with the digitization of books. It showed a wiggly, distorted word that you needed to confirm in the box (common in CAPTCHAs), and also a word scanned from a book, which you also needed to confirm. However, when I looked at the scripts with the reCAPTCHA at a later date I noticed that the scanned words had been replaced with photos of house number plates, which struck me as a bit creepy. I ended up removing them, and I tried other approaches, such as getting the user to confirm an ID (for bug reports and feature requests) or making the form multi-paged (which allows the form to be customized, depending on the initial settings, but also makes it a little harder for a bot to follow).

My original plan was just to provide a website to host the online versions of my books and related information, but I started to add more stuff and eventually decided to try e-commerce. I opted for the open-source osCommerce which is written in PHP. The advantage with an open source project is that I can modify the code to better suit my requirements. This was my first introduction to PHP, and at first I only made minor modifications to make it specifically a book store rather than a more general store (such as changing “manufacturer” to “author”). Later I tried to make it more mobile-friendly. The great thing about osCommerce is that the upgrades are provided in terms of differences between the old version and the new one. It’s less convenient than simply clicking on an upgrade button that will do everything for me, but it means I’m less likely to lose my changes.

If you happen to be planning on setting up your own online store, I recommend you get an expert to write up the legal documents (such as the terms and conditions). A few years ago my electricity supplier decided to launch a brand new site with user accounts to manage bills etc online. I dutifully started filling in the new account form and, being the pedantic person I am, I followed the “terms and conditions” link so that I could read it. (I don’t like ticking boxes to say I’ve read something when I haven’t even glanced at it.) I found myself on a page that was blank except for the title “Cookies”, so I contacted the company. I received an apology and the “correct” URL, which turned out to be their cookie policy page. I wrote back and asked for their actual terms and conditions page. Their reply effectively said that was their only terms and conditions and that I should stop fussing and just tick the check box. (They didn’t literally write that last bit, but that was the subtext.)

Terms and conditions are there to protect the site owner by imposing restrictions (don’t post nasty stuff on our site, don’t try breaking it, don’t sue us if it goes off-line occasionally). A site shouldn’t have to provide such obvious conditions. Society expects visitors to behave well: wipe your feet on the doormat, don’t smash the crockery or insult the other guests. The terms and conditions may go beyond that (which is why it’s a good idea to check them). A cookie policy (as well as a privacy policy), on the other hand, is legally required in countries with legislation such as the General Data Protection Regulation (GDPR) as it relates to your personal data.

My concern with that particular form wasn’t that the company wasn’t trying to impose any conditions on my use of the site. The problem was that it gave the impression that the site was designed by an amateur who didn’t know the difference between the two types of document. If they didn’t know that, how could I be confident that they knew how to securely store my data?

Trust is fundamental to business. It doesn’t matter how good your product is, a certain level of trust is required for people to buy from your store. This is particularly true for on-line small business that have a tendency to pop up and disappear. One of the things I realised I had to do was add an SSL certificate to the site. This wasn’t an option with Hostgator’s hatchling account. There was a button on the cPanel dashboard, but it just popped up a message saying I needed to upgrade my account in order to include SSL. I decided to upgrade to the business account, which included a free SSL certificate.

When I first started with Hostgator, the support channels included email and a ticketing system. It was only when my site encountered a problem in late 2017 that I discovered that some of these channels had gone. My Perl CGI scripts suddenly stopped working. The CTAN team alerted me to the problem. They have a tool that checks all the links on their site and the ones leading to my FAQ were now triggering an error code.

I soon discovered that some of the Perl modules that those scripts depended on were no longer installed. The most obvious solution was to reinstall them, but when I tried I received a rather unhelpful “an error has occurred” message. I modified all of the scripts so that they no longer triggered an error code and just displayed a message saying the particular function was currently unavailable while I investigated further. I couldn’t find a link to the ticketing system so I tried to send an email to the support address but received an automated reply telling me to phone or use the on-line chat. I didn’t fancy being stuck in a trans-Atlantic telephone queue so I tried the on-line chat. The operator didn’t know anything about Perl and eventually gave up and apologised that the technical support staff were all currently unavailable. Please try again later. I found on the forum that I wasn’t the only one experiencing this problem, but there were no solutions offered.

That was when I first considered moving to a new web hosting provider. I started looking around and noticed that many of the shared hosting accounts seemed to use cPanel as a convenient web application to allow site administrators to access the site files, email accounts, databases etc. I investigated cPanel and found that a new version had recently come out. There was a note advising that following the upgrade it may be necessary to reinstall some Perl modules. It therefore seemed likely that it was a cPanel update that had caused the modules to disappear. In which case, if the problem was with this new version of cPanel then moving web hosting provider may not help. In the end, I decided to stick with Hostgator and replace the Perl CGI scripts with PHP. This turned out to take a lot longer than I had originally anticipated due to other more urgent commitments.

A few months ago I started wondering about adding a blog to this site. I was previously using the author blog on GoodReads, but I don’t have much control over the interface, and I’ve started to become wary about putting content on third-party sites. I investigated WordPress (another open-source web application written in PHP) and considered adding it to my site. The business shared hosting account provided the option for sub-domains. By this time, I was starting to get quite adventurous. Perhaps I could have a sub-domain for the shop and another sub-domain for the blog.

I started with the shop. I made a temporary index page for dickimaw-books.com/shop to say that it was currently off-line and set to work moving all the files over to a sub-domain. I ran through some test transactions and everything seemed to work fine. Then I decided that I ought to review all the security settings just to make sure everything was as tight as it could be.

The shop doesn’t store any financial details. It works by transferring the customer to PayPal to perform the actual payment. The shop, therefore, has to communicate with PayPal to provide the transaction details (customer name, invoice address, shipping address, cost etc). PayPal, in turn, has to send a message back to the shop to say the transaction has been completed. The customer is returned to the store, which triggers the confirmation email and the store empties the customer’s basket. The communications between the store and PayPal have to be made securely (in case anyone attempts to intercept them and also to prevent any naughty customer from trying to alter the total cost).

While reviewing all the settings I noticed that “verify SSL” was set to false and that looked like the kind of thing that ought to be on. So I switched it to true and ran a test transaction. An error message appear when I was returned from PayPal’s sandbox account to the store. This was rather annoying. So I copied the message into a search engine to find a solution. I encountered a lot of questions about the problem but the only “solution” offered was to switch off the “verify SSL” setting. This didn’t strike me as a particularly good answer.

Near the “verify SSL” setting there’s a “test connection” button, so I tried that out. It displayed a red “failure” when the setting was on and a green “success” when it was off. The most obvious next step was to find out exactly what that test did. It took a bit of rummaging around the PHP code, but I finally discovered that it was calling curl (client URL, a command-line tool for transferring data). I modified the code so that on failure it would also provide curl’s error message. This gave me a new bit of information in my search, and this time I finally found an answer. My version of curl was likely too old. I found the version number (7.19.7) and looked it up in curl’s release page. It was over 9 years old with 40 known vulnerabilities. Definitely time for an upgrade.

So I was back on Hostgator’s chat. “An operator will be with you in 5 minutes.” Fifteen minutes later I’m finally connected with someone. I was staggered by the response to my request to upgrade curl: sorry, that option isn’t available for the shared hosting accounts. “I hope you understand” the operator concluded. I reiterated it’s over 9 years old with 40 known vulnerabilities but I received the same response again concluded with “I hope you understand”.

Quite frankly, I don’t understand. I think it’s reprehensible of a web hosting provider to not have regular updates for common tools that form an essential part of web security, particularly for a business account that’s marketed as being suitable for e-commerce.

I replied that, in that case, I would find a new web hosting provider and did the internet equivalent of slamming the phone on the hook: I clicked on the close button. So now I really had to migrate and I once again began investigating alternatives.

This was back in May and my news feed had recently included articles about fake reviews, which made me quite wary. Was the 5 star “X is brilliant” written by a genuinely happy customer of X or was it written by someone paid by X? Was the 1 star “X is awful so I switched to Y who are brilliant” really written by a dissatisfied customer of X or was it written by someone paid by Y to boost Y and slate the opposition? How old is the review? Companies can improve or deteriorate over time. An old low rating could’ve prompted them to change for the better.

I decided not to be strongly influenced by ratings and reviews but instead draw up a list according to my requirements and find out from each company in turn what version of curl they had installed. Given the way software updates usually work, the chances are that if curl was up-to-date then so would the other tools I also require.

So, what are my primary requirements? Shared hosting (small budget), Linux (I don’t want to waste time learning how to use an unfamiliar operating system), apache (web server), MySQL (for my databases), PHP (web scripts), Perl (web scripts and non-web command line tools for maintenance etc). That doesn’t really narrow the list as this is quite a common set of requirements.

In order to filter the list further so that I had an easier starting point, I decided to add a secondary requirement that I was willing to drop if I couldn’t find a satisfactory fit: a company with a UK base so I didn’t have to worry about international helplines, currency conversion or bank fees for foreign transactions.

In the end I opted for TSO Host. The chat operator was quick to respond, helpful, and provided me with a curl version number that I checked and found to be less than 2 months old with no known vulnerabilities. I had the option to select either a cPanel account or a cloud account. There are three equivalent packages for each type of account. The price varies according to the package but doesn’t depend on whether you opt for cPanel or cloud. Since I was already familiar with cPanel the operator suggested I might prefer that, which I originally agreed about, but if I changed my mind I could switch to the cloud.

The next post deals with migrating from Hostgator’s cPanel single server to TSO Host’s cloud server cluster.