This blog posting represents the views of the author, David Fosberry. Those opinions may change over time. They do not constitute an expert legal or financial opinion.
If you have comments on this blog posting, please email me .
The Opinion Blog is organised by threads, so each post is identified by a thread number ("Major" index) and a post number ("Minor" index). If you want to view the index of blogs, click here to download it as an Excel spreadsheet.
Click here to see the whole Opinion Blog.
To view, save, share or refer to a particular blog post, use the link in that post (below/right, where it says "Show only this post").
Posted on 18th May 2017 |
Show only this post Show all posts in this thread. |
There have been many reports over the last few days about the chaos caused by British Airway and their IT failure:
Very many types of business, in the modern world, are fully dependent upon IT systems to operate: banking, telecoms, air travel (not just airlines; also airports, air-traffic control, etc.), the full gamut of Internet based businesses, etc. Most of these companies seem to have understood how vital it is, for both them and their customers, to ensure that their systems are reliable and robust, but it seems that BA "didn't get the memo". Now, high reliability (usually call high availability) systems is something that I know quite a lot about, and the experts quoted by the BBC's Tech Tent are right: a power surge is no excuse, and power management is a vital part of any business critical system design. To put this into context for those readers who are not familiar with the subject matter, let me describe a typical disaster recovery plan:
Of course, none of this is any use if the fall-back systems and processes don't work, which seems to be the case here. When you spend millions on redundant systems and data-back-ups, you have to test that they work. You must test that, when a system or a whole site, fails, that the load is properly switched to other systems (and that you can put the system back into its normal operating mode once the fault is repaired). You must also test that the software and processes to restore data from back-ups actually work. It seems likely that BA failed to do this, since their systems stopped working. Of course, I do not know if BA simply failed to put in place a properly reliable set of systems and processes, or if they at least tried to do so, but failed to test that they worked properly in the event of failure. Either way, the outcome is simply unacceptable, and the impact on its customers was major and intolerable. This simply cements BA's position as one of the world's worst airlines; one a cavalier and irresponsible attitude to their customers. BA's CEO said that the "flight disruption had nothing to do with cutting costs". I beg to differ. Clearly, not enough money was spent in building and testing the disaster recovery plan. |