But they stopped doing daily downtime, right? And those sites you mentioned revenue, what, billions of dollars a year? And I was one of those people having their play time interrupted daily, and it was always back up in minutes.
Finally, EVE has daily downtime for a half hour. No one bats an eye. Elite server goes down for two minutes a day, and everyone loses their mind.
And, trust me, people were losing their mind. Anything FD does is heresy around here.
Yes, they stopped that, it seems, and I was simply explaining why daily downtimes at the same time of the day are bad for some people. No wonder they did go haywire...
The technical issue is not about millions of invest, but about design and planning. The big ones were simply an example.
The requirement of an uninterrupted service is standard these days. It is not dependent on the company size, nor extremely "expensive" (in fact you save costs). What you would need is simply enough ressources to have a single node replaced with a spare during maintenance of each server and move on to the next (usually you use virtualized servers anyway). Actually it is common to setup the "new patch version" on the spare, let sessions migrate to the new server from one "old" node, delete the "old" server and clone a new one... rinse and repeat. only the management database / session broker needs to trackthe connected clients and sessions. (Databases use methods of high availability since the 90ies, so that shouldn't be too diffcult to operate).
This is somewhat simplified, and there are other methods available depending on the server OS for example, but in general it is no rocket science.
It is not so much about heresy, as about people from Frontier, were "bragging" how professional and experienced they are in the business. I give it to them, it is a complex project and not easy to build, but they had an advantage most companies or providers don't have: they could plan, design and build everything to their own ideas. No stakeholders, BUs or customers telling them which application, database, or third party middleware to integrate.
So, sorry I sound a bit harsh, I wonder why they didn't take an approach that does not require downtimes, to update or add something. Just my oppinion and you may be happy with it. I'm just pretty certain it should not be necessary...
,,,and by the way, what is EVE???
The differene may be, that nobody cares if two server instances of google have a different version of the software running during deployment for several minutes as two clients almost never write the same data in thos examples. With elite, multiple players would influence the same data making out of sync servers with different version a real problem for the consistent galaxy.
It is just a guess, as no matter how good anyone is at software development, you can not make assumptions about a software system you do not know implementation details about.
As I explained above, I don't make assumptions about the software, I talk about the underlying infrastructure standards in a datacenter / cloud service of the 21st century.
Let me explain that, if you want a Taxi service to bring people from A to B without "downtimes", all you need is at least one operational Taxi more than rides required at all times... no need to know what car model they are, if they are automatic or shift gear, etc.
With today's virtualization technologies unlike Taxis you can get a new virtual server node cloned and running in minutes.
A good high availability design makes sure you can simply add or replace an instance without impact and have a spare (or multiple spares) available at all times. Before you shut down an instance simply have all players moved to a spare, prevent new players from entering the server flagged for maintenance and then shut it down/replace it with the new version. It's usually faster to clone a new server, than to upgrade/install it, yet there are some aspects where unique IDs, certificates etc. are involved, so you may need to upgrade. To my experience, still no need to shut down all servers at once... unless you have a basic design flaw.
By the way the basic design concept for this High Avalability approach is from the late 90ies...