This holds up to a point. The systems that require really little downtime get progressively more complex.
Need to handle machine failures? Push a bugfix without causing downtime? Maintain a single state across the redundant machines? Handle load spikes? Ones created on purpose to DOS you?
All of this requires additional mechanisms, which can themselves cause failures.
Need to handle machine failures? Push a bugfix without causing downtime? Maintain a single state across the redundant machines? Handle load spikes? Ones created on purpose to DOS you?
All of this requires additional mechanisms, which can themselves cause failures.