This holds up to a point. The systems that require really little downtime get pr...

This holds up to a point. The systems that require really little downtime get progressively more complex.

Need to handle machine failures? Push a bugfix without causing downtime? Maintain a single state across the redundant machines? Handle load spikes? Ones created on purpose to DOS you?

All of this requires additional mechanisms, which can themselves cause failures.