I wish there was a way to "pause" incoming requests in web servers. Most deployments (migrations + code) take less than a few seconds, and I'd be fine with some users having to wait 2 seconds for a request to finish over their request hitting a 500 (due to inconsistent code/database) or 503 (putting the site into maintenance mode).
Usually we aren't deploying a schema change that's really huge so we just go for it and let the application crash for those users who happen to hit a place where the code/schema are out of sync.
Zero (no crash) downtime deployments seems like too much effort for too little gain.
haproxy allows re-dispatching failed requests some number of times. If you have an extremely brief outage due to a deploy, redispatching failed requests 3 times may be sufficient. I imagine other load balancers have similar functionality.
One approach which can work pretty well in many apps is having something like a CDN or Varnish which can serve stale content when the backend is unreachable. That allows the code you need to bootstrap your app to be served as long as your edge cache is running and your JavaScript can some sort of retry+backoff for failed requests or even do things like check for image error states to trigger a reload.
Have your load balancer drain connections to a web server. When it has no request in progress, deploy to it. When that's done, move on the the next web server.
Folks from Braintree gave a talk about how they did this. They'd queue up all the incoming requests in Redis, then replay them once the site was back up.
I don't think that's quite what raziel2p wants. Won't that still allow the web server to receive and process the request, but just delay it's response?