The problem with this formulation of the future is limiting cycles for the free tier sopping up as you say. How does that work exactly? Unless you're implementation is an interpreted language, I can't imagine a solution.
There is actually a pretty interesting solution in production today, you can use Vagrant and Docker to implement a containerized service system where 'priority' containers get resources in preference to non-priority ones. From the perspective of mainframe computing this problem was solved pretty much as a minimum viable feature (at the time mainframes had hard limits on things like getting billing jobs done before 6AM and other batch jobs got 'best effort' sorts of treatment) We have so many more tools now than they did, and hardware support too.
Good points. Supposing you're referring to setrlimit/ulimit syscalls for example, do those apply per thread or per process? My understanding is they're per process. Then the math comes down to how many user processes per remote machine can the service run? A few hundred? That's a hard limit on the scalability of the service. Is process really the ideal level of isolation?