I suppose the cost of running the prediction service is made up for by the improved workload performance. How do you make that trade off between the MIP solver budget and resources spent training the model on the one hand, and workload throughput gains or latency reductions on the other hand?
Yes. In our case it's definitely worth it. Another interesting thing to consider is by moving from a kernel-level generic solution (Linux CFS) towards ML-driven systems that depend on the actual workloads we run on all of our clusters, it also implies different ways of debugging and iterating on software. We believe it's a net positive in our case.