| | Scaling pretraining affects RL sample efficiency (runrl.com) |
| 1 point by ag8 6 months ago | past |
|
| | Training Qwen to answer briefly yet intelligently using feedback control (runrl.com) |
| 4 points by ag8 7 months ago | past |
|
| | Launch HN: RunRL (YC X25) – Reinforcement learning as a service (runrl.com) |
| 71 points by ag8 7 months ago | past | 22 comments |
|
| | Generating the Funniest Joke with RL (runrl.com) |
| 1 point by ag8 11 months ago | past |
|
| | Why Run RL? How specialized models can outperform the biggest LLMs (runrl.com) |
| 4 points by -_- 12 months ago | past |
|