Hacker Times
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
emp17344
35 days ago
|
parent
|
context
|
favorite
| on:
Constraint Decay: The Fragility of LLM Agents in B...
RLVR doesn’t work for unverifiable tasks, so they won’t be able to effectively use tools to boost reliability for those tasks.
jeremyjh
34 days ago
[–]
Right, so you have to use RLHF. That is the economics problem I was referring to.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: