Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

It was never really clear what the difference between the chat and responses APIs were. Anyone know the difference?


chat completions is stateless — you must provide the entire conversation history with each new message; openai stores nothing (at least nothing that the downstream product _can use_) beyond the life of the request.

responses api, by contrast, is stateful — only send the latest message, and openai stores the conversation history, while keeping track of other details on behalf of the calling app, like parallel tool call states.

but i would say that since chat completions has become an informal industry standard, the responses api feels like an attempt by openai to break away from that shared interface, because it is so easy to swap out providers with nothing more than a base url and a model id, to a paradigm which requires data migration as well as replacement infrastructure (containers for code execution, for example).


one additional difference between chat and responses is the number model turns a single api call can make. chat completions is a single turn api primitive -- which means it can talk to the model just once. responses is capable of making multiple model turns and tool calls in a single api call.

for example, you can give the responses api access to 3 tools: a vector store with some user memories (file_search), the shopify mcp server, and code_interpreter. you can then ask it to look up some user memories, find relevant items in the shopify mcp store, and then download them into a csv file. all of this can be done in a single api call that involves multiple model turns and tool calls.

p.s. - you can also use responses statelessly by setting store=false.


What are my choices for using a custom tool? Does it come down to: function calling (single turn), MCP (multi-turn via Responses)? What else are my choices?

Why would anyone want to use Responses statelessly? Just trying to understand.


i think the original intent of responses api was also to unify the realtime experiences into responses - is that accurate?


we expect responses and realtime to be our 2 core api primitives long term — responses for turn by turn interactions and realtime for models requiring low latency bidirectional streams to/from the apps/models.


thank you for the correction!


This is very enlightening. You're right then, it does seem to partially be a strategic moat-building move by OpenAI




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: