It was never really clear what the difference between the chat and responses API...

brittlewis12 · 2025-05-21T16:50:57 1747846257

chat completions is stateless — you must provide the entire conversation history with each new message; openai stores nothing (at least nothing that the downstream product _can use_) beyond the life of the request.

responses api, by contrast, is stateful — only send the latest message, and openai stores the conversation history, while keeping track of other details on behalf of the calling app, like parallel tool call states.

but i would say that since chat completions has become an informal industry standard, the responses api feels like an attempt by openai to break away from that shared interface, because it is so easy to swap out providers with nothing more than a base url and a model id, to a paradigm which requires data migration as well as replacement infrastructure (containers for code execution, for example).

nknj · 2025-05-21T17:20:40 1747848040

one additional difference between chat and responses is the number model turns a single api call can make. chat completions is a single turn api primitive -- which means it can talk to the model just once. responses is capable of making multiple model turns and tool calls in a single api call.

for example, you can give the responses api access to 3 tools: a vector store with some user memories (file_search), the shopify mcp server, and code_interpreter. you can then ask it to look up some user memories, find relevant items in the shopify mcp store, and then download them into a csv file. all of this can be done in a single api call that involves multiple model turns and tool calls.

p.s. - you can also use responses statelessly by setting store=false.

OutOfHere · 2025-05-21T17:48:59 1747849739

What are my choices for using a custom tool? Does it come down to: function calling (single turn), MCP (multi-turn via Responses)? What else are my choices?

Why would anyone want to use Responses statelessly? Just trying to understand.

swyx · 2025-05-21T17:55:22 1747850122

i think the original intent of responses api was also to unify the realtime experiences into responses - is that accurate?

nknj · 2025-05-22T04:05:28 1747886728

we expect responses and realtime to be our 2 core api primitives long term — responses for turn by turn interactions and realtime for models requiring low latency bidirectional streams to/from the apps/models.

swyx · 2025-05-22T17:28:44 1747934924

thank you for the correction!

andrewrn · 2025-05-24T18:05:40 1748109940

This is very enlightening. You're right then, it does seem to partially be a strategic moat-building move by OpenAI