New Normal
Preserves internal reasoning between calls to maintain context and improve accuracy, returning not only the final answer but also tool calls and intermediate steps item by item for clear execution order and debugging. Executes File Search, Code Interpreter, Web Search, Image Gen, MCP, etc. on the OpenAI internal infrastructure to reduce latency and round-trip costs. Chain-of-thought is privately encrypted and preserved on the client, securely continuing only the reasoning process.
In other words, Chat Completions only stored text strings, while tool/session states such as code interpreter status, file handles, RAG indices, and web sessions were not preserved. Additionally, o3 may not maintain sufficient context with a single request alone. Therefore, when using Function calling, include either
previous_response_id or all "reasoning items" in subsequent requests to prevent performance degradation. (This applies to other situations as well, but is especially critical after function calls)Computer env
shell + container + orchestration + memory compaction
From model to agent: Equipping the Responses API with a computer environment
How OpenAI built an agent runtime using the Responses API, shell tool, and hosted containers to run secure, scalable agents with files, tools, and state.
https://openai.com/index/equip-responses-api-computer-environment/

Responses API
OpenAI Platform
Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.
https://platform.openai.com/docs/quickstart?api-mode=responses

File input
OpenAI Platform
Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.
https://platform.openai.com/docs/guides/pdf-files?api-mode=chat

Why we built the Responses API
How the Responses API unlocks persistent reasoning, hosted tools, and multimodal workflows for GPT-5.
https://developers.openai.com/blog/responses-api/


Seonglae Cho