OpenAI Responses API

New Normal

Preserves internal reasoning between calls to maintain context and improve accuracy, returning not only the final answer but also tool calls and intermediate steps item by item for clear execution order and debugging. Executes File Search, Code Interpreter, Web Search, Image Gen, MCP, etc. on the OpenAI internal infrastructure to reduce latency and round-trip costs. Chain-of-thought is privately encrypted and preserved on the client, securely continuing only the reasoning process.

In other words, Chat Completions only stored text strings, while tool/session states such as code interpreter status, file handles, RAG indices, and web sessions were not preserved. Additionally, o3 may not maintain sufficient context with a single request alone. Therefore, when using Function calling, include either previous_response_id or all "reasoning items" in subsequent requests to prevent performance degradation. (This applies to other situations as well, but is especially critical after function calls)