The completions API endpoint received its final update in July 2023 and has a different interface than the new chat completions endpoint. Instead of the input being a list of messages, the input is a freeform text string called a
prompt
.JSON mode will not guarantee the output matches any specific schema, only that it is valid and parses without errors.
accounts for 97% of API GPT usage
Chat Completions API Notion
Predicted output
Any tokens provided in predictions that are not part of the final completion will be charged at completion token rates. This means the more an output differs from the prediction, the more it will cost. Using Predicted Outputs can reduce processing time.
Models
Prompt Caching for cost reduction and Latency impprovement (Automatic Application)