Chat Completions
Reasoning Models
Configure chain-of-thought reasoning with reasoning_effort and the reasoning parameter.
Some models on DEVUP AI support extended chain-of-thought reasoning — the model “thinks through” a problem step by step before producing a final answer. By default, reasoning models produce a reasoning trace alongside the response. You can control this behavior with the reasoning_effort parameter.
Supported models
Reasoning is available on models that support chain-of-thought, including:
deepseek-ai/DeepSeek-R1
Check the model catalog for the latest list.
Controlling reasoning effort
Use reasoning_effort to control how much reasoning the model performs. Higher effort means deeper thinking but more output tokens and higher latency.
from openai import OpenAI
client = OpenAI(
api_key="$DEVUP_API_KEY",
base_url="https://api.devupai.com/v1",
)
response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1",
messages=[{"role": "user", "content": "Prove that the square root of 2 is irrational."}],
extra_body={"reasoning_effort": "high"},
)
print(response.choices[0].message.content)Disabling reasoning
Set reasoning_effort to "none" to disable chain-of-thought entirely. The model will respond directly without a reasoning trace — faster and cheaper.
response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1",
messages=[{"role": "user", "content": "What is the capital of France?"}],
extra_body={"reasoning_effort": "none"},
)The reasoning parameter
For more granular control, use the reasoning object instead of reasoning_effort:
response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1",
messages=[{"role": "user", "content": "Solve this step by step: 15! / 13!"}],
extra_body={
"reasoning": {
"effort": "medium",
"enabled": True,
}
},
)Setting "enabled": false is equivalent to reasoning_effort: "none".
When to use reasoning
| Use case | Effort Level |
|---|---|
| Math, logic, and code problems | "high" (default for reasoning models) |
| Multi-step analysis | "medium" or "high" |
| Simple Q&A, translation, summarization | "none" |
| Cost-sensitive workloads | "none" or "low" |
Supported parameters
| Parameter | Type | Description |
|---|---|---|
| reasoning_effort | string | Controls reasoning depth: "none", "low", "medium", "high". |
| reasoning | object | Fine-grained reasoning config. |
| reasoning.effort | string | Same values as reasoning_effort. |
| reasoning.enabled | boolean | Explicitly enable or disable reasoning. |
Notes
- Streaming Support: Reasoning models support streaming; the reasoning trace is returned in the
reasoning_contentdelta block before the final answer begins. - Token Pricing: Tokens generated during the reasoning phase are billed as standard output tokens.
- Usage Telemetry: Your API response will report
completion_tokens_detailswhich containsreasoning_tokensso you can track how much the model "thought". - Temperature Constraints: Some reasoning models may enforce fixed temperatures to maintain logical integrity, ignoring any custom
temperatureparameter passed.