DevUp Docs
Back to Dashboard

Chat Completions

Reasoning Models

Configure chain-of-thought reasoning with reasoning_effort and the reasoning parameter.

Some models on DEVUP AI support extended chain-of-thought reasoning — the model “thinks through” a problem step by step before producing a final answer. By default, reasoning models produce a reasoning trace alongside the response. You can control this behavior with the reasoning_effort parameter.

Supported models

Reasoning is available on models that support chain-of-thought, including:

  • deepseek-ai/DeepSeek-R1

Check the model catalog for the latest list.

Controlling reasoning effort

Use reasoning_effort to control how much reasoning the model performs. Higher effort means deeper thinking but more output tokens and higher latency.

from openai import OpenAI

client = OpenAI(
    api_key="$DEVUP_API_KEY",
    base_url="https://api.devupai.com/v1",
)

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1",
    messages=[{"role": "user", "content": "Prove that the square root of 2 is irrational."}],
    extra_body={"reasoning_effort": "high"},
)

print(response.choices[0].message.content)

Disabling reasoning

Set reasoning_effort to "none" to disable chain-of-thought entirely. The model will respond directly without a reasoning trace — faster and cheaper.

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    extra_body={"reasoning_effort": "none"},
)

The reasoning parameter

For more granular control, use the reasoning object instead of reasoning_effort:

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1",
    messages=[{"role": "user", "content": "Solve this step by step: 15! / 13!"}],
    extra_body={
        "reasoning": {
            "effort": "medium",
            "enabled": True,
        }
    },
)

Setting "enabled": false is equivalent to reasoning_effort: "none".

When to use reasoning

Use caseEffort Level
Math, logic, and code problems"high" (default for reasoning models)
Multi-step analysis"medium" or "high"
Simple Q&A, translation, summarization"none"
Cost-sensitive workloads"none" or "low"

Supported parameters

ParameterTypeDescription
reasoning_effortstringControls reasoning depth: "none", "low", "medium", "high".
reasoningobjectFine-grained reasoning config.
reasoning.effortstringSame values as reasoning_effort.
reasoning.enabledbooleanExplicitly enable or disable reasoning.

Notes

  • Streaming Support: Reasoning models support streaming; the reasoning trace is returned in the reasoning_content delta block before the final answer begins.
  • Token Pricing: Tokens generated during the reasoning phase are billed as standard output tokens.
  • Usage Telemetry: Your API response will report completion_tokens_details which contains reasoning_tokens so you can track how much the model "thought".
  • Temperature Constraints: Some reasoning models may enforce fixed temperatures to maintain logical integrity, ignoring any custom temperature parameter passed.