Model Library
Browse and deploy state-of-the-art AI models through the DevUp Gateway.
Browse and deploy state-of-the-art AI models through the DevUp Gateway.
DeepSeek-V3.1 Terminus is an update to DeepSeek V3.1 that maintains the model's original capabilities while addressing issues reported by users, including language consistency and agent capabilities, further optimizing the model's performance in coding and search agents. It is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes. It extends the DeepSeek-V3 base with a two-phase long-context training process. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs The model improves tool use, code generation, and reasoning efficiency, achieving performance comparable to DeepSeek-R1 on difficult benchmarks while responding more quickly. It supports structured tool calling, code agents, and search agents, making it suitable for research, coding, and agentic workflows.

DeepSeek-V3.1 is a hybrid model that supports both thinking mode and non-thinking mode. Compared to the previous version, this upgrade brings improvements in multiple aspects:
DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens. Additionally, DeepSeek-V3.1 is trained using the UE8M0 FP8 scale data format to ensure compatibility with microscaling data formats.
| Model | #Total Params | #Activated Params | Context Length | Download |
|---|---|---|---|---|
| DeepSeek-V3.1-Base | 671B | 37B | 128K | HuggingFace | ModelScope |
| DeepSeek-V3.1 | 671B | 37B | 128K | HuggingFace | ModelScope |
The details of our chat template is described in tokenizer_config.json and assets/chat_template.jinja. Here is a brief description.
With the given prefix, DeepSeek V3.1 generates responses to queries in non-thinking mode. Unlike DeepSeek V3, it introduces an additional token </think>.
By concatenating the context and the prefix, we obtain the correct prompt for the query.
The prefix of thinking mode is similar to DeepSeek-R1.
The multi-turn template is the same with non-thinking multi-turn chat template. It means the thinking token in the last turn will be dropped but the <think> is retained in every turn of context.
Toolcall is supported in non-thinking mode. The format is:
where the tool_description is:
## Tools
You have access to the following tools:
### {tool_name1}
Description: {description}
Parameters: {json.dumps(parameters)}
IMPORTANT: ALWAYS adhere to this exact format for tool use:
<|tool_calls_begin|><|tool_call_begin|>tool_call_name<|tool_sep|>tool_call_arguments<|tool_call_end|>[{additional_tool_calls}]<|tool_calls_end|>
Where:
- `tool_call_name` must be an exact match to one of the available tools
- `tool_call_arguments` must be valid JSON that strictly follows the tool's Parameters Schema
- For multiple tool calls, chain them directly without separators or spacesCode-Agent: We support various code agent frameworks. Please refer to the above toolcall format to create your own code agents. An example is shown in assets/code_agent_trajectory.html.
Search-Agent: We design a specific format for searching toolcall in thinking mode, to support search agent. For complex questions that require accessing external or up-to-date information, DeepSeek-V3.1 can leverage a user-provided search tool through a multi-turn tool-calling process. Please refer to the assets/search_tool_trajectory.html and assets/search_python_tool_trajectory.html for the detailed template.
| Category | Benchmark (Metric) | DeepSeek V3.1-NonThinking | DeepSeek V3 0324 | DeepSeek V3.1-Thinking | DeepSeek R1 0528 |
|---|---|---|---|---|---|
| General | MMLU-Redux (EM) | 91.8 | 90.5 | 93.7 | 93.4 |
| General | MMLU-Pro (EM) | 83.7 | 81.2 | 84.8 | 85.0 |
| General | GPQA-Diamond (Pass@1) | 74.9 | 68.4 | 80.1 | 81.0 |
| General | Humanity's Last Exam (Pass@1) | - | - | 15.9 | 17.7 |
| Search Agent | BrowseComp | - | - | 30.0 | 8.9 |
| Search Agent | BrowseComp_zh | - | - | 49.2 | 35.7 |
| Search Agent | Humanity's Last Exam (Python + Search) | - | - | 29.8 | 24.8 |
| Search Agent | SimpleQA | - | - | 93.4 | 92.3 |
| Code | LiveCodeBench (2408-2505) (Pass@1) | 56.4 | 43.0 | 74.8 | 73.3 |
| Code | Codeforces-Div1 (Rating) | - | - | 2091 | 1930 |
| Code | Aider-Polyglot (Acc.) | 68.4 | 55.1 | 76.3 | 71.6 |
| Code Agent | SWE Verified (Agent mode) | 66.0 | 45.4 | - | 44.6 |
| Code Agent | SWE-bench Multilingual (Agent mode) | 54.5 | 29.3 | - | 30.5 |
| Code Agent | Terminal-bench (Terminus 1 framework) | 31.3 | 13.3 | - | 5.7 |
| Math | AIME 2024 (Pass@1) | 66.3 | 59.4 | 93.1 | 91.4 |
| Math | AIME 2025 (Pass@1) | 49.8 | 51.3 | 88.4 | 87.5 |
| Math | HMMT 2025 (Pass@1) | 33.5 | 29.2 | 84.2 | 79.4 |
import transformers
tokenizer = transformers.AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V3.1")
messages = [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Who are you?"},
{"role": "assistant", "content": "<think>Hmm</think>I am DeepSeek"},
{"role": "user", "content": "1+1=?"}
]
tokenizer.apply_chat_template(messages, tokenize=False, thinking=True, add_generation_prompt=True)
# '<|begin_of_sentence|>You are a helpful assistant<|User|>Who are you?<|Assistant|><think>Hmm</think>I am DeepSeek<|end_of_sentence|><|User|>1+1=?<|Assistant|><think>'
tokenizer.apply_chat_template(messages, tokenize=False, thinking=False, add_generation_prompt=True)
# '<|begin_of_sentence|>You are a helpful assistant<|User|>Who are you?<|Assistant|></think>I am DeepSeek<|end_of_sentence|><|User|>1+1=?<|Assistant|></think>'