Updated 2026-06-11

DeepSeek Thinking Mode and Tool Calls: use reasoning correctly in multi-turn apps

DeepSeek's official Thinking Mode guide is more opinionated than many quick snippets suggest. Thinking is enabled by default, some common sampling parameters stop mattering, and `reasoning_content` becomes a real protocol requirement when tool calls enter the conversation. If you ignore that rule, the resulting 400 error is usually an implementation bug, not a model problem.

1. Thinking mode is on by default

DeepSeek's official Thinking Mode guide says the toggle defaults to `enabled`. That matters because teams often assume reasoning is an optional premium mode they have to turn on manually.

The same guide says the default effort is `high` for regular requests and can automatically become `max` for some complex agent requests such as Claude Code or OpenCode. This is a strong signal that DeepSeek expects real agent workflows to lean on reasoning rather than bypass it.

Sources checked

2. Some classic sampling knobs stop mattering

DeepSeek's docs say thinking mode does not support `temperature`, `top_p`, `presence_penalty`, or `frequency_penalty`. Setting them will not raise an error, but they also will not change model behavior in thinking mode.

That is a subtle but important operations detail. Many teams think they are tuning thinking quality by changing these fields, when the effective control is really the effort level and whether thinking is enabled at all.

3. reasoning_content is not the same as final content

In thinking mode, DeepSeek returns chain-of-thought material in `reasoning_content` alongside the final answer in `content`. The docs then split multi-turn handling into two cases: no tool call versus tool call.

If there was no tool call between two user turns, the docs say prior `reasoning_content` does not need to be passed back and will be ignored if you do send it. That makes ordinary multi-turn chat simpler than some developers expect.

response = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=messages,
    reasoning_effort="high",
    extra_body={"thinking": {"type": "enabled"}},
)

reasoning_content = response.choices[0].message.reasoning_content
content = response.choices[0].message.content

4. Tool calls change the replay rule completely

The official guide is explicit here: if a turn performs tool calls, the intermediate assistant `reasoning_content` must be fully passed back to the API in all subsequent requests.

If you do not replay that reasoning context correctly, DeepSeek says the API will return a 400 error. This is one of the highest-signal debugging rules in the whole docs set because many agent loops silently drop intermediate reasoning while serializing tool state.

In practice, your DeepSeek tool-call adapter needs to store three things together: the assistant reasoning block, the tool invocation, and the tool result. Treat them as one transaction, not as loose fragments.

Sources checked

5. OpenAI-format versus Anthropic-format effort controls

DeepSeek documents different control shapes for each protocol. In OpenAI format, the docs use `extra_body.thinking` plus the top-level `reasoning_effort` field. In Anthropic format, the guide maps effort through `output_config.effort`.

That means teams should avoid copying one protocol's request body into another and assuming the same keys will carry over cleanly.

Thinking controls by protocol
ProtocolThinking toggleEffort control
OpenAI formatextra_body.thinkingreasoning_effort
Anthropic formatProtocol-level thinking supportoutput_config.effort

FAQ

Is DeepSeek thinking mode enabled by default?

Yes. The official Thinking Mode guide says the toggle defaults to enabled.

Does temperature work in thinking mode?

No in practice. DeepSeek says `temperature`, `top_p`, `presence_penalty`, and `frequency_penalty` are unsupported in thinking mode and have no effect.

When can I ignore reasoning_content?

If there was no tool call between user turns, the docs say prior `reasoning_content` does not need to be passed back and will be ignored if you send it.

When must I replay reasoning_content?

When the turn performed tool calls. DeepSeek says that reasoning content must be passed back in all subsequent requests.

Why am I getting a 400 after a tool call?

One common cause is that your app dropped or mangled the required `reasoning_content` when replaying the conversation after the tool step.

DeepSeek thinking mode is not just a switch for better answers. It is part of the protocol contract for multi-turn agent work. Once tool calls appear, `reasoning_content` becomes state you must preserve, and teams that ignore that rule will eventually debug a self-inflicted 400.

Related model comparisons

Continue from this guide into structured DeepSeek-first comparison pages with model tables, routing advice, and pricing context.