Updated 2026-07-01
DeepSeek reasoning model guide: remove `reasoning_content` from replayed inputs and stop tweaking knobs that do nothing
DeepSeek's official reasoning-model guide is more specific than many wrapper tutorials. It says `max_tokens` covers both the chain-of-thought segment and the final answer, it names which familiar OpenAI-style parameters have no effect or hard-fail, and it warns that sending `reasoning_content` back in later input messages will trigger a 400 error. That makes this a strong support-query page for teams debugging DeepSeek reasoning flows rather than a generic 'thinking model' explainer.
2. Some familiar sampling parameters do nothing here
DeepSeek explicitly lists `temperature`, `top_p`, `presence_penalty`, and `frequency_penalty` as unsupported for the reasoning model. The compatibility rule is subtle: setting those fields does not throw an error, but it also has no effect.
That matters because many teams keep tweaking those knobs while debugging output quality, then misread a no-op change as proof that the model is unstable. On the official contract, those fields are not the live control surface for this reasoning route.
The same page is stricter about `logprobs` and `top_logprobs`: DeepSeek says those settings do trigger an error.
| Parameter | Official behavior | What to do |
|---|---|---|
| `temperature` / `top_p` | Accepted for compatibility but no effect | Remove them from reasoning templates |
| `presence_penalty` / `frequency_penalty` | Accepted for compatibility but no effect | Do not treat them as tuning levers |
| `logprobs` / `top_logprobs` | Rejected with an error | Disable them on reasoning routes |
| `max_tokens` | Counts CoT + final answer | Budget for reasoning, not just visible output |
3. Never send `reasoning_content` back in the next request
DeepSeek's official page is direct here: if the `reasoning_content` field appears in the sequence of input messages, the API returns a 400 error.
That means the replay rule is not 'store everything and send it back.' The safe pattern is narrower: keep the assistant's final `content` in history, but strip out `reasoning_content` before the next API request.
This is one of the easiest DeepSeek reasoning bugs to self-inflict because developers often mirror entire response objects back into their message store.
const assistantMessage = response.choices[0].message;
messages.push({
role: "assistant",
content: assistantMessage.content,
});
// Do not replay assistantMessage.reasoning_content in the next request.4. Streaming separates reasoning tokens from answer tokens
DeepSeek's streaming example shows `delta.reasoning_content` arriving separately from ordinary `delta.content`. That is useful for observability and UI design because a client can collect internal reasoning tokens and visible answer tokens independently.
The important boundary is still the same as the non-streaming path: collect `reasoning_content` if you need it for logging or debugging, but do not concatenate it back into the next request payload.
If your current bug involves tool replay rather than plain reasoning replay, continue with `/guides/deepseek-thinking-mode-tool-calls` next.
5. Treat this as a protocol contract, not a prompt-style preference
The official reasoning-model guide is really a protocol document. It tells you which fields are inert, which ones are rejected, how much output the route can produce, and which response field must stay out of subsequent input messages.
That is why this page solves a different support problem from the broader thinking-mode or chat-history guides. The failure class here is not 'the model forgot context'; it is 'the caller violated the reasoning-route contract.'
FAQ
Why does DeepSeek return a 400 when I replay `reasoning_content`?
DeepSeek's official reasoning-model guide says `reasoning_content` must not appear in input messages. Strip it before the next request.
Does `temperature` work on DeepSeek's reasoning model?
No. The official page says `temperature` is accepted only for compatibility and has no effect on this route.
What is the DeepSeek reasoning-model `max_tokens` limit?
DeepSeek documents a default of 32K and a maximum of 64K, and that budget includes the CoT segment plus the final answer.
Do `logprobs` and `top_logprobs` work on the reasoning model?
No. DeepSeek says those settings will trigger an error on the reasoning-model route.
Does this page create a new purchasable plan on /pricing?
No. It is API behavior guidance only and does not imply any new stocked Coding Plan inventory.
The practical DeepSeek reasoning-model rule is simple: budget `max_tokens` for both CoT and answer, delete inert sampling knobs from the request, and never replay `reasoning_content` back into input messages.
Related model comparisons
Continue from this guide into structured DeepSeek-first comparison pages with model tables, routing advice, and pricing context.