Updated 2026-06-27

DeepSeek stateless chat history: replay the full conversation or your app will forget

One of the easiest DeepSeek API mistakes is assuming `/chat/completions` behaves like a server-hosted conversation thread. DeepSeek's official multi-round conversation guide says the opposite: the endpoint is stateless, so every later turn must resend the prior conversation history. That means many 'DeepSeek forgot the previous message' bugs are really caller-side history bugs. This page keeps the fix narrow, official, and directly tied to DeepSeek's own example.

1. What DeepSeek officially means by stateless

DeepSeek says the `/chat/completions` API does not record conversation context on the server. The model sees only the `messages` array you send on the current request.

That is the first production rule to internalize. If the second turn lacks the first turn and the first assistant reply, DeepSeek has nothing reliable to ground the follow-up answer on.

Sources checked

2. The official replay order is stricter than many quick snippets

DeepSeek's sample does not just say 'keep history.' It shows the order: send the first user prompt, append the assistant output from round one, then append the next user turn before making round two.

That matters because some buggy clients keep only user prompts or rebuild the history from summaries. The official DeepSeek example keeps the original assistant reply in the actual `messages` array.

messages = [{"role": "user", "content": "What's the highest mountain in the world?"}]
response = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=messages,
)

messages.append(response.choices[0].message)
messages.append({"role": "user", "content": "What is the second?"})

response = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=messages,
)

3. Why follow-up turns look broken when history replay is incomplete

When teams say DeepSeek 'forgot' the prior answer, the failure is often deterministic: the app sent the new user turn but not the previous assistant response.

That makes the conversation under-specified. A follow-up like 'what is the second?' is only meaningful if the assistant's earlier Everest answer is still present in `messages`.

This is a different class of bug from reasoning-mode protocol errors or tool-call replay errors. The first fix here is always the conversation payload.

4. Use a structured conversation builder instead of ad hoc string memory

The safest production pattern is to store each turn as a role-tagged message object, then append new user and assistant turns in order. Do not compress all history into one mutable free-text buffer unless you have a deliberate summarization strategy.

That approach also keeps your DeepSeek implementation closer to the official sample and makes payload inspection easier when a session behaves unexpectedly.

Safer DeepSeek multi-round state choices
ApproachOperational qualityWhy
Role-tagged `messages` arrayBest defaultMatches DeepSeek's official sample and preserves user/assistant boundaries
Manual free-text transcriptRiskyEasy to drop assistant turns or blur role order
Selective summarizationAdvanced onlyCan save tokens, but must preserve the context the next turn truly needs

5. Stateless does not mean DeepSeek ignores cost controls

Replaying history does increase prompt size, but the correct response is not to omit essential turns blindly. Instead, measure which parts of the conversation must stay literal and where summarization is safe.

If your next concern is price and repeated-prefix reuse, continue with `/guides/deepseek-context-caching-hit-rules`. If the issue is reasoning-mode behavior with tools, continue with `/guides/deepseek-thinking-mode-tool-calls`.

6. Keep the DeepSeek-first boundary intact

This page is about the official DeepSeek API contract. It does not create a new product listing, and it does not mean any extra model is sold on `/pricing`.

The conversion boundary stays narrow: fix the API contract here, then use `/pricing` only if you need an in-stock DeepSeek key route that the site actually sells.

FAQ

Does DeepSeek store previous chat turns automatically on `/chat/completions`?

No. DeepSeek's official guide says the endpoint is stateless, so each request must resend the prior conversation history you want the model to see.

Should I include the previous assistant reply before the next user turn?

Yes. DeepSeek's own sample appends the assistant message from round one before adding the next user question for round two.

Why does DeepSeek seem to forget context in follow-up questions?

Usually because the app did not replay enough history in `messages`, especially the earlier assistant output that the follow-up depends on.

Can I summarize history instead of replaying every exact turn?

Sometimes, but that is an application design choice. The official DeepSeek baseline is full history replay, and summarization should only come after you know what context can be safely compressed.

Does this page mean DeepSeek sells a special chat-memory plan on `/pricing`?

No. This page explains API behavior only. Purchasable plans still depend on actual stocked inventory.

The practical DeepSeek multi-round rule is simple: treat `/chat/completions` as stateless, replay the full conversation you still need, and append the assistant's last reply before asking the next question. If a follow-up looks forgetful, inspect `messages` before blaming the model.

Related model comparisons

Continue from this guide into structured DeepSeek-first comparison pages with model tables, routing advice, and pricing context.