Official2026-06-11

DeepSeek's official rate-limit docs now spell out account-level concurrency, user_id isolation, and keep-alive behavior

Checked on June 11, 2026: DeepSeek's official docs now give a cleaner production picture for V4 traffic management, with 500 Pro concurrency, 2500 Flash concurrency, explicit `user_id` isolation rules, and request keep-alive handling.

Accepted official-source monitoring note

Today's accepted item stays DeepSeek-first and uses current official DeepSeek documentation because the public X surface remains login-friction-heavy and did not expose a newer safely verifiable post than the already-published homepage anchor.

What we verified on June 11, 2026

  • DeepSeek's official Rate Limit & Isolation page sets account-level concurrency at 500 for deepseek-v4-pro and 2500 for deepseek-v4-flash.
  • The same official page now explains what user_id does: content-safety isolation, KV-cache isolation, and scheduling isolation under one account.
  • DeepSeek also documents protocol-specific user_id placement: extra_body.user_id for OpenAI-format calls and metadata.user_id for Anthropic-format calls.
  • The same page documents the request keep-alive behavior: non-streaming responses can emit empty lines, streaming responses can emit SSE keep-alive comments, and the server closes the connection if inference has not started after 10 minutes.
  • DeepSeek's English homepage still points its public social anchor to @deepseek_ai, which remains the safest official X confirmation even though the X page itself is not safely crawlable here.

Why this is publishable

This is not framed as a new product launch. It is a current official operations signal that matters for real DeepSeek deployments:

  1. It gives teams a trustworthy concurrency baseline instead of forcing them to infer limits from scattered SDK examples.
  2. It turns user_id from an obscure field into a documented production control for isolation and privacy-sensitive workloads.
  3. It gives the site a fresh official topic that is not a duplicate of the June 9 homepage-X anchor check or the June 10 Claude Code Web Search check.

Why this matters for DeepSeek-first SEO pages

  • API operations pages should explain account-level concurrency separately from per-user application logic.
  • Anthropic-format and OpenAI-format guides should show different user_id wiring, because the field lands in different request shapes.
  • Error-handling and timeout guides should mention keep-alive lines and the 10-minute pre-inference cutoff so teams do not misclassify healthy long polls as broken connections.

Rejected candidates today

  • The same April 24 homepage X anchor: still official, but already published on June 9 and therefore a duplicate-content risk.
  • The same Claude Code Web Search page: still official, but already published on June 10.
  • Status page uptime alone: official, but weaker and less actionable than the current Rate Limit & Isolation page.
  • Community posts about DeepSeek concurrency or gateway tuning: helpful discovery leads only, but weaker than the official docs page that now states the limits directly.

Editorial takeaway

The safest official DeepSeek story today is an API operations documentation check: DeepSeek's own docs now say exactly how much concurrency V4 Pro and V4 Flash support per account, how user_id should be used for isolation, and what keep-alive behavior client code should tolerate while a request is waiting to run.

Sources checked