DeepSeek's official docs currently pin the V4 contract at 1M context, 384K max output, per-1M token billing, and account-level concurrency caps of 2500 for Flash and 500 for Pro
Checked on June 28, 2026: DeepSeek's official Models & Pricing and Rate Limit pages currently show a 1M context window, 384K max output, per-1M token pricing, and account-level concurrency limits of 2500 for DeepSeek V4 Flash and 500 for DeepSeek V4 Pro, with 429s when that cap is exceeded.
Accepted official-source monitoring note
Today's accepted item stays DeepSeek-first and uses current official DeepSeek documentation because the official @deepseek_ai X timeline still was not safely readable in this run. DeepSeek's English homepage continues to anchor the official X account through the V4 Preview banner, but the direct timeline did not expose a reliably crawlable current post here, so the publish-safe choice is a first-party docs-backed contract check.
What we verified on June 28, 2026
- DeepSeek's official Models & Pricing page currently shows a shared V4 surface with a 1M context length and 384K maximum output for both
deepseek-v4-flashanddeepseek-v4-pro. - The same official page bills in units of per 1M tokens and currently lists
deepseek-v4-flashat$0.0028input cache hit,$0.14input cache miss, and$0.28output, whiledeepseek-v4-prois listed at$0.003625,$0.435, and$0.87respectively. - DeepSeek's official rate-limit page currently treats concurrency as an account-level cap, not a per-key loophole:
deepseek-v4-flashis listed at2500concurrent requests anddeepseek-v4-proat500. - The official rate-limit contract is explicit about failure mode: once the concurrency limit is exceeded, requests receive HTTP
429rather than silently queueing forever. - DeepSeek also documents a capacity-expansion path with no additional cost, but only through a business-needs request workflow. That is a request path, not a guarantee of automatic higher limits.
- The pricing page still warns that product prices may vary, so this item should be read as a current docs snapshot rather than a promise that rates never change.
Why this is publishable
This is a current official DeepSeek docs contract check, not a rumor, repost, or inventory claim.
- It answers real buyer and implementation questions that affect both API budgeting and app throughput.
- It stays clearly inside official facts: prices, context, output cap, concurrency limits, and 429 behavior are all documented by DeepSeek directly.
- It avoids inventing any new stocked plan card, resale entitlement, or roadmap promise for
/pricing. - It stays distinct from the June 26 and June 27 items because the story is about the core V4 service contract, not an agent integration or chat-history behavior.
Why this matters for DeepSeek-first SEO pages
- Support pages can now target DeepSeek V4 pricing per million tokens using current first-party numbers instead of stale community screenshots.
- Buyer pages can explain why Flash and Pro differ on both cost and concurrency, which is a more useful comparison than generic 'cheap versus strong' copy.
- Scaling guides can answer what triggers a DeepSeek 429 and why multiple API keys under one account do not bypass the documented account-level cap.
Rejected candidates today
- Official X timeline as the primary source: rejected for this run because the homepage still anchored DeepSeek's X presence, but the direct X surface remained unreadable here.
- The homepage V4 Preview X anchor: official, but older and already represented in existing launch and migration coverage.
- The current agent-integration pages: official, but already represented heavily across the site's recent daily news set and therefore higher duplicate-content risk.
- The official change log: checked, but not stronger than the current pricing plus rate-limit contract for today's buyer and developer intent mix.
- Official GitHub, Hugging Face, and status surfaces: checked as backup official sources, but no stronger current DeepSeek update beat the pricing and concurrency contract for the single publish-safe news slot.
Editorial takeaway
The safest official DeepSeek story today is a core V4 service-contract check: the current first-party docs pin the DeepSeek V4 surface at 1M context, 384K max output, per-1M token billing, and account-level concurrency caps of 2500 for Flash and 500 for Pro, with HTTP 429 when callers go over the cap. That is a stronger current developer and buyer signal than recycling an older X anchor we still could not fully verify live.