Updated 2026-04-15
How to Use DeepSeek in Node.js — TypeScript, Streaming, Tool Use
DeepSeek V4 is OpenAI-compatible, so in Node.js you can reuse the official openai package without swapping libraries. This guide walks you through a production setup in TypeScript: environment wiring, streaming responses for a chat UI, tool use for agents, Next.js route handlers, and tactics to keep the token bill under control.
1. Install and configure
The official openai npm package works with DeepSeek via base_url override. Put your key in .env — never commit it, and never ship it to the browser.
npm install openai zod
echo "DEEPSEEK_API_KEY=sk-..." >> .env.local2. First chat completion in TypeScript
Instantiate the client once and reuse it. Set baseURL to the DeepSeek endpoint, use model: "deepseek-chat" for general tasks.
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.DEEPSEEK_API_KEY,
baseURL: "https://api.deepseek.com/v1",
});
export async function ask(question: string): Promise<string> {
const res = await client.chat.completions.create({
model: "deepseek-chat",
messages: [
{ role: "system", content: "You are a concise technical assistant." },
{ role: "user", content: question },
],
temperature: 0.2,
});
return res.choices[0].message.content ?? "";
}3. Streaming from a Next.js route handler
For a chat UI you want tokens to arrive as they are produced. The OpenAI SDK returns an async iterable when stream is true. You can pipe that directly into a Next.js Response as a ReadableStream.
// app/api/chat/route.ts
import OpenAI from "openai";
export const runtime = "nodejs";
const client = new OpenAI({
apiKey: process.env.DEEPSEEK_API_KEY,
baseURL: "https://api.deepseek.com/v1",
});
export async function POST(req: Request) {
const { messages } = await req.json();
const stream = await client.chat.completions.create({
model: "deepseek-chat",
messages,
stream: true,
});
const encoder = new TextEncoder();
const body = new ReadableStream({
async start(controller) {
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta?.content ?? "";
if (delta) controller.enqueue(encoder.encode(delta));
}
controller.close();
},
});
return new Response(body, {
headers: { "Content-Type": "text/plain; charset=utf-8" },
});
}4. Function calling with Zod-validated arguments
DeepSeek V4 handles OpenAI-style tool use reliably enough to run real agents. Declare the tool schema, dispatch on the returned tool_call, and always validate the arguments — the model can and will occasionally hallucinate fields.
import { z } from "zod";
const GetWeatherArgs = z.object({ city: z.string().min(1) });
const tools = [{
type: "function" as const,
function: {
name: "get_weather",
description: "Get the current weather for a city",
parameters: {
type: "object",
properties: { city: { type: "string" } },
required: ["city"],
},
},
}];
const res = await client.chat.completions.create({
model: "deepseek-chat",
messages: [{ role: "user", content: "Weather in Tokyo?" }],
tools,
});
const call = res.choices[0].message.tool_calls?.[0];
if (call?.function.name === "get_weather") {
const args = GetWeatherArgs.parse(JSON.parse(call.function.arguments));
// … execute tool, feed result back as a { role: "tool" } message …
}5. Cost control for Node.js workloads
Input and output tokens are billed separately, with output ~2× the price of input. System prompt and chat history both count as input, so unbounded conversation state is the #1 cost leak.
Use maxTokens to cap runaway answers, summarise old turns on long chats, and cache retrieval chunks rather than regenerating them. If your workload is large, the /pricing discounted official keys cut unit cost further without any code change.
6. Production checklist
Never expose the API key to the browser — always proxy through a server route. Set a request timeout (AbortController) and retry 429 / 5xx with exponential backoff.
For structured output, combine response_format: { type: "json_object" } with a Zod schema parse on the server so you fail fast on malformed responses.
Log model, promptTokens, completionTokens, and latency per request. A simple daily aggregate catches cost regressions before the invoice does.
FAQ
Can I call DeepSeek directly from the browser?
No. Your API key would be exposed. Always call it from a Node.js server, serverless function, or Next.js route handler.
Does this work with Cloudflare Workers / Bun / Deno?
Yes. The openai SDK ships ESM builds and works on any runtime with fetch. Edge runtimes work, but prefer Node.js for long-lived streams.
Which model name for Node.js?
Same as everywhere else: deepseek-chat for the generalist, deepseek-reasoner for deeper reasoning.
Where do I find the cheapest API key?
/pricing lists discounted official DeepSeek keys — same API, lower price.
Can I use LangChain / LlamaIndex / Vercel AI SDK?
Yes. Any library that accepts an OpenAI-compatible baseURL can point at DeepSeek.
If you already ship Node.js or Next.js apps, adopting DeepSeek V4 is literally a two-line change: baseURL and model. You keep the SDK you know, pay a fraction of the tokens, and unlock tool use reliable enough for real agents.