Updated 2026-05-24

How to Use DeepSeek in Node.js — TypeScript, Streaming, Tool Use

DeepSeek V4 is OpenAI-compatible, so in Node.js you can reuse the official openai package without swapping libraries. This guide walks you through a production setup in TypeScript: environment wiring, streaming responses for a chat UI, tool use for agents, Next.js route handlers, and tactics to keep the token bill under control.

1. Install and configure

The official openai npm package works with DeepSeek via base_url override. Put your key in .env — never commit it, and never ship it to the browser.

npm install openai zod
echo "DEEPSEEK_API_KEY=sk-..." >> .env.local

2. First chat completion in TypeScript

Instantiate the client once and reuse it. Set baseURL to the DeepSeek endpoint, use `deepseek-v4-flash` for general tasks, and escalate to `deepseek-v4-pro` only when quality demands it.

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.DEEPSEEK_API_KEY,
  baseURL: "https://api.deepseek.com/v1",
});

export async function ask(question: string): Promise<string> {
  const res = await client.chat.completions.create({
    model: "deepseek-v4-flash",
    messages: [
      { role: "system", content: "You are a concise technical assistant." },
      { role: "user", content: question },
    ],
    temperature: 0.2,
  });

  return res.choices[0].message.content ?? "";
}

3. Streaming from a Next.js route handler

For a chat UI you want tokens to arrive as they are produced. The OpenAI SDK returns an async iterable when stream is true. You can pipe that directly into a Next.js Response as a ReadableStream.

// app/api/chat/route.ts
import OpenAI from "openai";

export const runtime = "nodejs";

const client = new OpenAI({
  apiKey: process.env.DEEPSEEK_API_KEY,
  baseURL: "https://api.deepseek.com/v1",
});

export async function POST(req: Request) {
  const { messages } = await req.json();

  const stream = await client.chat.completions.create({
    model: "deepseek-v4-flash",
    messages,
    stream: true,
  });

  const encoder = new TextEncoder();
  const body = new ReadableStream({
    async start(controller) {
      for await (const chunk of stream) {
        const delta = chunk.choices[0]?.delta?.content ?? "";
        if (delta) controller.enqueue(encoder.encode(delta));
      }
      controller.close();
    },
  });

  return new Response(body, {
    headers: { "Content-Type": "text/plain; charset=utf-8" },
  });
}

4. Function calling with Zod-validated arguments

DeepSeek V4 handles OpenAI-style tool use reliably enough to run real agents. Declare the tool schema, dispatch on the returned tool_call, and always validate the arguments — the model can and will occasionally hallucinate fields.

import { z } from "zod";

const GetWeatherArgs = z.object({ city: z.string().min(1) });

const tools = [{
  type: "function" as const,
  function: {
    name: "get_weather",
    description: "Get the current weather for a city",
    parameters: {
      type: "object",
      properties: { city: { type: "string" } },
      required: ["city"],
    },
  },
}];

const res = await client.chat.completions.create({
  model: "deepseek-v4-pro",
  messages: [{ role: "user", content: "Weather in Tokyo?" }],
  tools,
});

const call = res.choices[0].message.tool_calls?.[0];
if (call?.function.name === "get_weather") {
  const args = GetWeatherArgs.parse(JSON.parse(call.function.arguments));
  // … execute tool, feed result back as a { role: "tool" } message …
}

5. Cost control for Node.js workloads

Input and output tokens are billed separately. The English official price table now lists DeepSeek V4 Pro at $0.435 per 1M cache-miss input tokens and $0.87 per 1M output tokens, with cache-hit input at $0.003625 per 1M, and says those rates become the official quarter-price baseline after the 75% discount window ends on May 31, 2026. Even so, unbounded conversation state is still the #1 cost leak for Node.js chat and agent apps.

DeepSeek V4 Flash is the route to test first for high-volume workloads at $0.14 per 1M input tokens and $0.28 per 1M output tokens (cache-hit input: $0.0028). Its output quality is excellent for routine coding, chat, retrieval, and tool calls.

Use maxTokens to cap runaway answers, summarise old turns on long chats, and cache retrieval chunks rather than regenerating them. If your workload is large, the /pricing discounted official keys cut unit cost further without any code change.

6. Production checklist

Never expose the API key to the browser — always proxy through a server route. Set a request timeout (AbortController) and retry 429 / 5xx with exponential backoff.

For structured output, combine response_format: { type: "json_object" } with a Zod schema parse on the server so you fail fast on malformed responses.

Log model, promptTokens, completionTokens, and latency per request. A simple daily aggregate catches cost regressions before the invoice does.

FAQ

Can I call DeepSeek directly from the browser?

No. Your API key would be exposed. Always call it from a Node.js server, serverless function, or Next.js route handler.

Does this work with Cloudflare Workers / Bun / Deno?

Yes. The openai SDK ships ESM builds and works on any runtime with fetch. Edge runtimes work, but prefer Node.js for long-lived streams.

Which model name for Node.js?

Same as everywhere else: `deepseek-v4-flash` for the default route, `deepseek-v4-pro` for deeper reasoning and quality-sensitive workflows.

Where do I find the cheapest API key?

/pricing lists discounted official DeepSeek keys — same API, lower price.

Can I use LangChain / LlamaIndex / Vercel AI SDK?

Yes. Any library that accepts an OpenAI-compatible baseURL can point at DeepSeek.

If you already ship Node.js or Next.js apps, adopting DeepSeek V4 is still basically a two-line change: baseURL and model ID. You keep the SDK you know, pay a fraction of the tokens, and get a clean Pro/Flash routing story for real agents.

Related model comparisons

Continue from this guide into structured DeepSeek-first comparison pages with model tables, routing advice, and pricing context.

Best AI model for agentic workflows: DeepSeek V4-first routing Best AI model for coding: DeepSeek V4-first comparison DeepSeek V4 API Pricing: Pro, Flash and Major Rivals

Get a discounted DeepSeek API key