Chat API

The Chat API is the primary interface for interacting with DeepSeek's language models. It allows you to have conversations, generate text, and perform various language tasks.

Endpoint

POST https://api.deepseek.com/v1/chat/completions

Basic Usage

Simple Chat Request

bash

curl -X POST "https://api.deepseek.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-chat",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ]
  }'

Response

json

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "deepseek-chat",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I'm doing well, thank you for asking. How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 19,
    "total_tokens": 31
  }
}

Request Parameters

Required Parameters

Parameter	Type	Description
`model`	string	The model to use for completion
`messages`	array	List of messages in the conversation

Optional Parameters

Parameter	Type	Default	Description
`temperature`	number	1.0	Controls randomness (0.0 to 2.0)
`max_tokens`	integer	4096	Maximum tokens to generate
`top_p`	number	1.0	Nucleus sampling parameter
`frequency_penalty`	number	0.0	Penalty for frequent tokens (-2.0 to 2.0)
`presence_penalty`	number	0.0	Penalty for new tokens (-2.0 to 2.0)
`stop`	string/array	null	Stop sequences
`stream`	boolean	false	Enable streaming responses
`logprobs`	boolean	false	Return log probabilities
`top_logprobs`	integer	null	Number of top logprobs to return

Message Format

Message Structure

json

{
  "role": "user|assistant|system",
  "content": "Message content"
}

Role Types

system: Sets the behavior and context for the assistant
user: Messages from the user
assistant: Previous responses from the assistant

Example Conversation

json

{
  "model": "deepseek-chat",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant that explains complex topics simply."
    },
    {
      "role": "user",
      "content": "What is quantum computing?"
    },
    {
      "role": "assistant",
      "content": "Quantum computing is a type of computing that uses quantum mechanics..."
    },
    {
      "role": "user",
      "content": "Can you give me a simple analogy?"
    }
  ]
}

Advanced Features

Temperature Control

Control the creativity and randomness of responses:

json

{
  "model": "deepseek-chat",
  "messages": [
    {"role": "user", "content": "Write a creative story"}
  ],
  "temperature": 1.5
}

0.0: Deterministic, focused responses
1.0: Balanced creativity and coherence
2.0: Highly creative, more random

Token Limits

Set maximum response length:

json

{
  "model": "deepseek-chat",
  "messages": [
    {"role": "user", "content": "Summarize this in 50 words"}
  ],
  "max_tokens": 60
}

Stop Sequences

Define custom stopping points:

json

{
  "model": "deepseek-chat",
  "messages": [
    {"role": "user", "content": "List three items:"}
  ],
  "stop": ["\n4.", "END"]
}

Streaming Responses

Enable real-time response streaming for better user experience:

Request

json

{
  "model": "deepseek-chat",
  "messages": [
    {"role": "user", "content": "Tell me a story"}
  ],
  "stream": true
}

Response Stream

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"deepseek-chat","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"deepseek-chat","choices":[{"index":0,"delta":{"content":"Once"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"deepseek-chat","choices":[{"index":0,"delta":{"content":" upon"},"finish_reason":null}]}

data: [DONE]

Handling Streaming in Python

python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.deepseek.com/v1"
)

stream = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

Function Calling

Enable the model to call external functions:

Function Definition

json

{
  "model": "deepseek-chat",
  "messages": [
    {"role": "user", "content": "What's the weather like in New York?"}
  ],
  "functions": [
    {
      "name": "get_weather",
      "description": "Get current weather information",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "City name"
          },
          "unit": {
            "type": "string",
            "enum": ["celsius", "fahrenheit"]
          }
        },
        "required": ["location"]
      }
    }
  ],
  "function_call": "auto"
}

Function Call Response

json

{
  "id": "chatcmpl-123",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": null,
        "function_call": {
          "name": "get_weather",
          "arguments": "{\"location\": \"New York\", \"unit\": \"fahrenheit\"}"
        }
      },
      "finish_reason": "function_call"
    }
  ]
}

JSON Mode

Force the model to respond with valid JSON:

json

{
  "model": "deepseek-chat",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant designed to output JSON."
    },
    {
      "role": "user",
      "content": "Generate a user profile for John Doe"
    }
  ],
  "response_format": {"type": "json_object"}
}

Error Handling

Common Error Responses

Rate Limit Exceeded

json

{
  "error": {
    "message": "Rate limit exceeded",
    "type": "rate_limit_error",
    "code": "rate_limit_exceeded"
  }
}

Invalid Model

json

{
  "error": {
    "message": "Model not found",
    "type": "invalid_request_error",
    "code": "model_not_found"
  }
}

Token Limit Exceeded

json

{
  "error": {
    "message": "Maximum context length exceeded",
    "type": "invalid_request_error",
    "code": "context_length_exceeded"
  }
}

Error Handling in Code

python

from openai import OpenAI
import openai

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.deepseek.com/v1"
)

try:
    response = client.chat.completions.create(
        model="deepseek-chat",
        messages=[{"role": "user", "content": "Hello!"}]
    )
    print(response.choices[0].message.content)
    
except openai.RateLimitError:
    print("Rate limit exceeded. Please wait and try again.")
except openai.InvalidRequestError as e:
    print(f"Invalid request: {e}")
except Exception as e:
    print(f"An error occurred: {e}")

Best Practices

Conversation Management

Keep context relevant: Include only necessary previous messages
Use system messages: Set clear instructions for the assistant
Manage token usage: Monitor and optimize token consumption
Handle long conversations: Implement conversation summarization

Performance Optimization

Use appropriate temperature: Lower for factual tasks, higher for creative tasks
Set reasonable max_tokens: Avoid unnecessarily long responses
Implement caching: Cache responses for repeated queries
Use streaming: For better user experience with long responses

Security Considerations

Validate inputs: Sanitize user inputs before sending to API
Implement rate limiting: Protect against abuse
Monitor usage: Track API usage and costs
Handle sensitive data: Avoid sending sensitive information

Code Examples

Python

python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.deepseek.com/v1"
)

def chat_with_deepseek(message, conversation_history=None):
    if conversation_history is None:
        conversation_history = []
    
    conversation_history.append({"role": "user", "content": message})
    
    response = client.chat.completions.create(
        model="deepseek-chat",
        messages=conversation_history,
        temperature=0.7,
        max_tokens=1000
    )
    
    assistant_message = response.choices[0].message.content
    conversation_history.append({"role": "assistant", "content": assistant_message})
    
    return assistant_message, conversation_history

# Usage
response, history = chat_with_deepseek("Hello, how are you?")
print(response)

Node.js

javascript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://api.deepseek.com/v1'
});

async function chatWithDeepSeek(message, conversationHistory = []) {
  conversationHistory.push({ role: 'user', content: message });
  
  try {
    const response = await client.chat.completions.create({
      model: 'deepseek-chat',
      messages: conversationHistory,
      temperature: 0.7,
      max_tokens: 1000
    });
    
    const assistantMessage = response.choices[0].message.content;
    conversationHistory.push({ role: 'assistant', content: assistantMessage });
    
    return { message: assistantMessage, history: conversationHistory };
  } catch (error) {
    console.error('Error:', error);
    throw error;
  }
}

// Usage
const result = await chatWithDeepSeek("Hello, how are you?");
console.log(result.message);

Chat API ​

Endpoint ​

Basic Usage ​

Simple Chat Request ​

Response ​

Request Parameters ​

Required Parameters ​

Optional Parameters ​

Message Format ​

Message Structure ​

Role Types ​

Example Conversation ​

Advanced Features ​

Temperature Control ​

Token Limits ​

Stop Sequences ​

Streaming Responses ​

Request ​

Response Stream ​

Handling Streaming in Python ​

Function Calling ​

Function Definition ​

Function Call Response ​

JSON Mode ​

Error Handling ​

Common Error Responses ​

Rate Limit Exceeded ​

Invalid Model ​

Token Limit Exceeded ​

Error Handling in Code ​

Best Practices ​

Conversation Management ​

Performance Optimization ​

Security Considerations ​

Code Examples ​

Python ​

Node.js ​

Next Steps ​

Chat API

Endpoint

Basic Usage

Simple Chat Request

Response

Request Parameters

Required Parameters

Optional Parameters

Message Format

Message Structure

Role Types

Example Conversation

Advanced Features

Temperature Control

Token Limits

Stop Sequences

Streaming Responses

Request

Response Stream

Handling Streaming in Python

Function Calling

Function Definition

Function Call Response

JSON Mode

Error Handling

Common Error Responses

Rate Limit Exceeded

Invalid Model

Token Limit Exceeded

Error Handling in Code

Best Practices

Conversation Management

Performance Optimization

Security Considerations

Code Examples

Python

Node.js

Next Steps