Chat API
The Chat API is the primary interface for interacting with DeepSeek's language models. It allows you to have conversations, generate text, and perform various language tasks.
Endpoint
POST https://api.deepseek.com/v1/chat/completions
Basic Usage
Simple Chat Request
bash
curl -X POST "https://api.deepseek.com/v1/chat/completions" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-chat",
"messages": [
{"role": "user", "content": "Hello, how are you?"}
]
}'
Response
json
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "deepseek-chat",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm doing well, thank you for asking. How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 19,
"total_tokens": 31
}
}
Request Parameters
Required Parameters
Parameter | Type | Description |
---|---|---|
model | string | The model to use for completion |
messages | array | List of messages in the conversation |
Optional Parameters
Parameter | Type | Default | Description |
---|---|---|---|
temperature | number | 1.0 | Controls randomness (0.0 to 2.0) |
max_tokens | integer | 4096 | Maximum tokens to generate |
top_p | number | 1.0 | Nucleus sampling parameter |
frequency_penalty | number | 0.0 | Penalty for frequent tokens (-2.0 to 2.0) |
presence_penalty | number | 0.0 | Penalty for new tokens (-2.0 to 2.0) |
stop | string/array | null | Stop sequences |
stream | boolean | false | Enable streaming responses |
logprobs | boolean | false | Return log probabilities |
top_logprobs | integer | null | Number of top logprobs to return |
Message Format
Message Structure
json
{
"role": "user|assistant|system",
"content": "Message content"
}
Role Types
- system: Sets the behavior and context for the assistant
- user: Messages from the user
- assistant: Previous responses from the assistant
Example Conversation
json
{
"model": "deepseek-chat",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant that explains complex topics simply."
},
{
"role": "user",
"content": "What is quantum computing?"
},
{
"role": "assistant",
"content": "Quantum computing is a type of computing that uses quantum mechanics..."
},
{
"role": "user",
"content": "Can you give me a simple analogy?"
}
]
}
Advanced Features
Temperature Control
Control the creativity and randomness of responses:
json
{
"model": "deepseek-chat",
"messages": [
{"role": "user", "content": "Write a creative story"}
],
"temperature": 1.5
}
- 0.0: Deterministic, focused responses
- 1.0: Balanced creativity and coherence
- 2.0: Highly creative, more random
Token Limits
Set maximum response length:
json
{
"model": "deepseek-chat",
"messages": [
{"role": "user", "content": "Summarize this in 50 words"}
],
"max_tokens": 60
}
Stop Sequences
Define custom stopping points:
json
{
"model": "deepseek-chat",
"messages": [
{"role": "user", "content": "List three items:"}
],
"stop": ["\n4.", "END"]
}
Streaming Responses
Enable real-time response streaming for better user experience:
Request
json
{
"model": "deepseek-chat",
"messages": [
{"role": "user", "content": "Tell me a story"}
],
"stream": true
}
Response Stream
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"deepseek-chat","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"deepseek-chat","choices":[{"index":0,"delta":{"content":"Once"},"finish_reason":null}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"deepseek-chat","choices":[{"index":0,"delta":{"content":" upon"},"finish_reason":null}]}
data: [DONE]
Handling Streaming in Python
python
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.deepseek.com/v1"
)
stream = client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")
Function Calling
Enable the model to call external functions:
Function Definition
json
{
"model": "deepseek-chat",
"messages": [
{"role": "user", "content": "What's the weather like in New York?"}
],
"functions": [
{
"name": "get_weather",
"description": "Get current weather information",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
],
"function_call": "auto"
}
Function Call Response
json
{
"id": "chatcmpl-123",
"choices": [
{
"message": {
"role": "assistant",
"content": null,
"function_call": {
"name": "get_weather",
"arguments": "{\"location\": \"New York\", \"unit\": \"fahrenheit\"}"
}
},
"finish_reason": "function_call"
}
]
}
JSON Mode
Force the model to respond with valid JSON:
json
{
"model": "deepseek-chat",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant designed to output JSON."
},
{
"role": "user",
"content": "Generate a user profile for John Doe"
}
],
"response_format": {"type": "json_object"}
}
Error Handling
Common Error Responses
Rate Limit Exceeded
json
{
"error": {
"message": "Rate limit exceeded",
"type": "rate_limit_error",
"code": "rate_limit_exceeded"
}
}
Invalid Model
json
{
"error": {
"message": "Model not found",
"type": "invalid_request_error",
"code": "model_not_found"
}
}
Token Limit Exceeded
json
{
"error": {
"message": "Maximum context length exceeded",
"type": "invalid_request_error",
"code": "context_length_exceeded"
}
}
Error Handling in Code
python
from openai import OpenAI
import openai
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.deepseek.com/v1"
)
try:
response = client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
except openai.RateLimitError:
print("Rate limit exceeded. Please wait and try again.")
except openai.InvalidRequestError as e:
print(f"Invalid request: {e}")
except Exception as e:
print(f"An error occurred: {e}")
Best Practices
Conversation Management
- Keep context relevant: Include only necessary previous messages
- Use system messages: Set clear instructions for the assistant
- Manage token usage: Monitor and optimize token consumption
- Handle long conversations: Implement conversation summarization
Performance Optimization
- Use appropriate temperature: Lower for factual tasks, higher for creative tasks
- Set reasonable max_tokens: Avoid unnecessarily long responses
- Implement caching: Cache responses for repeated queries
- Use streaming: For better user experience with long responses
Security Considerations
- Validate inputs: Sanitize user inputs before sending to API
- Implement rate limiting: Protect against abuse
- Monitor usage: Track API usage and costs
- Handle sensitive data: Avoid sending sensitive information
Code Examples
Python
python
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.deepseek.com/v1"
)
def chat_with_deepseek(message, conversation_history=None):
if conversation_history is None:
conversation_history = []
conversation_history.append({"role": "user", "content": message})
response = client.chat.completions.create(
model="deepseek-chat",
messages=conversation_history,
temperature=0.7,
max_tokens=1000
)
assistant_message = response.choices[0].message.content
conversation_history.append({"role": "assistant", "content": assistant_message})
return assistant_message, conversation_history
# Usage
response, history = chat_with_deepseek("Hello, how are you?")
print(response)
Node.js
javascript
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'YOUR_API_KEY',
baseURL: 'https://api.deepseek.com/v1'
});
async function chatWithDeepSeek(message, conversationHistory = []) {
conversationHistory.push({ role: 'user', content: message });
try {
const response = await client.chat.completions.create({
model: 'deepseek-chat',
messages: conversationHistory,
temperature: 0.7,
max_tokens: 1000
});
const assistantMessage = response.choices[0].message.content;
conversationHistory.push({ role: 'assistant', content: assistantMessage });
return { message: assistantMessage, history: conversationHistory };
} catch (error) {
console.error('Error:', error);
throw error;
}
}
// Usage
const result = await chatWithDeepSeek("Hello, how are you?");
console.log(result.message);