API FAQ
Frequently asked questions about the DeepSeek API.
General Questions
What is the DeepSeek API?
The DeepSeek API provides programmatic access to DeepSeek's advanced language models. You can use it to build applications that leverage AI for text generation, conversation, code generation, and more.
How do I get started?
- Sign up for a DeepSeek account
- Generate an API key from your dashboard
- Make your first API call using our SDKs or direct HTTP requests
- Explore our documentation and examples
What models are available?
Currently available models include:
- deepseek-chat: General-purpose conversational AI
- deepseek-coder: Specialized for code generation and programming tasks
- deepseek-math: Optimized for mathematical reasoning
Is there a free tier?
Yes, we offer a free tier with limited usage to help you get started. Check our pricing page for current limits and upgrade options.
Authentication & API Keys
How do I authenticate API requests?
Use your API key in the Authorization header:
curl -H "Authorization: Bearer YOUR_API_KEY" \
https://api.deepseek.com/v1/chat/completions
Or use the api-key header:
curl -H "api-key: YOUR_API_KEY" \
https://api.deepseek.com/v1/chat/completions
Where do I find my API key?
- Log in to your DeepSeek account
- Go to the API Keys section in your dashboard
- Create a new key or copy an existing one
Can I regenerate my API key?
Yes, you can regenerate your API key at any time from your dashboard. Note that regenerating will invalidate the old key immediately.
How do I keep my API key secure?
- Never commit API keys to version control
- Use environment variables to store keys
- Rotate keys regularly
- Restrict key permissions when possible
- Monitor usage for unusual activity
Rate Limits & Usage
What are the rate limits?
Rate limits vary by plan:
- Free tier: 10 requests per minute
- Pro tier: 100 requests per minute
- Enterprise: Custom limits
How do I handle rate limiting?
Implement exponential backoff when you receive a 429 status code:
import time
import random
def make_request_with_backoff(func, max_retries=3):
for attempt in range(max_retries):
try:
return func()
except RateLimitError:
if attempt == max_retries - 1:
raise
wait_time = (2 ** attempt) + random.uniform(0, 1)
time.sleep(wait_time)
How is usage calculated?
Usage is measured in tokens:
- Input tokens: Text you send to the API
- Output tokens: Text generated by the model
- Both count toward your usage quota
Can I monitor my usage?
Yes, check your dashboard for real-time usage statistics, including:
- Total requests made
- Tokens consumed
- Rate limit status
- Billing information
Technical Questions
What's the maximum context length?
Context length varies by model:
- deepseek-chat: 32,768 tokens
- deepseek-coder: 16,384 tokens
- deepseek-math: 8,192 tokens
How do I handle long conversations?
For conversations that exceed context limits:
- Summarization: Summarize older messages
- Sliding window: Keep only recent messages
- Chunking: Break long inputs into smaller pieces
def manage_conversation_length(messages, max_tokens=30000):
total_tokens = estimate_tokens(messages)
if total_tokens > max_tokens:
# Keep system message and recent messages
system_msg = [msg for msg in messages if msg['role'] == 'system']
recent_msgs = messages[-10:] # Keep last 10 messages
return system_msg + recent_msgs
return messages
Can I use streaming responses?
Yes, set stream: true
in your request:
stream = client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
How do I control response randomness?
Use the temperature
parameter:
- 0.0: Deterministic, focused responses
- 1.0: Balanced creativity (default)
- 2.0: Highly creative, more random
{
"model": "deepseek-chat",
"messages": [{"role": "user", "content": "Write a poem"}],
"temperature": 1.5
}
What's the difference between temperature and top_p?
- Temperature: Controls randomness across all possible tokens
- Top_p: Nucleus sampling - considers only top tokens that sum to p probability
Use temperature for general creativity control, top_p for more precise probability distribution control.
Error Handling
Why am I getting a 401 error?
401 errors indicate authentication issues:
- Invalid or missing API key
- Expired API key
- Incorrect header format
Verify your API key and header format:
# Correct format
curl -H "Authorization: Bearer sk-..." \
https://api.deepseek.com/v1/chat/completions
What does "context length exceeded" mean?
This error occurs when your input plus requested output tokens exceed the model's context limit. Solutions:
- Reduce input length
- Lower
max_tokens
parameter - Summarize conversation history
- Use a model with larger context window
How do I handle network timeouts?
Implement timeout handling in your requests:
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
session = requests.Session()
retry_strategy = Retry(
total=3,
backoff_factor=1,
status_forcelist=[429, 500, 502, 503, 504],
)
adapter = HTTPAdapter(max_retries=retry_strategy)
session.mount("http://", adapter)
session.mount("https://", adapter)
Integration & SDKs
Which SDKs are available?
Official SDKs:
- Python:
pip install openai
- Node.js:
npm install openai
- Go: Community maintained
- Java: Community maintained
Can I use the OpenAI SDK?
Yes! Our API is compatible with OpenAI's SDK. Just change the base URL:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_DEEPSEEK_API_KEY",
base_url="https://api.deepseek.com/v1"
)
How do I integrate with my existing OpenAI code?
Minimal changes required:
- Update the base URL
- Replace your API key
- Use compatible model names
# Before (OpenAI)
client = OpenAI(api_key="openai-key")
# After (DeepSeek)
client = OpenAI(
api_key="deepseek-key",
base_url="https://api.deepseek.com/v1"
)
Can I use curl or other HTTP clients?
Absolutely! Our API is RESTful and works with any HTTP client:
curl -X POST "https://api.deepseek.com/v1/chat/completions" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-chat",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Performance & Optimization
How can I improve response speed?
- Use streaming: Get partial responses immediately
- Optimize prompts: Shorter, clearer prompts
- Reduce max_tokens: Limit response length
- Choose appropriate model: Use specialized models for specific tasks
Should I cache responses?
Yes, for repeated queries:
import hashlib
import json
from functools import lru_cache
@lru_cache(maxsize=100)
def cached_completion(prompt_hash, model, temperature):
# Your API call here
pass
def get_completion(prompt, model="deepseek-chat", temperature=0.7):
prompt_hash = hashlib.md5(
f"{prompt}{model}{temperature}".encode()
).hexdigest()
return cached_completion(prompt_hash, model, temperature)
How do I optimize token usage?
- Be concise: Remove unnecessary words
- Use system messages: Set context once
- Implement conversation management: Summarize old messages
- Choose appropriate max_tokens: Don't over-allocate
Billing & Pricing
How am I charged?
Billing is based on token usage:
- Input tokens: Text you send
- Output tokens: Text generated
- Different models have different rates
Can I set spending limits?
Yes, configure spending limits in your dashboard to prevent unexpected charges.
What happens if I exceed my quota?
Your requests will be rejected with a quota exceeded error until:
- Your quota resets (for free tier)
- You upgrade your plan
- You purchase additional credits
Do you offer volume discounts?
Yes, enterprise customers can get volume discounts. Contact our sales team for custom pricing.
Security & Privacy
Is my data secure?
Yes, we implement industry-standard security measures:
- Encryption in transit and at rest
- Regular security audits
- Compliance with data protection regulations
Do you store my API requests?
We may temporarily store requests for:
- Service improvement
- Abuse prevention
- Debugging purposes
Data retention policies are detailed in our privacy policy.
Can I use the API for sensitive data?
For sensitive data, consider:
- Using our enterprise deployment options
- Implementing additional encryption
- Reviewing our data processing agreements
Is the API GDPR compliant?
Yes, we are GDPR compliant. See our privacy policy for details on data processing and your rights.
Troubleshooting
My requests are slow. What can I do?
- Check your internet connection
- Try a different model
- Reduce input/output length
- Use streaming for better perceived performance
- Check our status page for service issues
I'm getting inconsistent responses. Why?
This is normal for AI models. To get more consistent responses:
- Lower the temperature parameter
- Use more specific prompts
- Set a seed parameter (if available)
How do I report bugs or issues?
- Check our documentation first
- Search existing issues in our community forum
- Contact support with detailed information:
- Request/response examples
- Error messages
- Steps to reproduce
Where can I get help?
- Documentation: Comprehensive guides and examples
- Community Forum: Ask questions and share solutions
- Support: Direct help for technical issues
- Discord: Real-time community chat
Best Practices
Prompt Engineering
- Be specific: Clear, detailed instructions
- Use examples: Show the desired format
- Set context: Use system messages effectively
- Iterate: Test and refine your prompts
Error Handling
- Implement retries: Handle transient errors
- Validate inputs: Check parameters before sending
- Log errors: Track issues for debugging
- Graceful degradation: Provide fallbacks
Performance
- Batch requests: When possible, combine multiple queries
- Use appropriate models: Match model to task
- Monitor usage: Track performance and costs
- Implement caching: Avoid redundant requests
Security
- Secure API keys: Use environment variables
- Validate inputs: Sanitize user inputs
- Monitor usage: Watch for unusual activity
- Regular rotation: Update keys periodically
Getting Help
Documentation
Support
Can't find what you're looking for? Contact our support team for personalized assistance.