April 24, 2026 — DeepSeek officially released the preview version of the DeepSeek-V4 series, simultaneously open-sourcing the models and launching API services. This series includes two versions: DeepSeek-V4-Pro and DeepSeek-V4-Flash, achieving leadership in Agent capabilities, world knowledge, and reasoning performance both domestically and in the open-source domain, marking the official entry into the era of 1 million token context accessibility.
Core Capabilities: 1M Context and Top-Tier Performance
DeepSeek-V4 introduces a novel attention mechanism with token-level compression, combined with DSA (DeepSeek Sparse Attention), achieving globally leading long-context capabilities. Compared to traditional methods, V4 significantly reduces compute and memory requirements, making 1M (one million) context the standard for all official services.
DeepSeek-V4-Pro: Performance on Par with Top Closed-Source Models
- Significantly Enhanced Agent Capabilities: In Agentic Coding benchmarks, V4-Pro has reached the best performance among open-source models, with user experience superior to Sonnet 4.5 and delivery quality approaching Opus 4.6 non-thinking mode
- World Knowledge Leadership: Significantly outperforms other open-source models, slightly behind the top closed-source model Gemini-Pro-3.1
- Exceptional Reasoning Performance: In mathematics, STEM, and competitive coding benchmarks, surpasses all publicly evaluated open-source models
DeepSeek-V4-Flash: Faster and More Cost-Effective Option
V4-Flash has slightly less world knowledge储备 compared to the Pro version but demonstrates comparable reasoning capabilities. With smaller parameters and activation, V4-Flash provides faster and more economical API services, suitable for simpler task scenarios.
Dual-Chip Architecture Support: Ascend and NVIDIA in Parallel
Another highlight of DeepSeek-V4 is its comprehensive hardware compatibility. The models simultaneously support:
- Ascend (Huawei Ascend Chips): Adapted for Huawei Ascend 910 series and other mainstream domestic AI chips
- NVIDIA GPUs: Fully supporting H-series, A100, L40S and other mainstream GPU models
This dual-chip compatible design allows enterprises to flexibly choose based on their infrastructure and compliance needs, reducing the barriers to AI application deployment.
API Access
DeepSeek-V4 API has been updated simultaneously, supporting both OpenAI ChatCompletions and Anthropic interface formats:
# V4-Pro
model: deepseek-v4-pro
# V4-Flash
model: deepseek-v4-flash
Important Notice: The legacy model names deepseek-chat and deepseek-reasoner will be deprecated on July 24, 2026. Currently, they correspond to V4-Flash non-thinking and thinking modes respectively.
Xi’an Boao Intelligent: Rapid Response, Enabling Enterprise Adaptation
As a leading AI enterprise in Northwest China, Xi’an Boao Intelligent Technology Co., Ltd. has simultaneously initiated adaptation work for DeepSeek-V4. Our technical team provides the following services:
- Deployment and optimization of DeepSeek-V4 in Ascend chip environments
- Performance tuning of DeepSeek-V4 in NVIDIA GPU environments
- Seamless integration of DeepSeek-V4 with enterprise existing AI systems
- Enterprise-level Agent application development based on DeepSeek-V4
Xi’an Boao Intelligent always upholds the philosophy of “Turning Technology into Real Productivity,” helping enterprises quickly embrace cutting-edge AI capabilities. For more information, please contact our technical team.
References:
- DeepSeek-V4 Technical Report: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf
- DeepSeek API Documentation: https://api-docs.deepseek.com/