April 24, 2026 — DeepSeek officially released the preview version of the DeepSeek-V4 series, simultaneously open-sourcing the models and launching API services. This series includes two versions: DeepSeek-V4-Pro and DeepSeek-V4-Flash, achieving leadership in Agent capabilities, world knowledge, and reasoning performance both domestically and in the open-source domain, marking the official entry into the era of 1 million token context accessibility.

Core Capabilities: 1M Context and Top-Tier Performance

DeepSeek-V4 introduces a novel attention mechanism with token-level compression, combined with DSA (DeepSeek Sparse Attention), achieving globally leading long-context capabilities. Compared to traditional methods, V4 significantly reduces compute and memory requirements, making 1M (one million) context the standard for all official services.

DeepSeek-V4-Pro: Performance on Par with Top Closed-Source Models

  • Significantly Enhanced Agent Capabilities: In Agentic Coding benchmarks, V4-Pro has reached the best performance among open-source models, with user experience superior to Sonnet 4.5 and delivery quality approaching Opus 4.6 non-thinking mode
  • World Knowledge Leadership: Significantly outperforms other open-source models, slightly behind the top closed-source model Gemini-Pro-3.1
  • Exceptional Reasoning Performance: In mathematics, STEM, and competitive coding benchmarks, surpasses all publicly evaluated open-source models

DeepSeek-V4-Flash: Faster and More Cost-Effective Option

V4-Flash has slightly less world knowledge储备 compared to the Pro version but demonstrates comparable reasoning capabilities. With smaller parameters and activation, V4-Flash provides faster and more economical API services, suitable for simpler task scenarios.

Dual-Chip Architecture Support: Ascend and NVIDIA in Parallel

Another highlight of DeepSeek-V4 is its comprehensive hardware compatibility. The models simultaneously support:

  • Ascend (Huawei Ascend Chips): Adapted for Huawei Ascend 910 series and other mainstream domestic AI chips
  • NVIDIA GPUs: Fully supporting H-series, A100, L40S and other mainstream GPU models

This dual-chip compatible design allows enterprises to flexibly choose based on their infrastructure and compliance needs, reducing the barriers to AI application deployment.

API Access

DeepSeek-V4 API has been updated simultaneously, supporting both OpenAI ChatCompletions and Anthropic interface formats:

# V4-Pro
model: deepseek-v4-pro

# V4-Flash  
model: deepseek-v4-flash

Important Notice: The legacy model names deepseek-chat and deepseek-reasoner will be deprecated on July 24, 2026. Currently, they correspond to V4-Flash non-thinking and thinking modes respectively.

Xi’an Boao Intelligent: Rapid Response, Enabling Enterprise Adaptation

As a leading AI enterprise in Northwest China, Xi’an Boao Intelligent Technology Co., Ltd. has simultaneously initiated adaptation work for DeepSeek-V4. Our technical team provides the following services:

  • Deployment and optimization of DeepSeek-V4 in Ascend chip environments
  • Performance tuning of DeepSeek-V4 in NVIDIA GPU environments
  • Seamless integration of DeepSeek-V4 with enterprise existing AI systems
  • Enterprise-level Agent application development based on DeepSeek-V4

Xi’an Boao Intelligent always upholds the philosophy of “Turning Technology into Real Productivity,” helping enterprises quickly embrace cutting-edge AI capabilities. For more information, please contact our technical team.


References: