Qwen3.5 from Alibaba: Next-Gen Open-Source Multimodal LLM with Native Agent Capabilities

Date: February 16, 2026
Category: Artificial Intelligence, Open Source LLMs, Multimodal AI
Reading Time: 8 Minutes

qwen 3.5 open source release: Next-Gen Open-Source Multimodal LLM with Native Agent Capabilities

The open-source AI landscape witnessed a pivotal moment today. On February 16, 2026, the Qwen Team at Alibaba Cloud officially released Qwen3.5, comprising two new models: Qwen3.5-Plus and Qwen3.5-397B-A17B (the first open-weight version in the Qwen 3.5 series).

Qwen3.5-Plus is positioned as the latest large language model in the Qwen3.5 series, while Qwen3.5-397B-A17B is positioned as the flagship large language model in the open-source Qwen 3.5 series. Both models support text and multimodal tasks.

Moving beyond the industry's obsession with raw parameter scaling, Qwen3.5 represents a "humble leap" forward. It prioritizes architectural efficiency, native multimodal understanding, and massive-scale reinforcement learning to deliver a model that is accessible yet performs at the frontier level.

In this technical review, we explore the specifications, architecture, benchmark performance, and deployment strategies for Qwen3.5, helping developers understand why this model is a significant upgrade for AI agents.

Qwen3.5-397B-A17B โ€“ Key Highlights

qwen 3.5-397b-a17b key highlights: Native multimodal, built & trained for real-world agents

๐Ÿš€ First open-weight model of the Qwen3.5 series โ€“ now released!
๐Ÿ–ผ๏ธ Native multimodal, built & trained for real-world agents
โœจ Hybrid linear attention + sparse MoE + massive RL scaling
โšก 8.6โ€“19.0ร— faster decoding than Qwen3-Max
๐ŸŒ Supports 201 languages & dialects
๐Ÿ“œ Apache 2.0 licensed โ€“ fully open

Architecture: Efficiency Through Innovation

The defining characteristic of Qwen3.5 is its ability to do more with less. While the headline parameter count is 397 billion, the model utilizes a sophisticated Sparse Mixture-of-Experts (MoE) architecture that activates only 17 billion parameters per forward pass.

This design allows Qwen3.5 to maintain the vast knowledge base of a 400B+ model while running with the inference latency and cost profile of a much smaller model.

Key Technical Specifications

According to the official technical report, the Qwen3.5-397B-A17B features:

  • Total Parameters: 397B (17B Activated)
  • Architecture: Hybrid Gated DeltaNet (Linear Attention) + MoE
  • Layer Structure: 60 Layers. Layout: 15 blocks of [3ร— (Gated DeltaNet โ†’ MoE) โ†’ 1ร— (Gated Attention โ†’ MoE)]
  • Context Window: 262,144 tokens (native), extensible to 1,010,000 tokens
  • Vocabulary Size: 248,320 (Expanded for multilingual efficiency)
  • Hidden Dimension: 4096

The "Gated DeltaNet" Advantage

By integrating Gated Delta Networks, a form of linear attention, Qwen3.5 significantly optimizes memory usage. In standard 32k context scenarios, decoding throughput is 8.6x higher than the previous Qwen3-Max. For ultra-long context tasks (256k), throughput improves by up to 19x, with a reported 60% reduction in deployment VRAM usage.

Native Multimodal Capabilities

Unlike previous generations that stitched vision encoders onto text models, Qwen3.5 is natively multimodal. It was trained from scratch on a massive dataset of interleaved text, image, and video tokens.

This "early fusion" approach allows Qwen3.5 to "see" the world more like a human does.

  • Video Understanding: With a 1M token context, the model can process and analyze up to 2 hours of continuous video.
  • Visual Coding: It can interpret hand-drawn UI sketches and generate functional frontend code directly.
  • Spatial Reasoning: The model demonstrates improved performance in robotics planning and spatial analysis tasks.

Benchmark Performance

qwen 3.5 benchmark performance: Qwen3.5-397B-A17B demonstrates outstanding results across a full range of benchmark evaluations, including reasoning, coding, agent capabilities, and multimodal understanding

The Qwen Team has released extensive comparison data. Qwen3.5 consistently achieves parity with, and often surpasses, proprietary frontier models.

Language & Reasoning

In pure logic and knowledge tasks, the efficient 17B activated parameters hold their own against massive dense models.

BenchmarkQwen3.5 (397B-A17B)GPT-5.2Claude 4.5 OpusGemini-3 Pro
MMLU-Pro (Knowledge)87.887.489.589.8
GPQA (STEM)88.492.487.091.9
IFBench (Instruction)76.575.458.070.4
LiveCodeBench v683.687.784.890.7

Vision-Language & Agents

This is where Qwen3.5 truly shines, particularly in agentic workflows and visual reasoning.

BenchmarkQwen3.5 (397B-A17B)Qwen3-VLGemini-3 Pro
MathVision88.674.686.6
RealWorldQA83.981.383.3
OmniDocBench 1.590.888.588.5
BFCL-V4 (General Agent)72.967.772.5

Note: Benchmark data sourced from the official Qwen3.5 release blog, February 2026.

Reinforcement Learning & Agents

Qwen3.5 has been fine-tuned using a scalable asynchronous Reinforcement Learning (RL) framework. The model was trained across millions of agent environments, learning to plan, use tools, and correct its own errors.

This makes Qwen3.5 highly effective for:

  1. Computer Control: Automating tasks across desktop and mobile operating systems (OSWorld).
  2. Web Research: Autonomously browsing, filtering, and summarizing complex topics.
  3. "Vibe Coding": Working seamlessly with IDE agents like Qwen Code to iterate on software projects using natural language.

Global Accessibility: 201 Languages

In a push for inclusivity, Qwen3.5 supports 201 languages and dialects. The vocabulary expansion to ~250k tokens improves encoding efficiency for low-resource languages by 10โ€“60%, making it a truly global foundation model. Qwen 3.5-Plus performs strongly in core benchmarks such as inference, programming, and agent performance, with significantly reduced deployment costs and greatly improved inference efficiency compared to its predecessor. As a fully open-source Apache 2.0 model that can be downloaded locally, its cost-effectiveness is truly exceptional.

Deployment Guide

Developers can access Qwen3.5 immediately via open weights or managed APIs.

Open Source Deployment

The weights are available on Hugging Face and ModelScope. Due to the MoE architecture, using the latest versions of inference engines is recommended.

Using vLLM (Recommended for Production):

vllm serve Qwen/Qwen3.5-397B-A17B \
  --port 8000 \
  --tensor-parallel-size 8 \
  --max-model-len 262144 \
  --reasoning-parser qwen3

Using SGLang:

python -m sglang.launch_server \
  --model-path Qwen/Qwen3.5-397B-A17B \
  --tp-size 8 \
  --context-length 262144

Managed API (Qwen3.5-Plus)

For those preferring a managed solution, Qwen3.5-Plus is available on Alibaba Cloud Model Studio. It features the "Thinking Mode" by default and costs approximately 0.8 RMB per million tokens, which is 1/18th of that of the Gemini 3 Pro, making it highly cost-effective for scaling.

Conclusion

Qwen3.5 is more than just an upgrade; it is a validation of efficient, hybrid architectures. By delivering frontier-class intelligence with a 17B active parameter footprint, the Qwen Team has lowered the barrier to entry for advanced AI.

Whether you are building complex multimodal agents, analyzing long-form video, or deploying multilingual applications, Qwen3.5 offers a robust, open-source foundation. We look forward to seeing the innovations the community will build on top of this impressive release.

๐Ÿ”—Dive in

GitHub: https://github.com/QwenLM/Qwen3.5
Chat: https://chat.qwen.ai
API๏ผšhttps://modelstudio.console.alibabacloud.com/ap-southeast-1/?tab=doc#/doc/?type=model&url=2840914_2&modelId=group-qwen3.5-plus
Qwen Code: https://github.com/QwenLM/qwen-code
Hugging Face: https://huggingface.co/collections/Qwen/qwen35
ModelScope: https://modelscope.cn/collections/Qwen/Qwen35
blog: https://qwen.ai/blog?id=qwen3.5

For more details, visit the Official Qwen GitHub or the Hugging Face Collection.

Read More: Latest AI Video & Image Updates

Kling 3 Release

Kling AI enters the 3.0 era. Explore the unified multimodal engine, Native Audio, Multi-Shot, and Elements 3.0. Full tech comparison of Video 3.0 vs 2.6.

Read article โ†’

Kling 3 Prompt Guide

Master Kling AI 3.0 video generation. Get expert prompt formulas, cinematic camera controls, negative prompts, and learn how to fix sliding feet instantly.

Read article โ†’

Kling Image 3 Release

Discover Kling Image 3.0: The new standard for AI art with Visual Chain-of-Thought, Image Series Mode, and native 4K cinematic output.

Read article โ†’

Kling 3 Could Change AI Video Forever

Explore why Kling 3.0 Could Change AI Video Forever. A technical review of the unified model, 15s multi-shot generation, native audio, elements 3.0 consistency.

Read article โ†’

Seedance 2 Release

ByteDance unveils Seedance 2.0. Explore the quad-modal engine, industrial-grade character consistency, DiT architecture, and advanced reference control.

Read article โ†’

Seedance 2 Review

In-depth Seedance 2.0 review analyzing community feedback. Explore the 'Director Mode' workflow, native audio, multi-shot consistency, and pros/cons vs. competitors.

Read article โ†’

Qwen Image 2 Release

Explore Qwen-Image-2.0 from Alibaba: A unified foundation model mastering 1K token prompts, complex text rendering, and seamless generation-editing workflows.

Read article โ†’

Seedance 2 Prompt Guide

Master Seedance 2.0 with our expert prompt guide. Learn to control camera movements, use the '@' reference system, and create professional AI videos on Jimeng.

Read article โ†’

Kling 3 Motion Control Release

Master Kling 3.0 Motion Control for professional AI video. Explore Mocap-level animation, Element Binding for flawless facial consistency, and full-body tracking.

Read article โ†’

A Comprehensive Guide to GPT 5_4

Explore OpenAI's GPT-5.4 all-in-one model. Discover its native computer use, 1M token context, Tool Search efficiency, and evolution into an AI digital agent.

Read article โ†’

SkyReels V4 Preview

Explore SkyReels V4, the global #1 AI video generator. Discover its unified audio-video engine, grid image reference for character consistency, and smart editing.

Read article โ†’

Wan 2_7 Image Review

Read our comprehensive Wan 2.7 Image review. Explore its unified generation-editing, ultra-realistic face sculpting, precise color control, and 3K text rendering.

Read article โ†’