Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibaba. It accepts text, image, and video input and produces text output, with a 1M token context window. This is an updated version of Qwen3.5 Plus with tiered pricing above 256K tokens.
Modalities
Input Price
25% off
$0.30per 1M
Output Price
25% off
$1.80per 1M
Context
1M
Weekly Tokens
3.72B
Released
Apr 27, 2026
Sample code and API for Qwen3.5 Plus 2026-04-20
OpenRouter normalizes requests and responses across providers for you.
1
Get your API key
Create an API key from your OpenRouter dashboard and set it as an environment variable:
2
Make your first request
Use qwen/qwen3.5-plus-20260420 with the OpenRouter API:
OpenRouter supports reasoning-enabled models that can show their step-by-step thinking process. Use the reasoning parameter in your request to enable reasoning, and access the reasoning_details array in the response to see the model's internal reasoning before the final answer. When continuing a conversation, preserve the complete reasoning_details when passing messages back to the model so it can continue reasoning from where it left off. Learn more about reasoning tokens.
In the examples below, the OpenRouter-specific headers are optional. Setting them allows your app to appear on the OpenRouter leaderboards.
Using third-party SDKs
For information about using third-party SDKs and frameworks with OpenRouter, please see our frameworks documentation.
3
Enable streaming
Add "stream": true to your request body to receive responses as server-sent events:
Endpoint
POSThttps://openrouter.ai/api/v1/chat/completions
AuthorizationBearer $OPENROUTER_API_KEY
Content-Typeapplication/json
HTTP-Refereroptional — your site URL, for rankings
X-Titleoptional — your site name, for rankings
Modelqwen/qwen3.5-plus-20260420
Parameters
Name
Type
Default
Description
reasoning
map
—
Controls reasoning behavior for models that support thinking tokens, including whether reasoning is enabled, the reasoning effort, maximum reasoning tokens, and whether reasoning is excluded from the response.
include_reasoning
boolean
—
Deprecated alias for reasoning.exclude.
max_tokens
integer
—
This sets the upper limit for the number of tokens the model can generate in response.
temperature
float
1
This setting influences the variety in the model's responses.
top_p
float
1
This setting limits the model's choices to a percentage of likely tokens: only the top tokens whose probabilities add up to P.
seed
integer
—
If specified, the inferencing will sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
presence_penalty
float
0
Adjusts how often the model repeats specific tokens already used in the input.
response_format
map
—
Forces the model to produce specific output format.
tools
array
—
Tool calling parameter, following OpenAI's tool calling request shape.
tool_choice
string or object
—
Controls which (if any) tool is called by the model.
structured_outputs
boolean
—
If the model can return structured outputs using response_format json_schema.