NVIDIA: Nemotron 3 Super (free) – API Quickstart

Sample code and API for Nemotron 3 Super (free)

OpenRouter normalizes requests and responses across providers for you.

Get your API key

Create an API key from your OpenRouter dashboard and set it as an environment variable:

Make your first request

Use nvidia/nemotron-3-super-120b-a12b:free with the OpenRouter API:

OpenRouter supports reasoning-enabled models that can show their step-by-step thinking process. Use the reasoning parameter in your request to enable reasoning, and access the reasoning_details array in the response to see the model's internal reasoning before the final answer. When continuing a conversation, preserve the complete reasoning_details when passing messages back to the model so it can continue reasoning from where it left off. Learn more about reasoning tokens.

In the examples below, the OpenRouter-specific headers are optional. Setting them allows your app to appear on the OpenRouter leaderboards.

Using third-party SDKs

For information about using third-party SDKs and frameworks with OpenRouter, please see our frameworks documentation.

Enable streaming

Add "stream": true to your request body to receive responses as server-sent events:

Endpoint

POSThttps://openrouter.ai/api/v1/chat/completions

AuthorizationBearer $OPENROUTER_API_KEY

Content-Typeapplication/json

HTTP-Refereroptional — your site URL, for rankings

X-Titleoptional — your site name, for rankings

Modelnvidia/nemotron-3-super-120b-a12b:free

Parameters

Name	Type	Default	Description
`reasoning`	map	—	Controls reasoning behavior for models that support thinking tokens, including whether reasoning is enabled, the reasoning effort, maximum reasoning tokens, and whether reasoning is excluded from the response.
`include_reasoning`	boolean	—	Deprecated alias for reasoning.exclude.
`temperature`	float	`1`	This setting influences the variety in the model's responses.
`max_tokens`	integer	—	This sets the upper limit for the number of tokens the model can generate in response.
`seed`	integer	—	If specified, the inferencing will sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
`top_p`	float	`1`	This setting limits the model's choices to a percentage of likely tokens: only the top tokens whose probabilities add up to P.
`tools`	array	—	Tool calling parameter, following OpenAI's tool calling request shape.
`tool_choice`	string or object	—	Controls which (if any) tool is called by the model.
`structured_outputs`	boolean	—	If the model can return structured outputs using response_format json_schema.
`response_format`	map	—	Forces the model to produce specific output format.