Nemotron-3-Nano-30B-A3B

17.5 DZD in 70 DZD out/ 1M tokens

NVIDIA Nemotron 3 Nano is an open small reasoning model optimized for fast, cost-efficient inference in agentic and production workloads. Built with a hybrid Mixture-of-Experts (MoE) and Mamba-Transformer architecture, it delivers strong multi-step reasoning, high token throughput, stable latency with predictable cost, and efficient deployment for agent-based systems. Designed for real-world AI systems where reasoning can generate significantly more tokens per prompt, Nemotron Nano reduces compute cost while maintaining strong reasoning quality.

Publicfp4FunctionProject

ArchitectureMoE

Context Window262K

Model Library

Nemotron-3-Nano-30B-A3B

Model Information