Bengaluru-based startup Sarvam AI revealed two new significant language models at the India AI Impact Summit. These models, named Sarvam 30B and a larger 105-billion-parameter model, were developed to enhance India’s sovereign artificial intelligence capabilities. The Sarvam 30B model, with 30 billion parameters, focuses on efficiency and performance improvement through a unique architecture.
Co-founder Pratyush Kumar highlighted that the models were trained from scratch using a mixture-of-experts (MoE) structure. This approach, particularly beneficial for reasoning and complex tasks, aims to reduce inference costs while enhancing efficiency. The 30B model excels in thinking and reasoning benchmarks at both 8K and 16K scales compared to similar-sized models, supporting a 32,000-token context window.
The larger 105-billion-parameter model, designed for advanced reasoning and agent-based tasks, activates 9 billion parameters and boasts a 128,000-token context window. Kumar emphasized that this model outperforms global systems like DeepSeek R1 and Gemini Flash on various benchmarks. Sarvam AI’s commitment to making AI accessible at a population scale aligns with India’s initiative to develop foundational AI models for diverse public applications.
