Header Ads

The Hidden Dollar Drain Behind India’s AI Rush

India’s AI boom may be quietly creating its next major dollar outflow problem.

As startups and enterprises rush to embed generative AI and large language models into everything from customer support to internal workflows, a growing share of their technology spending is flowing overseas through AI inference bills.

AI inference refers to the process of using trained AI models to generate responses and execute tasks in real-world applications. Unlike traditional software, AI-native applications incur a fresh compute cost every time a user interacts with them through a chatbot query, as well as each instance of personalisation in recommendation engines or workflow automation powered by AI models. These costs are typically measured in tokens, the basic units of data processed by AI models.

While India emerges as one of the world’s largest consumers of AI, much of the underlying economic value accrues abroad. 

SaaS giant Zoho’s cofounder Sridhar Vembu recently likened the phenomenon to an ‘oil import bill’ for the AI era, warning that dependence on foreign AI compute infrastructure could become a vulnerability for countries like India. Zoho itself spends ‘a few million a year’ on AI model subscriptions, disclosed Vembu.

Inc42 dug deeper into the problem. At what stage of a startup’s lifecycle does AI inference costs become a material factor? How can startups mitigate these cost concerns? And should India have a plan to retain some of the revenue value in such AI inference calls? 

Earning In Rupees, Spending In Dollars

At Gurugram-based beautytech startup Style Lounge, AI inference and model usage already account for nearly 8–12% of the company’s AI and cloud infrastructure costs, which it expects could rise to as much as 25% as more customer journeys become AI-led. 

The company, founded in 2024, relies on AWS for cloud and GPU infrastructure, alongside OpenAI APIs, image analysis models, and other AI tooling layers. Most serious AI workloads today are ultimately billed in dollars or tied to dollar-linked pricing, meaning that even if revenues are earned in rupees, the underlying cost base remains global.

“In classic SaaS, once the product is built, the marginal cost of serving one more customer is relatively low. But in AI-first SaaS, every skin-analysis flow, chatbot response, recommendation, automation trigger, or customer-care interaction can create a paid inference event. So the cost scales with usage,” said Deepak Gupta, cofounder, Style Lounge.

To be sure, inference cost refers to the compute expense incurred every time an AI model processes a prompt and generates a response. In AI-powered applications, every user interaction, API call, or automated workflow is billable event; it means that the cost increases with scale. These costs are typically measured through input tokens, output tokens, and, in self-hosted deployments, GPU compute time.

Like Style Lounge, voice AI startup Bolna believes the Rupee-Dollar imbalance is showing up directly in the cost base, prompting the company to evaluate its use of open source models.

“Foundation models, GPU capacity, most of our core tooling are all dollar-billed. We use Indian providers where quality and latency hold up, and we keep evaluating open source models we can self-host to bring more of the stack in-house. Over time, we want more of the cost base under our direct control. But the honest answer is that any serious AI stack today still routes meaningfully through dollar-denominated vendors,” said Prateek Sachan, the startup’s cofounder and chief technology officer.

The conversation around AI-driven dollar outflows also comes at a time when Prime Minister Narendra Modi has publicly urged restraint in non-essential imports and external spending amid global economic uncertainty and pressure on India’s import bill. 

While the comments were directed at energy consumption and discretionary imports, some founders and analysts believe AI infrastructure dependence could emerge as a similar long-term concern for India’s digital economy, especially if the country scales AI adoption without building enough domestic capabilities.

The compute layer (chips, foundation models, hyperscale clusters) is where the margin pools concentrate, and India is not competitive there yet, opine experts. The defensible layer is applied AI that includes products, workflows, vertical agents, data assets, and distribution. 

“That’s where Indian SaaS has already proven it can build globally. The question isn’t whether we own the compute stack, it’s whether we build enough high-margin applied AI companies on top of it to net out positive on the trade,” said Sparsh Gupta, CEO and cofounder of Wingify, a Delhi-based SaaS company, which is majority owned by private equity giant Everyone Capital. 

Globally, the AI inference market is expected to grow from $106 Bn in 2025 to over $520 Bn in 2034, according to market research platform Polaris. Being home to advanced AI companies, North America holds close to 50% of the market; that said, Asia Pacific region is grow at about 20% CAGR, with demand driven by digital infrastructure and modernisation.

Token Costs Plummet, But Scale Brings Burden

According to analyst firm Gartner, the cost of running inference on a 1-trillion parameter AI model could fall by more than 90% by 2030 compared to last year. 

The firm estimates that LLMs may become up to 100 times more cost-efficient than early-generation models from 2022, driven by advances in hardware, model architecture, edge inference, and chips specifically optimised for AI inference workloads.

But it comes with a caveat. Forecasts of sharply falling inference costs notwithstanding, enterprises may not see proportional savings as AI usage grows more sophisticated and compute-intensive. While lower token costs are expected to make basic AI features cheaper and more widely embedded across software products, emerging applications like agentic AI are likely to drive overall spending higher.

India is the world’s second-largest consumer AI market, as also noted by OpenAI cofounder Sam Altman, who recently said that Indian users have collectively generated more than one billion images using ChatGPT Images 2.0. Against this backdrop, AI inference spending will likely become a meaningful component of India’s technology import bill over time. AI inference demand will scale exponentially with growing enterprise demand.

India’s push to become an AI powerhouse is also creating an opportunity for the country to position itself as a global AI inference hub, with data centres emerging as critical infrastructure in the race. 

“Our studies estimate India could require ~7 GW of AI compute capacity by 2030, nearly 30x current levels, with inference accounting for ~90–95% of total demand,” said Sidhant Rastogi, president of tech consultancy firm, Zinnov.

This is driving a new wave of investments into AI-ready data centres across India. Companies such as Yotta Data Services, CtrlS Datacenters, E2E Networks and Reliance Jio are ramping up GPU clusters and AI cloud offerings to capture growing demand from startups and enterprises looking to run inference workloads locally. Most recently, Uber partnered with Adani Group for its first data centre in India to support and scale its AI inference, machine learning, and real-time logistics operations.

Industry executives believe India has structural advantages like the oft-cited vast developer ecosystem, domestic AI demand, digital and internet infrastructure, and lower operating costs for LLM makers. Running inference closer to end users can also reduce latency, improve data residency compliance, and lower dependence on overseas compute infrastructure. But the question remains: will enterprises and startups show enough demand for local inference and compel AI infrastructure players to invest in jumping over the current hurdles?

[Edited by Nikhil Subramaniam]

The post The Hidden Dollar Drain Behind India’s AI Rush appeared first on Inc42 Media.


No comments

Powered by Blogger.