ForgeIQ Logo

Ant Group's Strategic Shift: Harnessing Local Chips to Innovate AI Cost-Efficiency

Featured image for the news article

Ant Group is taking bold steps in the AI space by leveraging domestic chips to train their models. This shift not only aims to reduce operational costs but also to mitigate reliance on US technology, a move that’s making waves in the industry.

The company, which is part of the Alibaba family, has turned to domestic semiconductor suppliers, including its parent company Alibaba and Huawei Technologies. Ant Group has been training its large language models utilizing a technique called Mixture of Experts (MoE), and sources suggest that the performance of these models rivals those trained using Nvidia’s powerful H800 chips. While Nvidia chips are still in play for some tasks, Ant seems to be increasingly leaning towards alternatives like AMD and homegrown designs.

This development underscores Ant’s increasing involvement in the competition between Chinese and American tech firms as they seek effective and budget-friendly methods to train AI models. The move reflects a larger strategy among Chinese businesses to navigate around export restrictions that limit access to top-tier GPUs, such as Nvidia’s H800, which, while not cutting-edge, remains among the more robust options available to Chinese companies.

In a fascinating twist, Ant has shared research claiming that some of its models outperform those developed by industry giant Meta. Bloomberg News, which first covered this story, is yet to verify these claims independently. If true, this could signify a substantial advancement in China’s efforts to lessen its dependence on foreign hardware and cut down the costs associated with running AI applications.

The MoE framework breaks down tasks into smaller data subsets, allowing different components to handle various parts efficiently. Think of it like a soccer team, where each player has a specialty – coordinating these specialists leads to a more streamlined performance. Companies like Google and the startup DeepSeek have successfully harnessed this method, so it’s no surprise that Ant Group is following suit.

Ant’s research is centered on dismantling the financial wall that high-performance GPUs often pose for smaller enterprises. The company’s paper even boasts of a goal stated outright: training models “without premium GPUs.” You have to admire their ambition!

Interestingly, while Ant is diving into cost-saving approaches with MoE, Nvidia’s strategy is quite the opposite. CEO Jensen Huang insists that the demand for computing power remains high, even with the emergence of more efficient models. For Nvidia, the focus is on creating more potent GPUs, with thousands of cores and extensive memory resources.

According to Ant's analysis, training on a whopping trillion tokens—a fundamental unit in AI learning—costs about 6.35 million yuan (around $880,000) with standard top-tier hardware. However, by employing less expensive chips, Ant successfully lowered that figure to approximately 5.1 million yuan.

What’s next for Ant? They intend to implement models like Ling-Plus and Ling-Lite in various industrial applications, such as healthcare and finance. This ambition aligns with their recent acquisition of Haodf.com, a prominent Chinese medical platform, accentuating their commitment to integrating AI solutions in the healthcare sector. Additionally, they run several AI services, including the virtual assistant Zhixiaobao and a financial advisory platform named Maxiaocai.

“Finding just one vulnerability to excel over the world champion can still mean victory, illustrating why real-world application is crucial,” remarked Robin Yu, Chief Technology Officer at Beijing-based Shengshang Tech.

To add to the excitement, Ant has made its models open source: Ling-Lite packs 16.8 billion parameters, while Ling-Plus boasts 290 billion. As a point of reference, estimates suggest the closed-source GPT-4.5 contains about 1.8 trillion parameters.

Nevertheless, challenges remain. Ant’s paper highlighted that minor modifications in either the hardware or model structure during training could lead to inconsistent results, with occasional spikes in error rates. Such hurdles are common in the ever-evolving landscape of AI development.

Stay tuned for more updates in the rapidly changing realm of AI, as it is clear that companies like Ant Group are not just adapting but actively reshaping the industry landscape.

Latest Related News