Ant Group is increasingly utilizing Chinese-manufactured semiconductors to train its artificial intelligence (AI) models. This shift aims to cut costs and reduce reliance on US technology, which has faced restrictions. Sources indicate that the Alibaba-owned company has been using chips from domestic suppliers, including those connected to Alibaba itself and Huawei Technologies, to implement the Mixture of Experts (MoE) method for developing large language models. The results achieved with these local chips reportedly match those produced using Nvidia’s H800 chips.
Although Ant still employs Nvidia chips for some projects, it is progressively opting for alternatives from AMD and Chinese manufacturers for its newest models. This development marks Ant’s deepening role in the competitive AI landscape between Chinese and American tech companies. With the growing need for affordable model training solutions, Ant’s experimentation with domestic hardware highlights a broader initiative among Chinese firms to navigate export restrictions hampering access to advanced chips like the Nvidia H800. According to a research paper published by Ant, some of its models even outperformed those created by Meta, although these claims have not been independently verified.
MoE models have gained attention due to their ability to allocate tasks into smaller data segments, making the modeling process more efficient. The technique, used by companies like Google and the startup DeepSeek, enables distinct specialists to manage individual components of a task. Ant Group has optimized its training methods to significantly reduce costs; training one trillion data tokens traditionally required about 6.35 million yuan but was reduced to approximately 5.1 million yuan using more affordable chips. Ant intends to apply its models, named Ling-Plus and Ling-Lite, to various sectors, including healthcare and finance, after acquiring the Chinese online medical platform Haodf.com.
Nonetheless, the complexities of training AI models remain, as minor changes in hardware or model design can lead to performance instability. Despite these challenges, Ant has made its models open source, with Ling-Lite featuring 16.8 billion parameters and Ling-Plus 290 billion parameters, showcasing significant technical advancements amid ongoing competition in the AI field.