On 24 December 2024, the Wall Street Journal published an article on Chinese artificial intelligence (AI).
Despite US export controls on advanced chips crucial to AI, Chinese AI developers are catching up with the US fast.
Much of the WSJ article is about Moonshot AI, in which China Merchants China Direct Investments (CMCDI, 133 HK) owns 1.3% stake. Alibaba and Tencent, the two tech giants, are both investors of Moonshot AI.
Moonshot AI says that its Kimi large language model “compared favorably with OpenAI’s reasoning model called o1“.
2024 年 12 月 24 日,《华尔街日报》发表了一篇关于中国人工智能行业的文章。
尽管美国对人工智能至关重要的高端芯片实施出口管制,但中国人工智能开发正在迅速赶上美国。
《华尔街日报》文章的大部分内容都是关于 月之暗面(Moonshot AI),招商局中国基金 (133 HK) 持有月之暗面 1.3% 的權益。阿里巴巴和腾讯两大科技巨头都是月之暗面的投資者。
月之暗面表示,其 Kimi 模型与「 OpenAI 的推理模型 o1 相比毫不逊色」。
Full article 完整文章:
Published on Wall Street Journal on 24 December 2024
SINGAPORE—Chinese startups show signs of catching up with America’s leading artificial-intelligence models more quickly than many in the industry had expected, despite the restrictions China faces in buying advanced chips.
DeepSeek, a startup funded by one of China’s most successful hedge-fund managers, released a preview version of its latest large language model in November. It said the program’s abilities compared favorably with OpenAI’s reasoning model called o1, which came out in preview form in September.
Other Chinese companies have made similar claims in recent weeks. Moonshot AI, a startup backed by Chinese internet giants Alibaba and Tencent, said it developed a model specializing in math with capabilities close to o1, while Alibaba said one of its own experimental research models outperformed the preview version of the U.S. model on math.
The companies haven’t published papers describing their models, and evaluating the claims is difficult because there isn’t a single agreed-upon test of an AI model’s abilities. Still, some U.S. specialists said they were impressed.
China is “catching up faster,” said Andrew Carr, a former fellow at OpenAI and currently an AI entrepreneur. He said DeepSeek researchers trying to replicate OpenAI’s reasoning model “figured it out within a few months, and frankly many of my colleagues are surprised by that.”
One test used for comparison is the American Invitational Mathematics Examination, which is designed to challenge the brightest high-school math students.
DeepSeek said its model bested OpenAI’s on the AIME. An experiment by The Wall Street Journal using 15 problems from this year’s AIME found that OpenAI’s o1 preview model got to the answers faster than DeepSeek, Moonshot and the experimental Alibaba model. In one word puzzle involving strategy in a hypothetical two-player game, the OpenAI program gave the answer in 10 seconds while DeepSeek took more than two minutes.
Getting the correct answer on the first try is still a feat because word problems often stump AI programs.
Chinese AI developers have faced U.S. restrictions on access to the world’s most advanced AI chips, including those from chip leader Nvidia, since 2022. The Biden administration in December again tightened export control rules.
But the developers have found workarounds.
At Moonshot, the startup backed by Alibaba and Tencent, founder Yang Zhilin has said the company is focusing on reinforcement learning, which mimics humans’ trial and error. The approach might use computing power less intensively in improving performance.
Since late last year, AI developers have increasingly been using a technique called “mixture of experts,” or MoE, in which an initial routing mechanism directs the problem to a specialized expert model like a head chef directing a spaghetti order to the kitchen’s Italian cook. This process also eases the demands on chips.
Tencent said its MoE model, released in November, delivered performance comparable to a Llama 3.1 model introduced in July by Facebook owner Meta Platforms. Researchers who reviewed papers published by the two companies said Tencent’s model was likely trained with around a 10th of the computing power Meta used.
DeepSeek started as the AI research unit of High-Flyer, a quantitative hedge-fund manager with $8 billion in assets that is known for leveraging AI to trade. In 2021, DeepSeek connected around 10,000 of Nvidia’s A100 chips to form a cluster for AI training, which it called Fire-Flyer 2.
In a paper published this August, DeepSeek said Fire-Flyer 2 achieved performance close to an Nvidia system containing similar chips, but the Chinese system cost less and consumed less energy. DeepSeek’s May paper on its MoE model, which incorporated a technique to process data more efficiently, was widely noted in the industry.
“One way China will get around export controls—building extremely good software and hardware training stacks using the hardware it can access,” Jack Clark, co-founder of AI startup Anthropic, wrote in his blog, referring to DeepSeek’s cluster. “Made in China will be a thing for AI models, same as electric cars, drones, and other technologies,” he wrote.
Many Chinese AI developers have found ways to access restricted Nvidia chips, including through trades with middlemen and overseas data centers.
Nonetheless, the lack of cutting-edge chips is painful to the Chinese startups, according to Chinese executives, and the gap is poised to widen. Nvidia customers are preparing to deploy its latest AI data-center chip, called Blackwell, at significant scale.
Elon Musk’s xAI has constructed a data center with 100,000 Nvidia chips and recently raised $5 billion to do more. Amazon Web Services plans to build a massive AI supercomputer with hundreds of thousands of its homegrown chips.
DeepSeek, which focuses on open-source models, emphasizes math and coding. Moonshot has gained popularity among Chinese consumers with its ChatGPT-like chatbot Kimi and is known for its ability to handle long-form text.
Chinese AI startups are currently valued at a fraction of U.S. companies such as OpenAI—which was recently valued at $157 billion—because financiers are unsure about their ability to monetize their advances. Fierce competition has led to a price war among AI model vendors.
Beijing-based Zhipu AI, which was valued at around $3 billion in its latest fundraising round this month, has pushed back its plan to go public as soon as the second half of 2025 after investment bankers told the company it was unlikely to get the valuation it wanted, people familiar with the matter said. Zhipu showcased its AI agent in late November and released a video-generating model similar to OpenAI’s Sora in July.
Howard Huang, a former AI-infrastructure executive at a Beijing-based AI-model company, compared the Chinese industry to people trying to dance while wearing shackles. “Focusing on what we have been good at is the only opportunity to survive, and probably to win,” he said.
Chat with us on WeChat or WhatsApp:
💬 WeChat ID: ASM_Argyle
💬 WhatsApp ID: +852 6317 6371


Leave a comment