Earlier this year, the DeepSeek R1 model went viral on social media due to its high capabilities despite being produced by a little-known Chinese firm. At that time, reports emerged that the cost of training this model was around $5.58 million (~RM 23.46) which is much cheaper than the cost of other AI models and uses NVIDIA chips. As a result, NVIDIA shares fell for a while.
This morning, the actual cost of training DeepSeek R1 was revealed to be only $294,000 (~RM 1.24 million) and uses 512 NVIDIA H800 chips. The secret was revealed in a research paper published in the journal Nature. The cost of training R1 is lower because it uses a trial-and-error reinforcement learning technique.
Models are given points if they successfully find the answer on their own through trial and error. Other models use data provided by humans to learn to solve certain problems. This takes more time and costs.
However, the weakness of this model is that it is difficult to explain how it thinks to get the answer. It is too complex and long for most humans to understand. Most AI models like ChatGPT and Gemini can explain the workings of mathematics for example in a more concise way.