Chatbot users must have forgotten about waiting for answers that take several seconds to process. For those with high patience, this is not an issue, but if time is critical, every second is gold. Mistral has launched their latest chatbot LeChat which has the ability to provide answers at a rate of 1000 words per second (1100 tokens/s).
This is much faster than Gemini 2.0 Flash with a capacity of 168 tokens/s and ChatGPT 4o with a capacity of 115 tokens/s. This fast performance is achieved using the world's largest and fastest processor chip WSE-3 developed by Cerebras
The chip is manufactured using a 215 x 215 mm wafer with 5nm technology. Each WSE-3 has 4 trillion transistors, 900,000 cores and 44GB of SRAM memory. The WSE-3 also has 52 times more processor cores than the NVIDIA H100. This gives it a peak processing speed capability of 125 petaFLOPs.
 
