AMD and Nvidia were trounced by Intel in critical AI tests but however with the help of a very small startup. Does this mean game over already for AMD and Nvidia? This here is the latest twist in the AI hardware arms race, with Numenta in question applying a novel approach in helping to boost the performance of the CPU.
AMD and Nvidia Trounced By Intel
Numenta as you should know has just demonstrated that the Xenon CPUs from Intel can well outperform the very best of CPUs as well as the best GPUs on AI workloads simply by applying a novel approach to them.
Making use of a set of techniques that are based on this idea, and then branded under the Numenta Platform for Intelligent Computing (NuPIC) label, the startup in question has just unlocked new performance levels in conventional CPUs on AI inference, as per Serve the Home.
The really astonishing thing here is that it can apparently outperform GPUs and CPUs that are specifically designed to tackle AI inference. For instance, Numenta took a workload for which Nvidia on its end reported performance figures with its A100 GPU, and then ran it on an augmented 48-core 4th-Gen Sapphire Rapids CPU. But however, in all scenarios, it was way faster than chips from Nvidia based on total throughput. In fact, the CPU in question was 64 times very much faster than a 3rd-Gen Intel Xeon processor as well as 10 times faster than the A100 GPU.
Numenta’s Neuroscience-Inspired Approach to AI Workloads
Numenta, which is well-known for its neuroscience-inspired approach to AI workloads, heavily leans on the idea of sparse computing which as you should know is just how the brain forms connections between neurons.
Most CPUs and GPUs as you should know today are crafted for dense computing, especially for AI, which in question is rather more brute force than the contextual manner in which the brain on its own works. Although sparsity is a trusted way to boost performance, CPUs on their own really can’t work well in that way. And this is where Numenta steps in.
What the Startup Looks To Achieve With Their Approach
This startup in question looks certain to unlock the efficiency gains of sparse computing in AI models simply by applying its “secret sauce” to general CPUs rather than just chips that are built specifically to handle AI-centric workloads.
Although as you should know it can work on both CPUs and GPUs, Numenta on the other hand adopted Intel Xeon CPUs and then applied its Advanced Vector Extensions (AVX)-512 plus Advanced Matrix Extensions (AMX) to it, simply because the chips from Intel were the most available at the said time.
These in question are extensions to the x86 architecture thus serving as additional instruction sets that can enable CPUs to perform more demanding functions.
How Numenta Delivers Its NuPIC Service
Numenta for those that don’t know delivers its NuPIC service making use of docker containers, and it can easily run on the very own servers of a company. And should it work in practice, it would very much be an optimum solution to repurposing CPUs that are already deployed in data centers for AI workloads, and most especially in light of lengthy wait times on the industry-leading A100 and H100 GPUs of Nvidia.
MORE RELATED POSTS