Movidius, the company behind early work on Google’s Project Tango and its own Myriad 2 high performance vision processors, has announced a new VPU (Vision Processing Unit) dubbed Myriad X. The new core is a significant advance on previous designs, and Movidius claims it can process up to 1 trillion deep neural network (DNN) operations per second with 10 times the performance of the Myriad 2 architecture.
Part of the reason Movidius can hit these performance targets is the boosted number of SHAVE (Streaming Hybrid Architecture Vector Engine) cores within the SoC. The Myriad 2 had 12 of these cores, as shown in the diagram below, while the Myriad X bumps that figure up to 16.
But the Myriad X doesn’t just extend the architecture of the Myriad family; it also implements custom hardware specifically intended for neural network inferencing workloads. Myriad X also supports 4K hardware encoding, USB 3.1, and PCIe 3.0 support, all while remaining within the same 2W envelope as the Myriad 2. These advances are partly due to a foundry node shrink; the Myriad 2 was built on 28nm technology at TSMC while the Myriad X uses TSMC’s 16nm FFC node. It also has four additional MIPI lanes and supports up to 4Gbit of LPDDR4 memory in-package.
Movidius is marketing its hardware as being instrumental to creating various “smart” devices and claims that its DNN proficiency gives platforms that integrate Movidius VPUs something more akin to human vision and human decision-making capabilities. This is obviously a significant stretch, but we’re still in the early days of the AI “revolution” as variously discussed by companies like AMD, Intel, and Nvidia. If Movidius can deliver the performance it claims within a 2W form factor, it would seem to threaten companies like Nvidia. GeForce GPUs may offer much higher performance, but they also consume significantly more power. You’ll never find a GTX 1080-equivalent in a smartphone (well, not until the GTX 1080 is utterly outdated, anyway).
It’s worth noting that there are significant differences between the inference workloads Movidius and Google’s TensorFlow project specialize in and the training workloads that Nvidia often talks up as a unique advantage of its GPUs. Training workloads are still the province of high-end GPU hardware. Inference workloads appear to be more amenable to low-power optimization and execution, at least judging by what we’ve seen to date.
Right now, however, we’re still long on future promises and short on real-world, practical applications for these technologies. Additional details and specifications on the Movidius Myriad X can be found here.