ARM SoCs Take Soft Roads to Neural Nets-Ameya360 electronic components purchasing network

ARM SoCs Take Soft Roads to Neural Nets

Release time：2017-06-30

author：Ameya360

source：EE Times

reading：1347

　　NXP is supporting inference jobs such as image recognition in software on its i.MX8 processor. It aims to extend its approach for natural-language processing later this year, claiming that dedicated hardware is not required in resource-constrained systems.

　　The chip vendor is following in the footsteps of its merger partner, Qualcomm. However, the mobile giant expects to eventually augment its code with dedicated hardware. Their shared IP partner, ARM, is developing neural networking libraries for its cores, although it declined an interview for this article.

　　NXP’s i.MX8 packs two GPU cores from Vivante, now part of Verisilicon. They use about 20 opcodes that support multiply-accumulates and bit extraction and replacement, originally geared for running computer vision.

　　“Adding more and more hardware is not the way forward on the power budget of a 5-W SoC,” said Geoff Lees, NXP’s executive vice president for i.MX. “I would like to double the Flops, but we got the image processing acceleration we wanted for facial and gesture recognition and better voice accuracy.”

　　The software is now in use with NXP’s lead customers for image-recognition jobs. Meanwhile, Verisilicon and NXP are working on additional extensions to the GPU shader pipeline targeting natural-language processing. They hope to have the code available by the end of the year.

　　“Our VX extensions were not originally viewed as a neural network accelerator, but we found [that] they work extraordinarily well … the math isn’t much different,” said Thomas “Rick” Tewell, vice president of system solutions at Verisilicon.

　　The GPU cores come with OpenCL drivers. “No one has to touch the instruction extensions … people don’t want to get locked into an architecture or tool set; they want to train a set of engineers who are interchangeable.”

　　ARM is taking a similar approach with its ARM Compute Library, released in March to run neural net tasks on its Cortex-A and Mali cores.

　　“It doesn’t have a lot of features yet and only supports single-precision math — we’d prefer 8-bit — but I know ARM is working on it,” said a Baidu researcher working on its neural net benchmark. “It also lacks support for recurrent neural nets, but most libraries still lack this.”

　　For its part, Qualcomm released earlier this year its Snapdragon 820 Neural Processing Engine SDK. It supports jobs run on the SoC’s CPU, GPU, and DSP and includes Hexagon DSP vector extensions to run 8-bit math for neural nets.

　　“Long-term, there could be a need for dedicated hardware,” said Gary Brotman, director of product management for commercial machine-learning products at Qualcomm. “We have work in the lab today but have not discussed a time-to-market.”

　　The code supports a variety of neural nets, including LSTMs often used for audio processing. Both NXP and Qualcomm execs said that it’s still early days for availability of good data sets to train models for natural-language processing. “Audio is the next frontier,” said Brotman.

（"Note: The information presented in this article is gathered from the internet and is provided as a reference for educational purposes. It does not signify the endorsement or standpoint of our website. If you find any content that violates copyright or intellectual property rights, please inform us for prompt removal."）

Trade news

Arm Targets Laptop Performance

Arm announced a new mobile CPU core that it said can deliver performance within 10% of Intel’s latest Skylake chips. Analysts praised the architecture’s leap forward but said that they doubt Arm will take a significant share of today’s x86-based notebooks.The Cortex-A76 arrives in tandem with new Mali G76 GPU and V76 video cores. All three are expected to appear in premium smartphone SoCs before the end of the year.The A76 marks a full redesign for mobile systems, packing up to 2-Mbytes L2 cache, 4-Mbytes L3, and running at more than 3 GHz in a 7-nm node. It aims to deliver 90% of the Specint2006 performance of an Intel mobile Skylake chip with one-fourth the area and half the power — or roughly the same performance in thermally constrained systems.“We’re looking to close the gap with Intel … this marks the first step in a new family, and it’s the biggest leap we’ve taken in our roadmap,” said Mike Filippo, an Arm fellow and lead architect for the A76.Compared to an A72 core at 10 nm, a 7-nm A76 should deliver 35% more performance or use 40% less power. That’s a step up from 15% to 25% increases that Arm typically delivers with annual core upgrades. In its day, the A72 delivered about 75% of the performance of Intel’s mobile Broadwell processors.The comparisons are based on CPUs running at similar frequencies. Arm acknowledged that Intel’s chips typically support higher frequencies than Arm’s cores. Although TSMC announced a 4-GHz A72 test chip, few SoC makers are expected to push their designs to such extreme speeds.Arm is preparing a separate core for wired servers and networking gear. The A76 aims to expand Arm’s dominance in smartphones into laptops with 4+4 A76/A55 configurations sporting large caches.“We think you’ll see meaningful volumes in laptops,” said Filippo, but some analysts disagree.Arm-based notebooks lack differentiation, said Bob O’Donnell of Technalysis Research. They offer slightly less performance and about the same price as x86 systems. Although the Arm portables sport longer battery life and often build in cellular modems, O’Donnell doubts that those factors will sway many buyers.That said, Asus, Hewlett Packard, and Lenovo announced Arm-based notebooks running Windows 10 on Qualcomm’s Snapdragon SoC. To date, Qualcomm has been the leading proponent of such designs.With its Cortex-A76, Arm removed performance bottlenecks and optimized features across its mobile core architecture. Click to enlarge. Images: Arm.With its focus on small, low-power cores, Arm will get more benefit from next-generation process technologies than rival Intel, traditionally focused on driving up data rates. Arm claims that the latest 7-nm nodes will only deliver 2% to 3% more speed than the 16-nm node.“There hasn’t been much frequency benefit at all since 16 nm … wire speed hasn’t scaled for some time,” said Peter Greenhalgh, an Arm fellow and vice president of technology.In graphics, the new Mali G76 is the latest high-end implementation of Arm’s Bifrost GPU architecture. It delivers at 7 nm an estimated 50% overall improvement compared to the existing G72 made in a 10-nm process.The G76 can be configured with up to 20 shader cores and an L2 cache configurable from 512 Kbytes to 4 Mbytes. Each shader sports three execution engines.Arm enhanced both the A76 CPU and G76 GPU for machine-learning tasks even though it is about to roll out its first AI-specific cores. The shotgun approach stems in part from Arm’s belief that it’s still early days for what’s likely to be a wide diversity of AI applications needing a variety of implementations.Deep-learning tasks will run four times faster on the A76 and 2.7 times faster on the G76 compared to existing Arm cores. “We are enabling machine learning on everything … as the size of workloads grows, people will move some jobs to GPUs and CPUs for inline work,” said Alex Chalfin, a senior principal graphics architect for Arm.In video, the Mali-V76 improves 4K performance and, running at 800 MHz, can decode a single 8K video stream at 60 frames/second. A next-generation design will support 8K60 encode.The 8K support is initially geared for VR headsets displaying 4K video to each eye. 8K content is not expected to be generally available until 2020, when Japan streams the Summer Olympics in the format.Overall, Arm expects that the A76 will deliver a 35% performance boost over the existing A72 core. Click to enlarge.Overall, “each new core offers significant upgrades for premium smartphones … and Arm’s Dynamiq architecture makes it easier to drop one or two Cortex-A76s into a cluster with the little A55 cores to boost performance in mid-range phones as well,” said Mike Demler, analyst for the Linley Group.“As for the VPU, Arm doesn’t have a display processor core yet to deliver 8K output, but I think there won’t be much of a market for that for a few more years,” he added.Test chips have been taped out for all of the new cores using RTL that Arm shipped about a year ago. Production silicon from SoC customers is eventually expected to span 12-, 7-, and 5-nm nodes.

2018-06-01 00:00 reading：1459

model	brand	Quote
TL431ACLPR	Texas Instruments
MC33074DR2G	onsemi
RB751G-40T2R	ROHM Semiconductor
CDZVT2R20B	ROHM Semiconductor
BD71847AMWV-E2	ROHM Semiconductor

model

brand

Quote

Texas Instruments

onsemi

ROHM Semiconductor

ROHM Semiconductor

ROHM Semiconductor

model	brand	To snap up
ESR03EZPJ151	ROHM Semiconductor
TPS63050YFFR	Texas Instruments
BU33JA2MNVX-CTL	ROHM Semiconductor
BP3621	ROHM Semiconductor
IPZ40N04S5L4R8ATMA1	Infineon Technologies
STM32F429IGT6	STMicroelectronics

model

brand

To snap up

ROHM Semiconductor

Texas Instruments

ROHM Semiconductor

ROHM Semiconductor