Baidu Accelerator Rises in AI

发布时间:2018-07-09 00:00
作者:Ameya360
来源:Rick Merritt
阅读量:1035

China’s Baidu followed in Google’s footsteps this week, announcing it has developed its own deep learning accelerator. The move adds yet another significant player to a long list in AI hardware, but details of the chip and when it will be used remain unclear.

Baidu will deploy Kunlun in its data centers to accelerate machine learning jobs for both its own applications and those of its cloud-computing customers. The services will compete with companies such as Wave Computing and SambaNova who aim to sell to business users appliances that run machine-learning tasks.

Kunlun delivers 260 Tera-operations/second while consuming 100 Watts, 30 times as powerful as Baidu’s prior accelerators based on FPGAs. The chip is made in a 14nm Samsung process and consists of thousands of cores with an aggregate 512 GBytes/second of memory bandwidth.

Baidu did not disclose its architecture, but like Google’s Tensor Processing Unit, it probably consists of an array of multiply-accumulate units. The memory bandwidth likely comes from use of a 2.5D stack of logic and the equivalent of two HBM2 DRAM chips.

Kunlun will come in a version for training (the 818-300 chip) and one for less computationally intensive inference jobs (the 818-100). It is aimed for use both in data centers and edge devices such as self-driving cars. Baidu did not comment on when it will offer access to the chip as a service on its Web site or its plans for merchant sales, if any, to third parties,

Baidu said the chip will support Baidu’s PaddlePaddle AI framework as well as “common open source deep learning algorithms.” It did not mention any support for the wide variety of other software frameworks. It is geared for the usual set of deep learning jobs including voice recognition, search ranking, natural language processing, autonomous driving and large-scale recommendations.

One of the few Western analysts at the Baidu Create event where Kunlun was announced on July 4 described the chip as “definitely interesting, but still [raising] lots of remaining questions.

“My sense is that they will first leverage it in their data centers and offer it via an AI service that developers can tap into…in particular, it could get optimized for Baidu’s Apollo autonomous car platform,” said Bob O'Donnell, chief analyst at Technalysis Research LLC.

Based on raw specs, Kunlun is significantly more powerful than the second-generation of Google’s TPU which delivers 45 TFlops at 600 GB/s memory bandwidth. However, “you always have to be careful making comparisons, since Baidu apparently didn’t describe what it’s operations are,” said Mike Demler of The Linley Group.

Baidu released a picture of a mock up of its chip but no datasheet or availabilty. (Image: Baidu)

Given it’s still early days for deep learning, Web giants such as Google and Baidu may use a mx of their own ASICs along with GPUs and FPGAs for some time, said Kevin Krewell of Tirias Research.

“In areas where algorithms are changing, it may still be important to use more programmable and flexible solutions like CPUs, GPUs, and FPGAs. But in other areas where the algorithms become more fixed, then ASICs can provide a more power-efficient solution,” said Krewell.

Kunlun is not Baidu’s first hardware initiative. Last year, it launched Duer, its own smart-speaker services with OEM and silicon partners.

At the Beijing event this week, Baidu also announced an upgrade of its machine-learning service called Baidu Brian 3.0, supporting 110 APIs or SDKs including ones for face, video and natural language recognition. Users implementing the service with Baidu’s EasyDL tool to create computer vision models include one unnamed U.S. company deploying it at checkout stands in more than 160 grocery stores to check for unpaid products on the bottom shelf of a shopping cart.

(备注:文章来源于网络,信息仅供参考,不代表本网站观点,如有侵权请联系删除!)

在线留言询价

相关阅读
Baidu to Release Voice Data for AI
  China Web giant Baidu will make available what it claims are three of the largest data sets related to Chinese voice recognition in an effort to attract developers. Its Project Prometheus also includes $1 million dollar fund to invest in efforts related to voice and machine learning.  The initiative is part of DuerOS, Baidu’s platform for natural-language services. Earlier this year, the Web giant, known as the Google of China, formally launched DuerOS and a variety of third-party products using it.  Baidu will gradually open three large datasets, one in far-field wake word detection, one in far-field speech recognition and one in what it calls multi-turn conversations. The data can be used to train new smart voice systems or services.  The wake-word data consists of about 500,000 voice clips of five to ten popular Chinese wake words. It includes the wake word to activate DuerOS devices, “xiaodu xiaodu.”  The speech recognition datasets will include thousands of hours of spoken Mandarin. The third data set is made up of thousands of dialogues across ten domains DuerOS currently serves.  Web giants such as Baidu typically guard the large datasets they accumulate because they are seen as part of their strategic advantage. Baidu’s goal is to enable many small groups to use the data to expand Baidu’s offerings and drive the technology ahead.  “In the age of AI, data is the new oil,” said Guoguo Chen, Baidu’s principal architect for DuerOS, speaking in a press statement.  Even giants such as Amazon and Google do not yet support Chinese in their Alexa and Google Assistant products today, in part, due to the complexity of the language.  Interestingly, Baidu invited Bj?rn Hoffmeister, senior manager of Amazon Machine Learning, to speak about the field at an event in Silicon Valley today where Baidu launched Prometheus. Baidu is taking a page from Facebook which has tried to spawn open source work among partners to gain leverage over larger rivals.  Under Project Prometheus, Baidu will work with universities and other researchers to conduct joint training, course design and workshops. The effort is geared to attract talent to the field as well as make Baidu a center of technical work in the area.  Baidu claims more than 100 branded devices from refrigerators and air conditioners to TV set-top boxes and smart speakers currently use its DuerOS.
2017-11-10 00:00 阅读量:1153
Baidu's Voice Exec Speaks Out
  Kun Jing wants to enable any embedded system in China to listen to and speak Mandarin. He aims to make Baidu’s DuerOS a kind of Android for natural-language cloud services.  “Our goal is to have every chip maker pre-install our software,” said Jing, general manager of Baidu’s DuerOS group, in an interview with EE Times. “We want every device to have voice capability,” he said, noting the free DuerOS code can add value to an otherwise commodity Wi-Fi chip.  So far ARM, Conexant, Intel, Nvidia, Qualcomm, Realtek, RDA Microelectronics and one undisclosed chip vendor plan to support DuerOS. They are among about 100 partners that include systems, software and content companies.  Realtek, RDA and the unnamed chip partner will offer so-called lightweight chip sets. So far, the RDA 5981, a 40nm Wi-Fi/Bluetooth chip with an ARM Cortex M4 processor, is the only chip shipping with the DuerOS SDK pre-installed.  Smartphones such as an HTC handset shipping now will run DuerOS on versions of Qualcomm’s Snapdragon. Intel is working with Lenovo on a smart speaker that will ship later this summer.  As many as 30 DuerOS products are in the works, including smartphones, TVs, refrigerators, air conditioners and speakers from OEMs such as Haier, HTC, Vivo, and Harman. A TV with voice search capabilities shipped in March, and a smart speaker shipped in May.  “Right now it’s all premium partners we work with closely to port and optimize our software for their chip sets,” said Jing.  Baidu officially launched DuerOS at a Bejing event July 4 with about 100 different capabilities. It claims its natural language recognition has a 97 percent accuracy rate.  Despite its name, DuerOS, “is not a traditional operating system, but a cloud service client that supports a wide range of OSes such as FreeRTOS, ARM Mbed, Linux and iOS,” said Jing. (Amazon takes a similar approach with its Alexa voice service.)
2017-07-17 00:00 阅读量:1010
Baidu Upgrades Neural Net Benchmark
  Baidu updated its open-source benchmark for neural networks, adding support for inference jobs and support for low-precision math.DeepBench provides a target for optimizing chips that help data centers build larger and, thus, more accurate models for jobs such as image and natural-language recognition.  The work shows that it’s still early days for neural nets. So far, results running the training version of the spec launched last September are only available on a handful of Intel Xeon and Nvidia graphics processors.  Results for the new benchmark on server-based inference jobs should be available on those chips soon. In addition, Baidu is releasing results on inference jobs run on devices including the iPhone 6, iPhone 7, and a Raspberry Pi board.  Inference in the server has longer latency but can use larger processors and more memory than is available in embedded devices like smartphones and smart speakers. “We’ve tried to avoid drawing big conclusions; so far, we’re just compiling results,” said Sharan Narang, a systems researcher at Baidu’s Silicon Valley AI Lab.  At press time, it was not clear whether Intel would have inference results for the release today, and it is still working on results for its massively parallel Knights Mill. AMD expressed support for the benchmark but has yet to release results running it on its new Epyc x86 and Radeon Instinct GPUs.  A handful of startups including Corenami, Graphcore, Wave Computing, and Nervana — acquitted by Intel — have plans for deep-learning accelerators.  “Chip makers are very excited about this and want to showcase their results, [but] we don’t want any use of proprietary libraries, only open ones, so these things take a lot of effort,” said Narang. “We’ve spoken to Nervana, Graphcore, and Wave, and they all have promising approaches, but none can benchmark real silicon yet.”  The updated DeepBench supports lower-precision floating-point operations and sparse operations for inference to boost performance.  “There’s a clear correlation in deep learning of larger models and larger data sets getting better accuracy in any app, so we want to build the largest possible models,” he said. “We need larger processors, reduced-precision math, and other techniques we’re working on to achieve that goal.”
2017-06-29 00:00 阅读量:1026
  • 一周热料
  • 紧缺物料秒杀
型号 品牌 询价
TL431ACLPR Texas Instruments
CDZVT2R20B ROHM Semiconductor
BD71847AMWV-E2 ROHM Semiconductor
RB751G-40T2R ROHM Semiconductor
MC33074DR2G onsemi
型号 品牌 抢购
STM32F429IGT6 STMicroelectronics
TPS63050YFFR Texas Instruments
BP3621 ROHM Semiconductor
ESR03EZPJ151 ROHM Semiconductor
IPZ40N04S5L4R8ATMA1 Infineon Technologies
BU33JA2MNVX-CTL ROHM Semiconductor
热门标签
ROHM
Aavid
Averlogic
开发板
SUSUMU
NXP
PCB
传感器
半导体
相关百科
关于我们
AMEYA360微信服务号 AMEYA360微信服务号
AMEYA360商城(www.ameya360.com)上线于2011年,现 有超过3500家优质供应商,收录600万种产品型号数据,100 多万种元器件库存可供选购,产品覆盖MCU+存储器+电源芯 片+IGBT+MOS管+运放+射频蓝牙+传感器+电阻电容电感+ 连接器等多个领域,平台主营业务涵盖电子元器件现货销售、 BOM配单及提供产品配套资料等,为广大客户提供一站式购 销服务。