HomeTechnologyAutomation and RoboticsHigh Reliable and Performance Deep Learning Accelerator for ADAS and Autonomous Driving...

    High Reliable and Performance Deep Learning Accelerator for ADAS and Autonomous Driving Systems

    Next-generation ADAS and autonomous driving (AD) systems, when deployed to market, will require accurate and highspeed recognition, judgment, and operation. Convolutional neural networks (CNNs) require large amounts of computation for pattern recognition. As the number of sensors installed increases, higher CNN performance is required. However, as power consumption increases in proportion to performance, a heavy and expensive water-cooling system is needed. It is required to achieve both high deep learning performance and low power consumption that enables a lightweight and cost-effective air-cooling system. Achieving a CNN performance of 60TOPS with an efficiency of 10TOPS/W per one LSI device is the optimal target from a practical point of view.

    CNN accelerator with high performance and power efficiency

    CNN accelerator with high performance and power efficiency

    A CNN accelerator (CNNA) performance/efficiency target is to achieve 60TOPS performance with 10TOPS/W efficiency. From an implementation point of view, it is realized with three identical accelerators instead of one accelerator. One CNNA contains 13,824 MAC arithmetic units and operates at 800MHz. The theoretical maximum performance of the three CNNAs is 66TOPS. In addition, each CNNA connects 2MB dedicated scratchpad memory (SPM) through a 512-bit interconnect module. This increases the execution efficiency of CNNA, reduces the amount of data transferred between CNNA and external memory (DRAM) by about 90%, and saves the power consumed by the DRAM interface and interconnect. From the actual measurement of test chip, VGG16 has 32TOPS performance with 6.1TOPS/W efficiency, and CNNA-optimized network (Network-A) has 60.6TOPS performance with 13.8TOPS/W efficiency.

    Safety mechanism for ASIL D tasks

    Safety mechanism for ASIL D tasks

    Next-generation ADAS and autonomous Driving systems are required to achieve the functional safety of ASIL D, which is the strictest safety level of ISO 26262. The dual core lockstep (DCLS) is one of the methods that can satisfy the metric of ASIL D. Fault can be detected by performing the same process on two redundant hardware and comparing their respective outputs.

    CNNA also requires hardware redundancy to meet the ASIL D metrics but simply applying DCLS requires a large MAC compute unit to be redundant. It is not practical because area and power consumption increase significantly. To achieve ASIL D metrics without adding redundant hardware, two CNNAs (CNNA1 and CNNA2) are dynamically configured by software to perform lockstep operation during processing that require safety.

    CNNA is used for both image recognition processing (ASIL B) input from the camera and modeling of the surrounding environment from the results input from each sensor (ASIL D). But most of the execution time is the former ASIL B image recognition processing. Therefore, by switching CNNA1 and CNNA2 to lockstep operation only during surrounding environment modeling processing, ASIL D tasks can be achieved without significantly compromising performance or power efficiency.

    The following is the lockstep operation of CNNA using lockstep DMAC (LDMAC).

    The following is the lockstep operation of CNNA using lockstep DMAC (LDMAC).

    1. LDMAC loads the same data from DRAM into SPM1 and SPM2.
    2. CNNA1 and CNNA2 perform the same network processing.
    3. LDMAC reads the execution results from SPM1 and SPM2 and compares them. If they do not match, it is judged as fault. Only the result of CNNA1 is stored in DRAM.

    Another important factor in achieving ASIL D is freedom from interference (FFI).

    Another important factor in achieving ASIL D is freedom from interference (FFI). There are a mix of tasks with different ASILs in the autonomous system. They must not interfere to higher ASIL tasks. As mentioned earlier, CNNA is accessed by tasks at different ASIL levels, so the memory space used by each task must be separate.

    The mechanism for memory space isolation is implemented in CNNA, LDMAC

    The mechanism for memory space isolation is implemented in CNNA, LDMAC, and the memory protection tables of the memory management unit (MMU). The context index of the currently running task is given to the transaction output from CNNA and LDMAC. The MMU receives it and switches the context on a transaction-by-transaction basis.

    Renesas presented these achievements at International Solid-State Circuits Conference 2021

    Renesas presented these achievements at International Solid-State Circuits Conference 2021 (ISSCC 2021), which take place February 13 to 22, 2021. We will continue to develop and deploy in-vehicle LSI based on this technology. We expect these will contribute to the realization of a safe and secure autonomous car society through the spread of ADAS and AD systems.

    Katsushige Matsubara

    Katsushige Matsubara

    ELE Times News
    ELE Times Newshttps://www.eletimes.ai/
    ELE Times provides extensive global coverage of Electronics, Technology, and the Market. In addition to providing in-depth articles, ELE Times attracts the industry’s largest, qualified, and highly engaged audiences, who appreciate our timely, relevant content and popular formats. ELE Times helps you build experience, drive traffic, communicate your contributions to the right audience, generate leads, and market your products favorably.

    Related News

    Must Read

    Keysight to Showcase Quantum-AI Collaboration at GTC 2025 with NVIDIA NVQLink

    Keysight Technologies, Inc. announced that they support the development...

    Centre Clears ₹5,532 Crore Investment for Seven Electronics Manufacturing Projects

    In a significant effort to enhance India's electronics ecosystem,...

    Nuvoton’s M55M1 AI MCU Debuts with Built-in NPU for Entry-Level AI Performance

    Nuvoton Technology has launched its latest generation AI microcontroller,...

    Anritsu Supports EU Market Expansion by Ensuring Safety and Compliance of 5G Wireless Devices

    ANRITSU CORPORATION has enhanced the functions of its New Radio...

    High-Accuracy Time Transfer Solution Delivers Sub-Nanosecond Timing Up to 800 km via Long-Haul Optical Networks

    Governments across the globe are requesting critical infrastructure operators...

    Infineon adds SPICE-based model generation to IPOSIM platform for more accurate system-level simulation

    The Infineon Power Simulation Platform (IPOSIM) from Infineon Technologies...

    Top 10 Agentic AI Threats and How to Defend Against Them

    Author: Saugat Sindhu, Global Head – Advisory Services, Cybersecurity...

    AI is defining reality as we progress further

    AI has well integrated into almost every sector of...