What is Direct Memory Access (DMA)?

 Table of Contents

  1. How Does the CPU Works?
  2. What is DMA?
  3. How Does the DMA Works?
  4. Indepth of DMA
  5. Modes of DMA?
  6. Benefits of DMA?
  7. Conclusion

1.How Does the CPU Works?

CPU is an essential part of an Embedded System that manages all the arithmetic, logic, control, and I/O operations. The main components of the CPU are Control Unit (CU), Arithmetic & Logic Unit (ALU), and the registers.   

CPU Architecture

 

CU manages the ALU and memory communication for executing or storing the instructions. It includes the fetching unit for fetching the instructions. On the other hand, the ALU unit manages all the arithmetic and logical operations. During the data transmission between memory and the peripherals, a delay will add if the CPU is engaged to accomplish some other task. Therefore, the system performance gets degraded.   

Bus Matrix 1

As shown above, the CPU and multiple peripherals connect with a bus matrix. The in-hardware CPU uses the bus interconnected (further expands to the data, control and address bus) to load the data from one of the peripherals and then store it in memory or vice-versa. As a Use Case scenario, the UART is one of the peripherals and wants to transfer the data from its data register to the memory. Therefore, the CPU needs to be programmed to do this transfer of data from the UART Data Register to the memory.

Bus Matrix 2

2.What is DMA?

DMA, or Direct Memory Access, is a vital mechanism in Embedded systems designed to address the inefficiencies associated with data transfer between peripheral devices and memory. Traditionally, the CPU has been responsible for managing these transfers, creating a bottleneck that hampers overall system performance.

Embedded systems often contend with limited resources, and the conventional method of relying on the CPU for every data transfer proves to be inefficient. DMA addresses this challenge by allowing peripheral devices, such as sensors or communication modules, direct access to the system's memory. This alleviates the CPU from continuous involvement in data transfers, optimizing resources and reducing processing overhead.

3.How Does DMA Works?

The peripherals can only generate the asynchronous events (interrupts) to indicate to the CPU that data is ready but unable to do the bus transactions. Therefore to offload the CPU, a kind of master is essential who can do these transactions without the CPU intervention. The Hardware IP that can control the data transfer between the memory and the peripherals without the CPU involvement is known Direct Memory Access (DMA) controller. DMA can access the memory and the peripherals directly for doing the bus transactions. However, Load Store instruction execution and other internal circuitry would only be active during the CPU execution. Therefore, Power Consumption will be less with DMA because the CPU can be in sleep mode during the data transmission.   

DMA

4.Indepth of DMA?

A System on Chip (SoC) is an integrated circuit with all the essential elements required for an Embedded System. In an SoC, the Advanced Microcontroller Bus Architecture (AMBA) is an open standard that is developed by ARM. AMBA has a master-slave type of topology, only the master can initiate the communication. AMBA connects and manages all the components like Memory and Peripherals e.g. UART, CPU and etc. For managing on-chip communication AMBA has three buses as follows, Advanced High-Performance Bus (AHB), the Advanced System Bus (ASB), and the Advanced Peripheral Bus (APB). The AHB facilitates high-frequency and high-performance components. On the other hand, ASB is an alternative to the AHB but with limited features. Whereas, APB is for taking care of the low bandwidth peripherals like GPIOs, Timers, and UART.  

The master has the privilege to control the bus and there can be multiple masters in an SoC, but a single master would be able to access the bus at a particular time. To resolve the multi-master access request in hardware, an arbiter is used. When any master wants to control the AHB bus, it sends a request signal to the arbiter and it will provide access if the bus is available at that particular time. If multiple masters send an access request signal simultaneously, then access will be given to the master having higher priority. Therefore, another master request can stop the CPU access to the system bus for a few bus cycles when the CPU and another master are trying to access the same destination simultaneously.  

The bus matrix provides access from a master to a slave, enabling concurrent access and efficient operation. The description of these masters is as below: 

    • I-Bus: Instruction Bus uses by the core for fetching the instruction. The target of this bus is a memory containing code like Flash, Static RAM, and External Memory.  
    • D-Bus: Data bus uses by the core for literal load and debugs access. The target of this bus is a memory containing code or data like Flash, Static RAM & External Memory.   
    • S-Bus: System Bus is used to access the data in a peripheral or SRAM. It is also able to fetch instructions and is slower than the I & D Bus. It connects with all the slaves except Flash.    
    • DMA Memory: It is used by the DMA for transferring the data to/from memories.    
    • DMA Peripheral: This bus is used by the DMA to access AHB peripherals or to perform memory-to-memory transfers. The targets of this bus are the AHB, APB peripherals, and data memories. DMA2 Peripheral is unable to communicate with Memories whereas it's possible in the case of DMA1. It depends on the system architecture.    
    • USB OTG HS Bus: This bus is used by the USB OTG to load/store data in memory.   

Let's take an example of ARM Cortex M-4 architecture, where the system consists of a 32-BIT multilayer AHB bus matrix that interconnects multiple masters and slaves. CPU Instruction, Data, System bus, DMAs, and USB are acting as masters. On the other hand, Flash, SRAM, and all the peripherals connected with AHB & APB buses are working as a slave.

5.Modes of DMA?

DMA controller supports two transfer modes which are as below: 

    1. Burst Transfer – The entire chunk of data is transferred continuously without interruption. The control of the bus is given back to the CPU when the entire chunk of data is transferred. The disadvantage of this transfer is that if the CPU needs any data from memory during this time, it needs to wait until the transfer is complete.  
    2. Split cycle transfer – The control of the bus is given back to the CPU periodically. The data transfer doesn’t happen continuously. The advantage is that the CPU doesn’t sit idle for a given amount of time, waiting for data from memory.  

DMA can be configured into 3 Modes:  

    • Memory to Memory  
    • Peripheral to Memory  
    • Memory to Peripheral   

Memory Division

6.Benefits of DMA?

  • Data Transfer Challenge: In computer systems, transferring data between peripheral devices (like hard drives, network cards) and memory traditionally involves the CPU.
  • CPU Bottleneck: The CPU's involvement in every data transfer task creates a bottleneck, limiting overall system efficiency.
  • Solution - DMA: DMA is a solution that allows peripherals to access the system's memory directly, bypassing the CPU for data transfers.
  • Offloading CPU: By offloading these data transfer tasks, DMA frees up the CPU to focus on other essential computations.
  • Improved Speeds: DMA significantly improves data transfer speeds since peripherals can independently manage the transfer process.
  • Efficiency in I/O Operations: DMA is particularly beneficial in Input/Output operations, such as reading from or writing to storage devices or transmitting data over a network.
  • Parallel Processing: DMA enables parallel processing, enabling the CPU to handle multiple tasks concurrently without being tied up in every data transfer.
  • Enhanced System Performance: The result is an overall enhancement in system performance, especially noticeable in scenarios involving large data sets, multimedia applications, or heavy I/O operations.
  • Buffering Capability: DMA often incorporates buffering, temporarily storing data, which aids in smoother and more efficient data transfers.
  • Critical for Real-time Applications: In real-time applications like video streaming or audio playback, DMA ensures a steady flow of data without interruptions, contributing to a seamless user experience.

7.Conclusion

Let's consider one application where a SPI based temperature sensor is connected to the microcontroller and firmware inside CPU wants to read the data from the sensor. To do the same transfer using DMA without the intervention from CPU so that data can be stored into the memory without any delay, as CPU may be busy in performing another high priority task. Communication steps would be as below:  

    • Once the SPI is available with some data in the data register, it will send the request to the CPU. The CPU sends the signal to DMA to perform a Read/Write between a SPI and the memory. 
    • The DMA communicates with the SPI to see if it is ready for the transfer and if SPI is ready DMA sends a DMA request to the CPU. 
    • The CPU sends a acknowledge signal to the DMA with the control of the system bus. It also gives the starting address of the data and the data count. 
    • The DMA receives a unit data from the SPI, stores it in its data register and subsequently transfers it to the memory. The data count is decremented by 1 and the address register is incremented by 1. 
    • When the data transfer is complete, DMA sends a interrupt signal to the CPU. Upon receiving the interrupt signal, the CPU takes back the control of the system bus. 

In the end, we can conclude that Fast Data transfer between Memory and Peripherals is possible using DMA. CPU and DMA can also operate concurrently which provides better performance. It also helps in reducing power consumption if the CPU is in sleep mode.