Freescale i.MX SDMA tutorial (part I)

This post was written by eli on October 25, 2011
Posted Under: ARM,NXP (Freescale)

This is part I of a brief tutorial about the i.MX51′s SDMA core. The SDMA for other i.MX devices, e.g. i.MX25, i.MX53 and i.MX6 is exactly the same, with changes in the registers’ addresses and different chapters in the Reference Manual.

Freescale’s Linux drivers for DMA also vary significantly across different kernel releases. It looks like they had two competing sets of code, and couldn’t make up their minds which one to publish.

This is by no means a replacement for reading the Reference Manual, but rather an introduction to make the landing softer. The division into part goes as follows:

Part I: Introduction, addressing and the memory map (this page)
Part II: Contexts, Channels, Scripts and their execution
Part III: Events and Interrupts
Part IV: Running custom SDMA scripts in Linux

NOTE: For more information, in particular on SDMA for i.MX6 and i.MX7, there’s a follow-up post written by Jonah Petri.

Introduction

Behind all the nice words, the SDMA subsystem is just a small and simple RISC processor core, with its private memory space and some specialized functional units. It works side-by-side with the main ARM processor (the application processor henceforth), and pretty much detached from it. Special registers allow the application processor to control the SDMA’s core, and special commands on the SDMA’s core allow it to access the application processor’s memory space and send it interrupts. But in their natural flow, each of these two don’t interact.

The underlying idea behind the SDMA core is that instead of hardwiring the DMA subsystem’s capabilities and possible behaviors, why not write small programs (scripts henceforth), which perform the necessary memory operations? By doing so, the possible DMA operations and variants are not predefined by the chip’s vendor; the classic DMA operations are still possible and available with vendor-supplied scripts, but the DMA subsystem can be literally programmed to do a lot of other things. Offload RAID xoring is an example of something than can be taken off the main processor, as the data is being copied from disk buffers to the peripherals with DMA.

Scripts are kicked off either by some internal event (say, some peripheral has data to offer) or directly by the main processor’s software (e.g. an offload memcpy). The SDMA processor’s instruction set is simple, all opcodes occupying exactly 16 bits in program memory. Its assembler can be acquired from Freescale, or you can download my mini-assembler, which is suitable for small projects (in part IV).

Chapter 52 in the Reference Manual is dedicated to the SDMA, but unfortunately it’s not easy reading. In the hope to clarify a few things, I’ve written down the basics. Please keep in mind that the purpose of my own project was to perform memory-to-memory transfers triggered autonomously by an external device, so I’ve given very little attention to the built-in scripts and handling DMA from built-in peripherals.

Quirky memory issues

I wouldn’t usually start the presentation of a processor with its memory map and addressing, but in this case it’s necessary, as it’s a major source of confusion.

The SDMA core processor has its own memory space, which is completely detached from the application processor’s. There are two modes of access to the memory space: Instruction mode and data mode.

Instruction mode is used in the context of jumps, branches and when calling built-in subroutines which were written with program memory in mind. In this mode, the address points at a 16-bit word (which matches the size of an opcode), so the program counter is incremented (by one) between each instruction (except for jumps, of course).

Data mode is used when reading from the SDMA’s memory (e.g. loading registers) or writing to it. This should not be confused with the application processor’s memory (the one Linux sees, for example), which is not directly accessible by the SDMA core. In data mode, addressing works on 32-bit words, so incrementing the data mode address (by one) means moving forward four bytes.

Instruction mode and data mode addressing points at exactly the same physical memory space. It’s possible to write data to RAM in data mode, and then execute it as a script, the latter essentially reading from RAM in instruction mode. It’s important to note, that different addresses will be used for each. This is best explained with a simple example:

Suppose that we want to run a routine (script) written by ourselves. To do so, it has to be copied into the internal RAM first. How to do that is explained in part IV, but let’s assume that we want to execute our script with a JMP instruction to 0x1800. This is 12 kB from the zero-address of the memory map, since the 0x1800 address is given in 16-bit quanta (2 bytes per address count). After the script is loaded in its correct place, we’ll be able to read the first instruction (as a piece as data) as follows: Set one of the SDMA’s processor’s registers to the value 0x0c00, and then load from the address pointed by that register. The address, 0x0c00, is given in 32-bit quanta (4 bytes per address count), so it hits exactly the same place: 12 kB from zero-address. And since we’re reading 32 bits, we’ll read the first instruction as well as the second at the same time.

Let’s say it loud and clear:

Instruction mode addresses are always double their data mode equivalents.

As for endianess, the SDMA core thinks Big Endian all the way through. That means, that when reading two assembly opcodes from memory in data mode, we get a 32-bit word, for which the first instruction is on bits [31:16] and the instruction following it on bits [15:0].

The memory map

Since we’re at it, and since the Reference Manual has this information spread all over, here’s a short outline of what’s mapped where, in data addresses.

0x0000-0x03ff: 4 kB of internal ROM with boot code and standard routines
0x0400-0x07ff: 4 kB of reserved space. No access at all should take place here
0x0800-0x0bff: 4 kB of internal RAM, containing the 32 channels’ contexts (each context is 32 words of 4 bytes each, when SMSZ is set in the CHN0ADDR register). More about this in part II. For the details, see Section 52.13.4 in the Reference Manual. When SMSZ is clear, this segment is 3 kB only (see 52.4.4).
0x0c00-0x0fff: 4 kB of internal RAM, free for end-user application scripts and data.
0x1000-0x6fff: Peripherals 1-6 memory space
0x7000-0x7fff: SDMA registers, as accessed directly by the SDMA core (as detailed in section 52.14 of the reference manual)
0x8000-0xffff: Peripherals 7-14 memory space (not accessible in program memory space)

The two regions of peripherals memory space is the preferred way to access peripherals (unlike the implementation in Linux drivers using SDMA script) as discussed in another post of mine.

And once again: The memory map above is given in data addresses. The memory map in program memory space is the same, only all addresses are double.

So much for part I. You may want to go on with Part II: Contexts, Channels, Scripts and their execution

Reader Comments

Hi Eli
I would like to utilise SDMA from the M4 on i.mx7 for the ecspi transfer. What would be the best way to achieve this?

Written By Yassin on September 13th, 2017 @ 07:33

Add a Comment

Next Post: Freescale i.MX51 SDMA tutorial (part II)

Previose Post: WordPress: Displaying C-like hexadecimal prefix “0x” correctly

my tech blog

Popular Posts

Latest Posts

Archives

Freescale i.MX SDMA tutorial (part I)

Introduction

Quirky memory issues

The memory map

Reader Comments

Add a Comment

Quick links

Categories

Meta