Freescale i.MX51 SDMA tutorial (part II)
This is part II of a brief tutorial about the i.MX51′s SDMA core. The SDMA for other i.MX devices, e.g. i.MX25, i.MX53 and i.MX6 is exactly the same, with changes in the registers’ addresses and different chapters in the Reference Manual.
This is by no means a replacement for reading the Reference Manual, but rather an introduction to make the landing softer. The division into part goes as follows:
- Part I: Introduction, addressing and the memory map
- Part II: Contexts, Channels, Scripts and their execution (this page)
- Part III: Events and Interrupts
- Part IV: Running custom SDMA scripts in Linux
Contexts and channels
The SDMA’s purpose is to service requests from hardware or from the application processor. In a way, it’s like a processor with no idle task, just interrupts. But the way the service is performed is different from interrupt handling.
Let’s assume that all scripts (those SDMA programs) are already present in the SDMA’s memory space. They may reside in the on-chip ROM or they’ve been loaded into RAM. How are they executed?
The answer lies in the contexts: Some of the SDMA’s RAM space is allocated for containing an array of structures. There are 32 such structures, each occupying 128 bytes (or 32 32-bit words), so all in all this block takes up 4 kB of memory (there’s a 96-byte variant as well, but we’ll leave it for now).
These structures do what their name implies: They contain the context of a certain execution thread. In other words, they contain everything that needs to be stored to resume execution at some point, as if it was never stopped. Since the SDMA core doesn’t have a stack, this information has to go to a fixed place. This includes the program counter, the registers and flags. Section 52.13.4 in the Reference Manual describes this structure in detail.
As mentioned, there’s an array of 32 of these structures. It means that the SDMA subsystem can maintain 32 contexts, or if you like, resemble a multitasking system with 32 independent threads. Or in SDMA terms: The SDMA core supports 32 DMA channels. This kinda connects with the common concept of DMA channels: Each channel has a certain purpose and particular flow.
The method to kick off a channel, so it will execute a certain script, is to write directly to the channel’s context structure, and then set up some flags to make it runnable. This is demonstrated in part IV. Since the context includes the program counter register, this controls where the execution starts. Other registers can be used to pass information to the script (that is, the SDMA “program”). What each register means upon such an invocation is up to the script’s API.
A script’s life cycle (scheduling)
So there are 32 context, each corresponding to 32 channels. What makes a context load into the registers, making its channel’s script execute? It’s time to talk about the scheduler. It’s described in painstaking detail in the Reference Manual, so let’s stick to the main points.
The scheduler’s main function is to decide which channel is the most eligible to spend time on the processor core. This decision is relevant only when the SDMA core isn’t running anything at all (a.k.a. “sleeping”) or when the currently running script voluntarily yields the processor. The SDMA core’s execution is non-preemptive, so the scheduler can’t force any script to stop running. In other words, if any script is (mistakenly) caught in an infinite loop, all DMA activity is as good as dead, most possibly leading to a complete system hangup. Nothing can force a script to stop running (expect for a reset or the debugger). Just a small thing to bear in mind when writing those scripts.
The SDMA core has a special instruction for yielding the processor, with the mnemonic “done”, which takes a parameter for choosing its variant. Two variants of this instructions have earned their own mnemonics, “yield” and “yieldge”. While “done” variant #3 (usually called just “done”) always yields the processor, the two others yield it if there are other channels ready for executing with higher priority (or higher-or-equal priority for “yieldge”). But never mind the details. The overall picture is that the script runs until it issues a command saying “you must stop me now” (as in “done”) or “you may stop me now” (as in the two other variants).
Yielding only means that the registers are stored back into the context structure (with optimizations to speed this process up) and that another context may be loaded instead of it. Depending on which variant of “done” was used, plus some other factors, the scheduler may or may not reschedule the same channel automatically at a later time. That is, the context may be reloaded into the registers. So unless designed otherwise, the opcode directly after the “done” instructions will be executed at some later time. Hence a carefully written script never “ends”, it just gives up the processor until the next time the relevant channel is scheduled.
Channel eligibility
Now let’s look at what makes a channel eligible for execution. Leaving priority issues aside, let’s ask what makes a certain channel a candidate for having its context pushed into the SDMA core.
In some cases, the setup is that the channel becomes eligible for execution without any other condition. This is the case for offload memory copy, for example. In other cases, the channel’s eligibility depends on some hardware event, typically some peripheral requesting service. The latter scenario resembles old-school interrupt handlers, only the interrupt isn’t serviced by the application processor, but wakes up a service thread (channel) in the SDMA core. And exactly as waking up a thread in a modern operating system doesn’t cause immediate execution, but rather sets some flag to make the thread eligible for getting a processor time slice, so does the SDMA channel wakeup work: It’s just a flag telling the scheduler to push the channel’s context into the SDMA’s core when it sees fit.
The Reference Manual sums this up in section 52.4.3.5, saying the channel i is eligible to run if and only if the following expression is logical ’1′:
(HE[i] or HO[i]) and (EP[i] or EO[i])
where HE[i], HO[i], EP[i], and EO[i] are flags belonging to the i’th channel. Let’s take them one by one:
- HE[i] stands for “Host Enable”, and is set and reset by the application processor by writing to registers. It’s also cleared by the “done” instruction, so it’s suitable for a scenario where the host kicks off a channel, and the script quits it.
- EP[i] stands for “External Peripheral”, and is set when an external peripheral wants service (more about that mechanism later on). It’s cleared by one of the “done” variants, so this is the flag used when a peripheral kicks off a channel, and the script quits.
- HO[i] stands for “Host override”, and is controlled solely by a register written to by the application processor. Its purpose is to make the left hand of the expression always true, when we want the channel’s eligibility be controlled by the peripheral only.
- EO[i] stands for “External override”, and is like HO[i] in the way it’s handled. This flag is set when we want the channel’s eligibility controlled by the host only.
There are four registers in the application processor’s memory space, which are used to alter these flags: STOP_STAT, HSTART, EVTOVR and HOSTOVR. They are outlined in sections 52.12.3.3-52.12.3.7 in the Reference Manual.
The full truth is that there’s also a DO[i] flag mentioned (controlled by the DSPOVR register), but it must be held ’1′ on i.MX51 devices, so let’s ignore it.
So if our case is the application processor controlling the i’th SDMA channel for offload operation, it sets EO[i], clears HO[i], and then sets HE[i] whenever it wants to have the script running. The script may clear HE[i] with a “done” instruction, or the application processor may clear it when appropriate. For example, the script can trigger an interrupt on the application processor, which clears the flag (even though I can’t see when this would be right way to do it).
In the case of channels being started by a peripheral, the application processor sets HO[i] and clears EO[i]. Certain events (as discussed next) set the EP[i] flag directly, and the script’s “done” instruction clears it.
Keep in mind that the script may not run continuously: It should execute “yield” instructions every now and then to give other channels a chance to use the SDMA core, but since neither HE[i] nor EP[i] are affected by yields, the script will keep running until it’s, well, done.
There is a possibility to reset the SDMA core or force a reschedule with the SDMA’s RESET register, but that’s really something for emergencies (e.g. a runaway script).
So much for part II. You may want to go on with Part III: Events and Interrupts