Motivation
On an embedded system, I have a device file /dev/xillybus_audio, which can be opened for read and/or write. One can write raw (signed 16 bit Little Endian Rate 48000 Hz stereo) samples to this file, and they’re played on a “headphones out” plug, and one can read samples of the same time, which are captured from a “mic input” plug. Clean and simple. Now let’s use that as an ALSA sound interface. This is where it doesn’t get all that simple.
How about a kernel driver for that interface? Nice idea, but the stream interface is already there. Besides, this is useful for piping with programs etc.
Attempt I
To make along story short, making /etc/asound.conf read like this, and playing back works (but capturing doesn’t!).
pcm.xillybus {
type asym
playback.pcm {
type plug
slave {
pcm {
type file
file "/dev/xillybus_audio"
slave.pcm null
format raw
}
rate 48000
format s16_le
channels 2
}
}
capture.pcm {
type plug
slave {
pcm {
type file
file "/dev/null"
infile "/dev/xillybus_audio"
slave.pcm null
}
rate 48000
format s16_le
channels 2
}
}
}
Playback works with rates other than 48000 Hz (and other formats), because of the wrapping with the “plug” plugin.
# aplay -D "xillybus" rate8000.wav
Playing WAVE 'rate8000.wav' : Signed 16 bit Little Endian, Rate 8000 Hz, Stereo
Note that “file” — which defines the output file — must be defined or arecord (or whatever program is used) quits on a segmentation fault. Not very polished.
Capturing doesn’t work at all, however. It’s just a silence file, which grows way too fast. It has been said that the slave of the capturing device shouldn’t be null, and indeed this probably the issue.
Diving into it
The problem seems to lie in the implementation of the file capture routine. Taken from alsa-lib-1.0.27.2/src/pcm/pcm_file.c:
static snd_pcm_sframes_t snd_pcm_file_readi(snd_pcm_t *pcm, void *buffer, snd_pcm_uframes_t size)
{
snd_pcm_file_t *file = pcm->private_data;
snd_pcm_channel_area_t areas[pcm->channels];
snd_pcm_sframes_t n;
n = snd_pcm_readi(file->gen.slave, buffer, size);
if (n <= 0)
return n;
if (file->ifd >= 0) {
n = read(file->ifd, buffer, n * pcm->frame_bits / 8);
if (n < 0)
return n;
return n * 8 / pcm->frame_bits;
}
snd_pcm_areas_from_buf(pcm, areas, buffer);
snd_pcm_file_add_frames(pcm, areas, 0, n);
return n;
}
This is the method, which the plugin exposes for reading samples. Note that it attempts to read the desired amount of samples from the slave first, and then attempts to fetch the same number of samples it got from the slave, from the file. This probably makes sense when reading from a plain file, because it would otherwise slurp the entire file in no-time. The slave is used as a data rate controller. Great.
Attempt II
To come around this, I changed /etc/asound.conf to this:
pcm.xillybus_raw {
type file
file "/dev/xillybus_audio"
slave.pcm null
format raw
}
pcm.xillybus_play {
type plug
slave {
pcm "xillybus_raw"
rate 48000
format s16_le
channels 2
}
}
pcm.xillybus {
type asym
playback.pcm "xillybus_play"
capture.pcm {
type plug
slave {
pcm {
type file
file "/dev/null"
infile "/dev/xillybus_audio"
slave.pcm "xillybus_play"
}
rate 48000
format s16_le
channels 2
}
}
}
This isn’t perfect either. When attempting
# arecord -D "xillybus" --rate 48000 --channels 2 --format s16_le try.wav
sound is indeed recorded into try.wav. The captured sound is echoed in the headphones (with a delay), so the output interface is now busy and noisy. But worst of all, this only works if the parameters are set exactly to the sound interface’s. So I could have read directly from /dev/xillybus_audio as well.
Changes in pcm_file.c
Based upon alsa-lib-1.0.25, the following functions were changed in pcm_file.c. The intention of these changes is to detach the I/O operations from the slave, which is null in the setting of Attempt I above.
static int snd_pcm_file_drop(snd_pcm_t *pcm)
{
return 0;
}
static int snd_pcm_file_drain(snd_pcm_t *pcm)
{
return 0;
}
static snd_pcm_sframes_t snd_pcm_file_readi(snd_pcm_t *pcm, void *buffer, snd_pcm_uframes_t size)
{
snd_pcm_file_t *file = pcm->private_data;
snd_pcm_channel_area_t areas[pcm->channels];
snd_pcm_sframes_t n;
n = read(file->ifd, buffer, size * pcm->frame_bits / 8);
if (n < 0)
return n;
return n * 8 / pcm->frame_bits;
}
static snd_pcm_sframes_t snd_pcm_file_readn(snd_pcm_t *pcm, void **bufs, snd_pcm_uframes_t size)
{
snd_pcm_file_t *file = pcm->private_data;
snd_pcm_channel_area_t areas[pcm->channels];
snd_pcm_sframes_t n;
SNDERR("DEBUG: Noninterleaved read not yet implemented.\n");
return 0; /* TODO: Noninterleaved read */
}
(these functions don’t appear one after the other in the source file)
Compiling to obtain libasound.so
After making the changes in pcm_file.c, compiled natively on the embedded board
# ./configure
# make -j 2
and then copy the result to the library directory:
# cp src/.libs/libasound.so.2.0.0 /usr/lib/arm-linux-gnueabihf/
This overwrites the previous file.
Plugin library issue
When attempting to use a sound interface with the new libasound, the following error occurs:
# aplay -D "xillybus" snip.wav
ALSA lib conf.c:3314:(snd_config_hooks_call) Cannot open shared library libasound_module_conf_pulse.so
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM xillybus
aplay: main:682: audio open error: No such file or directory
this is because the plugin loader looks in the wrong directories: It does look in /usr/lib/arm-linux-gnueabihf/, but not in /usr/lib/arm-linux-gnueabihf/alsa-lib/.
Trying configure parameters didn’t help:
# ./configure --libdir=/usr/lib/arm-linux-gnueabihf/
# ./configure --with-plugindir=/usr/lib/arm-linux-gnueabihf/alsa-lib/
The dirty solution was to create symbolic links to all files in alsa-lib/ that aren’t symbolic links themselves with
# cd /usr/lib/arm-linux-gnueabihf
# for i in `find alsa-lib/ -type f -a ! -type l` ; do ln -s "$i" ; done
Not the most elegant solution, but after spending a couple of hours on trying to figure this out, at least it works.
This is the list of files that were symlinked:
alsa-lib/libasound_module_pcm_speex.so
alsa-lib/libasound_module_ctl_oss.so
alsa-lib/libasound_module_ctl_pulse.so
alsa-lib/libasound_module_pcm_usb_stream.so
alsa-lib/libasound_module_pcm_pulse.so
alsa-lib/libasound_module_rate_samplerate.so
alsa-lib/libasound_module_ctl_bluetooth.so
alsa-lib/libasound_module_pcm_jack.so
alsa-lib/libasound_module_pcm_upmix.so
alsa-lib/libasound_module_pcm_bluetooth.so
alsa-lib/libasound_module_conf_pulse.so
alsa-lib/libasound_module_pcm_oss.so
alsa-lib/libasound_module_rate_speexrate.so
alsa-lib/libasound_module_ctl_arcam_av.so
alsa-lib/libasound_module_pcm_vdownmix.so
alsa-lib/smixer/smixer-sbase.so
alsa-lib/smixer/smixer-ac97.so
alsa-lib/smixer/smixer-hda.so
Current position
Both record and playback work with the first asound.conf above (a.k.a. Attempt I), as long as the parameters for recording are the same. For playback, the parameters must be the 48000 Hz, s16_le but it’s fine to work in mono and stereo. If other parameters are attempted, a huge file which is filled with click sounds is created.
So
# arecord -D "xillybus" --rate 48000 --format s16_le --channels 2 good.wav
Recording WAVE 'good.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
makes a nice sound file, but
# arecord -D "xillybus" --format s16_le --channels 2 justclicks.wav
Recording WAVE 'justclicks.wav' : Signed 16 bit Little Endian, Rate 8000 Hz, Stereo
creates a huge file with just clicks. The intriguing thing about those failing just-click scenarios, is that snd_pcm_file_readi() is never called when this happens. It looks like something in the data flow goes wrong when a rate resampler is pushed into the system. When the rate is the same, snd_pcm_file_readi() has been observed to be called in a steady way.
Both of the following are OK:
# aplay -D "xillybus" good.wav
Playing WAVE 'good.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
# aplay -D "xillybus" rate8000.wav
Playing WAVE 'rate8000.wav' : Signed 16 bit Little Endian, Rate 8000 Hz, Stereo
Background
This is just things I wrote down while trying to make aplay play sound through a fake module-based Pulseaudio sink. Spoiler: I failed.
Spoiler II: Sometimes people ask me question about posts I write. I don’t think I’ll answer any on this post. I’ll probably forget all about this five minutes from now.
So –
It all worked fine as long as there was only one of those sinks, because it was set to the default of Pulseaudio, and apparently aplay was ready to consider Pulseaudio as a sound card, playing sound to whatever the user configured as its sink (on Pulseaudio’s configuration interface). But what if I wanted two of those at the same time. Problem.
The following segment in /etc/pulse/default.pa indeed creates two sources and sinks in Pulseaudio’s environment:
load-module module-file-sink file=/dev/xillybus_audio rate=48000
load-module module-file-source file=/dev/xillybus_audio rate=48000
load-module module-file-sink file=/dev/xillybus_audio2 rate=48000
load-module module-file-source file=/dev/xillybus_audio2 rate=48000
Using “pacmd list-sources” and “pacmd list-sinks”, I have found Pulseaudio’s names for these, so that it’s possible to record and play with
parecord -d 'fifo_input' --file-format=wav --rate=44100 > junk.wav
or
parecord -d 'fifo_input.2' --file-format=wav --rate=44100 > junk.wav
and then play the sound back with
paplay -d 'fifo_output' junk.wav
or
paplay -d 'fifo_output.2' junk.wav
And there’s always the possibility to fake old-style /dev/dsp devices with padsp, and use them with whatever application that work with these.
ALSA random jots
/usr/share/alsa/alsa.conf defines the well-known PCM names “hw:” included, as pcm.hw { … } and also lists the other files to look at. So it’s definitely the place to start looking for understanding those name conventions. Or maybe the ‘hw:” prefix is no more than a reference to the ALSA hw plugin…?
The device’s name is used in aplay when in calls snd_pcm_open(), which is an alsa-lib call. In other words, aplay doesn’t use Pulseaudio in this case.
It’s part of alsa-lib, in src/pcm/pcm.c. This function calls snd_pcm_open_noupdate(), which in turn calls snd_config_search_definition() with the device name. If it returns with an error, we get the
ALSA lib pcm.c:2217:(snd_pcm_open_noupdate) Unknown PCM fifo_input
arecord: main:682: audio open error: No such file or directory
snd_config_search_definition() is defined in src/conf.c. It strips off anything after a ‘:’ (if such exists) and calls snd_config_search_alias_hooks() with the stripped name. So crucial question seems to be how to convince pulseaudio to inject an alias to the file sink, as it does when it’s the default.
It looks like snd_device_name_hint() is useful (supersedes the deprecated snd_names_list() in src/names.c). The following C code snippet (found here)
{
char **hints;
/* Enumerate sound devices */
int err = snd_device_name_hint(-1, "pcm", (void***)&hints);
if (err != 0)
return;//Error! Just return
char** n = hints;
while (*n != NULL) {
char *name = snd_device_name_get_hint(*n, "NAME");
if (name != NULL && 0 != strcmp("null", name)) {
printf("Name hint: %s\n", name);
free(name);
}
n++;
}
snd_device_name_free_hint((void**)hints);
}
prints out (on my PC, note that the “hw” name isn’t listed).
Name hint: default
Name hint: front:CARD=Intel,DEV=0
Name hint: surround40:CARD=Intel,DEV=0
Name hint: surround41:CARD=Intel,DEV=0
Name hint: surround50:CARD=Intel,DEV=0
Name hint: surround51:CARD=Intel,DEV=0
Name hint: surround71:CARD=Intel,DEV=0
Name hint: iec958:CARD=Intel,DEV=0
Name hint: hdmi:CARD=HDMI
BTW, when Pulseaudio is shut down on a machine where only “default” is listed (no ALSA cards), “default” goes away, and there’s “pulse” instead.
This page talks about ALSA tweaking.
The format of the asound.conf file is described on this page. The C library reference, mentioning the naming convention is here.
Pulseaudio jot
Download pulseaudio’s source with
git clone git://anongit.freedesktop.org/pulseaudio/pulseaudio
An interesting function is pa_alsa_source_new() in src/modules/alsa/alsa-source.c.
Sometimes, there are JavaScript snippets for the sake of obfuscation (hmmm, including this site). This is code made complicated intentionally, to prevent web spiders from harvesting addresses or emails. But hey, what if I’m the one with the spider?
The simple, and somewhat dangerous solution, is to run the JavaScript code on a local interpreter. I found Google’s V8 project most suitable for this purpose. Download the sources from Google’s SVN:
$ svn checkout http://v8.googlecode.com/svn/trunk/ v8
Following the instructions for building with GYP, change directory to v8/ and download GYP (and other stuff, I suppose)
$ make dependencies
And build for the current platform:
$ time make -j 8 native
which fails, because warnings are treated as errors (on GCC 4.4.4). So this instead:
$ time make werror=no -j 8 native
This worked, and took 2.30 minutes on my computer. The outputs go to out/native, so
$ cd out/native/
$ ./d8
V8 version 3.26.12 [console: dumb]
d8> print("Hello, world");
Hello, world
undefined
Isn’t that sweet? It just executes the command.
Note that d8 always returns the value of the last operation, which is nice when all we want is evaluation an obfuscated expression.
d8> os.system("date");
"Sun Apr 13 18:59:24 IDT 2014
"
Ayyeee! The interpreter allows shelling out! This means that running an alien script on our machine is extremely dangerous: If a spider is calling the interpreter with scripts that it retrieves from the web, one could easily contain code that attempts to run code on the host’s computer, if it detects that the environment isn’t a browser. Protective measures aren’t simple. I don’t know of any way to safely prevent the interpreter from accessing its host’s capabilities, except for applying seLinux or (the weaker option) chroot jailing. Or maybe use Linux namespaces for lightweight virtualization.
Anyhow, there are other executables created as well, for example, “shell”:
$ ./shell
V8 version 3.26.12 [sample shell]
> print("Hello, world");
Hello, world
>
More info can be found on this page. For example, it’s possible to quit the shell with the quit(0) command.
So I compiled a Linux kernel with 8 threads in parallel on my Linux desktop machine, as I always do. The CPU worked extra hard as usual, but lately its temperature began to rise, ending up at 88°C. It looks like a clock gating mechanism kicked in to save the CPU.
But hey, this never happened in the past! Asking a round a bit, I was advised to check if the fan is OK. Maybe the thermal paste went dry.
Opening the case and looking, I noticed that the heatsink was full with dust. More precisely, a lot of dust was stuck between the heatsink’s grill blades, obstructing the air flow. No air flow, no cooling. So I unsnapped the fan off the heatsink, took a vacuum cleaner, and removed all dust.
And my PC is like new now! The temperature goes from 30.0°C to no more than 44.0°C when I run that kernel compilation test (watching the temperature with “watch sensors” at shell prompt).
It was that simple.
Note to self: Vacuum the CPU’s heatsink every now and then.
And here’s what it looks like after two years, during which the computer has been on continuously (click on images to enlarge):

And this is with the fan taken off. One can clearly see that the layer of dust disrupts the air flow.
A minute with the vacuum cleaner, and we have
Snap the fan back in place, and the computer is ready to go!

Useful for diffing two sets of filesystems, just to see where the changes are (and maybe catch a file that was accidentally copied in)
Symbolic links and other non-regular files are ignored. If they’ve changed, there is no alarm on these.
The script (the path to the directory to be scanned is given as an argument):
#!/bin/bash
[ -d "$1" ] || exit 1;
cd $1 || exit 1;
find . | while read i; do
if [ -f "$i" ] && [ ! -h "$i" ]; then
sha1sum "$i";
else
echo "---------------------------------------- $i";
fi
done
To sort the output:
$ sort -k 1.41 list-of-files.txt > sorted.txt
The -k parameter causes sort to work from character 41 and on. Otherwise it would sort according to the SHA1 sums.
Unlike how I usually treat software tools I work with, my attitude towards U-boot is “if it works, never mind how and why”. Trying to understand the gory details of U-boot has never been very rewarding. Things work or break more or less randomly, depending on which git revision is checked out. Someone sent a patch fixing this and breaking that, and then someone else sent another patch.
So this is my story: After upgrading the Linux kernel from 3.3 to 3.12 (both having a strong Xilinx flavor, as the target is a Zynq board), but kept an old version of U-boot, I ran through the drill that usually makes the kernel boot:
zynq-uboot> fatload mmc 0 0x8000 zImage
reading zImage
2797680 bytes read
zynq-uboot> fatload mmc 0 0x1000000 devicetree.dtb
reading devicetree.dtb
5827 bytes read
zynq-uboot> go 0x8000
## Starting application at 0x00008000 ...
Error: unrecognized/unsupported machine ID (r1 = 0x1fb5662c).
Available machine support:
ID (hex) NAME
ffffffff Generic DT based system
ffffffff Xilinx Zynq Platform
Please check your kernel config and/or bootloader.
So the kernel didn’t boot. Looking at the way I attempted to kick it off, one may wonder how it worked at all with kernel v3.3. But one can’t argue with the fact that it used to boot.
The first thing to understand about this error message, is that it’s fatally misleading. The real problem is that the device tree blob isn’t found by the kernel, so it reverts to looking for a machine ID in r1. And r1 just has some value. The error message comes from a piece of boot code that is never reached, if the device tree is found and used.
Now, let’s try to understand the logic behind the sequence of commands: The first command loaded the zImage into a known place in memory, 0x8000. One could ask why I didn’t use uImage. Well, why should I? zImage works, and that’s the classic way to boot a kernel.
The device tree blob is then loaded to address 0x10000000.
And then comes the fun part: U-boot just jumps to 0x8000, the beginning of the image. I think I recall that one can put zImage anywhere in memory, and it will take it from there.
But how does the kernel know that the device tree is at 0x10000000? Beats me. I suppose it’s hardcoded somewhere. But hey, it worked! At least on older kernels. And on U-boot 2012.10 and older (but not 2013.10).
For the newer kernel (say, 3.12), a completely different spell should be cast. Something like this (using U-Boot 2012.10 or 2013.10):
zynq-uboot> fatload mmc 0 0x3000000 uImage
reading uImage
3054976 bytes read
zynq-uboot> fatload mmc 0 0x2A00000 devicetree.dtb
reading devicetree.dtb
7863 bytes read
zynq-uboot> bootm 0x3000000 - 0x2A00000
## Booting kernel from Legacy Image at 03000000 ...
Image Name: Linux-3.12.0-1.3-xilinx
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 3054912 Bytes = 2.9 MiB
Load Address: 00008000
Entry Point: 00008000
Verifying Checksum ... OK
## Flattened Device Tree blob at 02a00000
Booting using the fdt blob at 0x02a00000
Loading Kernel Image ... OK
OK
Loading Device Tree to 1fb4f000, end 1fb53eb6 ... OK
Starting kernel ...
Uncompressing Linux... done, booting the kernel.
Booting Linux on physical CPU 0x0
[ ... kernel boots, bliss and happiness ... ]
OK, so I twisted it all around. All of the sudden, I use an uImage, rather than zImage. Why? Because bootm works with uImage, and bootz wasn’t supported by that specific U-boot version (or configuration, I’m not really sure).
The addresses I loaded into are different too, and I’m not sure it matters. What surely matters, is that bootm was explicitly given the address of the device tree blob, and therefore passed that through a register to the kernel. So in this case, it’s pretty obvious how the kernel finds what’s where.
Ah, but there’s a twist. The latter, bootm-based method didn’t work on the 3.3 kernel. In fact, I got a very similar error message when I tried that.
As I said in the beginning, I never cared dive deep into U-boot. So all I’m saying is — if you encounter an error message like the one above, just try the other magic spell, and hope for good.
And if someone known more about what happened in the connection between U-boot and Linux somewhere in 2012, or has a link to some mailing list discussion where this was decided upon, please comment below. :)
Since the device tree is the new way to set up hardware devices on embedded platforms, I hoped that I could avoid the “platform” API for picking which driver is going to take control over what. But it looks like the /arch/arm disaster is here to stay for a while, so I need to at least understand how it works.
So for reference, here’s an example walkthrough of the SPI driver for i.MX51, declared and matched with a hardware device.
The idea is simple: The driver, which is enabled by .config (and hence the Makefile in its directory includes it for compilation) binds itself to a string during its initialization. On the other side, initialization code requests a device matching that string, and also supplies some information along with that. The example tells the story better.
The platform API is documented in the kernel tree’s Documentation/driver-model/platform.txt. There’s also a nice LWN article by Jonathan Corbet.
So let’s assume we have Freescale’s 3-stack board at hand. in arch/arm/mach-mx5/mx51_3stack.c, at the bottom, it says
MACHINE_START(MX51_3DS, "Freescale MX51 3-Stack Board")
.fixup = fixup_mxc_board,
.map_io = mx5_map_io,
.init_irq = mx5_init_irq,
.init_machine = mxc_board_init,
.timer = &mxc_timer,
MACHINE_EN
mxc_board_init() is defined in the same file, which among many other calls goes
mxc_register_device(&mxcspi1_device, &mxcspi1_data);
with the extra info structure mxcspi1_data defined as
static struct mxc_spi_master mxcspi1_data = {
.maxchipselect = 4,
.spi_version = 23,
.chipselect_active = mx51_3ds_gpio_spi_chipselect_active,
.chipselect_inactive = mx51_3ds_gpio_spi_chipselect_inactive,
};
Now to the declaration of mxcspi1_device: In arch/arm/mach-mx5/devices.c we have
struct platform_device mxcspi1_device = {
.name = "mxc_spi",
.id = 0,
.num_resources = ARRAY_SIZE(mxcspi1_resources),
.resource = mxcspi1_resources,
.dev = {
.dma_mask = &spi_dma_mask,
.coherent_dma_mask = DMA_BIT_MASK(32),
},
};
and before that, in the same file there was:
static struct resource mxcspi1_resources[] = {
{
.start = CSPI1_BASE_ADDR,
.end = CSPI1_BASE_ADDR + SZ_4K - 1,
.flags = IORESOURCE_MEM,
},
{
.start = MXC_INT_CSPI1,
.end = MXC_INT_CSPI1,
.flags = IORESOURCE_IRQ,
},
{
.start = MXC_DMA_CSPI1_TX,
.end = MXC_DMA_CSPI1_TX,
.flags = IORESOURCE_DMA,
},
};
So that defines the magic driver string and the resources that are allocated to this device.
It’s worth noting that devices.c ends with
postcore_initcall(mxc_init_devices);
which causes a call to mxc_init_devices(), a function that messes up the addresses of the resources for some architectures. Just to add some confusion. Always watch out for those little traps!
Meanwhile, in drivers/spi/mxc_spi.c
static struct platform_driver mxc_spi_driver = {
.driver = {
.name = "mxc_spi",
.owner = THIS_MODULE,
},
.probe = mxc_spi_probe,
.remove = mxc_spi_remove,
.suspend = mxc_spi_suspend,
.resume = mxc_spi_resume,
};
followed by:
static int __init mxc_spi_init(void)
{
pr_debug("Registering the SPI Controller Driver\n");
return platform_driver_register(&mxc_spi_driver);
}
static void __exit mxc_spi_exit(void)
{
pr_debug("Unregistering the SPI Controller Driver\n");
platform_driver_unregister(&mxc_spi_driver);
}
subsys_initcall(mxc_spi_init);
module_exit(mxc_spi_exit);
So this is how the driver tells Linux that it’s responsible for devices marked with the “mxc_spi” string.
As for some interaction with the device data (also in mxc_spi.c), there’s stuff like
mxc_platform_info = (struct mxc_spi_master *)pdev->dev.platform_data;
and
master_drv_data->res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
going on with
if (!request_mem_region(master_drv_data->res->start,
master_drv_data->res->end -
master_drv_data->res->start + 1, pdev->name)) { /* Ayee! */ }
and
if (pdev->dev.dma_mask == NULL) { /* No DMA for you! */ }
and it goes on…
This is my short saga about my not necessarily intelligent actions for reading a DocBook paper.
So I wanted to read some documentation from my Linux kernel sources. It happened to be in DocBook format.
In the kernel source’s root, I tried
$ make htmldocs
or I could have gone
$ make pdfdocs
Or mandocs. Or sgmldocs. Or psdocs.
But that builds only the DocBook templates in Documentation/DocBook/. I need those in Documentation/sound/alsa/DocBook!
I tried to copy the Makefile to the target directory, and change the assignment of DOCBOOKS to the files I wanted handled. But that didn’t work, because the Makefile couldn’t find the script/ subdirectory. Or as “make” put it:
make: *** No rule to make target `/scripts/kernel-doc', needed by `/alsa-driver-api.tmpl'. Stop.
OK, this was a bit too much. How about just going…
$ docbook2html writing-an-alsa-driver.tmpl
Works, creates a lot of scattered HTML files. Not so easy to read. Can’t I get it in a single document?
$ docbook2rtf writing-an-alsa-driver.tmpl
Huh? WTF? RTF? Yes, it’s apparently still alive. And hey, one can easily export it to PDF with OpenOffice!
This probably isn’t exactly the way the kernel hackers meant it to be done. On the other hand, it’s by far too much effort for just reading a document by and for the community…
I’m sure someone knowns about the obvious way I missed. Comments below, please…
What this blob is all about
Running some home-cooked SDMA scripts on Freescale’s Linux 2.6.28 kernel on an i.MX25 processor, I’m puzzled by the fact, that cache flushing with dma_map_single(…, DMA_TO_DEVICE) doesn’t hurt, but nothing happens if the calls are removed. On the other hand, attempting to remove cache invalidation calls, as in dma_map_single(…, DMA_FROM_DEVICE) does cause data corruption, as one would expect.
The de-facto lack of need for cache flushing could be explained by the small size of the cache: The sequence of events is typically preparing the data in the buffer, then some stuff in the middle, and only then is the SDMA script kicked off. If the cache lines are evicted naturally as a result of that “some stuff” activity, one gets away with not flushing the cache explicitly.
I’m by no means saying that cache flushing shouldn’t be done. On the contrary, I’m surprised that things don’t break when it’s removed.
So why doesn’t one get away with not invalidating the cache? In my tests, I saw 32-byte segments going wrong when I dropped the invalidation. That is, some segments, typically after a handful of successful data transactions of less than 1 kB of data.
Why does dropping the invalidation break things, and dropping the flushing doesn’t? As I said above, I’m still puzzled by this.
So I went down to the details of what these calls to dma_map_single() do. Spoiler: I didn’t find an explanation. At the end of the foodchain, there are several MCR assembly instructions, as one should expect. Both flushing and invalidation apparently does something useful.
The rest of this post is the dissection of Linux’ kernel code in this respect.
The gory details
DMA mappings and sync functions practically wrap the dma_cache_maint() function, e.g. in arch/arm/include/asm/dma-mapping.h:
static inline dma_addr_t dma_map_single(struct device *dev, void *cpu_addr,
size_t size, enum dma_data_direction dir)
{
BUG_ON(!valid_dma_direction(dir));
if (!arch_is_coherent())
dma_cache_maint(cpu_addr, size, dir);
return virt_to_dma(dev, cpu_addr);
}
It was verified with disassembly that dma_map_single() was implemented with a call to dma_cache_maint().
This function can be found in arch/arm/mm/dma-mapping.c as follows
/*
* Make an area consistent for devices.
* Note: Drivers should NOT use this function directly, as it will break
* platforms with CONFIG_DMABOUNCE.
* Use the driver DMA support - see dma-mapping.h (dma_sync_*)
*/
void dma_cache_maint(const void *start, size_t size, int direction)
{
const void *end = start + size;
BUG_ON(!virt_addr_valid(start) || !virt_addr_valid(end - 1));
switch (direction) {
case DMA_FROM_DEVICE: /* invalidate only */
dmac_inv_range(start, end);
outer_inv_range(__pa(start), __pa(end));
break;
case DMA_TO_DEVICE: /* writeback only */
dmac_clean_range(start, end);
outer_clean_range(__pa(start), __pa(end));
break;
case DMA_BIDIRECTIONAL: /* writeback and invalidate */
dmac_flush_range(start, end);
outer_flush_range(__pa(start), __pa(end));
break;
default:
BUG();
}
}
EXPORT_SYMBOL(dma_cache_maint);
The outer_* calls are defined as null functions in arch/arm/include/asm/cacheflush.h, since the CONFIG_OUTER_CACHE kernel configuration flag isn’t set.
The dmac_* macros are defined in arch/arm/include/asm/cacheflush.h as follows:
#define dmac_inv_range __glue(_CACHE,_dma_inv_range)
#define dmac_clean_range __glue(_CACHE,_dma_clean_range)
#define dmac_flush_range __glue(_CACHE,_dma_flush_range)
where __glue() simply glues the two strings together (see arch/arm/include/asm/glue.h) and _CACHE equals “arm926″ for the i.MX25, so e.g. dmac_clean_range becomes arm926_dma_clean_range.
These actual functions are implemented in assembler in arch/arm/mm/proc-arm926.S:
/*
* dma_inv_range(start, end)
*
* Invalidate (discard) the specified virtual address range.
* May not write back any entries. If 'start' or 'end'
* are not cache line aligned, those lines must be written
* back.
*
* - start - virtual start address
* - end - virtual end address
*
* (same as v4wb)
*/
ENTRY(arm926_dma_inv_range)
#ifndef CONFIG_CPU_DCACHE_WRITETHROUGH
tst r0, #CACHE_DLINESIZE - 1
mcrne p15, 0, r0, c7, c10, 1 @ clean D entry
tst r1, #CACHE_DLINESIZE - 1
mcrne p15, 0, r1, c7, c10, 1 @ clean D entry
#endif
bic r0, r0, #CACHE_DLINESIZE - 1
1: mcr p15, 0, r0, c7, c6, 1 @ invalidate D entry
add r0, r0, #CACHE_DLINESIZE
cmp r0, r1
blo 1b
mcr p15, 0, r0, c7, c10, 4 @ drain WB
mov pc, lr
/*
* dma_clean_range(start, end)
*
* Clean the specified virtual address range.
*
* - start - virtual start address
* - end - virtual end address
*
* (same as v4wb)
*/
ENTRY(arm926_dma_clean_range)
#ifndef CONFIG_CPU_DCACHE_WRITETHROUGH
bic r0, r0, #CACHE_DLINESIZE - 1
1: mcr p15, 0, r0, c7, c10, 1 @ clean D entry
add r0, r0, #CACHE_DLINESIZE
cmp r0, r1
blo 1b
#endif
mcr p15, 0, r0, c7, c10, 4 @ drain WB
mov pc, lr
/*
* dma_flush_range(start, end)
*
* Clean and invalidate the specified virtual address range.
*
* - start - virtual start address
* - end - virtual end address
*/
ENTRY(arm926_dma_flush_range)
bic r0, r0, #CACHE_DLINESIZE - 1
1:
#ifndef CONFIG_CPU_DCACHE_WRITETHROUGH
mcr p15, 0, r0, c7, c14, 1 @ clean+invalidate D entry
#else
mcr p15, 0, r0, c7, c6, 1 @ invalidate D entry
#endif
add r0, r0, #CACHE_DLINESIZE
cmp r0, r1
blo 1b
mcr p15, 0, r0, c7, c10, 4 @ drain WB
mov pc, lr
The CONFIG_CPU_DCACHE_WRITETHROUGH kernel configuration flag is not set, so there are no shortcuts.
Exactly the same snippet, only disassembled from the object file (using objdump -d):
000004d4 <arm926_dma_inv_range>:
4d4: e310001f tst r0, #31
4d8: 1e070f3a mcrne 15, 0, r0, cr7, cr10, {1}
4dc: e311001f tst r1, #31
4e0: 1e071f3a mcrne 15, 0, r1, cr7, cr10, {1}
4e4: e3c0001f bic r0, r0, #31
4e8: ee070f36 mcr 15, 0, r0, cr7, cr6, {1}
4ec: e2800020 add r0, r0, #32
4f0: e1500001 cmp r0, r1
4f4: 3afffffb bcc 4e8 <arm926_dma_inv_range+0x14>
4f8: ee070f9a mcr 15, 0, r0, cr7, cr10, {4}
4fc: e1a0f00e mov pc, lr
00000500 <arm926_dma_clean_range>:
500: e3c0001f bic r0, r0, #31
504: ee070f3a mcr 15, 0, r0, cr7, cr10, {1}
508: e2800020 add r0, r0, #32
50c: e1500001 cmp r0, r1
510: 3afffffb bcc 504 <arm926_dma_clean_range+0x4>
514: ee070f9a mcr 15, 0, r0, cr7, cr10, {4}
518: e1a0f00e mov pc, lr
0000051c <arm926_dma_flush_range>:
51c: e3c0001f bic r0, r0, #31
520: ee070f3e mcr 15, 0, r0, cr7, cr14, {1}
524: e2800020 add r0, r0, #32
528: e1500001 cmp r0, r1
52c: 3afffffb bcc 520 <arm926_dma_flush_range+0x4>
530: ee070f9a mcr 15, 0, r0, cr7, cr10, {4}
534: e1a0f00e mov pc, lr
So there’s actually little to learn from the disassembly. Or at all…
This is a simple and quick solution for those of us who want to run certain programs as a different user on the same desktop, for example running several user profiles of a browser at the same time. The main problem is usually that Pulseaudio doesn’t accept connections from a user other than the one logged in on the desktop.
It’s often suggested to go for a system mode Pulseaudio daemon, but judging from the developer’s own comments on this, and the friendly messages left in the system’s log when doing this, like
Jan 18 16:35:33 ocho pulseaudio[11158]: main.c: OK, so you are running PA in system mode. Please note that you most likely shouldn't be doing that.
Jan 18 16:35:33 ocho pulseaudio[11158]: main.c: If you do it nonetheless then it's your own fault if things don't work as expected.
Jan 18 16:35:33 ocho pulseaudio[11158]: main.c: Please read http://pulseaudio.org/wiki/WhatIsWrongWithSystemMode for an explanation why system mode is usually a bad idea.
Jan 18 16:35:33 ocho pulseaudio[11158]: module.c: module-hal-detect is deprecated: Please use module-udev-detect instead of module-hal-detect!
Jan 18 16:35:33 ocho pulseaudio[11158]: module-hal-detect-compat.c: We will now load module-udev-detect. Please make sure to remove module-hal-detect from your configuration
it’s probably not such a good idea. Plus that in my case, the sound card wasn’t detected in system wide mode, probably because some configuration issue, which I didn’t care much about working on. The bottom line is that the software’s authors don’t really want this to work.
Opening a TCP socket instead
The simple solution is given on this forum thread. This works well when there’s a specific user always logged on, and programs belonging to other dummy users are always run for specific purposes.
The idea behind this trick is to open a TCP port for native Pulseaudio communication, only it doesn’t require authentication, as long as the connection comes from 127.0.0.1, i.e. from the host itself. This opens the audio interface to any program running on the computer, including recording from the microphone. This makes no significant difference security-wise if the computer is accessed by a single user anyhow (possible spyware is likely to run with the logged in user ID anyhow, which has full access to audio either way).
This solution works on Fedora Core 12, but it’s probably the way to do it on any distribution released since 2009 or so.
Edit: It has been suggested in the comments below to use a UNIX socket instead of TCP. Haven’t tried it, but it seems like a better solution.
To do as the desktop’s user
So let’s get to the hands-on: First, copy /etc/pulse/default.pa into a file with the same name in the .pulse directory, that is
cp /etc/pulse/default.pa ~/.pulse/
And then edit the file, adding the following line at the end:
load-module module-native-protocol-tcp auth-ip-acl=127.0.0.1
At this point, restart the pulseaudio deamon,
$ pulseaudio -k
$ pulseaudio -D
To do as the “fake” user
Now switch to the second user, and create a file named client.conf under that user’s .pulse subdirectory
$ echo "default-server = 127.0.0.1" > ~/.pulse/client.conf
Note that default.pa and client.conf are in completely different directories, each belonging to a different user!
Surprisingly enough, that’s it. Any program running as the second user now has sound access.