my tech blog

Vacuum your CPU: When cooling suddenly doesn’t work anymore

So I compiled a Linux kernel with 8 threads in parallel on my Linux desktop machine, as I always do. The CPU worked extra hard as usual, but lately its temperature began to rise, ending up at 88°C. It looks like a clock gating mechanism kicked in to save the CPU.

But hey, this never happened in the past! Asking a round a bit, I was advised to check if the fan is OK. Maybe the thermal paste went dry.

Opening the case and looking, I noticed that the heatsink was full with dust. More precisely, a lot of dust was stuck between the heatsink’s grill blades, obstructing the air flow. No air flow, no cooling. So I unsnapped the fan off the heatsink, took a vacuum cleaner, and removed all dust.

And my PC is like new now! The temperature goes from 30.0°C to no more than 44.0°C when I run that kernel compilation test (watching the temperature with “watch sensors” at shell prompt).

It was that simple.

Note to self: Vacuum the CPU’s heatsink every now and then.

And here’s what it looks like after two years, during which the computer has been on continuously (click on images to enlarge):

And this is with the fan taken off. One can clearly see that the layer of dust disrupts the air flow.

A minute with the vacuum cleaner, and we have

Snap the fan back in place, and the computer is ready to go!

Posted Under: miscellaneous
This post was written by eli on March 25, 2014 Comments (0)

List all files in a subdirectory, with the SHA1 sums

Useful for diffing two sets of filesystems, just to see where the changes are (and maybe catch a file that was accidentally copied in)

Symbolic links and other non-regular files are ignored. If they’ve changed, there is no alarm on these.

The script (the path to the directory to be scanned is given as an argument):

#!/bin/bash

[ -d "$1" ] || exit 1;
cd $1 || exit 1;

find . | while read i; do
 if [ -f "$i" ] && [ ! -h "$i" ]; then
   sha1sum "$i";
 else
   echo "----------------------------------------  $i";
 fi
done

To sort the output:

$ sort -k 1.41 list-of-files.txt > sorted.txt

The -k parameter causes sort to work from character 41 and on. Otherwise it would sort according to the SHA1 sums.

Posted Under: Linux
This post was written by eli on March 20, 2014 Comments (0)

“Unsupported machine ID” after upgrading Linux kernel or U-boot

Unlike how I usually treat software tools I work with, my attitude towards U-boot is “if it works, never mind how and why”. Trying to understand the gory details of U-boot has never been very rewarding. Things work or break more or less randomly, depending on which git revision is checked out. Someone sent a patch fixing this and breaking that, and then someone else sent another patch.

So this is my story: After upgrading the Linux kernel from 3.3 to 3.12 (both having a strong Xilinx flavor, as the target is a Zynq board), but kept an old version of U-boot, I ran through the drill that usually makes the kernel boot:

zynq-uboot> fatload mmc 0 0x8000 zImage
reading zImage

2797680 bytes read
zynq-uboot> fatload mmc 0 0x1000000 devicetree.dtb
reading devicetree.dtb

5827 bytes read
zynq-uboot> go 0x8000
## Starting application at 0x00008000 ...

Error: unrecognized/unsupported machine ID (r1 = 0x1fb5662c).

Available machine support:

ID (hex)        NAME
ffffffff        Generic DT based system
ffffffff        Xilinx Zynq Platform

Please check your kernel config and/or bootloader.

So the kernel didn’t boot. Looking at the way I attempted to kick it off, one may wonder how it worked at all with kernel v3.3. But one can’t argue with the fact that it used to boot.

The first thing to understand about this error message, is that it’s fatally misleading. The real problem is that the device tree blob isn’t found by the kernel, so it reverts to looking for a machine ID in r1. And r1 just has some value. The error message comes from a piece of boot code that is never reached, if the device tree is found and used.

Now, let’s try to understand the logic behind the sequence of commands: The first command loaded the zImage into a known place in memory, 0x8000. One could ask why I didn’t use uImage. Well, why should I? zImage works, and that’s the classic way to boot a kernel.

The device tree blob is then loaded to address 0x10000000.

And then comes the fun part: U-boot just jumps to 0x8000, the beginning of the image. I think I recall that one can put zImage anywhere in memory, and it will take it from there.

But how does the kernel know that the device tree is at 0x10000000? Beats me. I suppose it’s hardcoded somewhere. But hey, it worked! At least on older kernels. And on U-boot 2012.10 and older (but not 2013.10).

For the newer kernel (say, 3.12), a completely different spell should be cast. Something like this (using U-Boot 2012.10 or 2013.10):

zynq-uboot> fatload mmc 0 0x3000000 uImage
reading uImage                                                                  

3054976 bytes read
zynq-uboot> fatload mmc 0 0x2A00000 devicetree.dtb
reading devicetree.dtb                                                          

7863 bytes read
zynq-uboot> bootm 0x3000000 - 0x2A00000
## Booting kernel from Legacy Image at 03000000 ...
   Image Name:   Linux-3.12.0-1.3-xilinx
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    3054912 Bytes = 2.9 MiB
   Load Address: 00008000
   Entry Point:  00008000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 02a00000
   Booting using the fdt blob at 0x02a00000
   Loading Kernel Image ... OK
OK
   Loading Device Tree to 1fb4f000, end 1fb53eb6 ... OK                         

Starting kernel ...                                                             

Uncompressing Linux... done, booting the kernel.
Booting Linux on physical CPU 0x0
[ ... kernel boots, bliss and happiness ... ]

OK, so I twisted it all around. All of the sudden, I use an uImage, rather than zImage. Why? Because bootm works with uImage, and bootz wasn’t supported by that specific U-boot version (or configuration, I’m not really sure).

The addresses I loaded into are different too, and I’m not sure it matters. What surely matters, is that bootm was explicitly given the address of the device tree blob, and therefore passed that through a register to the kernel. So in this case, it’s pretty obvious how the kernel finds what’s where.

Ah, but there’s a twist. The latter, bootm-based method didn’t work on the 3.3 kernel. In fact, I got a very similar error message when I tried that.

As I said in the beginning, I never cared dive deep into U-boot. So all I’m saying is — if you encounter an error message like the one above, just try the other magic spell, and hope for good.

And if someone known more about what happened in the connection between U-boot and Linux somewhere in 2012, or has a link to some mailing list discussion where this was decided upon, please comment below. :)

Posted Under: ARM,Linux kernel,Zynq
This post was written by eli on February 20, 2014 Comments (1)

Linux kernel platform device food chain example

Since the device tree is the new way to set up hardware devices on embedded platforms, I hoped that I could avoid the “platform” API for picking which driver is going to take control over what. But it looks like the /arch/arm disaster is here to stay for a while, so I need to at least understand how it works.

So for reference, here’s an example walkthrough of the SPI driver for i.MX51, declared and matched with a hardware device.

The idea is simple: The driver, which is enabled by .config (and hence the Makefile in its directory includes it for compilation) binds itself to a string during its initialization. On the other side, initialization code requests a device matching that string, and also supplies some information along with that. The example tells the story better.

The platform API is documented in the kernel tree’s Documentation/driver-model/platform.txt. There’s also a nice LWN article by Jonathan Corbet.

So let’s assume we have Freescale’s 3-stack board at hand. in arch/arm/mach-mx5/mx51_3stack.c, at the bottom, it says

MACHINE_START(MX51_3DS, "Freescale MX51 3-Stack Board")
 .fixup = fixup_mxc_board,
 .map_io = mx5_map_io,
 .init_irq = mx5_init_irq,
 .init_machine = mxc_board_init,
 .timer = &mxc_timer,
MACHINE_EN

mxc_board_init() is defined in the same file, which among many other calls goes

mxc_register_device(&mxcspi1_device, &mxcspi1_data);

with the extra info structure mxcspi1_data defined as

static struct mxc_spi_master mxcspi1_data = {
 .maxchipselect = 4,
 .spi_version = 23,
 .chipselect_active = mx51_3ds_gpio_spi_chipselect_active,
 .chipselect_inactive = mx51_3ds_gpio_spi_chipselect_inactive,
};

Now to the declaration of mxcspi1_device: In arch/arm/mach-mx5/devices.c we have

struct platform_device mxcspi1_device = {
	.name = "mxc_spi",
	.id = 0,
	.num_resources = ARRAY_SIZE(mxcspi1_resources),
	.resource = mxcspi1_resources,
	.dev = {
		.dma_mask = &spi_dma_mask,
		.coherent_dma_mask = DMA_BIT_MASK(32),
	},
};

and before that, in the same file there was:

static struct resource mxcspi1_resources[] = {
	{
		.start = CSPI1_BASE_ADDR,
		.end = CSPI1_BASE_ADDR + SZ_4K - 1,
		.flags = IORESOURCE_MEM,
	},
	{
		.start = MXC_INT_CSPI1,
		.end = MXC_INT_CSPI1,
		.flags = IORESOURCE_IRQ,
	},
	{
		.start = MXC_DMA_CSPI1_TX,
		.end = MXC_DMA_CSPI1_TX,
		.flags = IORESOURCE_DMA,
	},
};

So that defines the magic driver string and the resources that are allocated to this device.

It’s worth noting that devices.c ends with

postcore_initcall(mxc_init_devices);

which causes a call to mxc_init_devices(), a function that messes up the addresses of the resources for some architectures. Just to add some confusion. Always watch out for those little traps!

Meanwhile, in drivers/spi/mxc_spi.c

static struct platform_driver mxc_spi_driver = {
	.driver = {
		   .name = "mxc_spi",
		   .owner = THIS_MODULE,
		   },
	.probe = mxc_spi_probe,
	.remove = mxc_spi_remove,
	.suspend = mxc_spi_suspend,
	.resume = mxc_spi_resume,
};

followed by:

static int __init mxc_spi_init(void)
{
	pr_debug("Registering the SPI Controller Driver\n");
	return platform_driver_register(&mxc_spi_driver);
}

static void __exit mxc_spi_exit(void)
{
	pr_debug("Unregistering the SPI Controller Driver\n");
	platform_driver_unregister(&mxc_spi_driver);
}

subsys_initcall(mxc_spi_init);
module_exit(mxc_spi_exit);

So this is how the driver tells Linux that it’s responsible for devices marked with the “mxc_spi” string.

As for some interaction with the device data (also in mxc_spi.c), there’s stuff like

mxc_platform_info = (struct mxc_spi_master *)pdev->dev.platform_data;

and

master_drv_data->res = platform_get_resource(pdev, IORESOURCE_MEM, 0);

going on with

if (!request_mem_region(master_drv_data->res->start,
			master_drv_data->res->end -
			master_drv_data->res->start + 1, pdev->name)) { /* Ayee! */ }

and

if (pdev->dev.dma_mask == NULL) { /* No DMA for you! */ }

and it goes on…

Posted Under: ARM,Linux kernel,NXP (Freescale)
This post was written by eli on February 14, 2014 Comments (0)

Reading the DocBook files in Linux kernel’s documentation

This is my short saga about my not necessarily intelligent actions for reading a DocBook paper.

So I wanted to read some documentation from my Linux kernel sources. It happened to be in DocBook format.

In the kernel source’s root, I tried

$ make htmldocs

or I could have gone

$ make pdfdocs

Or mandocs. Or sgmldocs. Or psdocs.

But that builds only the DocBook templates in Documentation/DocBook/. I need those in Documentation/sound/alsa/DocBook!

I tried to copy the Makefile to the target directory, and change the assignment of DOCBOOKS to the files I wanted handled. But that didn’t work, because the Makefile couldn’t find the script/ subdirectory. Or as “make” put it:

make: *** No rule to make target `/scripts/kernel-doc', needed by `/alsa-driver-api.tmpl'.  Stop.

OK, this was a bit too much. How about just going…

$ docbook2html writing-an-alsa-driver.tmpl

Works, creates a lot of scattered HTML files. Not so easy to read. Can’t I get it in a single document?

$ docbook2rtf writing-an-alsa-driver.tmpl

Huh? WTF? RTF? Yes, it’s apparently still alive. And hey, one can easily export it to PDF with OpenOffice!

This probably isn’t exactly the way the kernel hackers meant it to be done. On the other hand, it’s by far too much effort for just reading a document by and for the community…

I’m sure someone knowns about the obvious way I missed. Comments below, please…

Posted Under: Linux kernel
This post was written by eli on February 10, 2014 Comments (0)

Cache coherency on i.MX25 running Linux

What this blob is all about

Running some home-cooked SDMA scripts on Freescale’s Linux 2.6.28 kernel on an i.MX25 processor, I’m puzzled by the fact, that cache flushing with dma_map_single(…, DMA_TO_DEVICE) doesn’t hurt, but nothing happens if the calls are removed. On the other hand, attempting to remove cache invalidation calls, as in dma_map_single(…, DMA_FROM_DEVICE) does cause data corruption, as one would expect.

The de-facto lack of need for cache flushing could be explained by the small size of the cache: The sequence of events is typically preparing the data in the buffer, then some stuff in the middle, and only then is the SDMA script kicked off. If the cache lines are evicted naturally as a result of that “some stuff” activity, one gets away with not flushing the cache explicitly.

I’m by no means saying that cache flushing shouldn’t be done. On the contrary, I’m surprised that things don’t break when it’s removed.

So why doesn’t one get away with not invalidating the cache? In my tests, I saw 32-byte segments going wrong when I dropped the invalidation. That is, some segments, typically after a handful of successful data transactions of less than 1 kB of data.

Why does dropping the invalidation break things, and dropping the flushing doesn’t? As I said above, I’m still puzzled by this.

So I went down to the details of what these calls to dma_map_single() do. Spoiler: I didn’t find an explanation. At the end of the foodchain, there are several MCR assembly instructions, as one should expect. Both flushing and invalidation apparently does something useful.

The rest of this post is the dissection of Linux’ kernel code in this respect.

The gory details

DMA mappings and sync functions practically wrap the dma_cache_maint() function, e.g. in arch/arm/include/asm/dma-mapping.h:

static inline dma_addr_t dma_map_single(struct device *dev, void *cpu_addr,
		size_t size, enum dma_data_direction dir)
{
	BUG_ON(!valid_dma_direction(dir));

	if (!arch_is_coherent())
		dma_cache_maint(cpu_addr, size, dir);

	return virt_to_dma(dev, cpu_addr);
}

It was verified with disassembly that dma_map_single() was implemented with a call to dma_cache_maint().

This function can be found in arch/arm/mm/dma-mapping.c as follows

/*
 * Make an area consistent for devices.
 * Note: Drivers should NOT use this function directly, as it will break
 * platforms with CONFIG_DMABOUNCE.
 * Use the driver DMA support - see dma-mapping.h (dma_sync_*)
 */
void dma_cache_maint(const void *start, size_t size, int direction)
{
	const void *end = start + size;

	BUG_ON(!virt_addr_valid(start) || !virt_addr_valid(end - 1));

	switch (direction) {
	case DMA_FROM_DEVICE:		/* invalidate only */
		dmac_inv_range(start, end);
		outer_inv_range(__pa(start), __pa(end));
		break;
	case DMA_TO_DEVICE:		/* writeback only */
		dmac_clean_range(start, end);
		outer_clean_range(__pa(start), __pa(end));
		break;
	case DMA_BIDIRECTIONAL:		/* writeback and invalidate */
		dmac_flush_range(start, end);
		outer_flush_range(__pa(start), __pa(end));
		break;
	default:
		BUG();
	}
}
EXPORT_SYMBOL(dma_cache_maint);

The outer_* calls are defined as null functions in arch/arm/include/asm/cacheflush.h, since the CONFIG_OUTER_CACHE kernel configuration flag isn’t set.

The dmac_* macros are defined in arch/arm/include/asm/cacheflush.h as follows:

#define dmac_inv_range			__glue(_CACHE,_dma_inv_range)
#define dmac_clean_range		__glue(_CACHE,_dma_clean_range)
#define dmac_flush_range		__glue(_CACHE,_dma_flush_range)

where __glue() simply glues the two strings together (see arch/arm/include/asm/glue.h) and _CACHE equals “arm926″ for the i.MX25, so e.g. dmac_clean_range becomes arm926_dma_clean_range.

These actual functions are implemented in assembler in arch/arm/mm/proc-arm926.S:

/*
 *	dma_inv_range(start, end)
 *
 *	Invalidate (discard) the specified virtual address range.
 *	May not write back any entries.  If 'start' or 'end'
 *	are not cache line aligned, those lines must be written
 *	back.
 *
 *	- start	- virtual start address
 *	- end	- virtual end address
 *
 * (same as v4wb)
 */
ENTRY(arm926_dma_inv_range)
#ifndef CONFIG_CPU_DCACHE_WRITETHROUGH
	tst	r0, #CACHE_DLINESIZE - 1
	mcrne	p15, 0, r0, c7, c10, 1		@ clean D entry
	tst	r1, #CACHE_DLINESIZE - 1
	mcrne	p15, 0, r1, c7, c10, 1		@ clean D entry
#endif
	bic	r0, r0, #CACHE_DLINESIZE - 1
1:	mcr	p15, 0, r0, c7, c6, 1		@ invalidate D entry
	add	r0, r0, #CACHE_DLINESIZE
	cmp	r0, r1
	blo	1b
	mcr	p15, 0, r0, c7, c10, 4		@ drain WB
	mov	pc, lr

/*
 *	dma_clean_range(start, end)
 *
 *	Clean the specified virtual address range.
 *
 *	- start	- virtual start address
 *	- end	- virtual end address
 *
 * (same as v4wb)
 */
ENTRY(arm926_dma_clean_range)
#ifndef CONFIG_CPU_DCACHE_WRITETHROUGH
	bic	r0, r0, #CACHE_DLINESIZE - 1
1:	mcr	p15, 0, r0, c7, c10, 1		@ clean D entry
	add	r0, r0, #CACHE_DLINESIZE
	cmp	r0, r1
	blo	1b
#endif
	mcr	p15, 0, r0, c7, c10, 4		@ drain WB
	mov	pc, lr

/*
 *	dma_flush_range(start, end)
 *
 *	Clean and invalidate the specified virtual address range.
 *
 *	- start	- virtual start address
 *	- end	- virtual end address
 */
ENTRY(arm926_dma_flush_range)
	bic	r0, r0, #CACHE_DLINESIZE - 1
1:
#ifndef CONFIG_CPU_DCACHE_WRITETHROUGH
	mcr	p15, 0, r0, c7, c14, 1		@ clean+invalidate D entry
#else
	mcr	p15, 0, r0, c7, c6, 1		@ invalidate D entry
#endif
	add	r0, r0, #CACHE_DLINESIZE
	cmp	r0, r1
	blo	1b
	mcr	p15, 0, r0, c7, c10, 4		@ drain WB
	mov	pc, lr

The CONFIG_CPU_DCACHE_WRITETHROUGH kernel configuration flag is not set, so there are no shortcuts.

Exactly the same snippet, only disassembled from the object file (using objdump -d):

000004d4 <arm926_dma_inv_range>:
 4d4:	e310001f 	tst	r0, #31
 4d8:	1e070f3a 	mcrne	15, 0, r0, cr7, cr10, {1}
 4dc:	e311001f 	tst	r1, #31
 4e0:	1e071f3a 	mcrne	15, 0, r1, cr7, cr10, {1}
 4e4:	e3c0001f 	bic	r0, r0, #31
 4e8:	ee070f36 	mcr	15, 0, r0, cr7, cr6, {1}
 4ec:	e2800020 	add	r0, r0, #32
 4f0:	e1500001 	cmp	r0, r1
 4f4:	3afffffb 	bcc	4e8 <arm926_dma_inv_range+0x14>
 4f8:	ee070f9a 	mcr	15, 0, r0, cr7, cr10, {4}
 4fc:	e1a0f00e 	mov	pc, lr

00000500 <arm926_dma_clean_range>:
 500:	e3c0001f 	bic	r0, r0, #31
 504:	ee070f3a 	mcr	15, 0, r0, cr7, cr10, {1}
 508:	e2800020 	add	r0, r0, #32
 50c:	e1500001 	cmp	r0, r1
 510:	3afffffb 	bcc	504 <arm926_dma_clean_range+0x4>
 514:	ee070f9a 	mcr	15, 0, r0, cr7, cr10, {4}
 518:	e1a0f00e 	mov	pc, lr

0000051c <arm926_dma_flush_range>:
 51c:	e3c0001f 	bic	r0, r0, #31
 520:	ee070f3e 	mcr	15, 0, r0, cr7, cr14, {1}
 524:	e2800020 	add	r0, r0, #32
 528:	e1500001 	cmp	r0, r1
 52c:	3afffffb 	bcc	520 <arm926_dma_flush_range+0x4>
 530:	ee070f9a 	mcr	15, 0, r0, cr7, cr10, {4}
 534:	e1a0f00e 	mov	pc, lr

So there’s actually little to learn from the disassembly. Or at all…

Posted Under: ARM,Linux kernel,NXP (Freescale)
This post was written by eli on February 4, 2014 Comments (0)

Pulseaudio for multiple users, without system-mode daemon

This is a simple and quick solution for those of us who want to run certain programs as a different user on the same desktop, for example running several user profiles of a browser at the same time. The main problem is usually that Pulseaudio doesn’t accept connections from a user other than the one logged in on the desktop.

It’s often suggested to go for a system mode Pulseaudio daemon, but judging from the developer’s own comments on this, and the friendly messages left in the system’s log when doing this, like

Jan 18 16:35:33 ocho pulseaudio[11158]: main.c: OK, so you are running PA in system mode. Please note that you most likely shouldn't be doing that.
Jan 18 16:35:33 ocho pulseaudio[11158]: main.c: If you do it nonetheless then it's your own fault if things don't work as expected.
Jan 18 16:35:33 ocho pulseaudio[11158]: main.c: Please read http://pulseaudio.org/wiki/WhatIsWrongWithSystemMode for an explanation why system mode is usually a bad idea.
Jan 18 16:35:33 ocho pulseaudio[11158]: module.c: module-hal-detect is deprecated: Please use module-udev-detect instead of module-hal-detect!
Jan 18 16:35:33 ocho pulseaudio[11158]: module-hal-detect-compat.c: We will now load module-udev-detect. Please make sure to remove module-hal-detect from your configuration

it’s probably not such a good idea. Plus that in my case, the sound card wasn’t detected in system wide mode, probably because some configuration issue, which I didn’t care much about working on. The bottom line is that the software’s authors don’t really want this to work.

Opening a TCP socket instead

The simple solution is given on this forum thread. This works well when there’s a specific user always logged on, and programs belonging to other dummy users are always run for specific purposes.

The idea behind this trick is to open a TCP port for native Pulseaudio communication, only it doesn’t require authentication, as long as the connection comes from 127.0.0.1, i.e. from the host itself. This opens the audio interface to any program running on the computer, including recording from the microphone. This makes no significant difference security-wise if the computer is accessed by a single user anyhow (possible spyware is likely to run with the logged in user ID anyhow, which has full access to audio either way).

This solution works on Fedora Core 12, but it’s probably the way to do it on any distribution released since 2009 or so.

Edit: It has been suggested in the comments below to use a UNIX socket instead of TCP. Haven’t tried it, but it seems like a better solution.

To do as the desktop’s user

So let’s get to the hands-on: First, copy /etc/pulse/default.pa into a file with the same name in the .pulse directory, that is

cp /etc/pulse/default.pa ~/.pulse/

And then edit the file, adding the following line at the end:

load-module module-native-protocol-tcp auth-ip-acl=127.0.0.1

At this point, restart the pulseaudio deamon,

$ pulseaudio -k
$ pulseaudio -D

To do as the “fake” user

Now switch to the second user, and create a file named client.conf under that user’s .pulse subdirectory

$ echo "default-server = 127.0.0.1" > ~/.pulse/client.conf

Note that default.pa and client.conf are in completely different directories, each belonging to a different user!

Surprisingly enough, that’s it. Any program running as the second user now has sound access.

Posted Under: Linux,Linux sound
This post was written by eli on January 18, 2014 Comments (35)

stmmaceth: NetworkManager fails to bring up a wired Ethernet NIC

The problem

In short: Running linux 3.8.0 on Altera’s Cyclone V SoC, NetworkManager doesn’t bring up the Ethernet port. It also makes false accusations such as

Jan  1 00:00:17 localhost NetworkManager[1206]: <info> (eth0): driver 'stmmaceth' does not support carrier detection.

and later on also says

Jan  1 00:00:17 localhost NetworkManager[1206]: <warn> (eth0): couldn't get carrier state: (-1) unknown
Jan  1 00:00:17 localhost NetworkManager[1206]: <info> (eth0): carrier now OFF (device state 20, deferring action for 4 seconds)

And asking more directly,

# nm-tool eth0
NetworkManager Tool

State: disconnected

- Device: eth0 -----------------------------------------------------------------
  Type:              Wired
  Driver:            stmmaceth
  State:             unavailable
  Default:           no
  HW Address:        96:A7:6F:4E:DD:6D

  Capabilities:

  Wired Properties
    Carrier:         off

All of this is, of course, incorrect. Even though it’s not clear who to blame for this. But the driver detects the carrier all right:

# cat /sys/class/net/eth0/carrier
1

and as we shall see below, the ioctl() interface is also supported. Only it doesn’t work as NetworkManager expects it to.

Well, I bluffed a bit proving that the carrier detection works. Explained later.

So what went wrong?

Nothing like digging in the source code. In NetworkManager’s nm-device-ethernet.c, the function supports_ethtool_carrier_detect() goes

static gboolean
supports_ethtool_carrier_detect (NMDeviceEthernet *self)
{
	int fd;
	struct ifreq ifr;
	gboolean supports_ethtool = FALSE;
	struct ethtool_cmd edata;

	g_return_val_if_fail (self != NULL, FALSE);

	fd = socket (PF_INET, SOCK_DGRAM, 0);
	if (fd < 0) {
		nm_log_err (LOGD_HW, "couldn't open control socket.");
		return FALSE;
	}

	memset (&ifr, 0, sizeof (struct ifreq));
	strncpy (ifr.ifr_name, nm_device_get_iface (NM_DEVICE (self)), IFNAMSIZ);

	edata.cmd = ETHTOOL_GLINK;
	ifr.ifr_data = (char *) &edata;

	errno = 0;
	if (ioctl (fd, SIOCETHTOOL, &ifr) < 0) {
		nm_log_dbg (LOGD_HW | LOGD_ETHER, "SIOCETHTOOL failed: %d", errno);
		goto out;
	}

	supports_ethtool = TRUE;

out:
	close (fd);
	nm_log_dbg (LOGD_HW | LOGD_ETHER, "ethtool %s supported",
	            supports_ethtool ? "is" : "not");
	return supports_ethtool;
}

Obviously, this is the function that determines if the port supplies carrier detection. There is also a similar function for MII, supports_mii_carrier_detect (). A simple strace reveals what went wrong:

And indeed, in the strace log with this driver it says

socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 17
ioctl(17, SIOCETHTOOL, 0x7e93bcdc)      = -1 EBUSY (Device or resource busy)
close(17)                               = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 17
ioctl(17, SIOCGMIIPHY, 0x7e93bcfc)      = -1 EINVAL (Invalid argument)
close(17)                               = 0
open("/proc/sys/net/ipv6/conf/eth0/accept_ra", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/proc/sys/net/ipv6/conf/eth0/use_tempaddr", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
gettimeofday({4101, 753554}, NULL)      = 0
send(6, "<30>Jan  1 01:08:21 NetworkManager[1701]: <info> (eth0): driver 'stmmaceth' does not support carrier detection.", 111, MSG_NOSIGNAL) = 111

so we can see that the attempt made in supports_ethtool_carrier_detect() failed with an EBUSY, and the one made by supports_mii_carrier_detect () failed as well, with an EINVAL. In other words, the ethtool (which is loosely related to the ethtool utility) ioctl() interface was recognized, but the driver said the driver was busy (a silly return code, as we shall see later), and the MII ioctl() interface was rejected altogether.

Since NetworkManager doesn’t support carrier detection based on Sysfs, the final conclusion is that there is no carrier detection.

But why did the driver answer EBUSY in the first place?

Some kernel digging

The relevant Linux kernel is 3.8.0.

ioctl() calls to network devices are handled by the dev_ioctl() function in net/core/dev.c (not in drivers/, and it was later on moved to dev_ioctl.c) as follows:

	case SIOCETHTOOL:
		dev_load(net, ifr.ifr_name);
		rtnl_lock();
		ret = dev_ethtool(net, &ifr);
		rtnl_unlock();
		if (!ret) {
			if (colon)
				*colon = ':';
			if (copy_to_user(arg, &ifr,
					 sizeof(struct ifreq)))
				ret = -EFAULT;
		}
		return ret;

Note that the ioctl() call is based upon the name of the interface as a string (e.g. “eth0″). The call to dev_load hence loads a kernel module if the respective driver isn’t loaded yet. The dev_ethtool() function is in net/core/ethtool.c. This function first runs a few sanity checks + permissions, and may return ENODEV, EFAULT or EPERM, depending on different mishaps.

Most notably, it runs

	if (dev->ethtool_ops->begin) {
		rc = dev->ethtool_ops->begin(dev);
		if (rc  < 0)
			return rc;
	}

which in the case of stmmac is

static int stmmac_check_if_running(struct net_device *dev)
{
	if (!netif_running(dev))
		return -EBUSY;
	return 0;
}

netif_running(dev) is defined in include/linux/netdevice.h as follows:

static inline bool netif_running(const struct net_device *dev)
{
	return test_bit(__LINK_STATE_START, &dev->state);
}

This function returns true when the device is “up”, exactly in the sense of “ifconfig up”.

Say what?

NetworkManager made the SIOCETHTOOL ioctl() call before bringing up the eth0 interface in order to check if it supports carrier detect. But since it wasn’t up (why should it be? NetworkManager didn’t bring it up), the driver’s sanity check (?) failed the ioctl() call with an EBUSY, as netif_running() returns false — the interface was down. So NetworkManager marked the interface as not supporting carrier detect, and took it up even so. This made the driver say that it has detected a carrier, but since NetworkManager didn’t expect that to happen, it started fooling around, and eventually didn’t bring up the interface properly (no DHCP, in particular).

As it turns out, netif_running(dev) returns zero, which is the reason the whole thing fails with an EBUSY.

Now let’s return to the Sysfs detection of the carrier. With the eth0 interface down, it goes like this

# cat /sys/class/net/eth0/carrier
cat: /sys/class/net/eth0/carrier: Invalid argument
# ifconfig eth0 up
# cat /sys/class/net/eth0/carrier
0
# cat /sys/class/net/eth0/carrier
1

The two successive carrier detections give different results, because it takes a second or so before the carrier is detected. There was nothing changed with the hardware inbetween (no cable was plugged in or something).

So NetworkManager was partly right: There driver doesn’t support carrier detection as long as the interface isn’t brought up.

Solution

The solution is surprisingly simple. Just make sure

ifconfig eth0 up

is executed before NetworkManager is launched. That’s it. Suddenly nm-tool sees a completely different interface:

# nm-tool eth0

NetworkManager Tool

State: connected (global)

- Device: eth0  [Wired connection 1] -------------------------------------------
  Type:              Wired
  Driver:            stmmaceth
  State:             connected
  Default:           yes
  HW Address:        9E:37:A8:56:CF:EC

  Capabilities:
    Carrier Detect:  yes
    Speed:           100 Mb/s

  Wired Properties
    Carrier:         on

  IPv4 Settings:
    Address:         10.1.1.242
    Prefix:          24 (255.255.255.0)
    Gateway:         10.1.1.3

    DNS:             10.2.0.1
    DNS:             10.2.0.2

Who should we blame here? Probably NetworkManager. Since it’s bringing up the interface anyhow, why not ask it if it supports carrier detection after the interface is up? I suppose that the driver has its reasons for not cooperating while it’s down.

Epilogue

Since I started with dissecting the kernel’s code, here’s what happens with the call to dev_ethtool() mentioned above, when it passes the “sanity check”. There’s a huge case statement, with the relevant part saying

	case ETHTOOL_GLINK:
		rc = ethtool_get_link(dev, useraddr);
		break;

the rc value is propagated up when this call finishes (after some possible other operations, which are probably not relevant).

And then we have, in the same file,

static int ethtool_get_link(struct net_device *dev, char __user *useraddr)
{
	struct ethtool_value edata = { .cmd = ETHTOOL_GLINK };

	if (!dev->ethtool_ops->get_link)
		return -EOPNOTSUPP;

	edata.data = netif_running(dev) && dev->ethtool_ops->get_link(dev);

	if (copy_to_user(useraddr, &edata, sizeof(edata)))
		return -EFAULT;
	return 0;
}

The ethtool_value structure is defined in include/uapi/linux/ethtool.h saying

struct ethtool_value {
	__u32	cmd;
	__u32	data;
};

Note that if netif_running(dev) returns false, zero is returned on the edata entry of the answer, but the call is successful (it actually makes sense). But this never happens with the current driver, as was seen above.

It’s fairly safe to assume that drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c handles the actual call, as it has

static const struct ethtool_ops stmmac_ethtool_ops = {
	.begin = stmmac_check_if_running,
... snip ...
	.get_link = ethtool_op_get_link,
... snip ...
};

but ethtool_op_get_link() is defined in net/core/ethtool.c (we’re running in circles…) saying simply

u32 ethtool_op_get_link(struct net_device *dev)
{
	return netif_carrier_ok(dev) ? 1 : 0;
}

which bring us to include/linux/netdevice.h where it says

static inline bool netif_carrier_ok(const struct net_device *dev)
{
	return !test_bit(__LINK_STATE_NOCARRIER, &dev->state);
}

This rises the question why the driver refuses to answer ETHTOOL_GLINK requests when it’s down. It’s not even involved in answering this request. But having attempted to modify the driver, so ETHTOOL_GLINK is let through even when the interface is down, I can say that it still confused NetworkManager. I didn’t get down to why exactly.

Posted Under: ARM,Intel FPGA (Altera),Linux,Linux kernel
This post was written by eli on January 4, 2014 Comments (1)

High resolution images of the Sockit board

At times, it’s useful to have a high-resolution picture of the board in front of you. For example, finding the correct place to touch with a probe is easier when the point is first found on the computer screen.

These are two very detailed images of the Sockit board by Terasic and Arrow Electronics (and Altera), featuring a Cyclone V SoC FPGA.

The images below are small, and are just links to the bigger files. The USB plug that is connected is the OTG port (for connecting a keyboard or USB stick etc.)

And finally, here’s a short video clip showing what it looks like when powering on the board with Xillinux:

Posted Under: FPGA,Intel FPGA (Altera)
This post was written by eli on December 29, 2013 Comments (0)

Using xargs to run a commands from a bash batch file in parallel

Suppose that we have a file, batch-commands.sh, which consists of independent commands to be executed, one for each line. Now we want to run several of these in parallel.

xargs -P 8 -n 1 -d "\n" -a batch-commands.sh bash -c

With -P 8 there are 8 processes running all the time.

Posted Under: Linux
This post was written by eli on December 27, 2013 Comments (0)

« Older Entries

Newer Entries »

Popular Posts

Latest Posts

Archives