“Unsupported machine ID” after upgrading Linux kernel or U-boot

Unlike how I usually treat software tools I work with, my attitude towards U-boot is “if it works, never mind how and why”. Trying to understand the gory details of U-boot has never been very rewarding. Things work or break more or less randomly, depending on which git revision is checked out. Someone sent a patch fixing this and breaking that, and then someone else sent another patch.

So this is my story: After upgrading the Linux kernel from 3.3 to 3.12 (both having a strong Xilinx flavor, as the target is a Zynq board), but kept an old version of U-boot, I ran through the drill that usually makes the kernel boot:

zynq-uboot> fatload mmc 0 0x8000 zImage
reading zImage

2797680 bytes read
zynq-uboot> fatload mmc 0 0x1000000 devicetree.dtb
reading devicetree.dtb

5827 bytes read
zynq-uboot> go 0x8000
## Starting application at 0x00008000 ...

Error: unrecognized/unsupported machine ID (r1 = 0x1fb5662c).

Available machine support:

ID (hex)        NAME
ffffffff        Generic DT based system
ffffffff        Xilinx Zynq Platform

Please check your kernel config and/or bootloader.

So the kernel didn’t boot. Looking at the way I attempted to kick it off, one may wonder how it worked at all with kernel v3.3. But one can’t argue with the fact that it used to boot.

The first thing to understand about this error message, is that it’s fatally misleading. The real problem is that the device tree blob isn’t found by the kernel, so it reverts to looking for a machine ID in r1. And r1 just has some value. The error message comes from a piece of boot code that is never reached, if the device tree is found and used.

Now, let’s try to understand the logic behind the sequence of commands: The first command loaded the zImage into a known place in memory, 0x8000. One could ask why I didn’t use uImage. Well, why should I? zImage works, and that’s the classic way to boot a kernel.

The device tree blob is then loaded to address 0x10000000.

And then comes the fun part: U-boot just jumps to 0x8000, the beginning of the image. I think I recall that one can put zImage anywhere in memory, and it will take it from there.

But how does the kernel know that the device tree is at 0x10000000? Beats me. I suppose it’s hardcoded somewhere. But hey, it worked! At least on older kernels. And on U-boot 2012.10 and older (but not 2013.10).

For the newer kernel (say, 3.12), a completely different spell should be cast. Something like this (using U-Boot 2012.10 or 2013.10):

zynq-uboot> fatload mmc 0 0x3000000 uImage
reading uImage                                                                  

3054976 bytes read
zynq-uboot> fatload mmc 0 0x2A00000 devicetree.dtb
reading devicetree.dtb                                                          

7863 bytes read
zynq-uboot> bootm 0x3000000 - 0x2A00000
## Booting kernel from Legacy Image at 03000000 ...
   Image Name:   Linux-3.12.0-1.3-xilinx
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    3054912 Bytes = 2.9 MiB
   Load Address: 00008000
   Entry Point:  00008000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 02a00000
   Booting using the fdt blob at 0x02a00000
   Loading Kernel Image ... OK
OK
   Loading Device Tree to 1fb4f000, end 1fb53eb6 ... OK                         

Starting kernel ...                                                             

Uncompressing Linux... done, booting the kernel.
Booting Linux on physical CPU 0x0
[ ... kernel boots, bliss and happiness ... ]

OK, so I twisted it all around. All of the sudden, I use an uImage, rather than zImage. Why? Because bootm works with uImage, and bootz wasn’t supported by that specific U-boot version (or configuration, I’m not really sure).

The addresses I loaded into are different too, and I’m not sure it matters. What surely matters, is that bootm was explicitly given the address of the device tree blob, and therefore passed that through a register to the kernel. So in this case, it’s pretty obvious how the kernel finds what’s where.

Ah, but there’s a twist. The latter, bootm-based method didn’t work on the 3.3 kernel. In fact, I got a very similar error message when I tried that.

As I said in the beginning, I never cared dive deep into U-boot. So all I’m saying is — if you encounter an error message like the one above, just try the other magic spell, and hope for good.

And if someone known more about what happened in the connection between U-boot and Linux somewhere in 2012, or has a link to some mailing list discussion where this was decided upon, please comment below. :)

Linux kernel platform device food chain example

Since the device tree is the new way to set up hardware devices on embedded platforms, I hoped that I could avoid the “platform” API for picking which driver is going to take control over what. But it looks like the /arch/arm disaster is here to stay for a while, so I need to at least understand how it works.

So for reference, here’s an example walkthrough of the SPI driver for i.MX51, declared and matched with a hardware device.

The idea is simple: The driver, which is enabled by .config (and hence the Makefile in its directory includes it for compilation) binds itself to a string during its initialization. On the other side, initialization code requests a device matching that string, and also supplies some information along with that. The example tells the story better.

The platform API is documented in the kernel tree’s Documentation/driver-model/platform.txt. There’s also a nice LWN article by Jonathan Corbet.

So let’s assume we have Freescale’s 3-stack board at hand. in arch/arm/mach-mx5/mx51_3stack.c, at the bottom, it says

MACHINE_START(MX51_3DS, "Freescale MX51 3-Stack Board")
 .fixup = fixup_mxc_board,
 .map_io = mx5_map_io,
 .init_irq = mx5_init_irq,
 .init_machine = mxc_board_init,
 .timer = &mxc_timer,
MACHINE_EN

mxc_board_init() is defined in the same file, which among many other calls goes

mxc_register_device(&mxcspi1_device, &mxcspi1_data);

with the extra info structure mxcspi1_data defined as

static struct mxc_spi_master mxcspi1_data = {
 .maxchipselect = 4,
 .spi_version = 23,
 .chipselect_active = mx51_3ds_gpio_spi_chipselect_active,
 .chipselect_inactive = mx51_3ds_gpio_spi_chipselect_inactive,
};

Now to the declaration of mxcspi1_device: In arch/arm/mach-mx5/devices.c we have

struct platform_device mxcspi1_device = {
	.name = "mxc_spi",
	.id = 0,
	.num_resources = ARRAY_SIZE(mxcspi1_resources),
	.resource = mxcspi1_resources,
	.dev = {
		.dma_mask = &spi_dma_mask,
		.coherent_dma_mask = DMA_BIT_MASK(32),
	},
};

and before that, in the same file there was:

static struct resource mxcspi1_resources[] = {
	{
		.start = CSPI1_BASE_ADDR,
		.end = CSPI1_BASE_ADDR + SZ_4K - 1,
		.flags = IORESOURCE_MEM,
	},
	{
		.start = MXC_INT_CSPI1,
		.end = MXC_INT_CSPI1,
		.flags = IORESOURCE_IRQ,
	},
	{
		.start = MXC_DMA_CSPI1_TX,
		.end = MXC_DMA_CSPI1_TX,
		.flags = IORESOURCE_DMA,
	},
};

So that defines the magic driver string and the resources that are allocated to this device.

It’s worth noting that devices.c ends with

postcore_initcall(mxc_init_devices);

which causes a call to mxc_init_devices(), a function that messes up the addresses of the resources for some architectures. Just to add some confusion. Always watch out for those little traps!

Meanwhile, in drivers/spi/mxc_spi.c

static struct platform_driver mxc_spi_driver = {
	.driver = {
		   .name = "mxc_spi",
		   .owner = THIS_MODULE,
		   },
	.probe = mxc_spi_probe,
	.remove = mxc_spi_remove,
	.suspend = mxc_spi_suspend,
	.resume = mxc_spi_resume,
};

followed by:

static int __init mxc_spi_init(void)
{
	pr_debug("Registering the SPI Controller Driver\n");
	return platform_driver_register(&mxc_spi_driver);
}

static void __exit mxc_spi_exit(void)
{
	pr_debug("Unregistering the SPI Controller Driver\n");
	platform_driver_unregister(&mxc_spi_driver);
}

subsys_initcall(mxc_spi_init);
module_exit(mxc_spi_exit);

So this is how the driver tells Linux that it’s responsible for devices marked with the “mxc_spi” string.

As for some interaction with the device data (also in mxc_spi.c), there’s stuff like

mxc_platform_info = (struct mxc_spi_master *)pdev->dev.platform_data;

and

master_drv_data->res = platform_get_resource(pdev, IORESOURCE_MEM, 0);

going on with

if (!request_mem_region(master_drv_data->res->start,
			master_drv_data->res->end -
			master_drv_data->res->start + 1, pdev->name)) { /* Ayee! */ }

and

if (pdev->dev.dma_mask == NULL) { /* No DMA for you! */ }

and it goes on…

Reading the DocBook files in Linux kernel’s documentation

This is my short saga about my not necessarily intelligent actions for reading a DocBook paper.

So I wanted to read some documentation from my Linux kernel sources. It happened to be in DocBook format.

In the kernel source’s root, I tried

$ make htmldocs

or I could have gone

$ make pdfdocs

Or mandocs. Or sgmldocs. Or psdocs.

But that builds only the DocBook templates in Documentation/DocBook/. I need those in Documentation/sound/alsa/DocBook!

I tried to copy the Makefile to the target directory, and change the assignment of DOCBOOKS to the files I wanted handled. But that didn’t work, because the Makefile couldn’t find the script/ subdirectory. Or as “make” put it:

make: *** No rule to make target `/scripts/kernel-doc', needed by `/alsa-driver-api.tmpl'.  Stop.

OK, this was a bit too much. How about just going…

$ docbook2html writing-an-alsa-driver.tmpl

Works, creates a lot of scattered HTML files. Not so easy to read. Can’t I get it in a single document?

$ docbook2rtf writing-an-alsa-driver.tmpl

Huh? WTF? RTF? Yes, it’s apparently still alive. And hey, one can easily export it to PDF with OpenOffice!

This probably isn’t exactly the way the kernel hackers meant it to be done. On the other hand, it’s by far too much effort for just reading a document by and for the community…

I’m sure someone knowns about the obvious way I missed. Comments below, please…

 

Cache coherency on i.MX25 running Linux

What this blob is all about

Running some home-cooked SDMA scripts on Freescale’s Linux 2.6.28 kernel on an i.MX25 processor, I’m puzzled by the fact, that cache flushing with dma_map_single(…, DMA_TO_DEVICE) doesn’t hurt, but nothing happens if the calls are removed. On the other hand, attempting to remove cache invalidation calls, as in dma_map_single(…, DMA_FROM_DEVICE) does cause data corruption, as one would expect.

The de-facto lack of need for cache flushing could be explained by the small size of the cache: The sequence of events is typically preparing the data in the buffer, then some stuff in the middle, and only then is the SDMA script kicked off. If the cache lines are evicted naturally as a result of that “some stuff” activity, one gets away with not flushing the cache explicitly.

I’m by no means saying that cache flushing shouldn’t be done. On the contrary, I’m surprised that things don’t break when it’s removed.

So why doesn’t one get away with not invalidating the cache? In my tests, I saw 32-byte segments going wrong when I dropped the invalidation. That is, some segments, typically after a handful of successful data transactions of less than 1 kB of data.

Why does dropping the invalidation break things, and dropping the flushing doesn’t? As I said above, I’m still puzzled by this.

So I went down to the details of what these calls to dma_map_single() do. Spoiler: I didn’t find an explanation. At the end of the foodchain, there are several MCR assembly instructions, as one should expect. Both flushing and invalidation apparently does something useful.

The rest of this post is the dissection of Linux’ kernel code in this respect.

The gory details

DMA mappings and sync functions practically wrap the dma_cache_maint() function, e.g. in arch/arm/include/asm/dma-mapping.h:

static inline dma_addr_t dma_map_single(struct device *dev, void *cpu_addr,
		size_t size, enum dma_data_direction dir)
{
	BUG_ON(!valid_dma_direction(dir));

	if (!arch_is_coherent())
		dma_cache_maint(cpu_addr, size, dir);

	return virt_to_dma(dev, cpu_addr);
}

It was verified with disassembly that dma_map_single() was implemented with a call to dma_cache_maint().

This function can be found in arch/arm/mm/dma-mapping.c as follows

/*
 * Make an area consistent for devices.
 * Note: Drivers should NOT use this function directly, as it will break
 * platforms with CONFIG_DMABOUNCE.
 * Use the driver DMA support - see dma-mapping.h (dma_sync_*)
 */
void dma_cache_maint(const void *start, size_t size, int direction)
{
	const void *end = start + size;

	BUG_ON(!virt_addr_valid(start) || !virt_addr_valid(end - 1));

	switch (direction) {
	case DMA_FROM_DEVICE:		/* invalidate only */
		dmac_inv_range(start, end);
		outer_inv_range(__pa(start), __pa(end));
		break;
	case DMA_TO_DEVICE:		/* writeback only */
		dmac_clean_range(start, end);
		outer_clean_range(__pa(start), __pa(end));
		break;
	case DMA_BIDIRECTIONAL:		/* writeback and invalidate */
		dmac_flush_range(start, end);
		outer_flush_range(__pa(start), __pa(end));
		break;
	default:
		BUG();
	}
}
EXPORT_SYMBOL(dma_cache_maint);

The outer_* calls are defined as null functions in arch/arm/include/asm/cacheflush.h, since the CONFIG_OUTER_CACHE kernel configuration flag isn’t set.

The dmac_* macros are defined in arch/arm/include/asm/cacheflush.h as follows:

#define dmac_inv_range			__glue(_CACHE,_dma_inv_range)
#define dmac_clean_range		__glue(_CACHE,_dma_clean_range)
#define dmac_flush_range		__glue(_CACHE,_dma_flush_range)

where __glue() simply glues the two strings together (see arch/arm/include/asm/glue.h) and _CACHE equals “arm926″ for the i.MX25, so e.g. dmac_clean_range becomes arm926_dma_clean_range.

These actual functions are implemented in assembler in arch/arm/mm/proc-arm926.S:

/*
 *	dma_inv_range(start, end)
 *
 *	Invalidate (discard) the specified virtual address range.
 *	May not write back any entries.  If 'start' or 'end'
 *	are not cache line aligned, those lines must be written
 *	back.
 *
 *	- start	- virtual start address
 *	- end	- virtual end address
 *
 * (same as v4wb)
 */
ENTRY(arm926_dma_inv_range)
#ifndef CONFIG_CPU_DCACHE_WRITETHROUGH
	tst	r0, #CACHE_DLINESIZE - 1
	mcrne	p15, 0, r0, c7, c10, 1		@ clean D entry
	tst	r1, #CACHE_DLINESIZE - 1
	mcrne	p15, 0, r1, c7, c10, 1		@ clean D entry
#endif
	bic	r0, r0, #CACHE_DLINESIZE - 1
1:	mcr	p15, 0, r0, c7, c6, 1		@ invalidate D entry
	add	r0, r0, #CACHE_DLINESIZE
	cmp	r0, r1
	blo	1b
	mcr	p15, 0, r0, c7, c10, 4		@ drain WB
	mov	pc, lr

/*
 *	dma_clean_range(start, end)
 *
 *	Clean the specified virtual address range.
 *
 *	- start	- virtual start address
 *	- end	- virtual end address
 *
 * (same as v4wb)
 */
ENTRY(arm926_dma_clean_range)
#ifndef CONFIG_CPU_DCACHE_WRITETHROUGH
	bic	r0, r0, #CACHE_DLINESIZE - 1
1:	mcr	p15, 0, r0, c7, c10, 1		@ clean D entry
	add	r0, r0, #CACHE_DLINESIZE
	cmp	r0, r1
	blo	1b
#endif
	mcr	p15, 0, r0, c7, c10, 4		@ drain WB
	mov	pc, lr

/*
 *	dma_flush_range(start, end)
 *
 *	Clean and invalidate the specified virtual address range.
 *
 *	- start	- virtual start address
 *	- end	- virtual end address
 */
ENTRY(arm926_dma_flush_range)
	bic	r0, r0, #CACHE_DLINESIZE - 1
1:
#ifndef CONFIG_CPU_DCACHE_WRITETHROUGH
	mcr	p15, 0, r0, c7, c14, 1		@ clean+invalidate D entry
#else
	mcr	p15, 0, r0, c7, c6, 1		@ invalidate D entry
#endif
	add	r0, r0, #CACHE_DLINESIZE
	cmp	r0, r1
	blo	1b
	mcr	p15, 0, r0, c7, c10, 4		@ drain WB
	mov	pc, lr

The CONFIG_CPU_DCACHE_WRITETHROUGH kernel configuration flag is not set, so there are no shortcuts.

Exactly the same snippet, only disassembled from the object file (using objdump -d):

000004d4 <arm926_dma_inv_range>:
 4d4:	e310001f 	tst	r0, #31
 4d8:	1e070f3a 	mcrne	15, 0, r0, cr7, cr10, {1}
 4dc:	e311001f 	tst	r1, #31
 4e0:	1e071f3a 	mcrne	15, 0, r1, cr7, cr10, {1}
 4e4:	e3c0001f 	bic	r0, r0, #31
 4e8:	ee070f36 	mcr	15, 0, r0, cr7, cr6, {1}
 4ec:	e2800020 	add	r0, r0, #32
 4f0:	e1500001 	cmp	r0, r1
 4f4:	3afffffb 	bcc	4e8 <arm926_dma_inv_range+0x14>
 4f8:	ee070f9a 	mcr	15, 0, r0, cr7, cr10, {4}
 4fc:	e1a0f00e 	mov	pc, lr

00000500 <arm926_dma_clean_range>:
 500:	e3c0001f 	bic	r0, r0, #31
 504:	ee070f3a 	mcr	15, 0, r0, cr7, cr10, {1}
 508:	e2800020 	add	r0, r0, #32
 50c:	e1500001 	cmp	r0, r1
 510:	3afffffb 	bcc	504 <arm926_dma_clean_range+0x4>
 514:	ee070f9a 	mcr	15, 0, r0, cr7, cr10, {4}
 518:	e1a0f00e 	mov	pc, lr

0000051c <arm926_dma_flush_range>:
 51c:	e3c0001f 	bic	r0, r0, #31
 520:	ee070f3e 	mcr	15, 0, r0, cr7, cr14, {1}
 524:	e2800020 	add	r0, r0, #32
 528:	e1500001 	cmp	r0, r1
 52c:	3afffffb 	bcc	520 <arm926_dma_flush_range+0x4>
 530:	ee070f9a 	mcr	15, 0, r0, cr7, cr10, {4}
 534:	e1a0f00e 	mov	pc, lr

So there’s actually little to learn from the disassembly. Or at all…

Pulseaudio for multiple users, without system-mode daemon

This is a simple and quick solution for those of us who want to run certain programs as a different user on the same desktop, for example running several user profiles of a browser at the same time. The main problem is usually that Pulseaudio doesn’t accept connections from a user other than the one logged in on the desktop.

It’s often suggested to go for a system mode Pulseaudio daemon, but judging from the developer’s own comments on this, and the friendly messages left in the system’s log when doing this, like

Jan 18 16:35:33 ocho pulseaudio[11158]: main.c: OK, so you are running PA in system mode. Please note that you most likely shouldn't be doing that.
Jan 18 16:35:33 ocho pulseaudio[11158]: main.c: If you do it nonetheless then it's your own fault if things don't work as expected.
Jan 18 16:35:33 ocho pulseaudio[11158]: main.c: Please read http://pulseaudio.org/wiki/WhatIsWrongWithSystemMode for an explanation why system mode is usually a bad idea.
Jan 18 16:35:33 ocho pulseaudio[11158]: module.c: module-hal-detect is deprecated: Please use module-udev-detect instead of module-hal-detect!
Jan 18 16:35:33 ocho pulseaudio[11158]: module-hal-detect-compat.c: We will now load module-udev-detect. Please make sure to remove module-hal-detect from your configuration

it’s probably not such a good idea. Plus that in my case, the sound card wasn’t detected in system wide mode, probably because some configuration issue, which I didn’t care much about working on. The bottom line is that the software’s authors don’t really want this to work.

Opening a TCP socket instead

The simple solution is given on this forum thread. This works well when there’s a specific user always logged on, and programs belonging to other dummy users are always run for specific purposes.

The idea behind this trick is to open a TCP port for native Pulseaudio communication, only it doesn’t require authentication, as long as the connection comes from 127.0.0.1, i.e. from the host itself. This opens the audio interface to any program running on the computer, including recording from the microphone. This makes no significant difference security-wise if the computer is accessed by a single user anyhow (possible spyware is likely to run with the logged in user ID anyhow, which has full access to audio either way).

This solution works on Fedora Core 12, but it’s probably the way to do it on any distribution released since 2009 or so.

Edit: It has been suggested in the comments below to use a UNIX socket instead of TCP. Haven’t tried it, but it seems like a better solution.

To do as the desktop’s user

So let’s get to the hands-on: First, copy /etc/pulse/default.pa into a file with the same name in the .pulse directory, that is

cp /etc/pulse/default.pa ~/.pulse/

And then edit the file, adding the following line at the end:

load-module module-native-protocol-tcp auth-ip-acl=127.0.0.1

At this point, restart the pulseaudio deamon,

$ pulseaudio -k
$ pulseaudio -D

To do as the “fake” user

Now switch to the second user,  and create a file named client.conf under that user’s .pulse subdirectory

$ echo "default-server = 127.0.0.1" > ~/.pulse/client.conf

Note that default.pa and client.conf are in completely different directories, each belonging to a different user!

Surprisingly enough, that’s it. Any program running as the second user now has sound access.

stmmaceth: NetworkManager fails to bring up a wired Ethernet NIC

The problem

In short: Running linux 3.8.0 on Altera’s Cyclone V SoC, NetworkManager doesn’t bring up the Ethernet port. It also makes false accusations such as

Jan  1 00:00:17 localhost NetworkManager[1206]: <info> (eth0): driver 'stmmaceth' does not support carrier detection.

and later on also says

Jan  1 00:00:17 localhost NetworkManager[1206]: <warn> (eth0): couldn't get carrier state: (-1) unknown
Jan  1 00:00:17 localhost NetworkManager[1206]: <info> (eth0): carrier now OFF (device state 20, deferring action for 4 seconds)

And asking more directly,

# nm-tool eth0
NetworkManager Tool

State: disconnected

- Device: eth0 -----------------------------------------------------------------
  Type:              Wired
  Driver:            stmmaceth
  State:             unavailable
  Default:           no
  HW Address:        96:A7:6F:4E:DD:6D

  Capabilities:

  Wired Properties
    Carrier:         off

All of this is, of course, incorrect. Even though it’s not clear who to blame for this. But the driver detects the carrier all right:

# cat /sys/class/net/eth0/carrier
1

and as we shall see below, the ioctl() interface is also supported. Only it doesn’t work as NetworkManager expects it to.

Well, I bluffed a bit proving that the carrier detection works. Explained later.

So what went wrong?

Nothing like digging in the source code. In NetworkManager’s nm-device-ethernet.c, the function supports_ethtool_carrier_detect() goes

static gboolean
supports_ethtool_carrier_detect (NMDeviceEthernet *self)
{
	int fd;
	struct ifreq ifr;
	gboolean supports_ethtool = FALSE;
	struct ethtool_cmd edata;

	g_return_val_if_fail (self != NULL, FALSE);

	fd = socket (PF_INET, SOCK_DGRAM, 0);
	if (fd < 0) {
		nm_log_err (LOGD_HW, "couldn't open control socket.");
		return FALSE;
	}

	memset (&ifr, 0, sizeof (struct ifreq));
	strncpy (ifr.ifr_name, nm_device_get_iface (NM_DEVICE (self)), IFNAMSIZ);

	edata.cmd = ETHTOOL_GLINK;
	ifr.ifr_data = (char *) &edata;

	errno = 0;
	if (ioctl (fd, SIOCETHTOOL, &ifr) < 0) {
		nm_log_dbg (LOGD_HW | LOGD_ETHER, "SIOCETHTOOL failed: %d", errno);
		goto out;
	}

	supports_ethtool = TRUE;

out:
	close (fd);
	nm_log_dbg (LOGD_HW | LOGD_ETHER, "ethtool %s supported",
	            supports_ethtool ? "is" : "not");
	return supports_ethtool;
}

Obviously, this is the function that determines if the port supplies carrier detection. There is also a similar function for MII, supports_mii_carrier_detect (). A simple strace reveals what went wrong:

And indeed, in the strace log with this driver it says

socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 17
ioctl(17, SIOCETHTOOL, 0x7e93bcdc)      = -1 EBUSY (Device or resource busy)
close(17)                               = 0
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 17
ioctl(17, SIOCGMIIPHY, 0x7e93bcfc)      = -1 EINVAL (Invalid argument)
close(17)                               = 0
open("/proc/sys/net/ipv6/conf/eth0/accept_ra", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/proc/sys/net/ipv6/conf/eth0/use_tempaddr", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
gettimeofday({4101, 753554}, NULL)      = 0
send(6, "<30>Jan  1 01:08:21 NetworkManager[1701]: <info> (eth0): driver 'stmmaceth' does not support carrier detection.", 111, MSG_NOSIGNAL) = 111

so we can see that the attempt made in supports_ethtool_carrier_detect() failed with an EBUSY, and the one made by supports_mii_carrier_detect () failed as well, with an EINVAL. In other words, the ethtool (which is loosely related to the ethtool utility) ioctl() interface was recognized, but the driver said the driver was busy (a silly return code, as we shall see later), and the MII ioctl() interface was rejected altogether.

Since NetworkManager doesn’t support carrier detection based on Sysfs, the final conclusion is that there is no carrier detection.

But why did the driver answer EBUSY in the first place?

Some kernel digging

The relevant Linux kernel is 3.8.0.

ioctl() calls to network devices are handled by the dev_ioctl() function in net/core/dev.c (not in drivers/, and it was later on moved to dev_ioctl.c) as follows:

	case SIOCETHTOOL:
		dev_load(net, ifr.ifr_name);
		rtnl_lock();
		ret = dev_ethtool(net, &ifr);
		rtnl_unlock();
		if (!ret) {
			if (colon)
				*colon = ':';
			if (copy_to_user(arg, &ifr,
					 sizeof(struct ifreq)))
				ret = -EFAULT;
		}
		return ret;

Note that the ioctl() call is based upon the name of the interface as a string (e.g. “eth0″). The call to dev_load hence loads a kernel module if the respective driver isn’t loaded yet. The dev_ethtool() function is in net/core/ethtool.c. This function first runs a few sanity checks + permissions, and may return ENODEV, EFAULT or EPERM, depending on different mishaps.

Most notably, it runs

	if (dev->ethtool_ops->begin) {
		rc = dev->ethtool_ops->begin(dev);
		if (rc  < 0)
			return rc;
	}

which in the case of stmmac is

static int stmmac_check_if_running(struct net_device *dev)
{
	if (!netif_running(dev))
		return -EBUSY;
	return 0;
}

netif_running(dev) is defined in include/linux/netdevice.h as follows:

static inline bool netif_running(const struct net_device *dev)
{
	return test_bit(__LINK_STATE_START, &dev->state);
}

This function returns true when the device is “up”, exactly in the sense of “ifconfig up”.

Say what?

NetworkManager made the SIOCETHTOOL ioctl() call before bringing up the eth0 interface in order to check if it supports carrier detect. But since it wasn’t up (why should it be? NetworkManager didn’t bring it up), the driver’s sanity check (?) failed the ioctl() call with an EBUSY, as netif_running() returns false — the interface was down. So NetworkManager marked the interface as not supporting carrier detect, and took it up even so. This made the driver say that it has detected a carrier, but since NetworkManager didn’t expect that to happen, it started fooling around, and eventually didn’t bring up the interface properly (no DHCP, in particular).

As it turns out, netif_running(dev) returns zero, which is the reason the whole thing fails with an EBUSY.

Now let’s return to the Sysfs detection of the carrier. With the eth0 interface down, it goes like this

# cat /sys/class/net/eth0/carrier
cat: /sys/class/net/eth0/carrier: Invalid argument
# ifconfig eth0 up
# cat /sys/class/net/eth0/carrier
0
# cat /sys/class/net/eth0/carrier
1

The two successive carrier detections give different results, because it takes a second or so before the carrier is detected. There was nothing changed with the hardware inbetween (no cable was plugged in or something).

So NetworkManager was partly right: There driver doesn’t support carrier detection as long as the interface isn’t brought up.

Solution

The solution is surprisingly simple. Just make sure

ifconfig eth0 up

is executed before NetworkManager is launched. That’s it. Suddenly nm-tool sees a completely different interface:

# nm-tool eth0

NetworkManager Tool

State: connected (global)

- Device: eth0  [Wired connection 1] -------------------------------------------
  Type:              Wired
  Driver:            stmmaceth
  State:             connected
  Default:           yes
  HW Address:        9E:37:A8:56:CF:EC

  Capabilities:
    Carrier Detect:  yes
    Speed:           100 Mb/s

  Wired Properties
    Carrier:         on

  IPv4 Settings:
    Address:         10.1.1.242
    Prefix:          24 (255.255.255.0)
    Gateway:         10.1.1.3

    DNS:             10.2.0.1
    DNS:             10.2.0.2

Who should we blame here? Probably NetworkManager. Since it’s bringing up the interface anyhow, why not ask it if it supports carrier detection after the interface is up? I suppose that the driver has its reasons for not cooperating while it’s down.

Epilogue

Since I started with dissecting the kernel’s code, here’s what happens with the call to dev_ethtool() mentioned above, when it passes the “sanity check”. There’s a huge case statement, with the relevant part saying

	case ETHTOOL_GLINK:
		rc = ethtool_get_link(dev, useraddr);
		break;

the rc value is propagated up when this call finishes (after some possible other operations, which are probably not relevant).

And then we have, in the same file,

static int ethtool_get_link(struct net_device *dev, char __user *useraddr)
{
	struct ethtool_value edata = { .cmd = ETHTOOL_GLINK };

	if (!dev->ethtool_ops->get_link)
		return -EOPNOTSUPP;

	edata.data = netif_running(dev) && dev->ethtool_ops->get_link(dev);

	if (copy_to_user(useraddr, &edata, sizeof(edata)))
		return -EFAULT;
	return 0;
}

The ethtool_value structure is defined in include/uapi/linux/ethtool.h saying

struct ethtool_value {
	__u32	cmd;
	__u32	data;
};

Note that if netif_running(dev) returns false, zero is returned on the edata entry of the answer, but the call is successful (it actually makes sense). But this never happens with the current driver, as was seen above.

It’s fairly safe to assume that drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c handles the actual call, as it has

static const struct ethtool_ops stmmac_ethtool_ops = {
	.begin = stmmac_check_if_running,
... snip ...
	.get_link = ethtool_op_get_link,
... snip ...
};

but ethtool_op_get_link() is defined in net/core/ethtool.c (we’re running in circles…) saying simply

u32 ethtool_op_get_link(struct net_device *dev)
{
	return netif_carrier_ok(dev) ? 1 : 0;
}

which bring us to include/linux/netdevice.h where it says

static inline bool netif_carrier_ok(const struct net_device *dev)
{
	return !test_bit(__LINK_STATE_NOCARRIER, &dev->state);
}

This rises the question why the driver refuses to answer ETHTOOL_GLINK requests when it’s down. It’s not even involved in answering this request. But having attempted to modify the driver, so ETHTOOL_GLINK is let through even when the interface is down, I can say that it still confused NetworkManager. I didn’t get down to why exactly.

High resolution images of the Sockit board


At times, it’s useful to have a high-resolution picture of the board in front of you. For example, finding the correct place to touch with a probe is easier when the point is first found on the computer screen.

These are two very detailed images of the Sockit board by Terasic and Arrow Electronics (and Altera), featuring a Cyclone V SoC FPGA.

The images below are small, and are just links to the bigger files. The USB plug that is connected is the OTG port (for connecting a keyboard or USB stick etc.)

And finally, here’s a short video clip showing what it looks like when powering on the board with Xillinux:

 

Using xargs to run a commands from a bash batch file in parallel

Suppose that we have a file, batch-commands.sh, which consists of independent commands to be executed, one for each line. Now we want to run several of these in parallel.

xargs -P 8 -n 1 -d "\n" -a batch-commands.sh bash -c

With -P 8 there are 8 processes running all the time.

Tcl scripting: Which version of Quartus am I running?

The short answer is $quartus(version). Those familiar with Tcl immediately tell that there’s a named array (hash), $quartus, containing a key “version” which returns the full revision name.

So, entering an interactive session,

$ quartus_sh -s
Info: *******************************************************************
Info: Running Quartus II 32-bit Shell
 Info: Version 13.0.1 Build 232 06/12/2013 Service Pack 1 SJ Web Edition
 Info: Copyright (C) 1991-2013 Altera Corporation. All rights reserved.
 Info: Your use of Altera Corporation's design tools, logic functions
 Info: and other software and tools, and its AMPP partner logic
 Info: functions, and any output files from any of the foregoing
 Info: (including device programming or simulation files), and any
 Info: associated documentation or information are expressly subject
 Info: to the terms and conditions of the Altera Program License
 Info: Subscription Agreement, Altera MegaCore Function License
 Info: Agreement, or other applicable license agreement, including,
 Info: without limitation, that your use is for the sole purpose of
 Info: programming logic devices manufactured by Altera and sold by
 Info: Altera or its authorized distributors.  Please refer to the
 Info: applicable agreement for further details.
 Info: Processing started: Mon Dec 23 16:08:47 2013
Info: *******************************************************************
Info: The Quartus II Shell supports all TCL commands in addition
Info: to Quartus II Tcl commands. All unrecognized commands are
Info: assumed to be external and are run using Tcl's "exec"
Info: command.
Info: - Type "exit" to exit.
Info: - Type "help" to view a list of Quartus II Tcl packages.
Info: - Type "help <package name>" to view a list of Tcl commands
Info:   available for the specified Quartus II Tcl package.
Info: - Type "help -tcl" to get an overview on Quartus II Tcl usages.
Info: *******************************************************************

one can get both the Quartus revision and the Tcl version:

tcl> puts $quartus(version)
Version 13.0.1 Build 232 06/12/2013 Service Pack 1 SJ Web Edition
tcl> info tclversion
8.5

A simple regular expression can be used to fetch a clean Quartus version number:

tcl> regexp {[\.0-9]+} $quartus(version) clean_number
1
tcl> puts $clean_number
13.0.1

The first command runs the regular expression on the full version string, and finds the first sequence consisting of digits and dots. The return value is “1″ because such a sequence was found. The third argument to regexp makes the interpreter put the matched string into the $clean_number variable, which is printed in the second command.

To list all elements in the $quartus array,

tcl> foreach key [array names quartus] { puts "${key}=$quartus($key)" }
version_base=13.0
ip_rootpath=/path/to/13.0sp1/ip/
copyright=Copyright (C) 1991-2013 Altera Corporation
load_report_is_needed=0
advanced_use=0
nativelink_tclpath=/path/to/13.0sp1/quartus/common/tcl/internal/nativelink/
quartus_rootpath=/path/to/13.0sp1/quartus/
processing=0
tclpath=/path/to/13.0sp1/quartus/common/tcl/
ipc_mode=0
nameofexecutable=quartus_sh
tcl_console_mode=2
natural_bus_naming=1
eda_tclpath=/path/to/13.0sp1/quartus/common/tcl/internal/eda_utils/
settings=
internal_use=0
regtest_mode=0
package_table={ddr_timing_model quartus_sta hidden} {rpwq qacv hidden} [...]
eda_libpath=/path/to/13.0sp1/quartus/eda/
args=
ipc_sh=0
version=Version 13.0.1 Build 232 06/12/2013 Service Pack 1 SJ Web Edition
binpath=/path/to/13.0sp1/quartus/linux/
project=
is_report_loaded=0
available_packages=::quartus::external_memif_toolkit ::quartus::iptclgen ::quartus::project ::quartus::device ::quartus::partial_reconfiguration ::quartus::report ::quartus::misc ::quartus::rapid_recompile ::quartus::incremental_compilation ::quartus::flow ::quartus::systemconsol

package_table was snipped, as it was very long. I’ve also mangled the path to Quartus’ files into /path/to, also in order to keep it short.

Cyclone V SoC: Two masters on a bus errs on ID signal width

While working on Xillinux‘ port to Altera (the SocKit board, actually), I needed to connect two AXI masters: One for the VGA adapter, and one for the Xillybus IP core. Unlike Zynq, Altera’s HPS offers only one AXI slave port, so it’s up to Qsys to generate arbitration logic, implemented in the logic fabric, to connect these two masters to the HPS module.

But the interconnect’s details shouldn’t have bothered me, the user of Qsys. It was supposed to be a matter of connecting both masters to the same slave in Qsys’ graphical representation, and leaving the rest to the tools (Quartus 13.1 and 13.0sp1 in my case).

Only it went a little wrong. Besides, if you intend to use the WSTRB signals at all, you may want to avoid Altera’s master interconnect altogether. See below.

The generation failed as follows:

2013.12.14.17:22:33 Error: hps_0.f2h_axi_slave: width of ID signals (8) must be at least 9
2013.12.14.17:22:33 Info: merlin_domain_transform: After transform: 14 modules, 87 connections
2013.12.14.17:22:33 Info: merlin_router_transform: After transform: 28 modules, 129 connections
... snip ...
2013.12.14.17:22:34 Info: merlin_interrupt_mapper_transform: After transform: 62 modules, 201 connections
2013.12.14.17:22:38 Error: Generation stopped, 51 or more modules remaining
2013.12.14.17:22:38 Info: soc_system: Done soc_system" with 23 modules, 1 files, 298125 byte

Say what? The ID signals of masters on the AXI bus, which are connected to hps_0.f2h_axi_slave, should be 8 bits wide. Besides, where did the figure “9″ come from?

Also, note that Qsys is complaining about the width of a signal it generated itself (the port to the module that instantiates the HPS).

A word about ID widths

The IDs on the AXI bus are intended to identify the master that initiated the transaction, for several purposes (e.g. to allow loose reordering of packets from different masters). The full ID on the internal AXI bus is 12 bits wide.

Consequently, the ID widths presented by an FPGA slave on the AXI bus (attached to the regular or lightweight bridge, it doesn’t matter) should be 12 bits.

When the FPGA is master, the ID width is 8 bits. Rationale: The ID is 12 bits in the main interconnect, but bit 11 is always zero and bits [2:0] are 3′b100 for all packets from the FPGA bridge, so only 8 bits are left for setting by FPGA. See table 6-6 in the Cyclone V Device Handbook vol.3.

The solution

The answer is that the “9″ came from the width of the two master’s ID signals, which was 8, like they should be. It seems like the arbitration logic, which was automatically inserted by Qsys, added another bit in the ID field to distinguish between the two masters connected to it. So there are 9 bits. But the HPS can only offer 8 bits. Bummer.

Understanding the problem, the solution is simple: Reduce the masters’ ID signals’ width to, say, 4. Qsys then requires 5 bits from the HPS module, which is covered by its 8.

WSTRB lost by interconnect

After solving the problem described above, I combined two 64-bit masters into HPS’ slave, 64 bits as well, and experienced data corruptions. Some investigation revealed that the WSTRB signal wasn’t obeyed. Specifically, if WSTRB[7:0] was 0xf0 on a single-beat burst, all 64 bits ended up written into SDRAM, instead of leaving bits [31:0] intact. It’s not clear whether this happened occasionally or all the time, and if this is the only issue. I worked around this by connecting the write related AXI signals directly to the HPS (the arbitration was needed only for read signals), which solved the problem. Hence my conclusion that the interconnect was faulty.