iowrite32(), writel() and memory barriers taken apart

Introduction

Needing to remove superfluous memory barriers from a Linux kernel device driver, I wondered what they actually do. The issue is discussed down to painful detail in Documentation/memory-barriers.txt, but somehow it’s quite difficult to figure out if they’re really needed and where. Most drivers rely on subsequent iowrite32′s (or writel’s) to arrive to the hardware in the same order they appear in the code, and this is backed up the following clause in memory-barriers.txt:

Inside of the Linux kernel, I/O should be done through the appropriate accessor routines – such as inb() or writel() – which know how to make such accesses appropriately sequential. Whilst this, for the most part, renders the explicit use of memory barriers unnecessary, there are a couple of situations where they might be needed:

  1. On some systems, I/O stores are not strongly ordered across all CPUs, and so for _all_ general drivers locks should be used and miowb() must be issued prior to unlocking the critical section.
  2. If the accessor functions are used to refer to an I/O memory window with relaxed memory access properties, then _mandatory_ memory barriers are required to enforce ordering.

See Documentation/DocBook/deviceiobook.tmpl for more information.

So what they’re saying is that a memory barrier should be used before releasing a lock (spinlock? mutex? both? The examples show only a spinlock) and when prefetching is allowed by hardware.

Nice. Are they doing anything?

April 2020 update: I’ve written a new post on a similar topic. Also, on top of memory-barriers.txt mentioned above, there are some excellent explanations in the kernel tree’s tools/memory-model/Documentation/explanation.txt and tools/memory-model/Documentation/recipes.txt. There are relatively new (from v4.17, beginning of 2018).

May 2021 update: I’ve also written the parallel post for Windows device driver coding, which occasionally brings up Linux.

The practical take

Since I care most about x86 and ARM, I decided to figure out what the memory barriers actually do. The driver’s code should be formally correct, but in the end, if I remove a memory barrier and then test the driver — have I really made a difference? Have I really tested anything?

Ah, and in case you wonder why I didn’t check ioread32() and readl(): I don’t use them in my driver. Odd as it may sound.

The kernel sources in this post are ~3.12 but how often does anyone dare touching those basic functions?

Spoiler

For the lazy ones, here are my conclusions:

  • On x86 platforms, iowrite32() and writel() are translated to just a “mov” into memory.
  • On ARM, the same functions translate into a full write synchronization barrier (stop execution until all previous writes are done), and then an “str” into memory.
  • On x86, the following functions translate into nothing: mmiowb(), smp_wmb() and smp_rmb(). wmb() and rmb() translate into “sfence” and “lfence” respectively.
  • On ARM, mmiowb() translates into nothing. The other barriers translate into sensible opcodes.

Trying memory barriers with iowrite32()

I wrote the following kernel module as minimodule.c. Obviously, it won’t do anything good except for being disassembled after compilation.

#include <linux/module.h>
#include <linux/slab.h>
#include <linux/io.h>

void try_iowrite32(void) {
  void __iomem *p = (void *) 0x12345678;

  iowrite32(0xabcd0001, p);
  iowrite32(0xabcd0001, p);
  iowrite32(0xabcd0002, p);
  mmiowb();
  iowrite32(0xabcd0003, p);
  wmb();
  iowrite32(0xabcd0004, p);
  rmb();
  iowrite32(0xabcd0005, p);
  smp_wmb();
  iowrite32(0xabcd0006, p);
  smp_rmb();
}

EXPORT_SYMBOL(try_iowrite32);

The idea: First repeat exactly the same write to see how that’s handled, and then add barriers to see what they turn into.

The related sources for iowrite32() on x86

I have to admit that I was surprised to find out that iowrite32() is a function in itself, as is shown later in the disassembly. My best understanding was that it’s just an alias for writel(), by virtue of a define statement. But since CONFIG_GENERIC_IOMAP is defined on my kernel, it’s not defined in include/asm-generic/io.h, but there’s just a header for it in include/asm-generic/iomap.h. It’s defined as a function in lib/iomap.c as follows:

void iowrite32(u32 val, void __iomem *addr)
{
	IO_COND(addr, outl(val,port), writel(val, addr));
}

where IO_COND is previously defined in the same file as follows (the comment is in the sources):

/*
 * Ugly macros are a way of life.
 */
#define IO_COND(addr, is_pio, is_mmio) do {			\
	unsigned long port = (unsigned long __force)addr;	\
	if (port >= PIO_RESERVED) {				\
		is_mmio;					\
	} else if (port > PIO_OFFSET) {				\
		port &= PIO_MASK;				\
		is_pio;						\
	} else							\
		bad_io_access(port, #is_pio );			\
} while (0)

So there we have it. iowrite32() isn’t just an alias for writel(), but it checks the address and interprets it as port I/O if that makes sense.

To be sure, iowrite32() was disassembled as follows from the kernel’s object code (32-bit version):

0020f79f <iowrite32>:
  20f79f:       81 fa ff ff 03 00       cmp    $0x3ffff,%edx
  20f7a5:       89 d1                   mov    %edx,%ecx
  20f7a7:       76 03                   jbe    20f7ac <iowrite32+0xd>
  20f7a9:       89 02                   mov    %eax,(%edx)
  20f7ab:       c3                      ret
  20f7ac:       81 fa 00 00 01 00       cmp    $0x10000,%edx
  20f7b2:       76 08                   jbe    20f7bc <iowrite32+0x1d>
  20f7b4:       81 e2 ff ff 00 00       and    $0xffff,%edx
  20f7ba:       ef                      out    %eax,(%dx)
  20f7bb:       c3                      ret
  20f7bc:       ba f2 56 03 00          mov    $0x356f2,%edx
  20f7c1:       89 c8                   mov    %ecx,%eax
  20f7c3:       e9 41 fe ff ff          jmp    20f609 <bad_io_access>

Results on x86_64

Compiled on Intel x86/64 bit:

$ objdump -d minimodule.ko

minimodule.ko:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <try_iowrite32>:
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
   4:	e8 00 00 00 00       	callq  9 <try_iowrite32+0x9>
   9:	be 78 56 34 12       	mov    $0x12345678,%esi
   e:	bf 01 00 cd ab       	mov    $0xabcd0001,%edi
  13:	e8 00 00 00 00       	callq  18 <try_iowrite32+0x18>
  18:	be 78 56 34 12       	mov    $0x12345678,%esi
  1d:	bf 01 00 cd ab       	mov    $0xabcd0001,%edi
  22:	e8 00 00 00 00       	callq  27 <try_iowrite32+0x27>
  27:	be 78 56 34 12       	mov    $0x12345678,%esi
  2c:	bf 02 00 cd ab       	mov    $0xabcd0002,%edi
  31:	e8 00 00 00 00       	callq  36 <try_iowrite32+0x36>
  36:	be 78 56 34 12       	mov    $0x12345678,%esi
  3b:	bf 03 00 cd ab       	mov    $0xabcd0003,%edi
  40:	e8 00 00 00 00       	callq  45 <try_iowrite32+0x45>
  45:	0f ae f8             	sfence
  48:	be 78 56 34 12       	mov    $0x12345678,%esi
  4d:	bf 04 00 cd ab       	mov    $0xabcd0004,%edi
  52:	e8 00 00 00 00       	callq  57 <try_iowrite32+0x57>
  57:	0f ae e8             	lfence
  5a:	be 78 56 34 12       	mov    $0x12345678,%esi
  5f:	bf 05 00 cd ab       	mov    $0xabcd0005,%edi
  64:	e8 00 00 00 00       	callq  69 <try_iowrite32+0x69>
  69:	be 78 56 34 12       	mov    $0x12345678,%esi
  6e:	bf 06 00 cd ab       	mov    $0xabcd0006,%edi
  73:	e8 00 00 00 00       	callq  78 <try_iowrite32+0x78>
  78:	c9                   	leaveq
  79:	c3                   	retq
	...

Those “callq” statements are modified upon linking. To resolve what these are calling, go

$ readelf -r minimodule.ko

Relocation section '.rela.text' at offset 0xa9b0 contains 8 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000005  002300000002 R_X86_64_PC32     0000000000000000 mcount - 4
000000000014  002000000002 R_X86_64_PC32     0000000000000000 iowrite32 - 4
000000000023  002000000002 R_X86_64_PC32     0000000000000000 iowrite32 - 4
000000000032  002000000002 R_X86_64_PC32     0000000000000000 iowrite32 - 4
000000000041  002000000002 R_X86_64_PC32     0000000000000000 iowrite32 - 4
000000000053  002000000002 R_X86_64_PC32     0000000000000000 iowrite32 - 4
000000000065  002000000002 R_X86_64_PC32     0000000000000000 iowrite32 - 4
000000000074  002000000002 R_X86_64_PC32     0000000000000000 iowrite32 - 4

(the output continues with relocation information for debug variables).

It’s quite easy to work this out: The “Offset” column tells us the offset in the object code. For example, a callq statement begins at 0x13, but the address to call starts at 0x14. The second entry in the relocation section points at offset 0x14, and says that the target is iowrite32().

So from this output we learn that all callq’s are to iowrite32(), except the first one, which goes to mcount() (which is intended for kernel call tracing).

Now to conclusions: There are no memory barriers in the code, except those generated by wmb() and rmb(), which added sfence and lfence respectively. sfence is defined as

Performs a serializing operation on all store instructions that were issued prior the SFENCE instruction. This serializing operation guarantees that every store instruction that precedes in program order the SFENCE instruction is globally visible before any store instruction that follows the SFENCE instruction is globally visible. The SFENCE instruction is ordered with respect store instructions, other SFENCE instructions, any MFENCE instructions, and any serializing instructions (such as the CPUID instruction). It is not ordered with respect to load instructions or the LFENCE instruction.

and lfence as

Performs a serializing operation on all load-from-memory instructions that were issued prior the LFENCE instruction. This serializing operation guarantees that every load instruction that precedes in program order the LFENCE instruction is globally visible before any load instruction that follows the LFENCE instruction is globally visible. The LFENCE instruction is ordered with respect to load instructions, other LFENCE instructions, any MFENCE instructions, and any serializing instructions (such as the CPUID instruction). It is not ordered with respect to store instructions or the SFENCE instruction.

One can feel the Intel-headache just reading this.

Results on x86 (32 bit)

Compiling this against a 32-bit kernel, with a slightly different configuration:

$ objdump -d minimodule.ko

minimodule.ko:     file format elf32-i386

Disassembly of section .text:

00000000 <try_iowrite32>:
   0:	ba 78 56 34 12       	mov    $0x12345678,%edx
   5:	b8 01 00 cd ab       	mov    $0xabcd0001,%eax
   a:	e8 fc ff ff ff       	call   b <try_iowrite32+0xb>
   f:	ba 78 56 34 12       	mov    $0x12345678,%edx
  14:	b8 01 00 cd ab       	mov    $0xabcd0001,%eax
  19:	e8 fc ff ff ff       	call   1a <try_iowrite32+0x1a>
  1e:	ba 78 56 34 12       	mov    $0x12345678,%edx
  23:	b8 02 00 cd ab       	mov    $0xabcd0002,%eax
  28:	e8 fc ff ff ff       	call   29 <try_iowrite32+0x29>
  2d:	ba 78 56 34 12       	mov    $0x12345678,%edx
  32:	b8 03 00 cd ab       	mov    $0xabcd0003,%eax
  37:	e8 fc ff ff ff       	call   38 <try_iowrite32+0x38>
  3c:	f0 83 04 24 00       	lock addl $0x0,(%esp)
  41:	ba 78 56 34 12       	mov    $0x12345678,%edx
  46:	b8 04 00 cd ab       	mov    $0xabcd0004,%eax
  4b:	e8 fc ff ff ff       	call   4c <try_iowrite32+0x4c>
  50:	f0 83 04 24 00       	lock addl $0x0,(%esp)
  55:	ba 78 56 34 12       	mov    $0x12345678,%edx
  5a:	b8 05 00 cd ab       	mov    $0xabcd0005,%eax
  5f:	e8 fc ff ff ff       	call   60 <try_iowrite32+0x60>
  64:	ba 78 56 34 12       	mov    $0x12345678,%edx
  69:	b8 06 00 cd ab       	mov    $0xabcd0006,%eax
  6e:	e8 fc ff ff ff       	call   6f <try_iowrite32+0x6f>
  73:	f0 83 04 24 00       	lock addl $0x0,(%esp)
  78:	c3                   	ret
  79:	00 00                	add    %al,(%eax)
	...

Disassembly of section .altinstr_replacement:

00000000 <.altinstr_replacement>:
   0:	0f ae f8             	sfence
   3:	0f ae e8             	lfence
   6:	0f ae e8             	lfence

$ readelf -r minimodule.ko

Relocation section '.rel.text' at offset 0xc3e0 contains 7 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
0000000b  00002402 R_386_PC32        00000000   iowrite32
0000001a  00002402 R_386_PC32        00000000   iowrite32
00000029  00002402 R_386_PC32        00000000   iowrite32
00000038  00002402 R_386_PC32        00000000   iowrite32
0000004c  00002402 R_386_PC32        00000000   iowrite32
00000060  00002402 R_386_PC32        00000000   iowrite32
0000006f  00002402 R_386_PC32        00000000   iowrite32

So it’s in essence the same, only the mcount() call in the beginning was skipped.

The related sources for iowrite32() on ARM

These are the key excerpts from arch/arm/include/asm/io.h:

static inline void __raw_writel(u32 val, volatile void __iomem *addr)
{
	asm volatile("str %1, %0"
		     : "+Qo" (*(volatile u32 __force *)addr)
		     : "r" (val));
}
...
#define writel_relaxed(v,c)	__raw_writel((__force u32) cpu_to_le32(v),c)
...
#define writel(v,c)		({ __iowmb(); writel_relaxed(v,c); })
...
#define iowrite32(v,p)	({ __iowmb(); __raw_writel((__force __u32)cpu_to_le32(v), p); })

As for __iowmb(), it goes

/* IO barriers */
#ifdef CONFIG_ARM_DMA_MEM_BUFFERABLE
#include <asm/barrier.h>
#define __iormb()		rmb()
#define __iowmb()		wmb()
#else
#define __iormb()		do { } while (0)
#define __iowmb()		do { } while (0)
#endif

so it’s down to the configuration if __iowmb() does something. And to get the full picture, these are snips from arch/arm/include/asm/barrier.h:

#if __LINUX_ARM_ARCH__ >= 7
#define isb(option) __asm__ __volatile__ ("isb " #option : : : "memory")
#define dsb(option) __asm__ __volatile__ ("dsb " #option : : : "memory")
#define dmb(option) __asm__ __volatile__ ("dmb " #option : : : "memory")
...
#ifdef CONFIG_ARCH_HAS_BARRIERS
#include <mach/barriers.h>
#elif defined(CONFIG_ARM_DMA_MEM_BUFFERABLE) || defined(CONFIG_SMP)
#define mb()		do { dsb(); outer_sync(); } while (0)
#define rmb()		dsb()
#define wmb()		do { dsb(st); outer_sync(); } while (0)
#else
#define mb()		barrier()
#define rmb()		barrier()
#define wmb()		barrier()
#endif

Results on ARM

This is what the same module compiled for ARM Cortex A9, Little Endian gives (I’ve added extra newlines in the middle for clarity):

minimodule.o:     file format elf32-littlearm

Disassembly of section .text:

00000000 <try_iowrite32>:
   0:	e92d4038 	push	{r3, r4, r5, lr}

   4:	f57ff04e 	dsb	st
   8:	e59f2118 	ldr	r2, [pc, #280]	; 128 <try_iowrite32+0x128>
   c:	e1a04002 	mov	r4, r2
  10:	e5923018 	ldr	r3, [r2, #24]
  14:	e3530000 	cmp	r3, #0
  18:	0a000000 	beq	20 <try_iowrite32+0x20>
  1c:	e12fff33 	blx	r3
  20:	e59f3104 	ldr	r3, [pc, #260]	; 12c <try_iowrite32+0x12c>
  24:	e59f1104 	ldr	r1, [pc, #260]	; 130 <try_iowrite32+0x130>
  28:	e5831678 	str	r1, [r3, #1656]	; 0x678

  2c:	f57ff04e 	dsb	st
  30:	e5942018 	ldr	r2, [r4, #24]
  34:	e1a05001 	mov	r5, r1
  38:	e1a04003 	mov	r4, r3
  3c:	e3520000 	cmp	r2, #0
  40:	0a000000 	beq	48 <try_iowrite32+0x48>
  44:	e12fff32 	blx	r2
  48:	e5845678 	str	r5, [r4, #1656]	; 0x678

  4c:	f57ff04e 	dsb	st
  50:	e59f20d0 	ldr	r2, [pc, #208]	; 128 <try_iowrite32+0x128>
  54:	e1a04002 	mov	r4, r2
  58:	e5923018 	ldr	r3, [r2, #24]
  5c:	e3530000 	cmp	r3, #0
  60:	0a000000 	beq	68 <try_iowrite32+0x68>
  64:	e12fff33 	blx	r3
  68:	e59f30bc 	ldr	r3, [pc, #188]	; 12c <try_iowrite32+0x12c>
  6c:	e59f20c0 	ldr	r2, [pc, #192]	; 134 <try_iowrite32+0x134>
  70:	e5832678 	str	r2, [r3, #1656]	; 0x678

  74:	f57ff04e 	dsb	st
  78:	e5942018 	ldr	r2, [r4, #24]
  7c:	e1a04003 	mov	r4, r3
  80:	e3520000 	cmp	r2, #0
  84:	0a000000 	beq	8c <try_iowrite32+0x8c>
  88:	e12fff32 	blx	r2
  8c:	e59f30a4 	ldr	r3, [pc, #164]	; 138 <try_iowrite32+0x138>
  90:	e5843678 	str	r3, [r4, #1656]	; 0x678

  94:	f57ff04e 	dsb	st
  98:	e59f2088 	ldr	r2, [pc, #136]	; 128 <try_iowrite32+0x128>
  9c:	e1a04002 	mov	r4, r2
  a0:	e5923018 	ldr	r3, [r2, #24]
  a4:	e3530000 	cmp	r3, #0
  a8:	0a000000 	beq	b0 <try_iowrite32+0xb0>
  ac:	e12fff33 	blx	r3

  b0:	f57ff04e 	dsb	st
  b4:	e5943018 	ldr	r3, [r4, #24]
  b8:	e3530000 	cmp	r3, #0
  bc:	0a000000 	beq	c4 <try_iowrite32+0xc4>
  c0:	e12fff33 	blx	r3
  c4:	e59f3060 	ldr	r3, [pc, #96]	; 12c <try_iowrite32+0x12c>
  c8:	e59f206c 	ldr	r2, [pc, #108]	; 13c <try_iowrite32+0x13c>
  cc:	e5832678 	str	r2, [r3, #1656]	; 0x678
  d0:	f57ff04f 	dsb	sy

  d4:	f57ff04e 	dsb	st
  d8:	e59f1048 	ldr	r1, [pc, #72]	; 128 <try_iowrite32+0x128>
  dc:	e1a04003 	mov	r4, r3
  e0:	e1a05001 	mov	r5, r1
  e4:	e5912018 	ldr	r2, [r1, #24]
  e8:	e3520000 	cmp	r2, #0
  ec:	0a000000 	beq	f4 <try_iowrite32+0xf4>
  f0:	e12fff32 	blx	r2
  f4:	e59f3044 	ldr	r3, [pc, #68]	; 140 <try_iowrite32+0x140>
  f8:	e5843678 	str	r3, [r4, #1656]	; 0x678
  fc:	f57ff05a 	dmb	ishst

 100:	f57ff04e 	dsb	st
 104:	e5953018 	ldr	r3, [r5, #24]
 108:	e3530000 	cmp	r3, #0
 10c:	0a000000 	beq	114 <try_iowrite32+0x114>
 110:	e12fff33 	blx	r3
 114:	e59f3010 	ldr	r3, [pc, #16]	; 12c <try_iowrite32+0x12c>
 118:	e59f2024 	ldr	r2, [pc, #36]	; 144 <try_iowrite32+0x144>
 11c:	e5832678 	str	r2, [r3, #1656]	; 0x678
 120:	f57ff05b 	dmb	ish
 124:	e8bd8038 	pop	{r3, r4, r5, pc}
 128:	00000000 	.word	0x00000000
 12c:	12345000 	.word	0x12345000
 130:	abcd0001 	.word	0xabcd0001
 134:	abcd0002 	.word	0xabcd0002
 138:	abcd0003 	.word	0xabcd0003
 13c:	abcd0004 	.word	0xabcd0004
 140:	abcd0005 	.word	0xabcd0005
 144:	abcd0006 	.word	0xabcd0006

This was a lot of code (somehow that’s what you get with ARM). There are no calls to iowrite32(), so this is done inline for ARM (consistent with the sources).

This requires some translation from ARM opcodes to human language (taken from this page):

  • DSB SY — Data Synchronization Barrier: No instruction in program order after this instruction executes until all explicit memory accesses before this instruction complete, as well as all cache, branch predictor and TLB maintenance operations before this instruction complete.
  • DSB ST — Like DSB SY, but waits only for data writes to complete.
  • DMB ISHST — Data Memory Barrier, operation that waits only for stores to complete, and only to the inner shareable domain (whatever that “inner shareable domain” is).
  • DMB ISH — Data Memory Barrier, operation that waits only to the inner shareable domain.

Now let’s decipher the assembly code, which is quite tangled. Luckily, it’s easy to spot the seven write operations as the seven “str” commands in the assembly code. It’s also easy to see that all each iowrite32() starts with an “dsb st” which forces waiting until previous writes has completed. So each iowrite32() spans from a “dsb st” to a “str”. This matches the definition of iowrite32() as __iowmb() and then __raw_writel(…).

The memory barriers are quite clear too:

  • wmb() becomes “dsb st”, the full synchronization barrier for writes (which is also issued automatically before each iowrite32).
  • rmb() becomes “dsb sy”, the full synchronization barrier for reads and writes
  • smp_wmb() becomes “dmb ishst”, the “inner shareable domain” memory barrier for writes
  • smp_rmb() becomes “dmb ish”, the “inner shareable domain” memory barrier for reads and writes

Now with writel()

So I through it would be nice to repeat all this with writel(). Spoiler: Nothing thrilling happens here.

Module code (includes omitted):

void try_writel(void) {
  void __iomem *p = (void *) 0x12345678;

  writel(0xabcd0001, p);
  writel(0xabcd0001, p);
  writel(0xabcd0002, p);
  mmiowb();
  writel(0xabcd0003, p);
  wmb();
  writel(0xabcd0004, p);
  rmb();
  writel(0xabcd0005, p);
  smp_wmb();
  writel(0xabcd0006, p);
  smp_rmb();
}

EXPORT_SYMBOL(try_writel);

Assembly on 64-bit Intel:

minimodule.ko:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <try_writel>:
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
   4:	e8 00 00 00 00       	callq  9 <try_writel+0x9>
   9:	b8 01 00 cd ab       	mov    $0xabcd0001,%eax
   e:	89 04 25 78 56 34 12 	mov    %eax,0x12345678
  15:	89 04 25 78 56 34 12 	mov    %eax,0x12345678
  1c:	b8 02 00 cd ab       	mov    $0xabcd0002,%eax
  21:	89 04 25 78 56 34 12 	mov    %eax,0x12345678
  28:	b8 03 00 cd ab       	mov    $0xabcd0003,%eax
  2d:	89 04 25 78 56 34 12 	mov    %eax,0x12345678
  34:	0f ae f8             	sfence
  37:	b8 04 00 cd ab       	mov    $0xabcd0004,%eax
  3c:	89 04 25 78 56 34 12 	mov    %eax,0x12345678
  43:	0f ae e8             	lfence
  46:	b8 05 00 cd ab       	mov    $0xabcd0005,%eax
  4b:	89 04 25 78 56 34 12 	mov    %eax,0x12345678
  52:	b8 06 00 cd ab       	mov    $0xabcd0006,%eax
  57:	89 04 25 78 56 34 12 	mov    %eax,0x12345678
  5e:	c9                   	leaveq
  5f:	c3                   	retq

OK, so writel() just translated into a couple of inline “mov” opcodes. There’s even an optimization between the first and second move, so %eax isn’t set twice. Hi-tec, I’m telling you.

And on 32-bit Intel:

minimodule.ko:     file format elf32-i386

Disassembly of section .text:

00000000 <try_writel>:
   0:	b8 01 00 cd ab       	mov    $0xabcd0001,%eax
   5:	a3 78 56 34 12       	mov    %eax,0x12345678
   a:	a3 78 56 34 12       	mov    %eax,0x12345678
   f:	b0 02                	mov    $0x2,%al
  11:	a3 78 56 34 12       	mov    %eax,0x12345678
  16:	b0 03                	mov    $0x3,%al
  18:	a3 78 56 34 12       	mov    %eax,0x12345678
  1d:	f0 83 04 24 00       	lock addl $0x0,(%esp)
  22:	b0 04                	mov    $0x4,%al
  24:	a3 78 56 34 12       	mov    %eax,0x12345678
  29:	f0 83 04 24 00       	lock addl $0x0,(%esp)
  2e:	b0 05                	mov    $0x5,%al
  30:	a3 78 56 34 12       	mov    %eax,0x12345678
  35:	b0 06                	mov    $0x6,%al
  37:	a3 78 56 34 12       	mov    %eax,0x12345678
  3c:	f0 83 04 24 00       	lock addl $0x0,(%esp)
  41:	c3                   	ret
	...

Disassembly of section .altinstr_replacement:

00000000 <.altinstr_replacement>:
   0:	0f ae f8             	sfence
   3:	0f ae e8             	lfence
   6:	0f ae e8             	lfence

And for ARM, it’s exactly the same code (to the byte) as iowrite32() is an alias for writel(). But I listed it here anyhow for those who don’t take my word for it:

minimodule.o:     file format elf32-littlearm

Disassembly of section .text:

00000000 <try_writel>:
   0:	e92d4038 	push	{r3, r4, r5, lr}
   4:	f57ff04e 	dsb	st
   8:	e59f2118 	ldr	r2, [pc, #280]	; 128 <try_writel+0x128>
   c:	e1a04002 	mov	r4, r2
  10:	e5923018 	ldr	r3, [r2, #24]
  14:	e3530000 	cmp	r3, #0
  18:	0a000000 	beq	20 <try_writel+0x20>
  1c:	e12fff33 	blx	r3
  20:	e59f3104 	ldr	r3, [pc, #260]	; 12c <try_writel+0x12c>
  24:	e59f1104 	ldr	r1, [pc, #260]	; 130 <try_writel+0x130>
  28:	e5831678 	str	r1, [r3, #1656]	; 0x678
  2c:	f57ff04e 	dsb	st
  30:	e5942018 	ldr	r2, [r4, #24]
  34:	e1a05001 	mov	r5, r1
  38:	e1a04003 	mov	r4, r3
  3c:	e3520000 	cmp	r2, #0
  40:	0a000000 	beq	48 <try_writel+0x48>
  44:	e12fff32 	blx	r2
  48:	e5845678 	str	r5, [r4, #1656]	; 0x678
  4c:	f57ff04e 	dsb	st
  50:	e59f20d0 	ldr	r2, [pc, #208]	; 128 <try_writel+0x128>
  54:	e1a04002 	mov	r4, r2
  58:	e5923018 	ldr	r3, [r2, #24]
  5c:	e3530000 	cmp	r3, #0
  60:	0a000000 	beq	68 <try_writel+0x68>
  64:	e12fff33 	blx	r3
  68:	e59f30bc 	ldr	r3, [pc, #188]	; 12c <try_writel+0x12c>
  6c:	e59f20c0 	ldr	r2, [pc, #192]	; 134 <try_writel+0x134>
  70:	e5832678 	str	r2, [r3, #1656]	; 0x678
  74:	f57ff04e 	dsb	st
  78:	e5942018 	ldr	r2, [r4, #24]
  7c:	e1a04003 	mov	r4, r3
  80:	e3520000 	cmp	r2, #0
  84:	0a000000 	beq	8c <try_writel+0x8c>
  88:	e12fff32 	blx	r2
  8c:	e59f30a4 	ldr	r3, [pc, #164]	; 138 <try_writel+0x138>
  90:	e5843678 	str	r3, [r4, #1656]	; 0x678
  94:	f57ff04e 	dsb	st
  98:	e59f2088 	ldr	r2, [pc, #136]	; 128 <try_writel+0x128>
  9c:	e1a04002 	mov	r4, r2
  a0:	e5923018 	ldr	r3, [r2, #24]
  a4:	e3530000 	cmp	r3, #0
  a8:	0a000000 	beq	b0 <try_writel+0xb0>
  ac:	e12fff33 	blx	r3
  b0:	f57ff04e 	dsb	st
  b4:	e5943018 	ldr	r3, [r4, #24]
  b8:	e3530000 	cmp	r3, #0
  bc:	0a000000 	beq	c4 <try_writel+0xc4>
  c0:	e12fff33 	blx	r3
  c4:	e59f3060 	ldr	r3, [pc, #96]	; 12c <try_writel+0x12c>
  c8:	e59f206c 	ldr	r2, [pc, #108]	; 13c <try_writel+0x13c>
  cc:	e5832678 	str	r2, [r3, #1656]	; 0x678
  d0:	f57ff04f 	dsb	sy
  d4:	f57ff04e 	dsb	st
  d8:	e59f1048 	ldr	r1, [pc, #72]	; 128 <try_writel+0x128>
  dc:	e1a04003 	mov	r4, r3
  e0:	e1a05001 	mov	r5, r1
  e4:	e5912018 	ldr	r2, [r1, #24]
  e8:	e3520000 	cmp	r2, #0
  ec:	0a000000 	beq	f4 <try_writel+0xf4>
  f0:	e12fff32 	blx	r2
  f4:	e59f3044 	ldr	r3, [pc, #68]	; 140 <try_writel+0x140>
  f8:	e5843678 	str	r3, [r4, #1656]	; 0x678
  fc:	f57ff05a 	dmb	ishst
 100:	f57ff04e 	dsb	st
 104:	e5953018 	ldr	r3, [r5, #24]
 108:	e3530000 	cmp	r3, #0
 10c:	0a000000 	beq	114 <try_writel+0x114>
 110:	e12fff33 	blx	r3
 114:	e59f3010 	ldr	r3, [pc, #16]	; 12c <try_writel+0x12c>
 118:	e59f2024 	ldr	r2, [pc, #36]	; 144 <try_writel+0x144>
 11c:	e5832678 	str	r2, [r3, #1656]	; 0x678
 120:	f57ff05b 	dmb	ish
 124:	e8bd8038 	pop	{r3, r4, r5, pc}
 128:	00000000 	.word	0x00000000
 12c:	12345000 	.word	0x12345000
 130:	abcd0001 	.word	0xabcd0001
 134:	abcd0002 	.word	0xabcd0002
 138:	abcd0003 	.word	0xabcd0003
 13c:	abcd0004 	.word	0xabcd0004
 140:	abcd0005 	.word	0xabcd0005
 144:	abcd0006 	.word	0xabcd0006

SOLVED: Lenovo Yoga 2 13″ with “hardware-disabled” Wifi

Overview

Having a Lenovo Yoga 2 13″ (non-pro) running Ubuntu 14.04.1, I couldn’t get Wireless LAN up and running, as the WLAN NIC appeared to be “hardware locked”. This is the summary of how I solved this issue. If you’re not interested in the gory details, you may jump right to bottom, where I offer a replacement module that fixes it. At least for me.

Environment details: Distribution kernel 3.13.0-32-generic on an Intel i5-4210U CPU @ 1.70GHz. The Wifi device is an Intel Dual Band Wireless-AC 7260 (8086:08b1) connected to the PCIe bus, taken care of by the iwlwifi driver.

The problem

Laptops have a mechanism for working in “flight mode” which means turning off any device that could emit RF power, so that the airplane can crash for whatever different reason. Apparently, some laptops have a physical on-off switch to request this, but on Lenovo Yoga 13, the arrangement is to press a button on the keyboard with an airplane drawn on it. The one shared with F7.

It seems to be, that on Lenovo Yoga 13, the ACPI interface, which is responsible for reporting the Wifi’s buttons state, always reports that it’s in flight mode. So Linux turns off Wifi, and on the desktop’s Gnome network applet it says “Wi-Fi is disabled by hardware switch”.

In the dmesg log one can tell the problem with a line like

iwlwifi 0000:01:00.0: RF_KILL bit toggled to disable radio.

which is issued by the interrupt request handler defined in drivers/net/wireless/iwlwifi/pcie/rx.c, which responds to an interrupt from the device that informs the host that the hardware RF kill bit is set. So the iwlwifi module is not to blame here — it just responds to a request from the ACPI subsystem.

rfkill

The management of RF-related devices is handled by the rfkill subsystem. On my laptop, before solving the problem, a typical output went

$ rfkill list all
0: ideapad_wlan: Wireless LAN
        Soft blocked: yes
        Hard blocked: yes
1: ideapad_bluetooth: Bluetooth
        Soft blocked: no
        Hard blocked: yes
6: hci0: Bluetooth
        Soft blocked: no
        Hard blocked: no
7: phy1: Wireless LAN
        Soft blocked: yes
        Hard blocked: yes

So there are different entities that can be controlled with rfkill, enumerated and assigned soft and hard blocks. Each of these relate to a directory in /sys/class/rfkill/. For example, the last device, “phy7″ enumerated as 7 corresponds to /sys/class/rfkill/rfkill7, where the “hard” and “soft” pseudo-files signify the status with “0″ or “1″ values.

The soft block can be changed by “rfkill unblock 0″ or “rfkill unblock 7″, but this doesn’t really help with the hardware block. Both has to be “off” to use the device.

As can be seen easily from the rkfill list above, each of the physical devices are registered twice as rfkill devices: Once by their driver, and a second time by the ideapad_laptop driver. This will be used in the solution below.

The ideapad_laptop module

The ideapad-laptop module is responsible for talking with the ACPI layer on machines that match “VPC2004″ as a platform (as in /sys/devices/platform/VPC2004:00, or /sys/bus/acpi/devices/VPC2004:00, but doesn’t fit anything found in /sys/class/dmi/id/).

Blacklisting this module has been suggested for Yoga laptops all over the web. In particular this post suggests to insmod the module once with a hack that forces the Wifi on, and then blacklist it.

But by blacklisting ideapad-laptop, the computer loses some precious functionality, including disabling Wifi and the touchpad by pressing a button. So this is not an appealing solution.

Ideapad’s two debugfs output files go:

# cat /sys/kernel/debug/ideapad/cfg
cfg: 0x017DE014

Capability: Bluetooth Wireless Camera
Graphic:
# cat /sys/kernel/debug/ideapad/status
Backlight max:	16
Backlight now:	9
BL power value:	On
=====================
Radio status:	Off(0)
Wifi status:	Off(0)
BT status:	On(1)
3G status:	Off(0)
=====================
Touchpad status:Off(0)
Camera status:	On(1)

So the Radio and Wifi statuses, which are read from the ACPI registers, are off. This makes the ideapad_laptop module conclude that everything should go off.

The solution

In essence, the solution for the problem is to take the ideapad_laptop’s hands off the Wifi hardware, except for turning the hardware block off when it’s loaded. It consists of making the following changes in drivers/platform/x86/ideapad-laptop.c:

  • First, remove the driver’s rfkill registration. Somewhere at the beginning of the file, change
    #define IDEAPAD_RFKILL_DEV_NUM	(3)

    to

    #define IDEAPAD_RFKILL_DEV_NUM	(2)

    and in the definition of ideapad_rfk_data[], remove the line saying

    { "ideapad_wlan", CFG_WIFI_BIT, VPCCMD_W_WIFI, RFKILL_TYPE_WLAN }

    This prevents the driver from presenting an rfkill interface, so it keeps its hands off.

  • There is however a chance that the relevant bit in the ACPI layer already has the hardware block on. So let’s turn it off every time the driver loads. In ideapad_acpi_add(), after the call to ideapad_sync_rfk_state(), more or less, add the following two lines:
    pr_warn("Hack: Forcing WLAN hardware block off\n");
    write_ec_cmd(priv->adev->handle, VPCCMD_W_WIFI, 1);
  • And finally, solve a rather bizarre phenomenon, that when reading for the RF state with a VPCCMD_R_RF command, the Wifi interface is hardware blocked for some reason. Note that radio is always in off mode, so it’s a meaningless register on Yoga 2. This is handled in two places. First, empty ideapad_sync_rfk_state() completely, by turning it into
    static void ideapad_sync_rfk_state(struct ideapad_private *priv)
    {
    }

    This function reads VPCCMD_R_RF and calls rfkill_set_hw_state() accordingly, but on Yoga 2 it will always block everything, so what’s the point?
    Next, in debugfs_status_show() which prints out /sys/kernel/debug/ideapad/status, remove the following three lines:

    if (!read_ec_data(priv->adev->handle, VPCCMD_R_RF, &value))
      seq_printf(s, "Radio status:\t%s(%lu)\n",
        value ? "On" : "Off", value);

Having these changes made, the Wifi works properly, regardless of it was previously reported hardware blocked.

This can’t be submitted as a patch to the kernel, because presumably some laptops need the rfkill interface for Wifi through ideapad_laptop (or else, why was it put there in the first place?).

Also, maybe I should have done this for Bluetooth too? Don’t know. I don’t use Bluetooth right now, and the desktop applet seems to say all is fine with it anyhow.

Download the driver fix

For the lazy ones, I’ve prepared a little kit for compiling the relevant driver. I’ve taken the driver as it appears in kernel 3.16, more or less, and applied the changes above. And I then added a Makefile to make it compile easily. Since the kernel API changes rather rapidly, this will probably work well for kernels around 3.16 (that includes 3.13), and then you’ll have to apply the changes manually. If it isn’t fixed in the kernel itself by then.

Download it from here, unzip it, change directory, and compile it with typing “make”. This works only if you have the kernel headers and gcc compiler installed, which is usually the case in recent distributions. So a session like this is expected:

$ make
make -C /lib/modules/3.13.0-32-generic/build SUBDIRS=/home/eli/yoga-wifi-fix modules
make[1]: Entering directory `/usr/src/linux-headers-3.13.0-32-generic'
  CC [M]  /home/eli/yoga-wifi-fix/ideapad-laptop.o
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /home/eli/yoga-wifi-fix/ideapad-laptop.mod.o
  LD [M]  /home/eli/yoga-wifi-fix/ideapad-laptop.ko
make[1]: Leaving directory `/usr/src/linux-headers-3.13.0-32-generic'

Then replace the fresh ideapad-laptop.ko with the one the kernel uses. First, let’s figure out where to. The modinfo command help here:

$ modinfo ideapad_laptop
filename:       /lib/modules/3.13.0-32-generic/kernel/drivers/platform/x86/ideapad-laptop.ko
license:        GPL
description:    IdeaPad ACPI Extras
author:         David Woodhouse <dwmw2@infradead.org>
srcversion:     BA339D663FA3B10105A1DC0
alias:          acpi*:VPC2004:*
depends:        sparse-keymap
vermagic:       3.13.0-32-generic SMP mod_unload modversions
parm:           no_bt_rfkill:No rfkill for bluetooth. (bool)

So the directory is now known (marked in red). This leaves us with copying it into the right place:

$ sudo cp ideapad-laptop.ko /lib/modules/3.13.0-32-generic/kernel/drivers/platform/x86/

The new module is valid on the next reboot. Or the next insmod/modprobe, if you’re have the same allergy as myself regarding rebooting a Linux system.

Thunderbird / Linux: Re-sending a sent mail

The idea is to take a mail that has already been send (and is hence in the “sent” folder and send it again with sendmail. Why? In my case the idea is that Thunderbird and sendmail connect to different relay servers, and the one used by Thunderbird 3.0.7 is blacklisted by the destination (I got a reject message).

It’s simple: Find the message in Thunderbird’s “Sent” folder, and save it as an .eml file, say, Trying.eml.

Possibly edit the file, and remove the first three lines (even though there’s probably no problem leaving them there):

X-Mozilla-Status: 0001
X-Mozilla-Status2: 00800000
X-Mozilla-Keys:

Possibly add yourself as a Bcc: after the From: line with

Bcc: Myself <myself@example.com>

And then send the message with

$ sendmail -t < Trying.eml

The -t flag means to find the recipient’s address in the message’s body, which is usually what we want.

i.MX: SDMA not working? Strange things happen? Maybe it’s all about power management.

I ran into a weird problem while attempting to enable SDMA for UARTs on an i.MX53 processor running Freescale’s 2.6.35.3 Linux kernel: To begin with, the UART would only transmit 48 bytes, which is probably a result of only one watermark event arriving (the initial kickoff filled the UART’s FIFO with 32 bytes, and then one SDMA event occurred when the FIFO reached 16 bytes’ fill, so another 16 bytes were sent).

So it seemed like the SDMA core misses the UART’s watermark events. More scrutinized experiments with my own test scripts revealed a variety of weird behaviors, including what appeared to be preemption of the SDMA script’s process, even though the reference manual is quite clear about it: Context switching of SDMA scripts is voluntary. And still, the flow of data on the UART’s tx lines was stopped for 5-6 ms periods randomly, even when I ran a busy-wait loop in the SDMA script, polling the “not full” flag of the UART’s transmission FIFO.

So it looked like something stopped the SDMA script from running in the middle of the loop (which included no “yield” nor “done” command). Or maybe a completely different issue? Maybe the peripheral bus wasn’t completely coherent? Anything seemed possible at some point.

As the title implies, the problem was power management, and poor settings of the SDMA’s behavior during low power modes.

It goes like this: Every time the Linux kernel’s scheduler has no process to run, it executes an WFI ARM processor command, halting the processor until an interrupt arrives (from a peripheral or just the scheduler’s tick clock). But before doing that, the kernel calls an architecture-dependent function, arch_idle(), which possibly shuts down or slows down clocks in order to increase power savings.

The kernel I used didn’t configure the SDMA’s behavior in the lower-power WAIT mode correctly, causing it halt and miss events while the processor was in this mode. The word is that to overcome this, the CCM_CCGR bits for SDMA clocks should be set to 11 (bits 31-30 in CCM_CCGR4). There is probably also a need to enable aips_tz1_clk to keep the SDMA and aips_tz1 clocks running. But since the application I worked on didn’t have any power restrictions, I decided to avoid these power mode switches altogether.

This was done by editing arch/arm/mach-mx5/system.c in the kernel tree, where it said:

void arch_idle(void)
{
 if (likely(!mxc_jtag_enabled)) {
   if (ddr_clk == NULL)
     ddr_clk = clk_get(NULL, "ddr_clk");
   if (gpc_dvfs_clk == NULL)
     gpc_dvfs_clk = clk_get(NULL, "gpc_dvfs_clk");
   /* gpc clock is needed for SRPG */
   clk_enable(gpc_dvfs_clk);
   mxc_cpu_lp_set(arch_idle_mode);

and delete the last line in the listing above — the call to mxc_cpu_lp_set(), which changes the processor’s power mode.

This solved the SDMA problem for me.

As a matter of fact, I would suggest commenting out this line during the development phase of any i.MX-based system, and return it once everything works. True, this shouldn’t be an issue if the clocks are properly configured. But if they’re not, something will fail, and the natural tendency is to focus the drivers of the failing functionality, and not looking for power management issues.

When the power reduction function is re-enabled at some later point, it’s quite evident what the problem is, if something fails then. So even if the target product is battery-driven, do yourself a favor, and drop that line in system.c until you’re finished struggling with other things.

Simple GPIO on Zybo using command-line on Linux

Running Xillinux on the Zybo board, this is how I toggled a GPIO pin from a plain one-liner bash script in Linux. The same technique can be used for other Zynq-7000 boards (Zedboard in particular) to easily control GPIO pins.

First, I looked up which GPIO pin it is. The pin assignments can be found in the FPGA bundle, in xillydemo.ucf (or in xillydemo.sdc, if Vivado was used to build the project).

So I choose to connect to PMOD header JB, first pin, and the PMOD’s GND.

In the UCF file there’s a line saying

## Pmod Header JB
NET PS_GPIO[32]       LOC=T20 | IOSTANDARD=LVCMOS33; #IO_L15P_T2_DQS_34

and its counterpart in the SDC file is

## Pmod Header JB
set_property -dict "PACKAGE_PIN T20 IOSTANDARD LVCMOS33" [get_ports "PS_GPIO[32]"]

So it’s quite clear and cut that the PS_GPIO[32] signal is connected to PMOD B. It doesn’t hurt taking a look on the board’s schematics as well, if you’re convenient with those drawings, and see that the Zynq device’s pin T20 indeed goes to PMOD B, and which pin.

Hooked up as shown in this pic (click to enlarge):

The offset between PS_GPIO numbers and those designated by Linux is 54. So this pin is found as number 32+54=86.

Hence

# echo 86 > /sys/class/gpio/export
# echo out > /sys/class/gpio/gpio86/direction

And then poor man’s oscillator:

# while [ 1 ] ; do echo 1 > /sys/class/gpio/gpio86/value ; echo 0 > /sys/class/gpio/gpio86/value ; done

This runs at a staggering 2.9 kHz. Pretty impressive for the slowest form of programming one can think about.

Manually installing launcher icons for Xilinx tools on a Gnome desktop

So I installed Vivado on my Centos 6.5 64-bit Linux machine, and even though it promised to install icons on my desktop, it didn’t. This is how I installed them manually. There is surely a simpler way, as the special launch bash scripts I created must be somewhere. But I didn’t bother looking.

So it consists of generating four files, all in all, as follows.

First, as root, create these two files, and make them executable by all:

/usr/local/bin/run-vivado as follows:

#!/bin/bash
. /opt/Xilinx/Vivado/2014.1/settings64.sh
vivado &

And /usr/local/bin/run-sdk:

#!/bin/bash
. /opt/Xilinx/SDK/2014.1/settings64.sh
xsdk &

The path to Xilinx’ installation is /opt/Xilinx, of course. Adjust this to where your installation was made, and you should pick the settings32.sh file if you’re running on a 32-bit machine.

And next, we have the launchers, both to be placed in the Desktop directory of the ordinary user who should have these on the desktop.

The file named “Vivado 2014.1.desktop” goes

[Desktop Entry]
Version=1.0
Type=Application
Terminal=false
Icon=/opt/Xilinx/Vivado/2014.1/doc/images/vivado_logo.ico
Name[en_US]=Vivado 2014.1
Exec=/usr/local/bin/run-vivado
Path=/home/myself/vivado-outputs/
Name=Vivado 2014.1
StartupNotify=true

and “Xilinx SDK.desktop” is

[Desktop Entry]
Version=1.0
Type=Application
Terminal=false
Icon=/opt/Xilinx/SDK/2014.1/data/sdk/images/sdk_logo.ico
Name[en_US]=Xilinx SDK
Exec=/usr/local/bin/run-sdk
Name=Xilinx SDK
StartupNotify=true

I’ve marked the StartupNotify assignment in red, because this is what makes the mouse pointer turn into “busy” when the program is launched, until the splash window appears. It’s important for Vivado in particular, which takes some time to start up.

Also, the Path assignment in the Vivado launcher sets the directory at which Vivado runs, which should be changed to a directory that exists, and is a convenient place to dump all log files that Vivado generates.

A list of possible assignments in desktop launchers can be found on this page.

Booting Vivado / EDK mixed FSBL on Zynq-7000

Background

This is yet another war story about making the FSBL boot on a Zynq processor.

I had prepared an FSBL for a certain target using SDK 14.6, and then someone needed it in a Vivado package, using the SDK attached to Vivado 2014.1. In a perfect world, I would have exported the system’s configuration from XPS 14.6 to Vivado as an XML file, and generated the FSBL there. But experience shows that nothing really guarantees that the processor’s configuration will be adopted correctly in Vivado. As a matter of fact, I’ve seen that Vivado imports some parameters, and others are ignored.

But hey, I could just copy the existing FSBL source files to a new workspace in the target SDK? After all, it’s just C code!

This is in fact possible, going File > Import… > General > Existing Projects into Workspace. Then navigate to the path of the original project’s workspace. And don’t forget marking “Copy projects into workspace” so that the old one can be moved or deleted. A popup will allow selecting which projects to import, and it’s done!

Well, not. Selecting the three projects in an FSBL source set (fsbl, fsbl_bsp and system_hw_platform) will indeed create a fresh FSBL project, but it fails compiling (saying that it can’t find libxilffs as required by the -lxilffs or something like that).

To work around this, I imported only the system_hw_platform project, and generated the FSBL project in Vivado’s SDK, as usual: File > New > Application Project. Set the name to “fsbl”, make sure that the underlying hardware project it system_hw_platform. Click “Next” and pick “Zynq FSBL” as the template.

This makes sense, because the FSBL project relies on the C sources that were generated when XPS exported the project to SDK. So the hardware configuration remains correct, and the FSBL is new. No reason why this shouldn’t work, in theory.

The project compiled right away, and an fsbl.elf was ready for mixing into a boot.bin file.

Hurray! Not. It didn’t boot.

Despair not

The immediate measure for these cases in compiling the FSBL with the -DFSBL_DEBUG compilation parameter (which defines the FSBL_DEBUG compilation variable, turning on debug messages). With some luck, something informative will show up on the serial console, even if it appeared dead before.

I was one of those lucky bas#$%*s. I got:

PS7_INIT_FAIL : PS7 initialization successful
FSBL Status = 0xA012

Hmmm… That sounds like a mixed-up error message. It failed because it was successful? Well, in fact, the message itself represents the confusion causing the problem.

The FSBL status 0xA012 is returned when the call to ps7_init() fails in main.c. Or more precisely, when the returned value isn’t FSBL_PS7_INIT_SUCCESS. By the way, the FSBL generated by SDK 14.6 doesn’t even bother to check the return value of ps7_init(), but that’s irrelevant here.

Anyhow, note that ps7_init() is defined in the system_hw_platform, which consists of sources generated by XPS 14.6, but called by the FSBL, which was generated by Vivado.

This is a bit delicate, because ps7_init() returns PS7_INIT_SUCCESS when successful (see ps7_init.c), which happens to be defined in ps7_init.h as

#define PS7_INIT_SUCCESS   (0)    // 0 is success in good old C

and non-zero values meaning failure. This is the classic UNIX convention.

For some reason, this is what one finds in fsbl.h:

#ifdef NEW_PS7_ERR_CODE
#define FSBL_PS7_INIT_SUCCESS	PS7_INIT_SUCCESS
#else
#define FSBL_PS7_INIT_SUCCESS	(1)
#endif

In short: FSBL_PS7_INIT_SUCCESS=1, PS7_INIT_SUCCESS=0. A problem indeed.

So this is a direct consequence of mixing an old hardware project with a new FSBL. They changed the error code values somewhere in the middle.

Solution

The clean way to fix this is defining NEW_PS7_ERR_CODE during compilation. The less clean method is just remove this #ifdef statement and leave it as

#define FSBL_PS7_INIT_SUCCESS	PS7_INIT_SUCCESS

And with this FSBL booted correctly and all was well.

I know that getting the FSBL to boot is a recurring problem. Please don’t turn to me for help if your board doesn’t boot — there’s no secret trick, just good old debugging that takes time and effort.

Executing user-space programs from a different Linux distro

While trying to use executables from one ARM-based distribution to another, it failed to run, even before trying to load any libraries. The ARM architectures were compatible (armhf in both cases) so it wasn’t like I was trying to run an Intel binary on an ARM. I could always cross-compile from sources, but copying binaries is much easier…

I’ll demonstrate this issue with the “ls” program. Of course I tried to adopt something more worthy.

It was just like (where the current directory’s “ls” is the binary belonging to the other distro)

# ./ls
-bash: ./ls: No such file or directory

or sometimes (depends on the distribution) it says

$ ./ls
-sh: ./ls: not found

or when attempting to run with bash:

$ bash ./ls
./ls: ./ls: cannot execute binary file

Attempting to set LD_DEBUG=all was pointless, because the error was earlier on. Strace gave an idea:

$ strace ./ls
execve("./ls", ["./ls"], [/* 13 vars */]) = -1 ENOENT (No such file or directory)
dup(2)                                  = 3
fcntl64(3, F_GETFL)                     = 0x2 (flags O_RDWR)
fstat64(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aac9000
_llseek(3, 0, 0x7efca940, SEEK_CUR)     = -1 ESPIPE (Illegal seek)
write(3, "strace: exec: No such file or di"..., 40strace: exec: No such file or directory
) = 40
close(3)                                = 0
munmap(0x2aac9000, 4096)                = 0
exit_group(1)                           = ?

So execve() returns ENOENT even though the file exists. Which means, in this case, that the file is there but the kernel refuses to run it.

The reason

The crucial difference between the alien “ls” and the native one, is the where they expect to find their loader:

$ readelf -l /bin/ls

Elf file type is EXEC (Executable file)
Entry point 0xcb84
There are 7 program headers, starting at offset 52

Program Headers:
 Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
 EXIDX          0x093b4c 0x0009bb4c 0x0009bb4c 0x00110 0x00110 R   0x4
 PHDR           0x000034 0x00008034 0x00008034 0x000e0 0x000e0 R E 0x4
 INTERP         0x000114 0x00008114 0x00008114 0x00013 0x00013 R   0x1
 [Requesting program interpreter: /lib/ld-linux.so.3]
 LOAD           0x000000 0x00008000 0x00008000 0x93c60 0x93c60 R E 0x8000
 LOAD           0x094000 0x000a4000 0x000a4000 0x007bd 0x02a88 RW  0x8000
 DYNAMIC        0x09400c 0x000a400c 0x000a400c 0x000f0 0x000f0 RW  0x4
 GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4

 Section to Segment mapping:
 Segment Sections...
 00     .ARM.exidx
 01    
 02     .interp
 03     .interp .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .ARM.extab .ARM.exidx .eh_frame
 04     .init_array .fini_array .jcr .dynamic .got .data .bss
 05     .dynamic
 06    
$ readelf -l ./ls

Elf file type is EXEC (Executable file)
Entry point 0xb6d9
There are 9 program headers, starting at offset 52

Program Headers:
 Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
 EXIDX          0x00fce8 0x00017ce8 0x00017ce8 0x00030 0x00030 R   0x4
 PHDR           0x000034 0x00008034 0x00008034 0x00120 0x00120 R E 0x4
 INTERP         0x000154 0x00008154 0x00008154 0x00027 0x00027 R   0x1
 [Requesting program interpreter: /lib/arm-linux-gnueabihf/ld-linux.so.3]
 LOAD           0x000000 0x00008000 0x00008000 0x0fd1c 0x0fd1c R E 0x8000
 LOAD           0x00fee4 0x0001fee4 0x0001fee4 0x003e4 0x01050 RW  0x8000
 DYNAMIC        0x00fef0 0x0001fef0 0x0001fef0 0x00110 0x00110 RW  0x4
 NOTE           0x00017c 0x0000817c 0x0000817c 0x00044 0x00044 R   0x4
 GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4
 GNU_RELRO      0x00fee4 0x0001fee4 0x0001fee4 0x0011c 0x0011c R   0x1

 Section to Segment mapping:
 Segment Sections...
 00     .ARM.exidx
 01    
 02     .interp
 03     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .ARM.exidx .eh_frame
 04     .init_array .fini_array .jcr .dynamic .got .data .bss
 05     .dynamic
 06     .note.ABI-tag .note.gnu.build-id
 07    
 08     .init_array .fini_array .jcr .dynamic

Aha! When the native “ls” is executed, the kernel loads /lib/ld-linux.so.3 which in turn executes the required executable. When the alien “ls” was attempted, the kernel went for /lib/arm-linux-gnueabihf/ld-linux.so.3, couldn’t find it and returned “no such file”. It actually means that it didn’t find the interpreter binary (i.e. the glibc dynamic library loader).

The Solution

Create a symlink from where the executable expects the loader to where it actually is. In this case

# mkdir /lib/arm-linux-gnueabihf
# cd /lib/arm-linux-gnueabihf
# ln -s /lib/ld-linux.so.3

It’s of course quite likely that some library binaries will need to be copied along with the executable. LD_DEBUG or ldd may be helpful here, as well as “readelf -d” if there’s no ldd.

Changing the dynamic linker when compiling

Sometimes it’s possible to go the other way around: Tell gcc to pick a certain dynamic linker.

But first, to see which loader a program compiled with gcc will expect, add the -v flag in the compilation command, e.g.

$ gcc -v -O3 -Wall tryexec.c -o tryexec

and look for the -dynamic-linker flag in COLLECT_GCC_OPTIONS (could be, for example, /lib64/ld-linux-x86-64.so.2).

To change the choice of linker, pass an argument to the linker through gcc with the -Wl flag:

$ gcc -O3 -Wl,-I/lib/ld-linux.so.3 -Wall tryexec.c -o tryexec

What comes after the comma of the -Wl flag goes to the linker, so -Wl,-I/lib/ld-linux.so.3 passes “-I/lib/ld-linux.so.3″ to ld, which does the job.

Those using Eclipse (Xilinx SDK included) can add the flag in the project C/C++ Build Settings > Tool Settings > ARM Linux gcc linker > Miscellaneous > Linker Flags (write e.g. “-Wl,-I/lib/myloader.so”, without the quotes, in the text box).

Wifi Access Point on my desktop with USB dongles

Introduction

These are my rather messy notes as I set up a wireless access point on my desktop (Fedora 12) running a home-compiled 3.12.20 Linux kernel. Somewhere below (see “Rubbish starts here”) I’ve added things that I tried out but lead nowhere. Beware.

I began with two USB dongles, 8188EU and 8192CU. I got 8188EU up and running with Realtek’s hostapd and driver, but only for the 2.4 GHz band. So I bought a RaLink-based dual-band USB dongle, and ran it with the kernel’s built-in driver and an updated version of hostapd (it’s hardware neutral however). If you want it, search E-Bay for “300m USB Wifi dual band”. It should look like this, and cost some $15 or so:

Dual band Wifi USB dongle

This dongle is what I ended up using. You may skip to “Dual-band dongle” below if you don’t care about the other things I tried out before I chose this one.

The purpose is a manual setup for occasional use. There are plenty of similar writeouts, like this one.

It’s very easy to get mixed up with all those do-this-do-that howtos, and forget one simple fact: A wireless NIC is just another Ethernet card that happens not to have a cable. The authentication of a wireless link takes place with plain Ethernet packets, and once the two sides agree on talking with each other, it’s back to two Ethernet cards with a cross cable.

To make a machine serve as an access point, the NIC must support Master mode, and there must be software running that plays the role of authenticating clients and setting up encryption. But in the end of the day, that all there is to it. Linux’ daemon for doing this is hostapd.

The swiss army knives are “iw“, “iwconfig” and “iwlist”. Try “iw help” in particular.

In short

  1. Plug in device — driver autoloads
  2. Bring up the device with ifconfig (assign an IP address)
  3. Switch regulation region, if the 5 GHz band is required (and the device reports old and over-restrictive regulation rules):
    # iw reg set GD
  4. Restart dhcpd, so that it listens for requests on wlan0
  5. Start hostapd

Realtek vs. community

There are two completely different takes on getting the Wifi working. One is to use the tools that are maintained by the community: The hostapd that arrives along with distributions, and the drivers compiled in the kernel. Well, as of June 2014, that’s not a go with Realtek’s USB Wifi dongles.

The thing is that the typical distribution hostapd expects to find the kernel’s native interface, which is implemented in the cfg80211 and mac80211 kernel modules. These modules are supposed to talk with the low-level hardware drivers. Very structured and nice. Only hi-tec companies don’t always play ball with the kernel community.

Realtek, in this case, chose to compile together everything, including the higher level frontend source code, and make a single kernel module of that. Kinda makes sense when all you need is a single driver for your specific hardware (a bit like static linking of a program), but not when that hardware is just one of many to be supported.

For example, the kernel’s 8192CU driver (appears as rtl8192cu on lsmod with ~79kB) relies on the kernel’s low-level modules (which are mac80211. cfg80211, rtl8192c_common, rtl_usb, rtlwifi), but the Realtek driver has everything in a single module, which appears as 8192cu and takes ~526kB.

Now to hostapd: The distribution’s version are minded on the kernel’s native interface (“driver=nl80211″) with some partial support for Realtek’s drivers (“driver=rtl871x”), so all in all, if you use Realtek’s kernel drivers, use their hostapd as well.

My chosen solution (well, no-other-choice solution) was to compile the Realtek’s kernel modules and hostapd. With slight variations.

So first is a summary of commands when things finally work, and then the battle field (compilations from sources etc.).

ifconfig

This is necessary for the already running DHCP daemon to answer requests from wireless clients. This ifconfig command is also the moment at which the firmware is loaded (and not when the driver loads, as one could expect).

Important: Remember that routing rules apply like any Ethernet card, so don’t pick an IP address space that is already accounted for in the access point’s routing table. Doing that mistake will not just make pings fail, but the access point will also ignore ARP requests (see below).

# ifconfig wlan0 10.10.0.1 netmask 255.255.255.0
# service dhcpd restart

Starting hostapd

# service hostapd start

or running in the foreground, with a lot of debug output

# hostapd -dd /etc/hostapd/hostapd.conf

Note that when hostapd is running in the foreground and is stopped with CTRL-C, unplugging and replugging the device may be necessary before re-attempting to work with it.

What happens if you pick a bad IP address

For some reason, I had the silly idea that since my internal LAN’s subnet is 10.1.0.0/16, I should assign my wlan0 card the address 10.1.1.123, so it will natively belong to the LAN. What I didn’t realize was that another NIC is already assigned for handling 10.1.0.0/16, so wlan0 will never get packets routed to it.

Even worse, the wireless adapter will not answer to ARP requests, which kinda makes sense — the wireless adapter “knows” that it can’t work with the IP address it has, so it might as well not announce any IP connectivity. The interesting thing was that ping requests were ignored completely as well. It’s not like the replies went out on NIC to which the IP subnet belongs. There was no reply packet at all. Which again, makes sense, because pings are not supposed to go out on another NIC. That could potentially confuse someone into thinking that the link is OK (in case there was a way for the reply to reach the requester).

In grey, with a line-over, here is the description of the problem, as I saw it before I solved it. Just in case someone is stuck in the same situation.

At this point, I can connect to the Access Point from Windows XP (even with a client having poor WPA support) as well as Linux with seemingly no problem. But there’s no real internet access. The reason seems to be, that the USB dongle doesn’t seem to be connected with its IP protocol layer. Ethernet packets go through well, as can be seen in sniff dumps on both sides, and the client manages to acquire an address with DHCP, because it depends only on plain MAC packets.

Despite setting an address with ifconfig (or “ip address add” for that matter), the dongle doesn’t respond to ARP requests asking for the address it has, and doesn’t respond to pings.

ARP packets are sent properly from the dongle (acting as AP) and the responses from the client arrive fine as well (when asking for the address of the client’s Wifi NIC as well as another wired Ethernet NIC, both are answered).

# arping -I wlan0 10.1.1.166
ARPING 10.1.1.166 from 10.1.1.123 wlan0
Unicast reply from 10.1.1.166 [00:0E:2E:40:5B:11]  48.329ms
Unicast reply from 10.1.1.166 [00:0E:2E:40:5B:11]  80.612ms
Unicast reply from 10.1.1.166 [00:0E:2E:40:5B:11]  104.531ms

but not on the other way (from the client):

# arping -I wlan0 10.1.1.123

(nothing happens)

Now, if the access point sends a gratuitous ARP to the client:

# arping -A -I wlan0 10.1.1.123

the client can send ping packets to the access point. These ICMP packets appear in the sniff dump of wlan0 on both sides, but the access point doesn’t reply. So did pinging to the broadcast address. The packets were seen at the access point’s sniff dumps with all 0xff’s MAC address, but with no response:

# ping -b 10.1.255.255

This is not a firewall issue. The problem remains with the firewall taken down. Both USB dongles have this same problem.

Compiling Realtek’s driver for RTL8188EU

Possible reason why this is necessary: The USB device is V2.0 according to the package, and the newer version contains firmware. Anyhow,

$ git clone https://github.com/lwfinger/rtl8188eu.git

A plain “make” compiled the code cleanly on kernel 3.12.20 (using commit ID 63fe7cda86c2830d66335026efde7472c10bc5c2). Copy firmware (also in Git bundle):

# cp rtl8188eufw.bin /lib/firmware/rtlwifi/

(well, I ended up doing “make install”. After removing the existing driver from the staging subdirectory).

Compiling Realtek’s driver for RTL8192CU

Following this guide, went to Realtek’s site, and download something like RTL8188C_8192C_USB_linux_v4.0.2_9000.20130911.zip (ZIP??!), untarred wpa_supplicant_hostapd-0.8_rtw_r7475.20130812.tar.gz.

Tried to compile from this zip file (under “driver”). Compilation failed against my kernel (3.12) on the change of the “create_proc_entry” API. So instead, I went for

$ git clone https://github.com/pvaret/rtl8192cu-fixes.git

and compiled cleanly from commit ID f0dfbb46a891820b27942ba3e213af83f2452957.

Compiling and running Realtek’s hostapd

From the zip file that I downloaded from Realtek, went to the hostapd subdirectory in wpa_supplicant_hostapd/, and typed “make”. Compiled cleanly, and generated a “hostapd” and “hostapd_cli” executables. Yey.

And that actually worked! Note that the rtl871x driver is picked even though the “driver=” isn’t assigned at all in hostapd.conf.

# hostapd -d /etc/hostapd/hostapd.conf
random: Trying to read entropy from /dev/random
Configuration file: /etc/hostapd/hostapd.conf
ctrl_interface_group=0
eapol_version=1
drv->ifindex=35
l2_sock_recv==l2_sock_xmit=0x0x1203be0
BSS count 1, BSSID mask 00:00:00:00:00:00 (0 bits)
Completing interface initialization
Mode: IEEE 802.11g  Channel: 4  Frequency: 2427 MHz
RATE[0] rate=10 flags=0x1
RATE[1] rate=20 flags=0x1
RATE[2] rate=55 flags=0x1
RATE[3] rate=110 flags=0x1
RATE[4] rate=60 flags=0x0
RATE[5] rate=90 flags=0x0
RATE[6] rate=120 flags=0x0
RATE[7] rate=180 flags=0x0
RATE[8] rate=240 flags=0x0
RATE[9] rate=360 flags=0x0
RATE[10] rate=480 flags=0x0
RATE[11] rate=540 flags=0x0
Flushing old station entries
Deauthenticate all stations
+rtl871x_sta_deauth_ops, ff:ff:ff:ff:ff:ff is deauth, reason=2
rtl871x_set_key_ops
rtl871x_set_key_ops
rtl871x_set_key_ops
rtl871x_set_key_ops
Using interface wlan0 with hwaddr c0:4a:00:18:ef:21 and ssid 'ocho'
Deriving WPA PSK based on passphrase
SSID - hexdump_ascii(len=4):
 6f 63 68 6f                                       ocho           
PSK (ASCII passphrase) - hexdump_ascii(len=9): [REMOVED]
PSK (from passphrase) - hexdump(len=32): [REMOVED]
rtl871x_set_wps_assoc_resp_ie
rtl871x_set_wps_beacon_ie
rtl871x_set_wps_probe_resp_ie
urandom: Got 20/20 bytes from /dev/urandom
GMK - hexdump(len=32): [REMOVED]
Key Counter - hexdump(len=32): [REMOVED]
WPA: group state machine entering state GTK_INIT (VLAN-ID 0)
GTK - hexdump(len=32): [REMOVED]
WPA: group state machine entering state SETKEYSDONE (VLAN-ID 0)
rtl871x_set_key_ops
rtl871x_set_beacon_ops
rtl871x_set_hidden_ssid ignore_broadcast_ssid:0, ocho,4
rtl871x_set_acl
wlan0: Setup of interface done.

But with WPA authentication enabled, I got a lot of

hostapd: wlan0: STA 00:0e:2e:40:5b:94 IEEE 802.11: associated
hostapd: wlan0: STA 00:0e:2e:40:5b:94 IEEE 802.11: deauthenticated due to local deauth request
hostapd: wlan0: STA 00:0e:2e:40:5b:94 IEEE 802.11: disassociated

It was also evident sniffing wlan0 that EAPOL WPA key (254) frames were sent to the client, but they didn’t get answered, which is probably the reason for the whole thing, as mentioned on this page.

The solution was to restrict the protocol to version 1 with

eapol_version=1

in hostapd.conf. This problem occurred only when I used the RT2500 utility on the Windows laptop. Using Windows XP’s native wireless selection tool connected well either way.

8192CU is single band. Really.

I tried to work with the 8192CU dongle, because it supposedly supports the 5 GHz band as well. The 2.4 GHz is heavily crowded. I don’t know why I got the impression that it’s dual-band. Anyhow,

# cp 8192cu.ko /lib/modules/$(uname -r)/kernel/drivers/net/wireless/
# depmod -a

and also blacklist the kernel’s native driver by adding the following lines to /etc/modprobe.d/blacklist.conf

# Native Wifi drivers not usable as accept points
blacklist rtl8192cu
blacklist rtl8192c_common

To see the list of channels:

$ iwlist wlan0 freq

Darn, only 2.4 GHz! It even says so on Realtek’s site: “Complete 802.11n MIMO solution for 2.4GHz band” and “Single-Band 11n (2x2) WLAN USB Dongle”.

Besides, the signal it transmits appears to be really lousy. I got a really bad link quality (but hey, this is a cheapo dongle from Ebay).

Compiling hostapd from the sources

First, install libnl-devel, which is required for compiling hostapd:

# yum install libnl-devel

Download from the hostapd’s main page, copy the config file and compile:

$ git clone git://w1.fi/srv/git/hostap.git
$ cd hostap/hostapd
$ git checkout hostap_2_2
$ cp defconfig .config
$ make

Dual-band dongle

Plugged in an MediaTek (formerly RaLink) RT5572-based no-brand dongle (0x148f/0x5572) into my computer with kernel 3.12. Was detected right away. “iw list” gave a long answer, so revert to the original hostapd, and pick driver=nl80211. The driver handling it was rt2800usb, along with its dependencies, rt2800usb, rt2x00usb, rt2x00lib, mac80211 and cfg80211.

The Linux drivers MediaTek’s site were last updated in 2010, supporting kernel 2.4.0, but the rt2800usb driver seems to be maintained properly with occasional patches. So it looks like the kernel’s built-in driver is the best choice. The RT5572 was added in March 2013 to kernel 3.10.

Attempted to run hostapd, it said

# hostapd -dd /etc/hostapd/hostapd.conf
Configuration file: /etc/hostapd/hostapd.conf
ctrl_interface_group=0
eapol_version=1
ioctl[SIOCSIFFLAGS]: No such file or directory
nl80211 driver initialization failed.
wlan1: Unable to setup interface.
rmdir[ctrl_interface]: No such file or directory

That wasn’t very helpful, but looking at the system log was:

ieee80211 phy0: rt2x00lib_request_firmware: Info - Loading firmware file 'rt2870.bin'
ieee80211 phy0: rt2x00lib_request_firmware: Error - Failed to request Firmware

Ah, yes. A firmware file. Taken from the Linux Firmware Git repo,

# cp rt2870.bin /lib/firmware/

(note that it’s NOT to rtlwifi. The is RaLink, not RealTek).

At which point I got a lot of output from hostapd -dd, but it ended with

Could not set DTIM period for kernel driver

This seems to be an hostapd issue (I ran 0.6.9), as the driver is stable. Compiling hostapd-2.2 solved this (see just above), and the dongle works nicely as an access point.

Access point at 5 GHz

The whole point with this dual-band dongle was to run the access point at 5 GHz, and avoid all the noise from my neighbors. But alas, requesting a 5 GHz channel with hostapd -dd, says, somewhere in the middle:

channel [40] (157) is disabled for use in AP mode, flags: 0x1
wlan1: IEEE 802.11 Configured channel (157) not found from the channel list of current mode (2) IEEE 802.11a
wlan1: IEEE 802.11 Hardware does not support configured channel
Could not select hw_mode and channel. (-3)
wlan1: interface state UNINITIALIZED->DISABLED
wlan1: AP-DISABLED
wlan1: Unable to setup interface.

Hmmm… I failed twice here. The frequency isn’t allowed in Israel, and the 5 GHz band is blocked altogether.

Indeed,

$ iw list
Wiphy phy2
 Band 1:
 Capabilities: 0x2f2
 [...]
 Frequencies:
 * 2412 MHz [1] (20.0 dBm)
 * 2417 MHz [2] (20.0 dBm)
 * 2422 MHz [3] (20.0 dBm)
 * 2427 MHz [4] (20.0 dBm)
 * 2432 MHz [5] (20.0 dBm)
 * 2437 MHz [6] (20.0 dBm)
 * 2442 MHz [7] (20.0 dBm)
 * 2447 MHz [8] (20.0 dBm)
 * 2452 MHz [9] (20.0 dBm)
 * 2457 MHz [10] (20.0 dBm)
 * 2462 MHz [11] (20.0 dBm)
 * 2467 MHz [12] (20.0 dBm)
 * 2472 MHz [13] (20.0 dBm)
 * 2484 MHz [14] (disabled)
 Bitrates (non-HT):
 * 1.0 Mbps
 * 2.0 Mbps (short preamble supported)
 * 5.5 Mbps (short preamble supported)
 * 11.0 Mbps (short preamble supported)
 * 6.0 Mbps
 * 9.0 Mbps
 * 12.0 Mbps
 * 18.0 Mbps
 * 24.0 Mbps
 * 36.0 Mbps
 * 48.0 Mbps
 * 54.0 Mbps
 Band 2:
 Capabilities: 0x2f2
 HT20/HT40
 [...]
 Frequencies:
 * 5180 MHz [36] (disabled)
 * 5190 MHz [38] (disabled)
 * 5200 MHz [40] (disabled)
 * 5210 MHz [42] (disabled)
 * 5220 MHz [44] (disabled)
 * 5230 MHz [46] (disabled)
 * 5240 MHz [48] (disabled)
 * 5250 MHz [50] (disabled)
 * 5260 MHz [52] (disabled)
 * 5270 MHz [54] (disabled)
 * 5280 MHz [56] (disabled)
 * 5290 MHz [58] (disabled)
 * 5300 MHz [60] (disabled)
 * 5310 MHz [62] (disabled)
 * 5320 MHz [64] (disabled)
 * 5500 MHz [100] (disabled)
 * 5510 MHz [102] (disabled)
 * 5520 MHz [104] (disabled)
 * 5530 MHz [106] (disabled)
 * 5540 MHz [108] (disabled)
 * 5550 MHz [110] (disabled)
 * 5560 MHz [112] (disabled)
 * 5570 MHz [114] (disabled)
 * 5580 MHz [116] (disabled)
 * 5590 MHz [118] (disabled)
 * 5600 MHz [120] (disabled)
 * 5610 MHz [122] (disabled)
 * 5620 MHz [124] (disabled)
 * 5630 MHz [126] (disabled)
 * 5640 MHz [128] (disabled)
 * 5650 MHz [130] (disabled)
 * 5660 MHz [132] (disabled)
 * 5670 MHz [134] (disabled)
 * 5680 MHz [136] (disabled)
 * 5690 MHz [138] (disabled)
 * 5700 MHz [140] (disabled)
 * 5745 MHz [149] (disabled)
 * 5755 MHz [151] (disabled)
 * 5765 MHz [153] (disabled)
 * 5775 MHz [155] (disabled)
 * 5785 MHz [157] (disabled)
 * 5795 MHz [159] (disabled)
 * 5805 MHz [161] (disabled)
 * 5825 MHz [165] (disabled)
 * 4920 MHz [-16] (disabled)
 * 4940 MHz [-12] (disabled)
 * 4960 MHz [-8] (disabled)
 * 4980 MHz [-4] (disabled)
 Bitrates (non-HT):
 * 6.0 Mbps
 * 9.0 Mbps
 * 12.0 Mbps
 * 18.0 Mbps
 * 24.0 Mbps
 * 36.0 Mbps
 * 48.0 Mbps
 * 54.0 Mbps
 [...]

Are you kidding me? Disabled? Well, no wonder. The kernel thinks 5 GHz is disallowed in Israel:

$ iw reg get
country IL:
 (2402 - 2482 @ 40), (N/A, 20)

Where did it get that from? A peek on dmesg reveals the answer:

cfg80211: Calling CRDA to update world regulatory domain
cfg80211: World regulatory domain updated:
cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2457000 KHz - 2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
usb 5-1.4: reset full-speed USB device number 9 using uhci_hcd
ieee80211 phy0: rt2x00_set_rt: Info - RT chipset 5592, rev 0222 detected
ieee80211 phy0: rt2x00_set_rf: Info - RF chipset 000f detected
ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
usbcore: registered new interface driver rt2800usb
cfg80211: Calling CRDA for country: IL
cfg80211: Regulatory domain changed to country: IL
cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
cfg80211:   (2402000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm)

The thing is that according to Israel’s local regulations, the lower 5 GHz band is allowed for indoor use. My initial choice of channel 157 is probably illegal in Israel (see Wikipedia’s list). But hey, some channels are still open on the 5 GHz band! It’s also interesting to note that some of 5 GHz channels that are banned for Wifi are allowed for amateur radio (also see this and this).

As the regulations for each country is taken from some ROM on the hardware device itself, it’s probably outdated.

The ugly solution is to switch regulation country. For example, Granada has a relatively relaxed setting:

# iw reg set GD

A full list of these country codes can be found here. “BO” (for Bolivia) is also worth a try.

Now the responsibility is on me to pick a legal frequency. For example, anywhere between 36-48.


Rubbish starts here

From this point on, it’s just random stuff that I tried out, and didn’t lead anywhere. But since I write as I work, why delete it? Maybe it helps someone as is.

Plugging in a TL-WN725N before switching to Realtek’s drivers

usb 2-2.2: Product: 802.11n NIC
usb 2-2.2: Manufacturer: Realtek
usb 2-2.2: SerialNumber: 00E04C0001
r8188eu: module is from the staging directory, the quality is unknown, you have been warned.
Chip Version Info: CHIP_8188E_Normal_Chip_TSMC_D_CUT_1T1R_RomVer(0)
usbcore: registered new interface driver r8188eu

Check if it’s ready to be an access point:

# iwconfig wlan0 mode master
# iwconfig wlan0
wlan0     unassociated  Nickname:"<WIFI@REALTEK>"
 Mode:Master  Frequency=2.412 GHz  Access Point: Not-Associated  
 Sensitivity:0/0 
 Retry:off   RTS thr:off   Fragment thr:off
 Encryption key:off
 Power Management:off
 Link Quality:0  Signal level:0  Noise level:0
 Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
 Tx excessive retries:0  Invalid misc:0   Missed beacon:0

OK, so it is. :)

But this doesn’t seem very good:

# iw list
nl80211 not found.

And here comes a bit of nonsense that was fixed by compiling software from sources, as shown below.

Fixed with

# modprobe mac80211

Installing the access point daemon:

# yum install hostapd

Running manually for a test:

 

# hostapd -dd /etc/hostapd/hostapd.conf
Configuration file: /etc/hostapd/hostapd.conf
ctrl_interface_group=10 (from group name 'wheel')
nl80211 not found.
nl80211 driver initialization failed.
wlan0: Unable to setup interface.

Tried second dongle (the I bought cheap from Ebay)

usb 2-2.2: New USB device found, idVendor=0bda, idProduct=8176
usb 2-2.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 2-2.2: Product: 802.11n WLAN Adapter
usb 2-2.2: Manufacturer: Realtek
usb 2-2.2: SerialNumber: 00e04c000001
rtl8192cu: Chip version 0x10
rtl8192cu: MAC address: 00:13:ef:40:08:98
rtl8192cu: Board Type 0
rtl_usb: rx_max_size 15360, rx_urb_num 8, in_ep 1
rtl8192cu: Loading firmware rtlwifi/rtl8192cufw_TMSC.bin
usbcore: registered new interface driver rtl8192cu
rtlwifi: Loading alternative firmware rtlwifi/rtl8192cufw.bin
rtlwifi: Firmware rtlwifi/rtl8192cufw_TMSC.bin not available

OK, OK, take the firmware!

# mkdir /lib/firmware/rtlwifi
# cp rtl8192cufw.bin /lib/firmware/rtlwifi/

Unplug-replug. This one went much better:

usb 2-2.2: New USB device found, idVendor=0bda, idProduct=8176
usb 2-2.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 2-2.2: Product: 802.11n WLAN Adapter
usb 2-2.2: Manufacturer: Realtek
usb 2-2.2: SerialNumber: 00e04c000001
rtl8192cu: Chip version 0x10
rtl8192cu: MAC address: 00:13:ef:40:08:98
rtl8192cu: Board Type 0
rtl_usb: rx_max_size 15360, rx_urb_num 8, in_ep 1
rtl8192cu: Loading firmware rtlwifi/rtl8192cufw_TMSC.bin
rtlwifi: Loading alternative firmware rtlwifi/rtl8192cufw.bin
ieee80211 phy1: Selected rate control algorithm 'rtl_rc'
rtlwifi: wireless switch is on
cfg80211: Calling CRDA for country: IL
cfg80211: Regulatory domain changed to country: IL
cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
cfg80211:   (2402000 KHz - 2482000 KHz @ 40000 KHz), (N/A, 2000 mBm)

but

# hostapd /etc/hostapd/hostapd.conf
ioctl[SIOCSIFFLAGS]: Unknown error 132
nl80211 driver initialization failed.
rmdir[ctrl_interface]: No such file or directory

Newer hostapd

Stole the binaries from Fedora 20, including a set of necessary libraries, and created a chroot for that as follows:

# chroot . /hostapd -d /hostapd.conf

With the Ebay dongle, the AP was visible from my laptop, but I failed to connect. Nothing appears on sniffing wlan1, and strace shows nothing happens during these connection attempts, so the conclusion must be that the problem is with the dongle.

So I found the first firmware the driver was checking for,

usb 2-2.3: New USB device found, idVendor=0bda, idProduct=8176
usb 2-2.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 2-2.3: Product: 802.11n WLAN Adapter
usb 2-2.3: Manufacturer: Realtek
usb 2-2.3: SerialNumber: 00e04c000001
rtl8192cu: Chip version 0x10
rtl8192cu: MAC address: 00:13:ef:40:08:98
rtl8192cu: Board Type 0
rtl_usb: rx_max_size 15360, rx_urb_num 8, in_ep 1
rtl8192cu: Loading firmware rtlwifi/rtl8192cufw_TMSC.bin
ieee80211 phy7: Selected rate control algorithm 'rtl_rc'
rtlwifi: wireless switch is on
rtl8192cu: MAC auto ON okay!
rtl8192cu: Tx queue select: 0x05

Didn’t make any difference.

Creating a bridge

This is the really manual route, based upon this page.

Basically,

# brctl addbr br0
# brctl setfd br0 0
# brctl addif br0 eth0
# brctl addif br0 wlan0
# ifconfig br0 10.1.1.123 netmask 255.255.255.0
# ifconfig br0 up

The second command sets the forward delay to zero, to prevent problems on the first connection, as mentioned on this page.

One can take a look on the status with

# brctl show
bridge name    bridge id        STP enabled    interfaces
br0        8000.00241dd37e38    no        eth0
                                          wlan0

To remove the bridge:

# ifconfig br0 down
# brctl delbr br0

Plain-text mail from Thunderbird (under Linux)

Introduction

I’ve been annoyed for quite a while by Thunderbird’s strong inclination towards HTML mail. To the extent that if I don’t really, really verify that a mail goes out in plain text, it’s probably going to slip out in HTML. This is bad in particular when sending mails to Linux-related mailing lists. They don’t like it. And the truth is that I’m not very fond of them either, but I usually don’t care.

There’s an add-on for this, Outgoing Message Format, but I run a version of Thunderbird that is too old for that, and trying to fool Thunderbird into installing it by changing the add-on’s version requirement field ended up with an add-on that does nothing.

Upgrading was not an attractive direction: If I’m happy with a tool except for one thing, I’ll fix that thing. Upgrading tends to fix that thing but create a new problem. On a good day.

It turned out to be extremely difficult to convince Thunderbird stopping with that. My notes while trying below.

Note to self: To find the entire hack history, search your “Sent” box for “Thunderbird plain text hacks” in the subject.

Remove the HTML composition capability completely

Ths method makes it impossible for a certain mail identity to compose HTML mails. Go to Preferences > General > Config Editor… and agree to be careful.

mail.identity.id1.compose_html: Set from true to false.

In internal JavaScript code, these preferences are fetched with getPref() commands.

Fixing Thunderbird from within

After wasting a lot of time on this, I reached the conclusion, that the problem was that quite a few components in Thunderbird’s script environment push the HTML format for various reasons. These are apparently ugly hacks that solved a problem for someone in the far past, and remained there, because noone noticed them or understood exactly what they do, possibly including whoever wrote them in the first place.

The solution was a counter-hack. Basically, hide the relevant menu’s IDs from other scripts and set the default to “Plain text”. This requires opening a JAR, making a few fixes in a couple of files, and packing it up again.

So let’s get to it. In a fresh directory,

$ jar xf /usr/lib64/thunderbird-3.0/chrome/messenger.jar

and edit ./content/messenger/messengercompose/messengercompose.xul, in the part saying

<menu id="outputFormatMenu" label="&outputFormatMenu.label;" accesskey="&outputFormatMenu.accesskey;" oncommand="OutputFormatMenuSelect(event.target)">
 <menupopup id="outputFormatMenuPopup">
 <menuitem type="radio" name="output_format" label="&autoFormatCmd.label;" accesskey="&autoFormatCmd.accesskey;" id="format_auto" checked="true"/>
 <menuitem type="radio" name="output_format" label="&plainTextFormatCmd.label;" accesskey="&plainTextFormatCmd.accesskey;" id="format_plain"/>
 <menuitem type="radio" name="output_format" label="&htmlFormatCmd.label;" accesskey="&htmlFormatCmd.accesskey;" id="format_html"/>
 <menuitem type="radio" name="output_format" label="&bothFormatCmd.label;" accesskey="&bothFormatCmd.accesskey;" id="format_both"/>
 </menupopup>
 </menu>

The idea is to hide the elements from any script, except the one that responds to changes in this menu. Also, change the default from “Auto detect” to “plain text”. After the change we have

<menu id="my_outputFormatMenu" label="&outputFormatMenu.label;" accesskey="&outputFormatMenu.accesskey;" oncommand="OutputFormatMenuSelect(event.target)">
 <menupopup id="outputFormatMenuPopup">
 <menuitem type="radio" name="output_format" label="&autoFormatCmd.label;" accesskey="&autoFormatCmd.accesskey;" id="my_format_auto"/>
 <menuitem type="radio" name="output_format" label="&plainTextFormatCmd.label;" accesskey="&plainTextFormatCmd.accesskey;" id="my_format_plain" checked="true"/>
 <menuitem type="radio" name="output_format" label="&htmlFormatCmd.label;" accesskey="&htmlFormatCmd.accesskey;" id="my_format_html"/>
 <menuitem type="radio" name="output_format" label="&bothFormatCmd.label;" accesskey="&bothFormatCmd.accesskey;" id="my_format_both"/>
 </menupopup>
 </menu>

Note the “my_” prefixes on the IDs + that the “checked” attribute has moved.

This leaves a few changes in the only script that should deal with this, ./content/messenger/messengercompose/MsgComposeCommands.js: In

In ComposeStartup(),

document.getElementById("outputFormatMenu").setAttribute("hidden", true);

is replaced with

document.getElementById("my_outputFormatMenu").setAttribute("hidden", true);

and likewise, in OutputFormatMenuSelect()

if (msgCompFields)
 switch (target.getAttribute('id'))
 {
 case "format_auto":  gSendFormat = nsIMsgCompSendFormat.AskUser;     break;
 case "format_plain": gSendFormat = nsIMsgCompSendFormat.PlainText;   break;
 case "format_html":  gSendFormat = nsIMsgCompSendFormat.HTML;        break;
 case "format_both":  gSendFormat = nsIMsgCompSendFormat.Both;        break;
 }

is replaced with

if (msgCompFields)
 switch (target.getAttribute('id'))
 {
 case "my_format_auto":  gSendFormat = nsIMsgCompSendFormat.AskUser;     break;
 case "my_format_plain": gSendFormat = nsIMsgCompSendFormat.PlainText;   break;
 case "my_format_html":  gSendFormat = nsIMsgCompSendFormat.HTML;        break;
 case "my_format_both":  gSendFormat = nsIMsgCompSendFormat.Both;        break;
 }

Finally remove a single line that fiddles with the default (harmless now, but why leave it there…). In the definition of gComposeRecyclingListener, remove this line

document.getElementById("format_auto").setAttribute("checked", "true");

And that’s it.

and then repackage the Jar archive

$ jar cf messenger.jar content

Close Thunderbird, overwrite the original Jar file with the amended one (make a backup copy first, of course) and restart Thunderbird.

I should add, that there are several reasons to be surprised that this is enough. For example, while working on this, I noted that there are several direct calls to OutputFormatMenuSelect(), that attempt to fake a click on one of the HTML-enabling radio buttons. In the aftermath, plain text messages are generated even if this isn’t addressed directly.

Other stuff

During the process of figuring out how to solve this issue, I found a few tricks that may be useful in the future. So here they are

Open all jars you can find

$ find /usr/lib64/thunderbird-3.0/ -iname '*.jar' | while read i ; do ( mkdir "${i##*/}" && cd "${i##*/}" && jar xf "$i" ; ) done

This opens each jar in a directory holding its name (including the .jar suffix)

Set the default HTML format

mail.default_html_action: Set from 3 to 1. Seems not to have a significant effect.

Enabling the dump() command

dump() is used in internal Javascript code to produce debug messages, which are printed to stdout. This requires running Thunderbird from the command line.

In the Config Editor mentioned above, add the boolean browser.dom.window.dump.enabled and set it to true. Otherwise nothing is printed.

Creating stack traces

function DumpTrace()
{
 var err = new Error();

 dump("\nStack trace:\n" + err.stack + "\n\n");
}

The stack trace is pretty ugly, and contains a DumpTrace() too, but it’s good enough to find out why a certain function is called.