my tech blog

Reading the firmware ROM from a Renesas uPD720202 USB 3.0 Host Controller using Linux

Pretty much as a side note, I should mention that the firmware should and can be loaded with a Windows utility named K2024FWUP1.exe. Get it from whereever you can, and verify it isn’t dirty with

$ shasum K2024FWUP1.exe
c9414cb825af79f5d87bd9772e10e87633fbf125  K2024FWUP1.exe

If this isn’t done, Window’s Device Manager will say that the device can’t be started, and Linux kernel will complain with

pci 0000:06:00.0: xHCI HW not ready after 5 sec (HC bug?) status = 0x1801

[...]

xhci_hcd 0000:06:00.0: can't setup: -110
xhci_hcd 0000:06:00.0: USB bus 3 deregistered
xhci_hcd 0000:06:00.0: init 0000:06:00.0 fail, -110
xhci_hcd: probe of 0000:06:00.0 failed with error -110

Now to the Linux part. This is just the series of commands I used to read from the firmware ROM of a Renesas USB controller detected as:

# lspci -s 06:00
06:00.0 USB controller: Renesas Technology Corp. uPD720202 USB 3.0 Host Controller (rev 02)

The point was to check if the ROM was erased (it was). I followed the instructions in the “μPD720201/μPD720202 User’s Manual: Hardware” (R19UH0078EJ0600, Rev.6.00), section 6.

Check if ROM exists:

# setpci -s 06:00.0 f6.w
8000

Bit 15=1, so yes, ROM exists. Check type and parameter:

# setpci -s 06:00.0 ec.l
00c22210
# setpci -s 06:00.0 f0.l
00000500

OK, according to table 6-1 of the Hardware User Manual, it’s a MX25L5121E.

Write magic word to DATA0:

# setpci -s 06:00.0 f8.l=53524F4D

Set “External ROM Access Enable”:

# setpci -s 06:00.0 f6.w=8001

Check “Result Code”:

# setpci -s 06:00.0 f6.w
8001

Indeed, bits 6:4 are zero — no result yet, as required for this stage in the Guide.

Now set Get DATA0 and Get DATA1, and check that they have been cleared:

# setpci -s 06:00.0 f6.w=8c01
# setpci -s 06:00.0 f6.w
8001

Get first piece of data from DATA0:

# setpci -s 06:00.0 f8.l
ffffffff

The ROM appears to be erased… Set Get DATA0 again, and read DATA1 (this is really what the Guide says)

# setpci -s 06:00.0 f6.w=8401
# setpci -s 06:00.0 fc.l
ffffffff

Yet another erased word. And now the other way around: Set Get DATA1 and read DATA0 again:

# setpci -s 06:00.0 f6.w=8801
# setpci -s 06:00.0 f8.l
ffffffff

And the other way around again…

# setpci -s 06:00.0 f6.w=8401
# setpci -s 06:00.0 fc.l
ffffffff

When done, clear “External ROM Access Enable”

# setpci -s 06:00.0 f6.w=8000

This rewinds the next set of operation to the beginning, of the ROM, as I’ve seen by trying it out, even though the Guide wasn’t so clear about it. So if the sequence shown above starts from the beginning, we read the beginning of the ROM again.

Again, with the ROM loaded with firmware

# setpci -s 06:00.0 f6.w
8000
# setpci -s 06:00.0 f8.l=53524F4D
# setpci -s 06:00.0 f6.w=8001
# setpci -s 06:00.0 f6.w
8001
# setpci -s 06:00.0 f6.w=8c01
# setpci -s 06:00.0 f6.w
8001
# setpci -s 06:00.0 f8.l
7da655aa
# setpci -s 06:00.0 f6.w=8401
# setpci -s 06:00.0 fc.l
00f60014
# setpci -s 06:00.0 f6.w=8801
# setpci -s 06:00.0 f8.l
004c010c
# setpci -s 06:00.0 f6.w=8401
# setpci -s 06:00.0 fc.l
2ffc015c
# setpci -s 06:00.0 f6.w=8801
# setpci -s 06:00.0 f8.l
0008315c
# setpci -s 06:00.0 f6.w=8401
# setpci -s 06:00.0 fc.l
1a5c2024
# setpci -s 06:00.0 f6.w=8000

I stopped after a few words, of course. Note that the first word is indeed the correct signature.

Posted Under: Linux,USB
This post was written by eli on November 3, 2015 Comments (5)

Cursor control characters in a bash script

To control the cursor’s position with a plain bash “echo” command, use the fact that the $’something‘ pseudo-variable interprets that something more or less like a C escape sequence. So the ESC character, having ASCII code 0x1b, can be generated with $’0x1b’. $’\e’ is also OK, by the way.

There are plenty of sources for TTY commands, for example this and this.

So, to jump to the upper-left corner of the screen, just go

$ echo -n $'\x1b'[H

Alternatively, one can use echo’s -e flag, which is the method chosen in /etc/init.d/functions to produce color-changing escape characters. So the “home” sequence could likewise be

$ echo -en \\033[H

As easy as that.

Posted Under: Linux,Software
This post was written by eli on October 23, 2015 Comments (0)

Using Linux’ setpci to program an EEPROM attached to an PLX / Avago PCIe switch

Introduction

These are my notes as I programmed an Atmel AT25128 EEPROM, attached to a PEX 8606 PCIe switch, using PCIe configuration-space writes only (that is, no I2C / SMBus cable). This is frankly quite redundant, as Avago supplies software tools for doing this.

In fact, in order to get their tools, register at Avago’s site, then make the extra registration in PLX Tech’ site. None of these registrations require signing an NDA. At PLX Tech’s site, pick SDK -> PEX at the bottom of list of devices to get documentation for, and download the PLX SDK. Among others, this suite includes the PEX Device Editor, which is quite a useful tool regardless of switches, as it gives a convenient tree view of the bus. The Device Editor, as well as other tools, allow programming the EEPROM from the host, with or without an I2C cable.

There are also other tools in the SDK that do the same thing PLXMon in particular. If you have an Aardvark I2C to USB cable, the PLXMon tool allows reading and writing to the EEPROM through I2C. And there’s a command line interface, probably for all functionality. So really, this is really for those who want to get down to the gory details.

All said below will probably work with the entire PEX 86xx family, and possibly with other Avago devices as well. The Data Book is your friend.

The EEPROM format

The organization of data in the outlined in the Data Book, but to keep it short and concise: It’s a sequence of bytes, consisting of a concatenation of the following words, all represented in Little Endian format:

The signature, always 0x5a, occupying one byte
A zero (0x00), occupying one byte
The number of bytes of payload data to come, given as a 16-bit words (two bytes). Or equivanlently, the number of registers to be written to, multiplied by 6.
The address of the register to be written to, divided by 4, and ORed with the port number, left shifted by 10 bits. See the data book for how NT ports are addressed. This field occupies 16 bits (two bytes). Or to put it in C’ish:
```
unsigned short addr_field = (reg_addr >> 2) | (port << 10)
```
The data to be written: 32 bits (four bytes)

Items #4 and #5 are repeated for each register write. There is no alignment, so when this stream is organized in 32-bit words, it becomes somewhat inconvenient.

And as the Data Book keeps saying all over the place: If the Debug Control register (at 0x1dc) is written to, it has to be the first entry (occupying bytes 4 to 9 in the stream). Its address representation in the byte stream is 0x0077, for example (or more precisely, the byte 0x77 followed by 0x00).

Accessing configuration space registers

Given the following PCI bus setting:

02:00.0 PCI bridge: PLX Technology, Inc. Unknown device 8606 (rev ba)
03:01.0 PCI bridge: PLX Technology, Inc. Unknown device 8606 (rev ba)
03:05.0 PCI bridge: PLX Technology, Inc. Unknown device 8606 (rev ba)
03:07.0 PCI bridge: PLX Technology, Inc. Unknown device 8606 (rev ba)
03:09.0 PCI bridge: PLX Technology, Inc. Unknown device 8606 (rev ba)

In particular note that the switch’ upstream port 0 is at 02:00.0.

Reading from the Serial EEPROM Buffer register at 264h (as root, of course):

# setpci -s 02:00.0 264.l
00000000

The -s 02:00.0 part selects the device by its bus position (see above).

Note that all arguments as well as return values are given in hexadecimal. An 0x prefix is allowed, but it’s redundant.

Making a dry-run of writing to this register, and verifying nothing happened:

# setpci -Dv -s 02:00.0 264.l=12345678
02:00.0:264 12345678
# setpci -s 02:00.0 0x264.l
00000000

Now let’s write for real:

# setpci -s 02:00.0 264.l=12345678
# setpci -s 02:00.0 264.l
12345678

(Yey, it worked)

Reading from the EEPROM

Reading four bytes from the EEPROM at address 0:

# setpci -s 02:00.0 260.l=00a06000
# setpci -s 02:00.0 264.l
0012005a

The “a0″ part above sets the address width explicitly to 2 bytes on each operation. There may be some confusion otherwise, in particular if the device wasn’t detected properly at bringup. The “60″ part means “read”.

Just checking the value of the status register after this:

# setpci -s 02:00.0 260.l
00816000

Same, but read from EEPROM address 4. The lower 13 LSBs are used as bits [14:0] of the EEPROM address. It’s also possible to access higher addresses (see the respective Data Book).

# setpci -s 02:00.0 260.l=00a06001
# setpci -s 02:00.0 264.l
0008c03a

Or, to put it in a simple Bash script (this one reads the first 16 DWords, i.e. 64 bytes) from the EEPROM of the switch located at the bus address given as the argument to the script (see example below):

#!/bin/bash

DEVICE=$1

for ((i=0; i<16; i++)); do
  setpci -s $DEVICE 260.l=`printf '%08x' $((i+0xa06000))`
  usleep 100000
  setpci -s $DEVICE 264.l
done

Rather than checking the status bit for the read to be finished, the script waits 100 ms. Quick and dirty solution, but works.

Note: usleep is deprecated as a command-line utility. Instead, odds are that “sleep 0.1″ replaces “usleep 100000″. Yes, sleep takes non-integer arguments in non-ancient UNIXes.

Writing to the EEPROM

Important: Writing to the EEPROM, in particular the first word, can make the switch ignore the EEPROM or load faulty data into the registers. On some boards, the EEPROM is essential for the detection of the switch by the host and its enumeration. Consequently, writing junk to the EEPROM can make it impossible to rectify this through the PCIe interface. This can render the PCIe switch useless, unless this is fixed with I2C access.

Before starting to write, the EEPROM’s write enable latch needs to be set. This is done once for each write as follows, regardless of the desired target address:

# setpci -s 02:00.0 260.l=00a0c000

Now we’ll write 0xdeadbeef to the first 4 bytes of the EEPROM.

# setpci -s 02:00.0 264.l=deadbeef
# setpci -s 02:00.0 260.l=00a04000

If another address is desired, add the address in bytes, divided by 4 to 00004000 above. The write enable latch is the same (no change in the lower bits is required).

Here’s an example of the sequence for writing to bytes 4-7 of the EEPROM (all three lines are always required)

# setpci -s 02:00.0 260.l=00a0c000
# setpci -s 02:00.0 264.l=010d0077 # Just any value goes
# setpci -s 02:00.0 260.l=00a04001

Or making a script of this, which writes the arguments from address 0 and on (for those who like to make big mistakes…)

#!/bin/bash

numargs=$#
DEVICE=$1

shift

for ((i=0; i<(numargs-1); i++)); do
  setpci -s $DEVICE 260.l=00a0c000
  setpci -s $DEVICE 264.l=$1
  setpci -s $DEVICE 260.l=`printf '%08x' $((i+0xa04000))`
  usleep 100000
  shift
done

Again, usleep can be replaced with a plain sleep with a non-integer argument. See above.

Example of using these scripts

# ./writeeeprom.sh 02:00.0 0006005a 00ff0081 ffff0001
# ./readeeprom.sh 02:00.0
0006005a
00ff0081
ffff0001
ffffffff
ffffffff
ffffffff
ffffffff
ffffffff
ffffffff
ffffffff
ffffffff
ffffffff
ffffffff
ffffffff
ffffffff
ffffffff

When the EEPROM gets messed up

It’s more than possible that the switch becomes unreachable to the host as a result of messing up the EEPROM’s registers. For example, by changing the upstream port setting. A simple way out, if a blank EEPROM is good enough for talking with the switch, is to force the EEPROM undetected by e.g. short-circuiting the EEPROM’s SO pin (pin number 2 on AT25128) to ground with a 33 Ohm resistor or so. This prevents the data from being loaded, but the commands above will nevertheless work, so the content can be altered. Yet another “dirty, but works” solution.

Posted Under: Linux,PCI express,Software
This post was written by eli on October 21, 2015 Comments (3)

Moving a Windows 7-installed hard disk to a new computer

This has been documented elsewhere, but it’s important enough to have a note about here.

In short, before switching to a new hardware, it’s essential to prepare it, or an 0x0000007b blue screen will occur on the new hardware.

The trick is to run sysprep.exe (under windows\system32\sysprep\) before the transition. Have “Generalize” checked, and choose “shutdown” at the end of the operation (“Shutdown Options”).

Once the computer shuts down, move the hard disk to the new computer. Windows should boot smoothly, and start a series of installation stages, including feeding the license key and language settings. Also, an account needs to be created. This account can be deleted afterwards, as the old account is kept. Quite silly, as a matter of fact.

Posted Under: Microsoft
This post was written by eli on October 20, 2015 Comments (0)

Linux kernel hack for calming down a flood of PCIe AER messages

While working on a project involving a custom PCIe interface, Linux’ message log became flooded with messages like

pcieport 0000:00:1c.6:   device [8086:a116] error status/mask=00001081/00002000
pcieport 0000:00:1c.6:    [ 0] Receiver Error
pcieport 0000:00:1c.6:    [ 7] Bad DLLP
pcieport 0000:00:1c.6:    [12] Replay Timer Timeout
pcieport 0000:00:1c.6:   Error of this Agent(00e6) is reported first
pcieport 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0200(Transmitter ID)
pcieport 0000:02:00.0:   device [10b5:8606] error status/mask=00003000/00002000
pcieport 0000:02:00.0:    [12] Replay Timer Timeout
pcieport 0000:00:1c.6: AER: Corrected error received: id=00e6
pcieport 0000:00:1c.6: can't find device of ID00e6
pcieport 0000:00:1c.6: AER: Corrected error received: id=00e6
pcieport 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0200(Transmitter ID)

And before long, some 400 MB of log messages accumulated in /var/log/messages. In this context, they are merely informative AER (Advanced Error Reporting) messages, telling me that errors have occurred in the link between the computer’s PCIe controller and the PCIe switch on the custom board. But all of these errors were correctable (presumably with retransmits) so from a functional standpoint, the hardware worked.

Advanced Error Reporting, and its Linux driver was explained in OLS 2007 (pdf).

Had it not been for these messages, I could have been mislead to think that all was fine, even though there’s a method to tell, which I’ve dedicated an earlier post to. So they’re precious, but they flood the system logs, and even worse, the system is so busy handling them, that the boot is slowed down, and sometimes the boot process got stuck in the middle.

At first I thought that it would be enough to just turn off the logging of these messages, but it seems like the flood of interrupts was the problem.

So one way out is to disable the handler of AER altogether: Use the pci=noaer kernel parameter on boot, or disable the CONFIG_PCIEAER kernel configuration flag, and recompile the kernel. This removes the piece of code that configures the computer’s root port to send interrupts if and when an AER message arrives, but that way I won’t be alerted that a problem exists.

So I went for hacking the kernel code. In an early attempt, I went for not producing error messages for each event, but to keep it down to no more than 5 per second. It worked in the sense that the log wasn’t flooded, but didn’t solve the problem of a slow or impossible boot. As mentioned earlier, the core problem seems to be a bombardment of interrupts.

So the hack that eventually did the job for me tells the root port to stop generating interrupts after 100 kernel messages have been produced. That’s enough to inform me that there’s a problem, and give me an idea of where it is, but it stops soon enough to let the system live.

The only file I modified was drivers/pci/pcie/aer/aerdrv_errprint.c on a 4.2.0 Linux kernel. In retrospective, I could have done it more elegant. But hey, now that it works, why should I care…?

It goes like this: I defined a static variable, countdown, and initialized it to 100. Before a message is produced, a piece of code like this runs:

	if (!countdown--)
		aer_enough_is_enough(dev);

aer_enough_is_enough() is merely a copy of aerdrv.c’s aer_disable_rootport(), which is defines as static there, and requires an uncomfortable argument. It would have made more sense to make aer_disable_rootport() a wrapper of another function, which could have been used both by aerdrv.c and my little hack — that would have been much more elegant.

Instead, I copied two additional static functions that are required by aer_disable_rootport() into aerdrv_errprint.c, and ended up with an ugly hack that solves the problem.

With all due shame, here’s the changes in patch format. It’s not intended to apply on your kernel as is. It’s more intended to be a guideline to how to get it done. And by all means, take a look on aerdrv.c’s relevant functions, and see if they’re different, by any chance.

From b007850486167288ea4c6c6a1bf30ddd1a299f24 Mon Sep 17 00:00:00 2001
From: Eli Billauer <my-mail@gmail.com>
Date: Sat, 17 Oct 2015 07:37:19 +0300
Subject: [PATCH] PCIe AER handler: Turn off interrupts from root port after 100 messages

---
 drivers/pci/pcie/aer/aerdrv_errprint.c |   78 ++++++++++++++++++++++++++++++++
 1 files changed, 78 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
index 167fe41..31a8572 100644
--- a/drivers/pci/pcie/aer/aerdrv_errprint.c
+++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
@@ -20,6 +20,7 @@
 #include <linux/pm.h>
 #include <linux/suspend.h>
 #include <linux/cper.h>
+#include <linux/pcieport_if.h>

 #include "aerdrv.h"
 #include <ras/ras_event.h>
@@ -129,6 +130,74 @@ static const char *aer_agent_string[] = {
 	"Transmitter ID"
 };

+/* Two functions copied from aerdrv.c, to prevent name space pollution */
+
+static int set_device_error_reporting(struct pci_dev *dev, void *data)
+{
+	bool enable = *((bool *)data);
+	int type = pci_pcie_type(dev);
+
+	if ((type == PCI_EXP_TYPE_ROOT_PORT) ||
+	    (type == PCI_EXP_TYPE_UPSTREAM) ||
+	    (type == PCI_EXP_TYPE_DOWNSTREAM)) {
+		if (enable)
+			pci_enable_pcie_error_reporting(dev);
+		else
+			pci_disable_pcie_error_reporting(dev);
+	}
+
+	if (enable)
+		pcie_set_ecrc_checking(dev);
+
+	return 0;
+}
+
+/**
+ * set_downstream_devices_error_reporting - enable/disable the error reporting  bits on the root port and its downstream ports.
+ * @dev: pointer to root port's pci_dev data structure
+ * @enable: true = enable error reporting, false = disable error reporting.
+ */
+static void set_downstream_devices_error_reporting(struct pci_dev *dev,
+						   bool enable)
+{
+	set_device_error_reporting(dev, &enable);
+
+	if (!dev->subordinate)
+		return;
+	pci_walk_bus(dev->subordinate, set_device_error_reporting, &enable);
+}
+
+/* Allow 100 messages, and then stop it. Since the print functions are called
+   from a work queue, it's safe to call anything, aer_disable_rootport()
+   included. */
+
+static int countdown = 100;
+
+/* aer_enough_is_enough() is a copy of aer_disable_rootport(), only the
+   latter requires to get the aer_rpc structure from the pci_dev structure,
+   and then uses it to get the pci_dev structure. So enough with that too.
+*/
+
+static void aer_enough_is_enough(struct pci_dev *pdev)
+{
+	u32 reg32;
+	int pos;
+
+	dev_err(&pdev->dev, "Exceeded limit of AER errors to report. Turning off Root Port interrupts.\n");
+
+	set_downstream_devices_error_reporting(pdev, false);
+
+	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_ERR);
+	/* Disable Root's interrupt in response to error messages */
+	pci_read_config_dword(pdev, pos + PCI_ERR_ROOT_COMMAND, &reg32);
+	reg32 &= ~ROOT_PORT_INTR_ON_MESG_MASK;
+	pci_write_config_dword(pdev, pos + PCI_ERR_ROOT_COMMAND, reg32);
+
+	/* Clear Root's error status reg */
+	pci_read_config_dword(pdev, pos + PCI_ERR_ROOT_STATUS, &reg32);
+	pci_write_config_dword(pdev, pos + PCI_ERR_ROOT_STATUS, reg32);
+}
+
 static void __print_tlp_header(struct pci_dev *dev,
 			       struct aer_header_log_regs *t)
 {
@@ -168,6 +237,9 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
 	int layer, agent;
 	int id = ((dev->bus->number << 8) | dev->devfn);

+	if (!countdown--)
+		aer_enough_is_enough(dev);
+
 	if (!info->status) {
 		dev_err(&dev->dev, "PCIe Bus Error: severity=%s, type=Unaccessible, id=%04x(Unregistered Agent ID)\n",
 			aer_error_severity_string[info->severity], id);
@@ -200,6 +272,9 @@ out:

 void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
 {
+	if (!countdown--)
+		aer_enough_is_enough(dev);
+
 	dev_info(&dev->dev, "AER: %s%s error received: id=%04x\n",
 		info->multi_error_valid ? "Multiple " : "",
 		aer_error_severity_string[info->severity], info->id);
@@ -226,6 +301,9 @@ void cper_print_aer(struct pci_dev *dev, int cper_severity,
 	u32 status, mask;
 	const char **status_strs;

+	if (!countdown--)
+		aer_enough_is_enough(dev);
+
 	aer_severity = cper_severity_to_aer(cper_severity);

 	if (aer_severity == AER_CORRECTABLE) {
--
1.7.2.3

And again — it’s given as a patch, but really, it’s not intended for application as is. If you need to do this yourself, read through the patch, understand what it does, and make the changes with respect to your own kernel. Or your system may just hang.

Posted Under: Linux,Linux kernel,PCI express
This post was written by eli on October 19, 2015 Comments (6)

syslogd notes

A few jots on playing with the system logger (the one that writes to /var/log/messages) on an ancient CentOS 5.5.

First, check the version: It says

Oct  6 15:12:06 diskless syslogd 1.4.1: restart.

So it’s a quite old revision of syslogd, unfortunately. There are no filter conditions to rely on.

The relevant configuration file is /etc/syslog.conf. First, one may divert the log messages from /var/log/messages to /var/log/kernel by changing

*.info;mail.none;authpriv.none;cron.none                /var/log/messages

*.info;mail.none;authpriv.none;cron.none;kern.none              /var/log/messages

kern.*                                                          /var/log/kernel-junk

Or, alternatively, divert only less-than-warnings messages to kernel-junk (with lazy flushing):

*.info;mail.none;authpriv.none;cron.none;kern.none;kern.warn		/var/log/messages

kern.*							-/var/log/kernel-junk

The trick is that kern.none disables all kernel messages to /var/log/messages. The following kern.warn turns warnings and up back on. kernel-junk gets everything.

Posted Under: Linux,Linux kernel
This post was written by eli on October 17, 2015 Comments (0)

Hexdump notes

General notes

For plain byte-per-byte hex dump,

$ hexdump -C

To dump a limited number of bytes, use the -n flag:

$ hexdump -C -n 64 /dev/urandom
00000000  9c 72 b0 43 da 6e 27 2f  f9 f1 34 06 60 d5 71 ad  |.r.C.n'/..4.`.q.|
00000010  cc 07 89 02 f7 f9 5f 85  f6 ba a5 24 cc 9f 2d d5  |......_....$..-.|
00000020  6d da 5b 91 a6 23 d4 94  51 1d 96 a7 5c 34 1a 48  |m.[..#..Q...\4.H|
00000030  6e 13 d4 3a 54 5d c5 c4  7b 1e f3 7b 6f 84 af 8b  |n..:T]..{..{o...|
00000040

And possibly add the -v flag so that repeated lines are printed out explicitly

$ hexdump -C -n 64 /dev/zero
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000040
$ hexdump -C -v -n 64 /dev/zero
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000040

Hexdump scripting

Hexdump has a somewhat weird one-liner scripting syntax. It consists of the -e flag(s) followed by a string, which must be enclosed in a single quote signs. Within this string, there may be several double quotes containing formatting info. Probably, the only way to really figure this out is trying some examples.

Everything in the expression runs as a loop.
n/m (n and m are integers) means n times consume m bytes regarding the expression following immediately.
If there is more than one -e, they consume the same data for each -e
%08_ax is the data offset in hex. Also try “%10_ad: ” for decimal position.
Anything not interpreted is printed (a bit like printf). That includes, of course, “\n”.
For editing hex data, ghex can be handy

Scripting examples

Print out the input as 32-bit hex integers, one per line:

$ hexdump -v -e '1/4 "%08x " "\n"'

Same, but as 32-bit decimal numbers:

$ hexdump -v -e '1/4 "%08d " "\n"'

Dump mouse raw motion data, three bytes per line, each as a hex number:

$ hexdump -v -e '3/1 "%02x " "\n"' /dev/input/mice

Like “hexdump -C”, only explicitly:

$ hexdump -e '"%08_ax " 16/1 "%02x "' -e '" |" 16/1 "%_p" "|\n"'

The manpage offers a lot more detail on this.

Posted Under: Linux,Software
This post was written by eli on October 7, 2015 Comments (0)

Linux kernel compilation jots

General

These are a few random notes to self regarding kernel compilation.

The preferred vanilla kernel rep to use is Linux Stable:

$ git clone https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/

It’s often a good idea to pick a kernel version that was released a while ago, but with a high sub-subversion number. So it has been tested properly and it also has several bug fixes that were discovered down the road.

See another post of mine for avoiding “+” being added to the kernel’s version number.

Targeting i386

I compiled a kernel on a x86_64 machine, targeting an i386. Kind-of cross-compilation, but with no need for a cross compiler.

Remember to update the (extra) version number in the Makefile, but with only with [-+.a-z0-9]+ characters, or else the will be trouble with creating .deb packages (see below). Don’t forget adding a dash (“-”) at the beginning of EXTRAVERSION, or else it will be glued to the SUBLEVEL.

Also remember that there’s always

$ make help

and it’s very useful.

After copying a known config file to .config:

$ make ARCH=i386 oldconfig
$ time make ARCH=i386 -j 12 bzImage modules && echo Success

A lazier version is to use the “olddefconfig” and then the “bindeb-pkg” make targets instead of those above — see below. The bonus is that everything gets neatly packaged, and most of the said next becomes unnecessary.

And as root (hey, the ARCH parameter wasn’t required!):

# make modules_install INSTALL_MOD_PATH=/path/to/

(this installs into /path/to/lib/modules/{version number}, so don’t write the “/lib/modules” part)

Remember to update the symbolic links to the source directory if necessary.

Note that it’s possible to set the kernel version directly from the make command, overriding the one given in the Makefile. For example, to match the currently running version:

$ make KERNELVERSION=`uname -r` ARCH=i386 -j 8 bzImage modules && echo Success

Be sure to check in include/generated/utsrelease.h, possibly while the kernel is compiling, that you got it right. In particular, there may be a “+” sign added.

A depmod was required on the running machine as follows (after booting with the kernel, without modules loaded), even though a depmod ran on modules_install:

# depmod -a

When hacking on the kernel sources, it can be useful to go something like

$ make ARCH=i386 SUBDIRS=drivers/pci/pcie/

in order to compile just a certain subdirectory (like “I didn’t do anything stupid, did I?”).

So nope. SUBDIRS is deprecated. Use the “M=” alternative for modules, even though SUBDIRS catches the built-in objects as well (in case they were played with too).

And it’s also possible to add the known targets, such as

$ make ARCH=i386 SUBDIRS=drivers/pci/pcie/ clean

for cleaning up before compiling etc.

Installing on a Debian-based distribution

That includes Ubuntu, Linux Mint and all distributions where “apt” and “dpkg” are used to manage packages.

Instead of fiddling with all the files, just create .deb packages and install them. It’s so easy, and the files get the right names and locations without any hassle.

The command is simply

$ time make bindeb-pkg && echo Success

after a successful kernel compilation. I’ve tried to do this along with the compilation (of v6.8.12), either by adding bindeb-pkg to the targets or by requesting this target only. In both cases, the build failed after a few minutes, and I had little motivation figuring out why. The only backside is that the compilation number of the used kernel becomes #2 instead of #1, but that’s really petty.

Note that if the kernel is assigned an EXTRAVERSION in the Makefile, it must not contain any uppercase characters nor underscores. In fact, it has to match the regular expression [-+.a-z0-9]* or else an illegal Debian package name will be created, and the finale of the build will fail.

It takes about 12-20 minutes, and it appears to be stuck on the way, but in the end the following files are created on the Linux kernel tree’s parent directory. For a 6.8.12-myserver kernel targeting amd64, these are the files created:

linux-image-6.8.12-myserver_6.8.12-myserver-2_amd64.deb: The related files in /boot + modules in /lib/modules/
linux-headers-6.8.12-myserver_6.8.12-myserver-2_amd64.deb: The headers for compiling modules
linux-image-6.8.12-myserver-dbg_6.8.12-myserver-2_amd64.deb: Files apparently for debugging, a lot of them in /usr/lib/debug/lib/modules/
linux-libc-dev_6.8.12-myserver-2_amd64.deb: Header files for compiling user-space interface with the kernel, under /usr/include/

Installing these first two packages with “dpkg -i” does what I consider having the kernel installed on the machine: The kernel image in /boot, the kernel modules and the headers. It’s really that simple.

Creating headers for module compilation (non-Debian machine)

This is the probably somewhat off-beat way to create the files for /usr/src/ so that kernel modules can be compiled against the running kernel. The idea is to create a .deb file for the binary of the kernel, which necessarily includes the headers, and then fetch the desired parts from that. Maybe there’s a more straightforward way, but I don’t do this often enough to look for it.

Ah, and “make headers_install” is not the answer. It install the headers used by user-space programs, not for compiling modules. Neither is “make modules_prepare”, which allows compiling the C sources against the kernel tree, but then the MODPOST stage in the compilation fails because Module.symvers is missing.

So start with creating the Debian package files with bindeb-pkg, as mentioned above.

To extract the .deb file that contains the compilation headers:

$ ar x linux-headers-5.15.0_5.15.0-1_amd64.deb

Yes, that’s “ar”, not “tar”. That produces three files, among others the data.tar.xz file. First, verify what it contains:

$ tar -tJf data.tar.xz | less

and once you’ve convinced yourself that it’s OK to untar this in the target’s root directory, become root, and go

# tar -C / -xJvf data.tar.xz

The -C flag causes a chdir to root before executing the command, right? Also note that this will update the “build” symlink in /lib/modules as well.

For a simple installation, do this for the linux-headers and linux-image packages.

Creating an initramfs file (when necessary)

For a non-running kernel, something like (needs to run as root, or it fails)

# update-initramfs -v -c -k 4.14.0-test -b .
update-initramfs: Generating ./initrd.img-4.14.0-test

(-v for verbose, not really necessary)

In theory, this should have worked as well, but it doesn’t enable the prompt for encrypted root filesystem, which is why I need the initramfs to begin with.

$ mkinitramfs -o initrd.img-4.14.0-test 4.14.0-test

Anyway, update-initramfs got me an 285 MB file, which doesn’t fit into the boot/ directory. The one that came with the distro was 28 MB.

On the other hand, if I do the same thing with the currently running kernel, the output gets small and neat. So maybe because the new kernel is much newer, and maybe because initramfs always copies a lot of modules, and not just in use, when it’s not from the running kernel.

It didn’t help bluffing mkinitramfs by renaming the directories in /lib/modules/ and run mkinitramfs as if it was on the current kernel. Exactly the same file size resulted.

So I opened the initramfs image manually (copying from myself), from within an empty directory with

$ zcat ../initrd.img-4.14.0-test | cpio -i -d -H newc --no-absolute-filenames

and looked for the large files. The directories lib/modules/4.14.0-test/kernel/drivers/{gpu,net,scsi} took ~620 MB together. So removing these three, navigating to the root of the initram filesystem and compressing it back again:

$ find . -print0 | cpio --null -ov --format=newc | gzip -9 > ../smaller-initramfs.img

which shrunk the image to 94 MB. Which is small enough. The missing modules will load as the real root filesystem is mounted, so modules that aren’t necessary for boot can be deleted this way.

Trying to obtain a smaller initramfs image with

# update-initramfs -u -b .

when the new kernel is running brought me back a 285 MB image. It’s probably a matter of the new kernel’s size. It might be necessary to write a script that removes any module not loaded when the kernel is up from initramfs’ /lib/modules. But it’s not worth the effort at the moment.

Compiler versions

Since the target computer’s compiler version is really old, I got this when trying to compile a module on it:

$ make
make -C /lib/modules/5.15.0/build M=/home/eli/kernelmodule modules
make[1]: Entering directory '/usr/src/linux-headers-5.15.0'
warning: the compiler differs from the one used to build the kernel
  The kernel was built by: gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
  You are using:           gcc (Debian 4.9.2-10+deb8u2) 4.9.2
  CC [M]  /home/eli/kernelmodule/themodule.o
gcc: error: unrecognized command line option ‘-mrecord-mcount’

So I said, OK, let’s compile the kernel the target machine (or a root jail with a similar environment), but I got:

***
*** Compiler is too old.
***   Your GCC version:    4.9.2
***   Minimum GCC version: 5.1.0
***

So supporting old systems with new kernels isn’t all that easy. My solution is to compile the modules on the new machine, but nevertheless use the kernel headers just generated. They are still useful in the long run.

Posted Under: Linux,Linux kernel
This post was written by eli on October 1, 2015 Comments (0)

Instead of using “rm -rf”

The slightly safer alternative is

$ rm --one-file-system -vrf delme-junk/

There are two additional flags:

The -v flag causes “rm” to display the files as it deletes them. This gives the user a chance to stop the process if something completely wrong happens. Not as good as thinking before making the mistake, but much better than understanding the it in hindsight.
The –one-file-system flag prevents deleting files from possibly mounted sub-filesystems. For example, deleting a directory tree that contains bind mounts (a chroot jail, for example), forgetting to unmount these beforehand.

This is no substitute to thinking before typing, of course. ;)

Also, renaming the directory to a name that clearly means it should be deleted is helpful too. In particular as the operation is stored in the command history, with the potential of being re-run accidentally. Even though one may exempt the command from being stored with a space as the first character in the command line, if bash is configured accordingly.

Posted Under: Linux,Software
This post was written by eli on September 12, 2015 Comments (0)

Linux: Yet Another Google Chrome “Aw, snap” solved.

“Aw, snap” in Google Chrome happens when a process (or thread?) involved with Chrome dies unexpectedly. I got quite a few of those, and learned to live with them for about a year, as I couldn’t figure out what caused them. It was clear that it had to do with Adobe Flash somehow, and that it happened in certain sites, in certain situations. For example, Facebook’s messenger page always crashed. For those pages, I diverted to using the slower Firefox.

Adobe Flash continued to work fine in many other sites however.

This problem started after upgrading the kernel on Fedora Core 12 to a v3.12.20 I’ve compiled myself. Google Chrome is a 27.0.1453.93. All revisions can be upgraded in theory, but given all kinds of dependencies, the only way was to upgrade Linux completely. And I wasn’t ready to mess up a stable computer that does a lot of other things just to get rid of an annoying issue with Chrome.

For some reason, I couldn’t get a crash report from Chrome. I managed to enable reporting, but no report was ever generated.

The clue was there all along. These log entries kept appearing in /var/log/messages every time I launched Chrome:

Aug  7 13:01:15 kernel: audit_printk_skb: 16 callbacks suppressed
Aug  7 13:01:15 kernel: type=1326 audit(1438941675.119:56738): ses=1 pid=15006 comm="chrome" sig=0 syscall=20 compat=1 ip=0xf2df8430 code=0x50000
Aug  7 13:01:15 kernel: type=1326 audit(1438941675.297:56739): ses=1 pid=15030 comm="chrome" sig=0 syscall=5 compat=1 ip=0xf2d80430 code=0x50000
Aug  7 13:01:15 kernel: type=1326 audit(1438941675.297:56740): ses=1 pid=15030 comm="chrome" sig=0 syscall=33 compat=1 ip=0xf2d80430 code=0x50000
Aug  7 13:01:15 kernel: type=1326 audit(1438941675.297:56741): ses=1 pid=15030 comm="chrome" sig=0 syscall=5 compat=1 ip=0xf2d98044 code=0x50000
Aug  7 13:01:15 kernel: type=1326 audit(1438941675.297:56742): ses=1 pid=15030 comm="chrome" sig=0 syscall=85 compat=1 ip=0xf2d80430 code=0x50000
Aug  7 13:01:15 kernel: type=1326 audit(1438941675.298:56743): ses=1 pid=15030 comm="chrome" sig=0 syscall=195 compat=1 ip=0xf2d80430 code=0x50000
Aug  7 13:01:15 kernel: type=1326 audit(1438941675.298:56744): ses=1 pid=15030 comm="chrome" sig=0 syscall=195 compat=1 ip=0xf2d80430 code=0x50000
Aug  7 13:01:15 kernel: type=1326 audit(1438941675.298:56745): ses=1 pid=15030 comm="chrome" sig=0 syscall=195 compat=1 ip=0xf2d80430 code=0x50000
Aug  7 13:01:15 kernel: type=1326 audit(1438941675.298:56746): ses=1 pid=15030 comm="chrome" sig=0 syscall=195 compat=1 ip=0xf2d80430 code=0x50000
Aug  7 13:01:15 kernel: type=1326 audit(1438941675.298:56747): ses=1 pid=15030 comm="chrome" sig=0 syscall=195 compat=1 ip=0xf2d80430 code=0x50000

Googling a bit on “audit chrome type=1326″ I found this page making the connection with seccomp, and this page suggesting the solution in the comments.

Now, that made sense. Seccomp is a mechanism in Linux to cut off a process irreversibly from the outer world, so it can only read() and write() to already open file descriptors (supposedly going to pipes) or to terminate gracefully with exit(). Or use sigreturn(). It’s a neat security mechanism for not-so-trusted code that only needs to compute stuff. Codecs, for example. And Google Chrome uses this mechanism with Flash.

Maybe this explains why no crash report was generated: The process that crashed was jailed, so it couldn’t open the crash report file.

To fix this, I invoke Google Chrome with

$ google-chrome --disable-seccomp-filter-sandbox

And no more “Aw snaps”.

As the title implies, this solved my problem on a very certain machine with a very certain setting. There are millions of other reasons.

Posted Under: Linux,Linux kernel,Software
This post was written by eli on August 7, 2015 Comments (2)

« Older Entries

Newer Entries »

Popular Posts

Latest Posts

Archives