Google Chrome: Stop that nagging on updates

I have Google Chrome installed on a Linux machine at /opt/google as root, so the browser can’t update itself automatically. Instead, it complains with this pop-up every time the browser is started:

Annoying Google Chrome update popup

What I really like about this pop-up is the “you’re missing out” part. I get the same thing from the silly image gallery app on my Google Pixel phone.  This is Google trying to play on my (not so existent) FOMO.

It has been suggested to add the –simulate-outdated-no-au argument to the command line that executes Chrome. This works indeed. The common suggestion is however to do that on the shortcut that executes the browser. But that won’t cover the case when I run the browser from a shell. Something I do, every now and then. Don’t ask.

So a more sledge hammer solution is to edit the wrapper script:

$ which google-chrome
/usr/bin/google-chrome

So edit this file (as root), and change the last line from

exec -a "$0" "$HERE/chrome" "$@"

to

exec -a "$0" "$HERE/chrome" --simulate-outdated-no-au='Tue, 31 Dec 2099' "$@"

What does this mean, then? Well, according to the list of Google Chrome switches, this switch “simulates that current version is outdated and auto-update is off”. The date is referred to in the source’s upgrade_detector_impl.cc. Look there if you want to figure out why this works (I didn’t bother, actually).

Notes on Bluetooth on Linux internals

Introduction

These are notes I made while trying to make my Sony WH-CH510 bluetooth headphones work properly with my Linux Mint 19 machine. It’s evident that an upgrade of the whole OS would have fixed the problem, but I have my opinion on upgrades.

The problem: It takes a long time for the headphones to connect (around 20-30 seconds), and once that happens, the headphones are not automatically chosen by Pulseaudio as the output device. There are hacky solutions to the second part of the problems, but I have this thing about wanting to solve a problem properly.

The twist is that there’s no problem at all with Sony’s WH-CH500 headphones, neither with a pair of junky earbuds, which are labeled Y30.

So after trying quite a few quick fixes, I decided to get to the bottom of the problem. The good news is that I found out the reason for the problem. The bad news is that I didn’t find any direct solution for it.

So my solution was to upgrade the bluetooth daemon, and I did it in a rather exotic way: I basically installed a Linux Mint 21.1 (“Victoria”) pretty much in the same way as explained in this post. Now bluetoothd version 5.64 runs instead of Linux Mint’s 19 version 5.48 of this daemon, and everything works like a charm. But it’s a rather tangled solution if you don’t have this parallel Linux distribution installed for other reasons.

The rest of this post is a collection of things I tried before upgrading the daemon.

Bluetooth is handled by the kernel, which presents an hci device (typically hci0). The heavy lifting of implementing the protocol is done by bluetoothd ( /usr/lib/bluetooth/bluetoothd on my machine, started by the bluetooth systemd service). Try, for example,

$ hciconfig

Sniffing

To capture bluetooth communication, there are two primary option: The bluetooth0 interface in Wireshark or the btmon command line utility. Wireshark is more comprehensive as always, however btmon is actually easier to work with, because all crucial information is concentrated in a text file. The eternal tradeoff between GUI and text. Possibly, use both.

I’m going to show excerpts from btmon dumps below, obtained with e.g.

$ sudo stdbuf -oL btmon | tee btmon.txt

This could have been just “sudo btmon”, but using stdbuf and tee, there’s also data printed out to console in real time (stdbuf removes unnecessary buffering).

I skip a lot of entries, because there are, well, a lot of them. Hopefully I didn’t throw away anything relevant.

With btmon, “>” means incoming to host, and “<” means outgoing from host (the device is at the left side, which is a bit odd).

I’ve never carried out a bluetooth-related project, so all my comments on btmon’s output below are no more than hopefully intelligent guesses. I don’t pretend to be acquainted with any of the related protocols.

The connection

This is WH-CH510′s connection request (the earbud’s request was exactly the same):

> HCI Event: Connect Request (0x04) plen 10                #11 [hci0] 10.725859
        Address: 30:53:C1:11:40:2D (OUI 30-53-C1)
        Class: 0x240404
          Major class: Audio/Video (headset, speaker, stereo, video, vcr)
          Minor class: Wearable Headset Device
          Rendering (Printing, Speaker)
          Audio (Speaker, Microphone, Headset)
        Link type: ACL (0x01)
< HCI Command: Accept Connection R.. (0x01|0x0009) plen 7  #12 [hci0] 10.725914
        Address: 30:53:C1:11:40:2D (OUI 30-53-C1)
        Role: Master (0x00)

And after a few packet exchanges, this appears in the dump output (three times, not clear why):

@ MGMT Event: Device Connected (0x000b) plen 28       {0x0001} [hci0] 11.256871
        BR/EDR Address: 30:53:C1:11:40:2D (OUI 30-53-C1)
        Flags: 0x00000000
        Data length: 15
        Name (complete): WH-CH510
        Class: 0x240404
          Major class: Audio/Video (headset, speaker, stereo, video, vcr)
          Minor class: Wearable Headset Device
          Rendering (Printing, Speaker)
          Audio (Speaker, Microphone, Headset)

Cheap earbud’s service negotiation

Unlike USB, the device is in control: The requests information, and the host responds. The device chooses how to set up the connection, and the host follows suit. And it’s also the device that possibly messes up.

For the Y30 earbuds, the attribute request / response session went as follows.

First, the device checks if the host supports Handsfree Audio Gateway (0x111f), by asking for attributes, but the host doesn’t have any of it:

> ACL Data RX: Handle 17 flags 0x02 dlen 24                #42 [hci0] 14.317596
      Channel: 64 len 20 [PSM 1 mode 0] {chan 0}
      SDP: Service Search Attribute Request (0x06) tid 1 len 15
        Search pattern: [len 6]
          Sequence (6) with 3 bytes [16 extra bits] len 6
            UUID (3) with 2 bytes [0 extra bits] len 3
              Handsfree Audio Gateway (0x111f)
        Max record count: 512
        Attribute list: [len 6]
          Sequence (6) with 3 bytes [16 extra bits] len 6
            Unsigned Integer (1) with 2 bytes [0 extra bits] len 3
              0x0004
        Continuation state: 0
< ACL Data TX: Handle 17 flags 0x00 dlen 14                #43 [hci0] 14.317758
      Channel: 64 len 10 [PSM 1 mode 0] {chan 0}
      SDP: Service Search Attribute Response (0x07) tid 1 len 5
        Attribute bytes: 2
        Continuation state: 0

So the device tries AVDTP (0x0019) instead:

> ACL Data RX: Handle 17 flags 0x02 dlen 24                #45 [hci0] 14.322578
      Channel: 64 len 20 [PSM 1 mode 0] {chan 0}
      SDP: Service Search Attribute Request (0x06) tid 2 len 15
        Search pattern: [len 6]
          Sequence (6) with 3 bytes [16 extra bits] len 6
            UUID (3) with 2 bytes [0 extra bits] len 3
              AVDTP (0x0019)
        Max record count: 512
        Attribute list: [len 6]
          Sequence (6) with 3 bytes [16 extra bits] len 6
            Unsigned Integer (1) with 2 bytes [0 extra bits] len 3
              0x0004
        Continuation state: 0
< ACL Data TX: Handle 17 flags 0x00 dlen 60                #46 [hci0] 14.322737
      Channel: 64 len 56 [PSM 1 mode 0] {chan 0}
      SDP: Service Search Attribute Response (0x07) tid 2 len 51
        Attribute bytes: 48
          Attribute list: [len 21] {position 0}
            Attribute: Protocol Descriptor List (0x0004) [len 2]
              Sequence (6) with 6 bytes [8 extra bits] len 8
                UUID (3) with 2 bytes [0 extra bits] len 3
                  L2CAP (0x0100)
                Unsigned Integer (1) with 2 bytes [0 extra bits] len 3
                  0x0019
              Sequence (6) with 6 bytes [8 extra bits] len 8
                UUID (3) with 2 bytes [0 extra bits] len 3
                  AVDTP (0x0019)
                Unsigned Integer (1) with 2 bytes [0 extra bits] len 3
                  0x0103
          Attribute list: [len 21] {position 1}
            Attribute: Protocol Descriptor List (0x0004) [len 2]
              Sequence (6) with 6 bytes [8 extra bits] len 8
                UUID (3) with 2 bytes [0 extra bits] len 3
                  L2CAP (0x0100)
                Unsigned Integer (1) with 2 bytes [0 extra bits] len 3
                  0x0019
              Sequence (6) with 6 bytes [8 extra bits] len 8
                UUID (3) with 2 bytes [0 extra bits] len 3
                  AVDTP (0x0019)
                Unsigned Integer (1) with 2 bytes [0 extra bits] len 3
                  0x0103
        Continuation state: 0

And yes, the host supports it.

AVDTP is (according to Wikipedia) used by the advanced audio distribution (A2DP) profile to stream music to stereo headsets over an L2CAP channel intended for video distribution profile in the Bluetooth transmission.

So it goes on with connecting channel 65 to service number 0x0019 (PSM stands for Protocol Service Multiplexor).

> ACL Data RX: Handle 17 flags 0x02 dlen 12                #51 [hci0] 14.358888
      L2CAP: Connection Request (0x02) ident 3 len 4
        PSM: 25 (0x0019)
        Source CID: 65
< ACL Data TX: Handle 17 flags 0x00 dlen 16                #52 [hci0] 14.358945
      L2CAP: Connection Response (0x03) ident 3 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection pending (0x0001)
        Status: Authorization pending (0x0002)
< ACL Data TX: Handle 17 flags 0x00 dlen 16                #53 [hci0] 14.359202
      L2CAP: Connection Response (0x03) ident 3 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 17 flags 0x00 dlen 12                #54 [hci0] 14.359220
      L2CAP: Configure Request (0x04) ident 3 len 4
        Destination CID: 65
        Flags: 0x0000
AVDTP (0x0019)> HCI Event: Number of Completed Packets (0x13) plen 5     #55 [hci0] 14.361709
        Num handles: 1
        Handle: 17
        Count: 1

This is followed by a discovery phase:

@ MGMT Command: Start Discovery (0x0023) plen 1       {0x0001} [hci0] 14.362330
        Address type: 0x07
          BR/EDR
          LE Public
          LE Random

In the packet exchange that follows, the device obtains the capabilities of the host, and the audio connection is set up.

Sony WH-CH510′s service negotiation

The WH-CH510 takes another approach: It issues a Service Search Request for the Headset AG (0x1112). AG means Audio Gateway.

> ACL Data RX: Handle 20 flags 0x02 dlen 17                #48 [hci0] 11.736388
      Channel: 64 len 13 [PSM 1 mode 0] {chan 0}
      SDP: Service Search Request (0x02) tid 1 len 8
        Search pattern: [len 5]
          Sequence (6) with 3 bytes [8 extra bits] len 5
            UUID (3) with 2 bytes [0 extra bits] len 3
              Headset AG (0x1112)
        Max record count: 68
        Continuation state: 0
< ACL Data TX: Handle 20 flags 0x00 dlen 18                #49 [hci0] 11.736611
      Channel: 64 len 14 [PSM 1 mode 0] {chan 0}
      SDP: Service Search Response (0x03) tid 1 len 9
        Total record count: 1
        Current record count: 1
        Record handle: 0x1000c
        Continuation state: 0

And yes, it’s supported and given the handle 0x1000c. Using this handle, the device asks for this service’s attributes:

> ACL Data RX: Handle 20 flags 0x02 dlen 23                #51 [hci0] 11.741392
      Channel: 64 len 19 [PSM 1 mode 0] {chan 0}
      SDP: Service Attribute Request (0x04) tid 2 len 14
        Record handle: 0x1000c
        Max attribute bytes: 277
        Attribute list: [len 7]
          Sequence (6) with 5 bytes [8 extra bits] len 7
            Unsigned Integer (1) with 4 bytes [0 extra bits] len 5
              0x0000ffff
        Continuation state: 0
< ACL Data TX: Handle 20 flags 0x00 dlen 97                #52 [hci0] 11.741610
      Channel: 64 len 93 [PSM 1 mode 0] {chan 0}
      SDP: Service Attribute Response (0x05) tid 2 len 88
        Attribute bytes: 85
          Attribute list: [len 83] {position 0}
            Attribute: Service Record Handle (0x0000) [len 2]
              0x0001000c
            Attribute: Service Class ID List (0x0001) [len 2]
              UUID (3) with 2 bytes [0 extra bits] len 3
                Headset AG (0x1112)
              UUID (3) with 2 bytes [0 extra bits] len 3
                Generic Audio (0x1203)
            Attribute: Protocol Descriptor List (0x0004) [len 2]
              Sequence (6) with 3 bytes [8 extra bits] len 5
                UUID (3) with 2 bytes [0 extra bits] len 3
                  L2CAP (0x0100)
              Sequence (6) with 5 bytes [8 extra bits] len 7
                UUID (3) with 2 bytes [0 extra bits] len 3
                  RFCOMM (0x0003)
                Unsigned Integer (1) with 1 byte [0 extra bits] len 2
                  0x0c
            Attribute: Browse Group List (0x0005) [len 2]
              UUID (3) with 2 bytes [0 extra bits] len 3
                Public Browse Root (0x1002)
            Attribute: Bluetooth Profile Descriptor List (0x0009) [len 2]
              Sequence (6) with 6 bytes [8 extra bits] len 8
                UUID (3) with 2 bytes [0 extra bits] len 3
                  Headset (0x1108)
                Unsigned Integer (1) with 2 bytes [0 extra bits] len 3
                  0x0102
            Attribute: Unknown (0x0100) [len 2]
              Headset Voice gateway [len 21]
        Continuation state: 0

The device chooses to connect channel 65 to the RFCOMM service:

> ACL Data RX: Handle 20 flags 0x02 dlen 12                #62 [hci0] 12.337691
      L2CAP: Connection Request (0x02) ident 6 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 20 flags 0x00 dlen 16                #63 [hci0] 12.337747
      L2CAP: Connection Response (0x03) ident 6 len 8
        Destination CID: 64
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)

After some configuration packets, the discovery begins:

@ MGMT Event: Discovering (0x0013) plen 2             {0x0001} [hci0] 14.686897
        Address type: 0x07
          BR/EDR
          LE Public
          LE Random
        Discovery: Disabled (0x00)
> ACL Data RX: Handle 20 flags 0x02 dlen 18                #91 [hci0] 15.848944
      Channel: 64 len 14 [PSM 3 mode 0] {chan 0}
      RFCOMM: Unnumbered Info with Header Check (UIH) (0xef)
         Address: 0x63 cr 1 dlci 0x18
         Control: 0xef poll/final 0
         Length: 10
         FCS: 0x0e
        41 54 2b 43 49 4e 44 3d 3f 0d 0e                 AT+CIND=?..
> ACL Data RX: Handle 20 flags 0x02 dlen 17                #92 [hci0] 18.849015
      Channel: 64 len 13 [PSM 3 mode 0] {chan 0}
      RFCOMM: Unnumbered Info with Header Check (UIH) (0xef)
         Address: 0x63 cr 1 dlci 0x18
         Control: 0xef poll/final 0
         Length: 9
         FCS: 0x0e
        41 54 2b 43 49 4e 44 3f 0d 0e                    AT+CIND?..

This discovery isn’t all that successful: The device sends AT commands, but the host doesn’t respond to them. 4 seconds of futile attempts.

After some useless back and forth, the device tries again with the same Service Search Request, to which it receives the same answer, and so it goes on.

> ACL Data RX: Handle 20 flags 0x02 dlen 17               #108 [hci0] 20.717682
      Channel: 65 len 13 [PSM 1 mode 0] {chan 1}
      SDP: Service Search Request (0x02) tid 3 len 8
        Search pattern: [len 5]
          Sequence (6) with 3 bytes [8 extra bits] len 5
            UUID (3) with 2 bytes [0 extra bits] len 3
              Headset AG (0x1112)
        Max record count: 68
        Continuation state: 0

[ ... ]

The device disconnects the futile channel 65:

> ACL Data RX: Handle 20 flags 0x02 dlen 12               #114 [hci0] 20.818859
      L2CAP: Disconnection Request (0x06) ident 10 len 4
        Destination CID: 65
        Source CID: 64
< ACL Data TX: Handle 20 flags 0x00 dlen 12               #115 [hci0] 20.818906
      L2CAP: Disconnection Response (0x07) ident 10 len 4
        Destination CID: 65
        Source CID: 64

And then it tries another few AT commands on this same channel, despite having disconnected it. It didn’t reconnect it, so probably disconnection doesn’t mean what I think it does…?

> ACL Data RX: Handle 20 flags 0x02 dlen 24               #118 [hci0] 22.225305
      Channel: 64 len 20 [PSM 3 mode 0] {chan 0}
      RFCOMM: Unnumbered Info with Header Check (UIH) (0xef)
         Address: 0x63 cr 1 dlci 0x18
         Control: 0xef poll/final 0
         Length: 16
         FCS: 0x0e
        41 54 2b 43 4d 45 52 3d 33 2c 30 2c 30 2c 31 0d  AT+CMER=3,0,0,1.
        0e                                               .
> ACL Data RX: Handle 20 flags 0x02 dlen 18               #119 [hci0] 25.285306
      Channel: 64 len 14 [PSM 3 mode 0] {chan 0}
      RFCOMM: Unnumbered Info with Header Check (UIH) (0xef)
         Address: 0x63 cr 1 dlci 0x18
         Control: 0xef poll/final 0
         Length: 10
         FCS: 0x0e
        41 54 2b 43 43 57 41 3d 31 0d 0e                 AT+CCWA=1..

Needless to say, this was futile as well.

And then, out of the blue, the device requests to connect to AVDTP by choosing the service at address 0x0019.

> ACL Data RX: Handle 20 flags 0x02 dlen 12               #125 [hci0] 29.875346
      L2CAP: Connection Request (0x02) ident 11 len 4
        PSM: 25 (0x0019)
        Source CID: 64
< ACL Data TX: Handle 20 flags 0x00 dlen 16               #126 [hci0] 29.875431
      L2CAP: Connection Response (0x03) ident 11 len 8
        Destination CID: 65
        Source CID: 64
        Result: Connection pending (0x0001)
        Status: Authorization pending (0x0002)
< ACL Data TX: Handle 20 flags 0x00 dlen 16               #128 [hci0] 29.875677
      L2CAP: Connection Response (0x03) ident 11 len 8
        Destination CID: 65
        Source CID: 64
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)

It’s not clear where the device got the idea to do this: As far as I can tell, the device wasn’t informed by the host about the existence of this service, at least not directly.

From this point, the negotiation goes well, and an audio device is set up. This allows selecting the headphones in the Sound Settings. Something that is done automatically with the old headphones and the junky earbuds.

So what is the problem? Maybe that the host doesn’t answer the AT commands. This is maybe solved in later versions of bluetoothd or Linux distributions in general. It’s quite possible that my distribution is too old for these newer headphones. Upgrade or perish.

Or go the other way: Downgrade. I suppose the solution would be to disable the Headset AG profile (0x1112), so that bluetoothd refuses to play ball when this is requested. This would speed up the fallback to A2DP, I hope, and possibly solve the problem.

I’ve tried hard to find a way to make bluetoothd refuse to the Headset AG profile, but in vain (so far?). sdptool has the option to disable a service, however it’s deprecated, and bluez 5 won’t talk with it. The updated method to tickle bluetoothd is through Dbus. Not sure if it has an API for turning off a service.

Unfortunately, I have no idea how to do this except for compiling bluetoothd from its sources and remove that option altogether. Actually, changing the UUID is enough to make it unusable.

I tried that, but it didn’t work all that well. More on that below.

Now to some extra stuff I randomly found out while working on this.

bluetootctl

This utility talks with bluetoothd through Dbus.

Doesn’t require root (when run from a terminal window on the computer’s desktop):

$ bluetoothctl
[NEW] Controller 9C:BB:CC:DD:EE:FF compname [default]
[NEW] Device 78:2E:D4:D9:62:C1 Y30
[NEW] Device 00:18:09:76:27:29 WH-CH500
[NEW] Device 30:53:C1:11:40:2D WH-CH510
Agent registered
[bluetooth]# show
Controller 9C:BB:CC:DD:EE:FF (public)
	Name: compname
	Alias: compname
	Class: 0x001c0104
	Powered: yes
	Discoverable: yes
	Pairable: yes
	UUID: Headset AG                (00001112-0000-1000-8000-00805f9b34fb)
	UUID: Generic Attribute Profile (00001801-0000-1000-8000-00805f9b34fb)
	UUID: A/V Remote Control        (0000110e-0000-1000-8000-00805f9b34fb)
	UUID: OBEX File Transfer        (00001106-0000-1000-8000-00805f9b34fb)
	UUID: Generic Access Profile    (00001800-0000-1000-8000-00805f9b34fb)
	UUID: OBEX Object Push          (00001105-0000-1000-8000-00805f9b34fb)
	UUID: PnP Information           (00001200-0000-1000-8000-00805f9b34fb)
	UUID: IrMC Sync                 (00001104-0000-1000-8000-00805f9b34fb)
	UUID: A/V Remote Control Target (0000110c-0000-1000-8000-00805f9b34fb)
	UUID: Audio Source              (0000110a-0000-1000-8000-00805f9b34fb)
	UUID: Audio Sink                (0000110b-0000-1000-8000-00805f9b34fb)
	UUID: Vendor specific           (00005005-0000-1000-8000-0002ee000001)
	UUID: Message Notification Se.. (00001133-0000-1000-8000-00805f9b34fb)
	UUID: Phonebook Access Server   (0000112f-0000-1000-8000-00805f9b34fb)
	UUID: Message Access Server     (00001132-0000-1000-8000-00805f9b34fb)
	UUID: Headset                   (00001108-0000-1000-8000-00805f9b34fb)
	Modalias: usb:v1D6Bp0246d0530
	Discovering: no

This utility spits out a lot of information by itself when the daemon is restarted with e.g. “systemctl restart bluetooth”. There’s also output when a device is connected and disconnected.

An interesting feature of bluetootctl is the submenus, in particular “gatt” and “advertise”. Maybe the former allows deregistering UUIDs.

Try

[bluetooth]# menu gatt

and when done, go back to original menu:

[bluetooth]# back

btmgmt

btmgmt talks with the kernel directly through an AF_BLUETOOTH raw network socket. I considered this tool because it has the rm-uuid command, which is supposed to remove a UUID.

$ sudo btmgmt
[mgmt]# rm-uuid 00001112-0000-1000-8000-00805f9b34fb
Remove UUID succeeded. Class 0x1c0104

This is reverted when the bluetooth service is restarted. But it doesn’t seem to have any effect on the interface anyhow. The UUID keeps appearing in bluetoothctl’s “show” and the service is advertised and used. “clr-uuids” apparently removes all UUIDs, but this has no real effect.

It seems like the effective UUIDs are kept in bluetoothd. btmgmt changes the UUIDs in the kernel. See “Remove UUID Command” in mgmt-api.txt.

btmgmt also gets very active when bluetoothd is restarted and other events occur.

Talking with bluetoothd directly through DBus

See my notes on DBus on this post. Getting a property:

$ dbus-send --system --dest=org.bluez --print-reply /org/bluez/hci0 org.freedesktop.DBus.Properties.Get string:org.bluez.Adapter1 string:Address
method return time=1685778877.065819 sender=:1.4 -> destination=:1.4934 serial=82 reply_serial=2
   variant       string "9C:BB:CC:DD:EE:FF"

Let’s break this down. Bluetoothd’ Dbus API is published in its source’s doc/ subdirectory. The Address property of the adapter is documented in adapter-api.txt. So:

  • –print-reply: We want the reply to this query, of course.
  • –system: Bluetoothd is connected to the system DBus.
  • –dest=org.bluez: Taken from “Service” in adapter-api.txt.
  • /org/bluez/hci0: Taken from “Object path” in adapter-api.txt.
  • org.freedesktop.DBus.Properties.Get: Every dbus-send command needs a method. In this case fetching a property’s value. So use the generic method for fetching a property.
  • string:org.bluez.Adapter1: Taken from “Interface” in adapter-api.txt.
  • string:Address: This is the name of the property, as listed in adapter-api.txt.

Likewise, I can fetch the UUIDs:

$ dbus-send --print-reply --system --dest=org.bluez /org/bluez/hci0 org.freedesktop.DBus.Properties.Get string:org.bluez.Adapter1 string:UUIDs
method return time=1685779625.693856 sender=:1.4 -> destination=:1.5017 serial=89 reply_serial=2
   variant       array [
         string "00001112-0000-1000-8000-00805f9b34fb"
         string "00001801-0000-1000-8000-00805f9b34fb"
         string "0000110e-0000-1000-8000-00805f9b34fb"
         string "00001106-0000-1000-8000-00805f9b34fb"
         string "00001800-0000-1000-8000-00805f9b34fb"
         string "00001105-0000-1000-8000-00805f9b34fb"
         string "00001200-0000-1000-8000-00805f9b34fb"
         string "0000110c-0000-1000-8000-00805f9b34fb"
         string "00001104-0000-1000-8000-00805f9b34fb"
         string "0000110a-0000-1000-8000-00805f9b34fb"
         string "0000110b-0000-1000-8000-00805f9b34fb"
         string "00005005-0000-1000-8000-0002ee000001"
         string "00001133-0000-1000-8000-00805f9b34fb"
         string "0000112f-0000-1000-8000-00805f9b34fb"
         string "00001132-0000-1000-8000-00805f9b34fb"
         string "00001108-0000-1000-8000-00805f9b34fb"
      ]

So this is how bluetoothctl got these values. Unfortunately, this property is read-only according to adapter-api.txt, so it can’t be manipulated with a Set method.

It’s of course possible to run methods that are published in bluetoothd’s DBus API, but I didn’t find anything related to disabling services.

Hacking the source

Download the sources for bluez 5.48-0ubuntu3.1 as bluez_5.48.orig.tar.xz (which is the version running on Mint 19).

In lib/sdp.h, change 0x1112 to 0xeb12. Same in lib/uuid.h and in src/profile.c.

Then in the source’s root directory, go:

$ ./configure && echo Success

and then just

$ make && echo Success

On my machine, there was a need to install libical, to make configure work, i.e.

$ sudo apt-get install libical-dev

And then replace /usr/lib/bluetooth/bluetoothd with the compiled version in src/. Keep a copy of the old executable, of course.

That didn’t work at all. The old UUID kept appearing in bluetoothctl’s output for “show”. There was a change, however: The WH-CH510 headphones refused to connect to the host, and reverted to pairing. At least I did something, I thought. But as it turned out, these headphones refused to connect to the host even after going back to the original bluetooth daemon (but had no problem with my cellphone). Y30 had no problems, as usual.

Resetting the headphone by pressing the power button and “-” button for 7 seconds didn’t help either. What eventually did the trick was to remove /usr/lib/bluetooth/bluetoothd, go “systemctl restart bluetooth”, which failed of course. Then return bluetoothd to its place, which worked, as expected. And then everything was back to normal again.

Extra reading

  • A Debian wiki page with several interesting examples (in particular how to use dbus-send to talk with bluetoothd)
  • This page contains a lot of information about using command line
  • The list of tools that come with the bluez package
  • Some information about configuration files.

Random jots

  • The bluetooth service is defined in /lib/systemd/system/bluetooth.service, which runs /usr/lib/bluetooth/bluetoothd.
  • It’s also possible to run the daemon directly, by typing “/usr/lib/bluetooth/bluetoothd” at shell prompt. The daemon runs in the foreground in this case. Add the “-n” flag to get the logging information on the console as well.
  • bluetoothd stores persistent information (the list and information about already discovered devices in particular) in /var/lib/bluetooth. This directory can be removed

It’s possible to stop the Bluetooth daemon with

# systemctl stop bluetooth

however clicking on desktop’s bluetooth applet restarts it automatically, unless it is running by other means (e.g. started manually with a command).

The “bluetoothctl” command won’t do this, but rather wait for a connection. It’s actually OK to turn off the bluetooth deamon while this utility is running.

After upgrading bluetoothd as mentioned above, I noticed a lot of log messages of this sort every 16 seconds or so:

kernel: Bluetooth: hci0: last event is not cmd complete (0x0f)

But looking at old logs, it’s evident that these existed in the past as well. So they’re not related to my manipulations with the Bluetooth daemon. In fact, these messages appear while the Cinnamon Bluetooth applet’s utility window is displayed. Close this window, and the messages stop. This is a kernel bug that was fixed in kernel v5.0.

Linux Mint + Cinnamon: Volume buttons stop to work suddenly

The nuisance

After trying to fix another problem with Pulseaudio, and messing up a bit with its configuration files, the option to control the volume by virtue of keyboard buttons suddenly stopped to work. The cursor would blink briefly when pressing the button, but nothing else happened.

I tried all the fairly normal things, including to restart the Pulseaudio daemon, to restart Cinnamon (with Alt-F2 + “r”), restart the session by changing the user (to the same user), disconnect and reconnect the (USB) keyboard, and a whole lot of other things.

It took me quite some time to fix this, so here’s what worked for me this time, along with some debug info.

The machine was a Linux Mint 19 with Cinnamon.

At an earlier stage, even the sound applet was gone from its desktop panel. To fix that, I right-clicked the panel, picked Troubleshoot > Looking Glass, and picked the Extension tab. There’s an applet listed called “Sound”. So I right-clicked it, and chose “Reload Code”. That got the applet back to the panel, and it worked flawlessly. But then I was stuck with the volume button problem.

The fix

What I actually did:

  • Kill the process named csd-keyboard (probably irrelevant)
  • Kill the process named csd-media-keys
  • Restart pulseaudio with “pulseaudio -k”

This was the output of journalctl as a result of these:

May 31 12:17:03 cinnamon-session[2115]: WARNING: t+1132491.55749s: Application 'cinnamon-settings-daemon-keyboard.desktop' killed by signal 15
May 31 12:17:17 cinnamon-session[2115]: WARNING: t+1132505.78238s: Application 'cinnamon-settings-daemon-media-keys.desktop' killed by signal 15
May 31 12:17:17 rtkit-daemon[2284]: Successfully made thread 4304 of process 4304 (n/a) owned by '1010' high priority at nice level -11.
May 31 12:17:17 rtkit-daemon[2284]: Supervising 8 threads of 4 processes of 1 users.
May 31 12:17:17 pulseaudio[4304]: [pulseaudio] pid.c: Daemon already running.
May 31 12:18:04 bluetoothd[12009]: Endpoint unregistered: sender=:1.120210 path=/MediaEndpoint/A2DPSource
May 31 12:18:04 bluetoothd[12009]: Endpoint unregistered: sender=:1.120210 path=/MediaEndpoint/A2DPSink
May 31 12:18:04 rtkit-daemon[2284]: Successfully made thread 4345 of process 4345 (n/a) owned by '1010' high priority at nice level -11.
May 31 12:18:04 rtkit-daemon[2284]: Supervising 4 threads of 4 processes of 1 users.
May 31 12:18:04 pulseaudio[4345]: [pulseaudio] sink.c: Default and alternate sample rates are the same.
May 31 12:18:04 rtkit-daemon[2284]: Supervising 3 threads of 3 processes of 1 users.
May 31 12:18:04 rtkit-daemon[2284]: Successfully made thread 4347 of process 4345 (n/a) owned by '1010' RT at priority 5.
May 31 12:18:04 rtkit-daemon[2284]: Supervising 4 threads of 3 processes of 1 users.
May 31 12:18:04 rtkit-daemon[2284]: Supervising 4 threads of 3 processes of 1 users.
May 31 12:18:04 rtkit-daemon[2284]: Successfully made thread 4348 of process 4345 (n/a) owned by '1010' RT at priority 5.
May 31 12:18:04 rtkit-daemon[2284]: Supervising 5 threads of 3 processes of 1 users.
May 31 12:18:04 rtkit-daemon[2284]: Supervising 5 threads of 3 processes of 1 users.
May 31 12:18:04 rtkit-daemon[2284]: Successfully made thread 4349 of process 4345 (n/a) owned by '1010' RT at priority 5.
May 31 12:18:04 rtkit-daemon[2284]: Supervising 6 threads of 3 processes of 1 users.
May 31 12:18:05 rtkit-daemon[2284]: Supervising 6 threads of 3 processes of 1 users.
May 31 12:18:05 rtkit-daemon[2284]: Successfully made thread 4350 of process 4345 (n/a) owned by '1010' RT at priority 5.
May 31 12:18:05 rtkit-daemon[2284]: Supervising 7 threads of 3 processes of 1 users.
May 31 12:18:05 bluetoothd[12009]: Endpoint registered: sender=:1.120482 path=/MediaEndpoint/A2DPSource
May 31 12:18:05 bluetoothd[12009]: Endpoint registered: sender=:1.120482 path=/MediaEndpoint/A2DPSink
May 31 12:18:05 pulseaudio[4345]: [pulseaudio] backend-ofono.c: Failed to register as a handsfree audio agent with ofono: org.freedesktop.DBus.Error.ServiceUnknown: The name org.ofono was not provided by any .service files
May 31 12:18:05 rtkit-daemon[2284]: Successfully made thread 4352 of process 4352 (n/a) owned by '1010' high priority at nice level -11.
May 31 12:18:05 rtkit-daemon[2284]: Supervising 8 threads of 4 processes of 1 users.
May 31 12:18:05 pulseaudio[4352]: [pulseaudio] pid.c: Daemon already running.
May 31 12:18:58 gnome-keyring-daemon[2204]: asked to register item /org/freedesktop/secrets/collection/login/9, but it's already registered

I suppose that killing csd-media-keys was the part that really did the trick, because that caused a request to start Pulseaudio.

Why it worked (presumably)

The cinnamon-settings-daemon is the component responsible for passing the keyboard request to Pulseaudio. Apparently, messing up with Pulseaudio caused one of its components, csd-media-keys, to mess up as well. Killing it caused cinnamon-settings-daemon to restart it automatically. From fresh.

It’s fine to kill the csd-* processes. Nothing dramatic happens. The killed process is just restarted. But I haven’t tried to kill cinnamon-settings-daemon itself. I believe it will restart Cinnamon completely, something I wanted to avoid.

Dbus activity

I’ve already written a post mentioning how to monitor the Dbus. So here’s a different take on the same topic.

The difference is quite evident when running dbus-monitor (not as root) as follows in order to monitor the activity for the current session:

$ dbus-monitor

This is the output as a result of pressing a volume up button, when it works correctly (i.e. after fixing the problem):

method call time=1685525392.173838 sender=:1.6531 -> destination=:1.6616 serial=1431 path=/org/cinnamon/SettingsDaemon/KeybindingHandler; interface=org.cinnamon.SettingsDaemon.KeybindingHandler; member=HandleKeybinding
   uint32 2
method call time=1685525392.174967 sender=:1.6616 -> destination=org.freedesktop.DBus serial=566 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=AddMatch
   string "type='signal',interface='ca.desrt.dconf.Writer',path='/ca/desrt/dconf/Writer/user',arg0path='/org/cinnamon/desktop/sound/'"
method return time=1685525392.175012 sender=org.freedesktop.DBus -> destination=:1.6616 serial=382 reply_serial=566
method call time=1685525392.175026 sender=:1.6616 -> destination=:1.6531 serial=567 path=/org/Cinnamon; interface=org.Cinnamon; member=ShowOSD
   array [
      dict entry(
         string "icon"
         variant             string "audio-volume-medium-symbolic"
      )
      dict entry(
         string "level"
         variant             int32 34
      )
   ]
method call time=1685525392.175170 sender=:1.6616 -> destination=org.freedesktop.DBus serial=568 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=RemoveMatch
   string "type='signal',interface='ca.desrt.dconf.Writer',path='/ca/desrt/dconf/Writer/user',arg0path='/org/cinnamon/desktop/sound/'"
method return time=1685525392.175189 sender=org.freedesktop.DBus -> destination=:1.6616 serial=383 reply_serial=568
method call time=1685525392.175198 sender=:1.6616 -> destination=org.freedesktop.DBus serial=569 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=AddMatch
   string "type='signal',interface='ca.desrt.dconf.Writer',path='/ca/desrt/dconf/Writer/user',arg0path='/org/cinnamon/desktop/sound/'"
method return time=1685525392.175216 sender=org.freedesktop.DBus -> destination=:1.6616 serial=384 reply_serial=569
method call time=1685525392.176330 sender=:1.6618 -> destination=org.freedesktop.DBus serial=26 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=RequestName
   string "org.freedesktop.ReserveDevice1.Audio2"
   uint32 5
signal time=1685525392.176383 sender=org.freedesktop.DBus -> destination=(null destination) serial=50 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=NameOwnerChanged
   string "org.freedesktop.ReserveDevice1.Audio2"
   string ""
   string ":1.6618"
signal time=1685525392.176411 sender=org.freedesktop.DBus -> destination=:1.6618 serial=51 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=NameAcquired
   string "org.freedesktop.ReserveDevice1.Audio2"
method return time=1685525392.176427 sender=org.freedesktop.DBus -> destination=:1.6618 serial=52 reply_serial=26
   uint32 1
method return time=1685525392.184853 sender=:1.6616 -> destination=:1.6531 serial=570 reply_serial=1431
method call time=1685525392.184915 sender=:1.6616 -> destination=org.freedesktop.DBus serial=571 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=RemoveMatch
   string "type='signal',interface='ca.desrt.dconf.Writer',path='/ca/desrt/dconf/Writer/user',arg0path='/org/cinnamon/desktop/sound/'"
method return time=1685525392.184937 sender=org.freedesktop.DBus -> destination=:1.6616 serial=385 reply_serial=571
method return time=1685525392.190757 sender=:1.6531 -> destination=:1.6616 serial=1432 reply_serial=567

Unfortunately, I didn’t save the output before the fix. But it was exactly two rows of output. As far as I recall, the first and last row from this dump were present when it didn’t work (i.e. the row involving KeybindingHandler’s request, and the return method that closed it).

The bus addresses can be listed with the following command:

$ busctl --user

In this specific session the output was:

NAME                                                       PID PROCESS         USER             CONNECTION    UNIT                      SESSION    DESCRIPTION
:1.0                                                      2103 systemd         eli              :1.0          user@1010.service         -          -
:1.1014                                                  45251 pxgsettings     eli              :1.1014       session-c1.scope          c1         -
:1.1015                                                  45277 scp-dbus-servic eli              :1.1015       user@1010.service         -          -
:1.1016                                                  45277 scp-dbus-servic eli              :1.1016       user@1010.service         -          -
:1.12                                                     2213 csd-a11y-keyboa eli              :1.12         session-c1.scope          c1         -
:1.1234                                                  53331 nemo            eli              :1.1234       session-c1.scope          c1         -
:1.13                                                     2221 csd-a11y-settin eli              :1.13         session-c1.scope          c1         -
:1.131                                                   16564 klauncher       eli              :1.131        session-c1.scope          c1         -
:1.16                                                     2232 csd-cursor      eli              :1.16         session-c1.scope          c1         -
:1.17                                                     2225 csd-color       eli              :1.17         session-c1.scope          c1         -
:1.18                                                     2238 csd-clipboard   eli              :1.18         session-c1.scope          c1         -
:1.19                                                     2242 csd-orientation eli              :1.19         session-c1.scope          c1         -
:1.20                                                     2233 csd-wacom       eli              :1.20         session-c1.scope          c1         -
:1.21                                                     2245 csd-xsettings   eli              :1.21         session-c1.scope          c1         -
:1.22                                                     2248 csd-mouse       eli              :1.22         session-c1.scope          c1         -
:1.24                                                     2256 csd-print-notif eli              :1.24         session-c1.scope          c1         -
:1.26                                                     2258 csd-background  eli              :1.26         session-c1.scope          c1         -
:1.27                                                     2260 gvfsd           eli              :1.27         user@1010.service         -          -
:1.28                                                     2267 csd-housekeepin eli              :1.28         session-c1.scope          c1         -
:1.29                                                     2274 csd-screensaver eli              :1.29         session-c1.scope          c1         -
:1.30                                                     2270 csd-power       eli              :1.30         session-c1.scope          c1         -
:1.31                                                     2299 dconf-service   eli              :1.31         user@1010.service         -          -
:1.32                                                     2285 csd-automount   eli              :1.32         session-c1.scope          c1         -
:1.329                                                   22590 xreader         eli              :1.329        session-c1.scope          c1         -
:1.33                                                     2306 gvfsd-fuse      eli              :1.33         user@1010.service         -          -
:1.330                                                   22596 xreaderd        eli              :1.330        user@1010.service         -          -
:1.332                                                   22606 WebKitNetworkPr eli              :1.332        session-c1.scope          c1         -
:1.35                                                     2313 csd-printer     eli              :1.35         session-c1.scope          c1         -
:1.37                                                     2349 gvfs-udisks2-vo eli              :1.37         user@1010.service         -          -
:1.38                                                     2384 gvfs-gphoto2-vo eli              :1.38         user@1010.service         -          -
:1.39                                                     2277 csd-xrandr      eli              :1.39         session-c1.scope          c1         -
:1.40                                                     2410 gvfs-goa-volume eli              :1.40         user@1010.service         -          -
:1.41                                                     2414 goa-daemon      eli              :1.41         user@1010.service         -          -
:1.42                                                     2427 goa-identity-se eli              :1.42         user@1010.service         -          -
:1.43                                                     2432 gvfs-mtp-volume eli              :1.43         user@1010.service         -          -
:1.44                                                     2436 gvfs-afc-volume eli              :1.44         user@1010.service         -          -
:1.448                                                   31116 gvfsd-http      eli              :1.448        user@1010.service         -          -
:1.457                                                   31367 gvfsd-network   eli              :1.457        user@1010.service         -          -
:1.46                                                     2472 polkit-gnome-au eli              :1.46         session-c1.scope          c1         -
:1.463                                                   31401 gvfsd-dnssd     eli              :1.463        user@1010.service         -          -
:1.47                                                     2481 cinnamon-killer eli              :1.47         session-c1.scope          c1         -
:1.48                                                     2478 nemo-desktop    eli              :1.48         session-c1.scope          c1         -
:1.49                                                     2480 nm-applet       eli              :1.49         session-c1.scope          c1         -
:1.5                                                      2115 cinnamon-sessio eli              :1.5          session-c1.scope          c1         -
:1.50                                                     2474 blueberry-obex- eli              :1.50         session-c1.scope          c1         -
:1.51                                                     2474 blueberry-obex- eli              :1.51         session-c1.scope          c1         -
:1.52                                                     2499 obexd           eli              :1.52         user@1010.service         -          -
:1.54                                                     2506 gvfsd-trash     eli              :1.54         user@1010.service         -          -
:1.55                                                     2512 gvfsd-metadata  eli              :1.55         user@1010.service         -          -
:1.56                                                     2535 python2         eli              :1.56         session-c1.scope          c1         -
:1.5653                                                  62813 chrome          eli              :1.5653       session-c1.scope          c1         -
:1.5654                                                  62813 chrome          eli              :1.5654       session-c1.scope          c1         -
:1.5655                                                  62813 chrome          eli              :1.5655       session-c1.scope          c1         -
:1.5656                                                  62813 chrome          eli              :1.5656       session-c1.scope          c1         -
:1.5661                                                  62813 chrome          eli              :1.5661       session-c1.scope          c1         -
:1.57                                                     2534 cinnamon-screen eli              :1.57         session-c1.scope          c1         -
:1.58                                                     2574 gnome-terminal  eli              :1.58         session-c1.scope          c1         -
:1.5834                                                  52878 xreader         eli              :1.5834       session-c1.scope          c1         -
:1.5836                                                  52891 WebKitNetworkPr eli              :1.5836       session-c1.scope          c1         -
:1.5842                                                  54509 xreader         eli              :1.5842       session-c1.scope          c1         -
:1.5843                                                  54522 WebKitNetworkPr eli              :1.5843       session-c1.scope          c1         -
:1.59                                                     2574 gnome-terminal  eli              :1.59         session-c1.scope          c1         -
:1.6                                                      2115 cinnamon-sessio eli              :1.6          session-c1.scope          c1         -
:1.60                                                     2579 gconfd-2        eli              :1.60         user@1010.service         -          -
:1.61                                                     2574 gnome-terminal  eli              :1.61         session-c1.scope          c1         -
:1.6105                                                  47745 xreader         eli              :1.6105       session-c1.scope          c1         -
:1.6107                                                  47757 WebKitNetworkPr eli              :1.6107       session-c1.scope          c1         -
:1.6477                                                  31538 thunderbird-bin eli              :1.6477       session-c1.scope          c1         -
:1.6478                                                  31538 thunderbird-bin eli              :1.6478       session-c1.scope          c1         -
:1.6531                                                   2451 cinnamon        eli              :1.6531       session-c1.scope          c1         -
:1.6562                                                  69100 csd-sound       eli              :1.6562       session-c1.scope          c1         -
:1.6592                                                   1641 firefox-bin     eli              :1.6592       session-c1.scope          c1         -
:1.6593                                                   1641 firefox-bin     eli              :1.6593       session-c1.scope          c1         -
:1.6594                                                   1700 Web Content     eli              :1.6594       session-c1.scope          c1         -
:1.6595                                                   1760 WebExtensions   eli              :1.6595       session-c1.scope          c1         -
:1.6596                                                   1824 Web Content     eli              :1.6596       session-c1.scope          c1         -
:1.6597                                                   1876 Web Content     eli              :1.6597       session-c1.scope          c1         -
:1.6598                                                   1975 cinnamon-settin eli              :1.6598       session-c1.scope          c1         -
:1.6599                                                   1975 cinnamon-settin eli              :1.6599       session-c1.scope          c1         -
:1.6608                                                   2640 Web Content     eli              :1.6608       session-c1.scope          c1         -
:1.6609                                                   3399 Web Content     eli              :1.6609       session-c1.scope          c1         -
:1.6615                                                   4286 csd-keyboard    eli              :1.6615       session-c1.scope          c1         -
:1.6616                                                   4296 csd-media-keys  eli              :1.6616       session-c1.scope          c1         -
:1.6617                                                   4296 csd-media-keys  eli              :1.6617       session-c1.scope          c1         -
:1.6618                                                   4345 pulseaudio      eli              :1.6618       session-c1.scope          c1         -
:1.6621                                                   4692 xed             eli              :1.6621       session-c1.scope          c1         -
:1.6625                                                   5225 busctl          eli              :1.6625       session-c1.scope          c1         -
:1.67                                                     2860 applet.py       eli              :1.67         session-c1.scope          c1         -
:1.7                                                      2189 at-spi-bus-laun eli              :1.7          user@1010.service         -          -
:1.73                                                     3537 gnome-system-mo eli              :1.73         session-c1.scope          c1         -
:1.8                                                      2196 at-spi2-registr eli              :1.8          user@1010.service         -          -
:1.9                                                      2204 gnome-keyring-d eli              :1.9          session-c1.scope          c1         -
:1.92                                                     8884 xdg-desktop-por eli              :1.92         user@1010.service         -          -
:1.93                                                     8888 xdg-document-po eli              :1.93         user@1010.service         -          -
:1.94                                                     8891 xdg-permission- eli              :1.94         user@1010.service         -          -
:1.95                                                     8902 pxgsettings     eli              :1.95         user@1010.service         -          -
:1.96                                                     8906 xdg-desktop-por eli              :1.96         user@1010.service         -          -
ca.desrt.dconf                                            2299 dconf-service   eli              :1.31         user@1010.service         -          -
ca.desrt.dconf-editor                                        - -               -                (activatable) -                         -
org.Cinnamon                                              2451 cinnamon        eli              :1.6531       session-c1.scope          c1         -
org.Cinnamon.HotplugSniffer                                  - -               -                (activatable) -                         -
org.Cinnamon.LookingGlass                                 2451 cinnamon        eli              :1.6531       session-c1.scope          c1         -
org.Cinnamon.Melange                                         - -               -                (activatable) -                         -
org.Cinnamon.Slideshow                                       - -               -                (activatable) -                         -
org.Nemo                                                 53331 nemo            eli              :1.1234       session-c1.scope          c1         -
org.NemoDesktop                                           2478 nemo-desktop    eli              :1.48         session-c1.scope          c1         -
org.PulseAudio1                                           4345 pulseaudio      eli              :1.6618       session-c1.scope          c1         -
org.a11y.Bus                                              2189 at-spi-bus-laun eli              :1.7          user@1010.service         -          -
org.bluez.obex                                            2499 obexd           eli              :1.52         user@1010.service         -          -
org.cinnamon.ScreenSaver                                  2534 cinnamon-screen eli              :1.57         session-c1.scope          c1         -
org.cinnamon.SettingsDaemon.KeybindingHandler             4296 csd-media-keys  eli              :1.6616       session-c1.scope          c1         -

So it’s quite clear that the players here were cinnamon’s main socket, csd-media-keys and pulseaudio.

Random insights on lead-acid battery theory

Overview

This is a collection of insights regarding lead-acid batteries, in particular those sealed batteries that are used in UPSes. Had it not been for my failed attempts to get a decent battery for my UPS, none of this would have been written. It was only when I asked myself why the UPS doesn’t maintain power nearly as long as I expected that I started to learn more about these batteries.

Batteries is far away from anything I do for a living. This is definitely not my expertise. Please keep that in mind. But as it turns out, there’s a lot to know about them.

Well, I think I’ll take that back: After a lot of playing with these batteries, my conclusion is that knowledge doesn’t help much. Run a test, see how long the battery holds up your computer, and that’s it. There doesn’t seem to be anything smarter to do.

There’s probably only one important thing to know about them: They should be fresh. That’s hardly new, and yet, that’s probably the bottom line. Lead-acid batteries discharge by themselves over time, and when they reach a certain level of discharge, they get damaged. Also, the discharge rate depends a lot on temperature. So a battery that has been stored in a hot place somewhere in a distant country for over a year is probably no good candidate for your UPS. The manufacturer’s timestamp is everything: It starts to seem to me that most batteries were more or less the same on the day that they left the factory. The difference is how much damage they caught during storage.

For a concise technical background, I recommend reading Power Sonic’s Technical Manual.

So without further ado, let’s go through my findings, in more or less random order.

A 7AH battery doesn’t really give 7AH

With lead-acid batteries, 7AH doesn’t meas 7A for 1 hour. Or 3.5A for 2 hours. Or any other combination, except for 0.35A for 20 hours. More about that below.

The amount of charge (and energy) that a lead-acid battery supplies until it’s discharged depends dramatically on the discharging current. The capacity printed on the battery is given for a 20-hours discharge, or using the jargon, 0.05C. That “C” is 7 taken from the 7AH figure, so 7AH are obtained if the discharge current is 0.35A. For larger currents, expect much less energy out of the battery.

This phenomenon is approximated by Peukert’s Law.

For example, my specific case: The load is 70W (at 142VA) according to the UPS itself. I’ll assume that the low power factor thing can be ignored, i.e. that the fact that the VA figure is twice the consumed power makes no difference. This low power factor is natural to switching power supplies, as they draw more current when the voltage is low, so their behavior is far from a plain resistor (unless specifically compensated to mitigate this effect). I’ll also assume that the UPS is 100% efficient on its voltage conversion, which is complete rubbish, but for the heck of it.

So for two 12V batteries in series it goes 70W/24V =~ 2.9A, which is about 0.4C (2.9 / 7 =~ 0.4). A ballpark figure can be taken from Figure 4 in Power Sonic’s Technical Manual, showing that the voltage starts to drop after about an hour, and reaches the critical value somewhere after an hour and a half. Note that I have different batteries.

Also from Table 2 of the same Manual, we have that the actual capacity of a 7AH battery, when drained with a 4.34A current, is 4.34AH (one hour). The current is higher than 2.9A, but given that the UPS isn’t really 100% efficient, it’s likely that the real discharge current is closer to 4A than to 2.9A. So that explains why the UPS said 1:12 hours when I updated the battery replacement date.

Many battery manufacturers supply datasheets with Discharge Characteristics Curves. Even better, if there are tables with Constant Current Discharge Characteristics (also referred to as F.V / Time specification). See Ritar’s datasheet for RT1270, for example. The numbers inside this table are the constant currents for each discharging scenario. Each row is the final voltage (F.V) per cell that is reached in this scenario, and each column represents the time it takes to reach this voltage.

It’s not clear at which voltage the UPS decides to stop discharging. Even though 1.60V per cell seems to be a recurring number in datasheets for the game over.  And on the other hand, I’ve repeatedly seen 0.35A (i.e. 0.05C) paired with 1.75V and 20 hours discharge in those F.V / Time tables. So it’s not really clear. Anyhow, finding a number that is similar to the intended current in the table can give an idea on how long the battery is supposed to last.

Obviously, different batteries behave differently on higher currents. I couldn’t find data on my “Bulls Power” batteries. So maybe they could meet the 7AH specification for a 20-hours discharge, and then perform really poorly with higher, real-life currents. No way to know for sure.

So what is the state of the battery after it has been discharged with less than 7AH? It’s still charged, as a matter of fact. A rapid discharge, which yields less than the full capacity is healthy for the battery. It’s when the full capacity is utilized with a slow discharge pattern that the battery can be damaged if it stays in that state for a long while.

Charging a lead-acid battery

There are in fact several ways to charge a lead acid battery (for example, this), but the common way to charge a battery is to feed it with a current of 0.3C (2.1A for a 7AH battery) until the voltage rises to above a threshold. Or to set a constant voltage (with a current limit) and charge until the current goes below a certain level. Then comes the second phase, which is the slow charging to 100%.

When the battery is full, a floating voltage should be applied to the battery. Each battery specifies its own preferred floating voltage, but they all land at around 13.5-13.8V. This is the correct way to maintain a fully charged battery: A small charging current compensates for the battery’s inherent discharge (typically around 0.001C, i.e. 7 mA for a 7AH battery), and the battery remains fully charged in a way that is best for its health.

Battery manufacturers usually state something like 10% discharge for 90 days or so. This corresponds to about 0.000046C, or ~0.3 mA for a 7AH battery. But in this scenario, the battery discharges against its own voltage, and not the elevated floating voltage (does it matter?).

Knowing the battery’s charge level

How does one estimate how much energy a lead-acid battery has? The answer is unpleasant, yet simple: There is really no way to measure it from the battery electronically. After reading quite some material on the subject, that became evident to me: There are plenty of papers describing exotic algorithms for estimating a battery’s health and charge level, and their abundance and variety proves that there’s really no way to tell, except for draining it.

Maybe the first question should have been: What is consumed when the battery gets discharged? For one, sulfuric acid reacts with the lead plates and become lead sulfate. When there’s no acid left, the battery is done. On the other hand, the lead plates themselves might deteriorate in the process. This can happen relatively early if the lead plates are partly damaged because of previous use (or abuse).

There is one way to measure the charge level that is considered reliable, which is measuring the open circuit voltage (OCV) after the battery has been disconnected for a while (some say a few hours, battery manufacturers typically require 24 hours). Letting the battery rest allows it to reach a chemical equilibrium, at which point the voltage reflects its charge level. This is surely true for a fresh battery, because the OCV reflects the specific gravity of the electrolyte, that is the concentration of the sulfuric acid.

As for batteries with some history, the picture is less clear, and I haven’t managed to figure out if the OCV voltages remain the same, and if the voltage vs. charge percentage relate to the original charge capacity, or the one that is available after the battery is worn out. The OCV measurementt covers the charge level in the sense of how much acid there’s left to consume, but will the lead plates hold that long?

For example, Power Sonic claims that the OVC goes from 1.94V/cell to 2.16V/cell for 0% to 100% charge respectively. As a 12-volt battery has 6 cells, this corresponds to 11.64V to 12.96V. These figures are quite similar to those presented by another manufacturer.

But what does 100% charge mean? 7AH or as much as is left when the battery has worked for some time? My anecdotal measurement of the batteries I took out from the UPS was 12.99V after letting them rest. In other words, they presented a OCV voltage corresponding to 100% charge, even though they had much less than 7AH.

So how does a UPS estimate the remaining runtime? Well, the simple way is to let the battery run out once, and there you have a number. Clearly, Smart UPS uses this method.

Are there any alternatives? In theory, the UPS could let the battery rest for 24 hours, and measure its OCV. This is possible, because most of the time the UPS doesn’t need the battery. But even my anecdotal measurement shows that a 100% charge-like reading doesn’t mean much.

For other types of batteries (Li-ion in particular), measuring the current on the battery, in and out (Coulomb Counting), gives an idea on how much charge it contains. This doesn’t work with lead acid batteries, because the recommended way to maintain a standby battery, is to continuously float charge it. That means holding a constant voltage (say, 2.25V per cell, that is 13.5V for a 12V battery, or 27V on a battery pair, as in SmartUPS 750).

As this voltage is higher than the OCV at rest, this causes a small trickle current (said to be about 0.001C), which compensates for the battery’s self discharge. Even if it overcharges the battery slightly, the gases that are released are recycled internally in a sealed battery, so there’s no damage.

Hence the recommended strategy for charging a lead-acid battery is to charge it quickly as long as its voltage / current pair indicates that it’s far from being fully charged, and then apply a constant, known and safe voltage. This allows it to charge completely slowly, and then maintain the charge without any risk for overcharging. Odds are that this is what the UPS does.

But that makes Coulomb Counting impossible: During the float charge phase (that is, virtually all the time) the current may and may not actually charge the battery.

Measuring the discharging curve

Being quite frustrated by bad batteries, I decided to go for a more direct approach: Instead of taking my computer down and pushing the battery into the UPS each time, I thought it would be easier and more informative to run a discharging test: I used my old Mustek 600 UPS to charge the battery, and then discharged it through a (cheap) 55W light bulb for a car’s headlight (which is intended for 12V of course). With this, I ran a classic lab experiment, recording voltage and current as a function of time. Like a school lab.

The first step is to charge the battery of course. I’ve discussed this topic briefly above, but what about my Mustek’s UPS? It takes a very easy and slow approach: With a partly discharged battery, it starts with an initial charging current of 0.4–0.35 A. This current goes slowly down as the battery’s voltage rises. The voltage goes up very slowly from 12.8V and eventually stabilizes at a voltage around 13.5-13.6V (depends on the battery). Charging a battery to a decent level takes 7 hours, but I went for a 24 hours charge before running a discharging test. Just to make sure the battery was really fully charged.

By the way, the Mustek UPS charges even when not powered on (but connected to power). When disconnected from power, no charging occurs.

Now to the interesting part: The discharging of the batteries. I made several tests, and they all yielded the same results, more or less. In all of these, I made accurate voltage measurements (with a Fluke) as a function of time. This is the plot of the last experiment I did, with a logarithmic x-axis, each ‘x’ mark on the plot is a measurement:

Discharging a lead acid battery at 0.55C, voltage vs. time

This experiment was done on one of the Bulls Power batteries.

I measured the current throughout the session: The light bulb drew 4.00 A in the beginning (corresponding to ~48W), and the currents remained above 3.8A during the first 20 minutes. So this can be considered a 0.55C discharge.

The first thing to note is that the graph fits the discharge curve for Ritar’s RT1270 for 0.55C during the first 18-19 minutes: The voltage goes down gradually, and drops about 0.04V/cell during the first 10 minutes. On the next 8-9 minutes, the voltage drop is about 0.05V/cell, also following Ritar’s curve.

But then it’s quickly downhill, strongly diverting from the expected graph. The battery collapses quickly at this point. I haven’t found a single reference to this behavior, but I suppose that it’s a result of aging. And/or deep discharge during storage.

Is this consistent with the fact that this battery held for 9 minutes in the UPS, along with one that behaves roughly the same? Maybe. The UPS reported a power consumption of 105W in that situation, so if I’ll assume 90% power efficiency, we have a current of 105 * (1 / 0.9) / 24 = 4.86A ≈ 0.7C. That’s somewhere between the graph for 0.55C and 1C. So with the steeper discharging curve of this case, it’s actually possible that the collapse began much earlier.

Other random takeaways:

  • With this battery, it doesn’t matter so much at which voltage the UPS powers itself off. The last part’s slope is so steep, so it’s a matter of 30 seconds this way or another.
  • A 55W light bulb is blinding and heating. Be sure to have it fixed in a way that prevents it from heating the battery and also make sure  that it’s literally out of sight, and that no visual contact will be needed with it as long as the experiment runs.
  • Check the crocodiles’ or easy-hook’s resistance. In particular crocodiles can have a high an unstable resistance (0.1Ω is high in this context, because it causes a voltage drop of 0.4V).
  • As it says in many information sources, lead batteries recover after a discharging session after some time of rest: After a day, I connected the battery to the bulb, and it began discharging at 11.8V, and falling rapidly.

Discharging curves, round #2

The saga went on, and I tell it in detail in the other post. The quick summary is: Being unhappy with the Bull batteries, I bought a pair of batteries from a company named Afik, and put them in the UPS. The results weren’t really impressing with these either, so I decided to replace the Afik batteries with something else. Eventually, I ended up with a pair of Vega Power batteries.

Unfortunately, I didn’t make any measurements on the Afik batteries before putting them in the UPS, mainly because I thought they would work, and end of story. So as things turned out, I obtained the discharging curve from the new Vega Power batteries before putting them in the UPS, and then I did the same thing with the Afik batteries only after they had been in the UPS for a few months.

This time I created an automatic test setting that allowed two readings per second, so the curves are more accurate and detailed.

Note to self: See new-computer/ups/lead acid batteries/discharge-tests-2/.

I used the same light 55W light bulb, which drew 4.0A at the beginning of the discharging test, and then went down slowly to about 3.5A towards the end of the test. More or less constant current, that is.

The colors of the curves in the graphs below represent which test they belong to:

  • Red: Vega Power battery #1.
  • Blue: Vega Power battery #2, first test.
  • Magenta: Vega Power battery #2, second test.
  • Green: Afik battery #1.
  • Cyan: Afik battery #2.

Evidently, I obtained a curve from Vega Power battery #2 twice: Once after charging for ~24 hours (from as it came from the shop) with the Mustek UPS, and the second time after a total of ~48 hours from the state of the previous discharging session. The graphs below might imply that I got these two measurements swapped, but no, there’s no mistake about this.

Vega Power battery #1 was charged during 26 hours (from its initial state from the shop), and then 18 hours with a brief pause between the two charging sessions.

So to the first graph. Time in minutes on the X axis, and voltage per cell on the Y axis.

Discharge curves of lead-acid batteries, linear time scale

The curves begin one minute after discharging starts, and stop when the battery reached 10.0V (with 6 cells in the battery, that’s 1.67V / cell). I’ll show more details below about the first minute. As for choosing 1.67V as the end for discharging: The graphs for charging current around 0.55C often go this low in datasheets, so I suppose this shouldn’t damage the battery.

The first and obvious thing one can see is that the Afik batteries held shorter than the Vega Power batteries: 55 minutes vs. 80-87 minutes. That’s consistent with the fact that the Vega Power lasted longer in the UPS.

But as I’ve written in that other post, the UPS started to panic-beeping after 7:30 minutes with the Afik batteries, and then kept power up for another 15 minutes. Why did that happen? There’s nothing in the green nor cyan curves that offers a clue for the reason to the early panic. On the other hand, it’s evident that these batteries lost voltage quite rapidly after about 40 minutes, possibly going into “deep discharge”. That said, they didn’t collapse nearly as quickly as the Bull Power batteries. So there are definitely different levels of junk.

As for the Vega Power batteries, they performed surprisingly well on these tests. The shortest run was 80 minutes. Taking a gross average of 3.75A as the current, that means that the battery gave 5AH. Quite impressive. Also, according to Ritar’s datasheet for RT1270, that battery will reach 1.85V/cell after one hour if a current of 3.321A is drawn from it. Vega Power got to that region after 70 minutes with a higher current.

So the question is why the UPS started panicking after 28 minutes and cut the power very soon after that. Indeed, the current that is drawn from the batteries was higher inside the UPS. But why that sudden cut? Does it have to do with the curves’ bending at around 1.96V?

In fact, the reason that I repeated the first discharging test on Vega Power #2 (in blue) was that sharp bend. To my surprise, the bend went away on the second test (in magenta), but the battery reached the 1.67V point 7 minutes earlier (after a 48 hours charging). Why? Don’t know. Lead acid batteries are not supposed to have any memory.

In the next graph, I’ve removed the curves for the Afik batteries, and left only Vega Power’s curves. In addition, I’ve shifted the positions of these curves in the X axis. Namely, I delayed Vega #1′s curve by 9 minutes, and Vega #2′s first test’s curve 2:15 minutes. And we get this:

Discharge curves of lead-acid batteries, aligned, linear time scale

So it’s evident that both batteries ran according to an accurate discharging pattern for at least 45 minutes, and then things happened. Is this the start of “deep discharge”? Does this mean that discharging the batteries below 1.95V is a bad idea? Maybe this is what my UPS thought, and accordingly stopped after 28 minutes (which could be the same point, as it probably drew more current than this test).

Doesn’t the part where all three curves overlap look a little linear? I’ll come back to that below.

It also looks like the shifting I made in the X axis compensates for differences in the initial charging level. If that’s true, the charging levels of Vega Power #2 were nearly the same in both tests (slightly more charge after the second, 48-hours, charging session). But if so, why did this battery reach 1.67V faster on the second round?

And why was Vega Power #1 significantly less charged after a super-long charging session? Maybe an issue with the Mustek UPS?

The third graph is the same as the first, but with logarithmic X-axis.

Discharge curves of lead-acid batteries, logarithmic time scale

The interesting thing about this representation is that it’s comparable with the curves that appear in some batteries’ datasheets. It’s quite evident that these curves are similar to those that are usually given for 0.55C.

The bends of the blue and red curves are more evident on this graph. Recall that they were obtained during the first tests of each of the Vega Power batteries. The pretty curve in magenta is the second run of Vega Power #2.

Finally, I discharged Afik battery #2 down to 1V, which is considered a damaging thing to do. But who cares, I have no plans for this battery anyhow. So here’s the curve:

Discharge curves of a lead-acid battery, including deep discharge

Note that this is the same test and same curve as shown above, only that I display it in full here.

Transient responses

So far I’ve been looking at the curves during the actual discharging phase. But some possibly interesting things happen before and after that.

So first, what happens when the discharging begins:

Discharge curves of lead-acid batteries, initial response

Surprise, surprise, there’s an undershoot. Why? I’m tired of trying to even speculate. But it’s definitely a thing. And by the way, the initial voltages (i.e. with zero current) have no meaning in particular (?), because the batteries didn’t have enough time to stabilize after charging.

Note that the X-axis is given in seconds here. Not minutes.

And here’s a close-up on the same graph, just to make the point that the undershoot occurred on all curves:

Discharge curves of lead-acid batteries, initial response, zoomed in

The next thing to look at is what happens when the discharging process is stopped. I did this by disconnecting the light bulb, so the current went down abruptly from 3.5A to zero. This occurs at time zero of the following graph.

Recovery from discharging of lead-acid batteries

Recall that Afik #2 (cyan) went down as low as 1V/cell, and yet it recovered quite similarly Afik #1, which didn’t go through this minor trauma.

It’s a well-known fact that lead-acid batteries recover after a high-current discharge and resume their capability to provide energy, even if they aren’t charged inbetween. This graph shows this phenomenon.

It’s worth looking at the first couple of minutes of the recovery. This is just a zoom-in of the previous graph (and with time axis shown in seconds instead).

Recovery from discharging of lead-acid batteries, zoomed in

Aside from the immediate jump of the voltage when the light bulb is disconnected, there’s a smooth and yet fast rise in the voltage.

Another thing that is evident from this graph, is how much faster the Afik batteries went down towards the 1.67V level, compared with the Vega Power batteries (this refers to the time period before zero).

Is a DC test legit?

This test is made with a DC current, but the UPS discharges the battery with an alternating current: As the UPS feeds the computer with an AC voltage, the delivered power as a function of time is a more or less a sinus function (if the UPS produces a sinus wave). At any time, the energy that is delivered to the computer is drawn from the battery. The UPS barely stores any energy in its own circuits. So the current from the battery is also a sinus, more or less.

So does a test with a DC current reflect the situation with an alternating current? Is it good enough to rely on the average power, and derive a DC current from that? According to Okazaki et al. (“Influence of superimposed alternating current on capacity and cycle life for lead-acid batteries“) the rippled current doesn’t change the battery’s capacity at all. They tested a current of I=I0(1+sinωt) at 0.1Hz to 4 kHz and found only negligible differences (1%). So the DC current test seems to be accurate.

Linear discharge curve?

I’m going back to Vega Power’s second graphs, where I aligned three graphs by shifting them in the X axis. Let’s look at this graph again, but now with a dash-dotted line which represents the first-order linear regression of the segment between 20 and 40 minutes:

Discharge curves of lead-acid batteries, linear segment shown

So there’s no doubt that the Vega Power batteries’ voltage went down linearly as a function of time for a while. Maybe it was the divergence from this linear descent that made the UPS panic after 28 minutes (once again, the UPS drew a higher current).

Looking at Afik’s curves, they don’t appear linear at any stage. Maybe that’s why the UPS became nervous very soon, but then nothing dramatic happened, so it kept running.

What about the Bulls Power battery? I didn’t show the graph with linear time scale above, but here it is:

Discharge curves of lead-acid batteries, linear segment shown

Once again, the linear regression is shown with a dash-dotted line. There are fewer measurement points, but it’s quite clear that the voltage descends linearly for a while.

The slopes of these dash-dotted lines are -2.09 mV/minute for the Vega Power batteries, and -2.86 mV/minute for Bulls Power.

What’s the explanation for this linear behavior? Reiterating that lead-acid batteries is not my expertise, I’ll give it a shot:

At the first stage of discharging, the voltage curves seem to be governed mainly by the concentration of sulfuric acid close to the lead plates. As the acid is consumed by the discharging process, the supply of new acid depends on motion of ions between the two plates. A high current means a quick rate of consumption that works against the motion of ions, some of which depends on plain diffusion. So the result is a lower concentration of acid, hence a lower voltage. This, I suppose, explains why the curves go down faster for higher currents, and why the battery recovers quickly afterwards.

But here’s the thing: When the current is constant, the gradient of the sulfuric acid’s concentration stabilizes inside the cell after a few minutes. The motion of ions gets into a stable pace. However, as the current flows, sulfuric acid is consumed. This brings down the concentration of acid overall, even though the differences (gradients) remain the same. The current is constant, and the sulfuric acid is consumed against electrical charge that passes through the battery, so we have a linear slope.

At some point, the curve breaks off from this pattern, and begins diving faster. A possible reason is that the lead plates begin to lose effective surface area (blocked by lead sulfate?), so the current needs to go through a smaller surface. This is equivalent to a larger current, with the due implications on voltage drop.

This seems to be what is often referred to as “deep discharge”. This is a term that is loosely used with reference to drawing too much charge from a battery. Everyone seems to agree that this phase reduces the battery’s capacity on the following cycles, but when does it start? Is it when the curve diverges from the linear pattern? Or when the voltage gets to the level that the manufacturer suggests avoiding?

Academic material on the topic is also scarce (for example, this paper), but it seems like this phase occurs when the lead plates start to take their impact. As evident from the graphs shown above, this phase is rather unpredictable. Even though lead-acid batteries are said not to have a memory effect, that seems to be true until it comes to the lead plates themselves. They remember.

It looks like Smart UPS takes the linear slope approach. Or maybe it looks at the slope of the voltage as a function of time. As long as sulfuric acid is consumed, it’s OK, but when the lead plates begin to deteriorate, that’s time to beep like crazy. This strategy makes a lot of sense for a long battery life, but less beneficial when power outages are rare, so who cares if the battery gets a small hit every few months. But all this is my speculation on how the UPS makes its choices.

Afterthoughts

So what have I learned from all this messing around with the batteries? Not a whole lot of relevant information, I’m afraid. Invented in 1859, lead-acid batteries are easy to manufacture and easy to use, but there doesn’t seem to be a simple theory that accurately describes their behavior. As my own anecdotal experiments show, even the same battery won’t repeat its own discharging curve twice.

Maybe it was because I tried low-quality batteries (or are they?) and inaccurate apparatus for charging the batteries (the Mustek UPS). On the other hand, it’s difficult to find written material that describes the behavior of these batteries, even under lab conditions.

The state of car batteries is often tested by measuring the “cranking current”. This is the huge current when the car is started. This is a crude and indirect measurement of the lead plates’ resistance, which I suppose gives some indication of the battery’s condition. Does it ensure that your battery will hold for the next 12 months? Good luck with that.

Startup idea

The main conclusion from all this rambling is that the only way to know how long a battery will supply power is to fully discharge it. That is, down to a reasonable level that doesn’t harm it. On the other hand, a UPS should be able to give an accurate estimate of how long time it can hold up the computer, so that its user can decide when it’s time to replace the battery or batteries. Plus, the UPS needs to know this for the purpose of shutting down the computer gently a few minutes before it’s game over.

Smart UPS has a feature called “calibration” which does exactly that: It puts the computer on the batteries and discharges them until they reach the lowest acceptable level, and then resumes normal operation. The problem is that if the main power goes out at that moment, there’s nothing to hold up the computer with. So this has to be initiated by the user every now and then.

Solution: Make a UPS with say, four batteries, each of which is connected independently to the UPS’ power conversion circuitry. This way, the UPS can initiate a full discharge test on one of these batteries while the others remain fully charged. At the worst moment, the overall capacity will stand at 75%.

These discharge tests can be initiated automatically by the UPS for each of the four batteries every couple of weeks or so. This way, the UPS can give an accurate number for how long the UPS will hold up power.

This allows the user to decide when it’s time to replace the batteries. It also makes it possible to replace the batteries gradually, without turning off power. And the UPS can also inform the user if the new battery was worth the money.

Will this startup work? I doubt it. The vast majority of people just buy a UPS and believe it will do the work. They realize their mistake when the power goes out for more than two minutes, and buy a new UPS instead. Replacing lead-acid batteries separately? Come on.

Empty statements in Verilog’s if-else are OK

A reminder to self: As the title implies, it’s OK to have an empty statement in Verilog. This is useful in order to make all the following clauses active only if the first condition isn’t met. This is commonly seen in C, but I wasn’t sure if it’s OK to do this in Verilog too. So it is.

So this is fine, for example:

always @(posedge clk)
  if (relax)
     ; // Do nothing.
  else if (this_condition)
    do_this <= do_this + 1;
  else if (that_condition)
    do_that <= do_that + 1;

This is backed up by the IEEE standard 1364-2001 (standard Verilog), which says in Syntax 9-4:

conditional_statement ::=
  if ( expression )
     statement_or_null [ else statement_or_null ]
     | if_else_if_statement

And then defines in section A.6.4:

statement ::=
{ attribute_instance } blocking_assignment ;
  | { attribute_instance } case_statement
  | { attribute_instance } conditional_statement
  | { attribute_instance } disable_statement
  | { attribute_instance } event_trigger
  | { attribute_instance } loop_statement
  | { attribute_instance } nonblocking_assignment ;
  | { attribute_instance } par_block
  | { attribute_instance } procedural_continuous_assignments ;
  | { attribute_instance } procedural_timing_control_statement
  | { attribute_instance } seq_block
  | { attribute_instance } system_task_enable
  | { attribute_instance } task_enable
  | { attribute_instance } wait_statement

statement_or_null ::=
  statement | { attribute_instance } ;

Those attribute_instance are irrelevant, in particular as they are optional.

udev, the “authorized” attribute and other failed attempts to ban a bogus USB keyboard

Introduction

This is a spin-off post about failing attempts to fix the problem with a webcam’s keyboard buttons. Namely, that the a shaky physical connections caused the USB device to go on and off the bus rapidly, and consequently crash X windows. The background story is in this post.

There is really nothing to learn from this post regarding how to accomplish something. The only reason I don’t trash this is that there’s some possibly useful information about udev.

What I tried to do

There is a possibility to ban a USB device from being accessed by Linux, by virtue of the “authorized” attribute. Something like this.

# cd /sys/devices/pci0000:00/0000:00:14.0/usb2/2-5/
# echo 0 > authorized
^C^Z
# echo 1 > authorized
bash: echo: write error: Invalid argument

The ^C^Z after the first command is not a mistake. The first command got stuck for several seconds.

And this can be done with udev rules as well.

But surprisingly enough, there doesn’t seem to be a way to avoid the generation of the /dev/input/event* file without ignoring the USB device completely. It’s possible to delete it early enough, but that doesn’t really help, it turns out.

ATTRS{authorized} can be set to 0 only for the entire USB device. There is no such parameter for a udev event with the “input” subsystem.

Some udev queries

While trying to figure out the ATTRS{authorized} thing, these are my little play-arounds. Nothing really useful here:

$ sudo udevadm monitor --udev --property

I got

UDEV  [5662716.427855] add      /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1 (usb)
ACTION=add
BUSNUM=001
DEVNAME=/dev/bus/usb/001/098
DEVNUM=098
DEVPATH=/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1
DEVTYPE=usb_device
DRIVER=usb
ID_BUS=usb
ID_MODEL=USB2.0_PC_CAMERA
ID_MODEL_ENC=USB2.0\x20PC\x20CAMERA
ID_MODEL_ID=2311
ID_REVISION=0100
ID_SERIAL=Generic_USB2.0_PC_CAMERA
ID_USB_INTERFACES=:0e0100:0e0200:
ID_VENDOR=Generic
ID_VENDOR_ENC=Generic
ID_VENDOR_FROM_DATABASE=GEMBIRD
ID_VENDOR_ID=1908
MAJOR=189
MINOR=97
PRODUCT=1908/2311/100
SEQNUM=24413
SUBSYSTEM=usb
TYPE=239/2/1
USEC_INITIALIZED=5662716427506

UDEV  [5662716.430744] add      /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.1 (usb)
ACTION=add
DEVPATH=/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.1
DEVTYPE=usb_interface
DRIVER=uvcvideo
ID_USB_CLASS_FROM_DATABASE=Miscellaneous Device
ID_USB_PROTOCOL_FROM_DATABASE=Interface Association
ID_VENDOR_FROM_DATABASE=GEMBIRD
INTERFACE=14/2/0
MODALIAS=usb:v1908p2311d0100dcEFdsc02dp01ic0Eisc02ip00in01
PRODUCT=1908/2311/100
SEQNUM=24420
SUBSYSTEM=usb
TYPE=239/2/1
USEC_INITIALIZED=5662716430425

UDEV  [5662716.430935] add      /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0 (usb)
ACTION=add
DEVPATH=/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0
DEVTYPE=usb_interface
DRIVER=uvcvideo
ID_USB_CLASS_FROM_DATABASE=Miscellaneous Device
ID_USB_PROTOCOL_FROM_DATABASE=Interface Association
ID_VENDOR_FROM_DATABASE=GEMBIRD
INTERFACE=14/1/0
MODALIAS=usb:v1908p2311d0100dcEFdsc02dp01ic0Eisc01ip00in00
PRODUCT=1908/2311/100
SEQNUM=24414
SUBSYSTEM=usb
TYPE=239/2/1
USEC_INITIALIZED=5662716430396

UDEV  [5662716.433265] add      /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/media5 (media)
ACTION=add
DEVNAME=/dev/media5
DEVPATH=/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/media5
MAJOR=509
MINOR=5
SEQNUM=24416
SUBSYSTEM=media
USEC_INITIALIZED=5662716433110

UDEV  [5662716.435400] bind     /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.1 (usb)
ACTION=bind
DEVPATH=/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.1
DEVTYPE=usb_interface
DRIVER=uvcvideo
ID_USB_CLASS_FROM_DATABASE=Miscellaneous Device
ID_USB_PROTOCOL_FROM_DATABASE=Interface Association
ID_VENDOR_FROM_DATABASE=GEMBIRD
INTERFACE=14/2/0
MODALIAS=usb:v1908p2311d0100dcEFdsc02dp01ic0Eisc02ip00in01
PRODUCT=1908/2311/100
SEQNUM=24421
SUBSYSTEM=usb
TYPE=239/2/1
USEC_INITIALIZED=5662716430425

UDEV  [5662716.436539] add      /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/video4linux/video0 (video4linux)
ACTION=add
COLORD_DEVICE=1
COLORD_KIND=camera
DEVLINKS=/dev/v4l/by-id/usb-Generic_USB2.0_PC_CAMERA-video-index0 /dev/v4l/by-path/pci-0000:00:14.0-usb-0:5.2.1:1.0-video-index0
DEVNAME=/dev/video0
DEVPATH=/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/video4linux/video0
ID_BUS=usb
ID_FOR_SEAT=video4linux-pci-0000_00_14_0-usb-0_5_2_1_1_0
ID_MODEL=USB2.0_PC_CAMERA
ID_MODEL_ENC=USB2.0\x20PC\x20CAMERA
ID_MODEL_ID=2311
ID_PATH=pci-0000:00:14.0-usb-0:5.2.1:1.0
ID_PATH_TAG=pci-0000_00_14_0-usb-0_5_2_1_1_0
ID_REVISION=0100
ID_SERIAL=Generic_USB2.0_PC_CAMERA
ID_TYPE=video
ID_USB_DRIVER=uvcvideo
ID_USB_INTERFACES=:0e0100:0e0200:
ID_USB_INTERFACE_NUM=00
ID_V4L_CAPABILITIES=:capture:
ID_V4L_PRODUCT=USB2.0 PC CAMERA: USB2.0 PC CAM
ID_V4L_VERSION=2
ID_VENDOR=Generic
ID_VENDOR_ENC=Generic
ID_VENDOR_ID=1908
MAJOR=81
MINOR=0
SEQNUM=24415
SUBSYSTEM=video4linux
TAGS=:seat:uaccess:
USEC_INITIALIZED=5662716436054

UDEV  [5662716.436956] add      /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/input/input121 (input)
ACTION=add
DEVPATH=/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/input/input121
EV=3
ID_BUS=usb
ID_FOR_SEAT=input-pci-0000_00_14_0-usb-0_5_2_1_1_0
ID_INPUT=1
ID_INPUT_KEY=1
ID_MODEL=USB2.0_PC_CAMERA
ID_MODEL_ENC=USB2.0\x20PC\x20CAMERA
ID_MODEL_ID=2311
ID_PATH=pci-0000:00:14.0-usb-0:5.2.1:1.0
ID_PATH_TAG=pci-0000_00_14_0-usb-0_5_2_1_1_0
ID_REVISION=0100
ID_SERIAL=Generic_USB2.0_PC_CAMERA
ID_TYPE=video
ID_USB_DRIVER=uvcvideo
ID_USB_INTERFACES=:0e0100:0e0200:
ID_USB_INTERFACE_NUM=00
ID_VENDOR=Generic
ID_VENDOR_ENC=Generic
ID_VENDOR_ID=1908
KEY=100000 0 0 0
MODALIAS=input:b0003v1908p2311e0100-e0,1,kD4,ramlsfw
NAME="USB2.0 PC CAMERA: USB2.0 PC CAM"
PHYS="usb-0000:00:14.0-5.2.1/button"
PRODUCT=3/1908/2311/100
PROP=0
SEQNUM=24417
SUBSYSTEM=input
TAGS=:seat:
USEC_INITIALIZED=5662716436500

UDEV  [5662716.591160] add      /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/input/input121/event22 (input)
ACTION=add
BACKSPACE=guess
DEVLINKS=/dev/input/by-path/pci-0000:00:14.0-usb-0:5.2.1:1.0-event /dev/input/by-id/usb-Generic_USB2.0_PC_CAMERA-event-if00
DEVNAME=/dev/input/event22
DEVPATH=/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/input/input121/event22
ID_BUS=usb
ID_INPUT=1
ID_INPUT_KEY=1
ID_MODEL=USB2.0_PC_CAMERA
ID_MODEL_ENC=USB2.0\x20PC\x20CAMERA
ID_MODEL_ID=2311
ID_PATH=pci-0000:00:14.0-usb-0:5.2.1:1.0
ID_PATH_TAG=pci-0000_00_14_0-usb-0_5_2_1_1_0
ID_REVISION=0100
ID_SERIAL=Generic_USB2.0_PC_CAMERA
ID_TYPE=video
ID_USB_DRIVER=uvcvideo
ID_USB_INTERFACES=:0e0100:0e0200:
ID_USB_INTERFACE_NUM=00
ID_VENDOR=Generic
ID_VENDOR_ENC=Generic
ID_VENDOR_ID=1908
LIBINPUT_DEVICE_GROUP=3/1908/2311:usb-0000:00:14.0-5.2
MAJOR=13
MINOR=86
SEQNUM=24418
SUBSYSTEM=input
TAGS=:power-switch:
USEC_INITIALIZED=5662716590816
XKBLAYOUT=us,il
XKBMODEL=pc105
XKBOPTIONS=grp:alt_shift_toggle,grp_led:scroll
XKBVARIANT=,

UDEV  [5662716.593390] bind     /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0 (usb)
ACTION=bind
DEVPATH=/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0
DEVTYPE=usb_interface
DRIVER=uvcvideo
ID_USB_CLASS_FROM_DATABASE=Miscellaneous Device
ID_USB_PROTOCOL_FROM_DATABASE=Interface Association
ID_VENDOR_FROM_DATABASE=GEMBIRD
INTERFACE=14/1/0
MODALIAS=usb:v1908p2311d0100dcEFdsc02dp01ic0Eisc01ip00in00
PRODUCT=1908/2311/100
SEQNUM=24419
SUBSYSTEM=usb
TYPE=239/2/1
USEC_INITIALIZED=5662716430396

UDEV  [5662716.595836] bind     /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1 (usb)
ACTION=bind
BUSNUM=001
DEVNAME=/dev/bus/usb/001/098
DEVNUM=098
DEVPATH=/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1
DEVTYPE=usb_device
DRIVER=usb
ID_BUS=usb
ID_MODEL=USB2.0_PC_CAMERA
ID_MODEL_ENC=USB2.0\x20PC\x20CAMERA
ID_MODEL_ID=2311
ID_REVISION=0100
ID_SERIAL=Generic_USB2.0_PC_CAMERA
ID_USB_INTERFACES=:0e0100:0e0200:
ID_VENDOR=Generic
ID_VENDOR_ENC=Generic
ID_VENDOR_FROM_DATABASE=GEMBIRD
ID_VENDOR_ID=1908
MAJOR=189
MINOR=97
PRODUCT=1908/2311/100
SEQNUM=24422
SUBSYSTEM=usb
TYPE=239/2/1
USEC_INITIALIZED=5662716427506

So the device I want to avoid was /dev/input/event22 this time. What’s its attributes?

$ sudo udevadm info -a -n /dev/input/event22 

Udevadm info starts with the device specified by the devpath and then
walks up the chain of parent devices. It prints for every device
found, all possible attributes in the udev rules key format.
A rule to match, can be composed by the attributes of the device
and the attributes from one single parent device.

  looking at device '/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/input/input121/event22':
    KERNEL=="event22"
    SUBSYSTEM=="input"
    DRIVER==""

  looking at parent device '/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/input/input121':
    KERNELS=="input121"
    SUBSYSTEMS=="input"
    DRIVERS==""
    ATTRS{name}=="USB2.0 PC CAMERA: USB2.0 PC CAM"
    ATTRS{phys}=="usb-0000:00:14.0-5.2.1/button"
    ATTRS{properties}=="0"
    ATTRS{uniq}==""

  looking at parent device '/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0':
    KERNELS=="1-5.2.1:1.0"
    SUBSYSTEMS=="usb"
    DRIVERS=="uvcvideo"
    ATTRS{authorized}=="1"
    ATTRS{bAlternateSetting}==" 0"
    ATTRS{bInterfaceClass}=="0e"
    ATTRS{bInterfaceNumber}=="00"
    ATTRS{bInterfaceProtocol}=="00"
    ATTRS{bInterfaceSubClass}=="01"
    ATTRS{bNumEndpoints}=="01"
    ATTRS{iad_bFirstInterface}=="00"
    ATTRS{iad_bFunctionClass}=="0e"
    ATTRS{iad_bFunctionProtocol}=="00"
    ATTRS{iad_bFunctionSubClass}=="03"
    ATTRS{iad_bInterfaceCount}=="02"
    ATTRS{interface}=="USB2.0 PC CAMERA"
    ATTRS{supports_autosuspend}=="1"

  looking at parent device '/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1':
    KERNELS=="1-5.2.1"
    SUBSYSTEMS=="usb"
    DRIVERS=="usb"
    ATTRS{authorized}=="1"
    ATTRS{avoid_reset_quirk}=="0"
    ATTRS{bConfigurationValue}=="1"
    ATTRS{bDeviceClass}=="ef"
    ATTRS{bDeviceProtocol}=="01"
    ATTRS{bDeviceSubClass}=="02"
    ATTRS{bMaxPacketSize0}=="64"
    ATTRS{bMaxPower}=="256mA"
    ATTRS{bNumConfigurations}=="1"
    ATTRS{bNumInterfaces}==" 2"
    ATTRS{bcdDevice}=="0100"
    ATTRS{bmAttributes}=="80"
    ATTRS{busnum}=="1"
    ATTRS{configuration}==""
    ATTRS{devnum}=="98"
    ATTRS{devpath}=="5.2.1"
    ATTRS{idProduct}=="2311"
    ATTRS{idVendor}=="1908"
    ATTRS{ltm_capable}=="no"
    ATTRS{manufacturer}=="Generic"
    ATTRS{maxchild}=="0"
    ATTRS{product}=="USB2.0 PC CAMERA"
    ATTRS{quirks}=="0x0"
    ATTRS{removable}=="unknown"
    ATTRS{speed}=="480"
    ATTRS{urbnum}=="16"
    ATTRS{version}==" 2.00"

  looking at parent device '/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2':
    KERNELS=="1-5.2"
    SUBSYSTEMS=="usb"
    DRIVERS=="usb"
    ATTRS{authorized}=="1"
    ATTRS{avoid_reset_quirk}=="0"
    ATTRS{bConfigurationValue}=="1"
    ATTRS{bDeviceClass}=="09"
    ATTRS{bDeviceProtocol}=="01"
    ATTRS{bDeviceSubClass}=="00"
    ATTRS{bMaxPacketSize0}=="64"
    ATTRS{bMaxPower}=="100mA"
    ATTRS{bNumConfigurations}=="1"
    ATTRS{bNumInterfaces}==" 1"
    ATTRS{bcdDevice}=="0100"
    ATTRS{bmAttributes}=="e0"
    ATTRS{busnum}=="1"
    ATTRS{configuration}==""
    ATTRS{devnum}=="75"
    ATTRS{devpath}=="5.2"
    ATTRS{idProduct}=="7250"
    ATTRS{idVendor}=="214b"
    ATTRS{ltm_capable}=="no"
    ATTRS{maxchild}=="4"
    ATTRS{product}=="USB2.0 HUB"
    ATTRS{quirks}=="0x0"
    ATTRS{removable}=="unknown"
    ATTRS{speed}=="480"
    ATTRS{urbnum}=="409"
    ATTRS{version}==" 2.00"

  looking at parent device '/devices/pci0000:00/0000:00:14.0/usb1/1-5':
    KERNELS=="1-5"
    SUBSYSTEMS=="usb"
    DRIVERS=="usb"
    ATTRS{authorized}=="1"
    ATTRS{avoid_reset_quirk}=="0"
    ATTRS{bConfigurationValue}=="1"
    ATTRS{bDeviceClass}=="09"
    ATTRS{bDeviceProtocol}=="02"
    ATTRS{bDeviceSubClass}=="00"
    ATTRS{bMaxPacketSize0}=="64"
    ATTRS{bMaxPower}=="0mA"
    ATTRS{bNumConfigurations}=="1"
    ATTRS{bNumInterfaces}==" 1"
    ATTRS{bcdDevice}=="0123"
    ATTRS{bmAttributes}=="e0"
    ATTRS{busnum}=="1"
    ATTRS{configuration}==""
    ATTRS{devnum}=="73"
    ATTRS{devpath}=="5"
    ATTRS{idProduct}=="5411"
    ATTRS{idVendor}=="0bda"
    ATTRS{ltm_capable}=="no"
    ATTRS{manufacturer}=="Generic"
    ATTRS{maxchild}=="4"
    ATTRS{product}=="4-Port USB 2.0 Hub"
    ATTRS{quirks}=="0x0"
    ATTRS{removable}=="removable"
    ATTRS{speed}=="480"
    ATTRS{urbnum}=="69"
    ATTRS{version}==" 2.10"

  looking at parent device '/devices/pci0000:00/0000:00:14.0/usb1':
    KERNELS=="usb1"
    SUBSYSTEMS=="usb"
    DRIVERS=="usb"
    ATTRS{authorized}=="1"
    ATTRS{authorized_default}=="1"
    ATTRS{avoid_reset_quirk}=="0"
    ATTRS{bConfigurationValue}=="1"
    ATTRS{bDeviceClass}=="09"
    ATTRS{bDeviceProtocol}=="01"
    ATTRS{bDeviceSubClass}=="00"
    ATTRS{bMaxPacketSize0}=="64"
    ATTRS{bMaxPower}=="0mA"
    ATTRS{bNumConfigurations}=="1"
    ATTRS{bNumInterfaces}==" 1"
    ATTRS{bcdDevice}=="0415"
    ATTRS{bmAttributes}=="e0"
    ATTRS{busnum}=="1"
    ATTRS{configuration}==""
    ATTRS{devnum}=="1"
    ATTRS{devpath}=="0"
    ATTRS{idProduct}=="0002"
    ATTRS{idVendor}=="1d6b"
    ATTRS{interface_authorized_default}=="1"
    ATTRS{ltm_capable}=="no"
    ATTRS{manufacturer}=="Linux 4.15.0-20-generic xhci-hcd"
    ATTRS{maxchild}=="16"
    ATTRS{product}=="xHCI Host Controller"
    ATTRS{quirks}=="0x0"
    ATTRS{removable}=="unknown"
    ATTRS{serial}=="0000:00:14.0"
    ATTRS{speed}=="480"
    ATTRS{urbnum}=="454"
    ATTRS{version}==" 2.00"

  looking at parent device '/devices/pci0000:00/0000:00:14.0':
    KERNELS=="0000:00:14.0"
    SUBSYSTEMS=="pci"
    DRIVERS=="xhci_hcd"
    ATTRS{broken_parity_status}=="0"
    ATTRS{class}=="0x0c0330"
    ATTRS{consistent_dma_mask_bits}=="64"
    ATTRS{d3cold_allowed}=="1"
    ATTRS{dbc}=="disabled"
    ATTRS{device}=="0xa2af"
    ATTRS{dma_mask_bits}=="64"
    ATTRS{driver_override}=="(null)"
    ATTRS{enable}=="1"
    ATTRS{irq}=="33"
    ATTRS{local_cpulist}=="0-11"
    ATTRS{local_cpus}=="0,00000000,00000fff"
    ATTRS{msi_bus}=="1"
    ATTRS{numa_node}=="0"
    ATTRS{revision}=="0x00"
    ATTRS{subsystem_device}=="0x5007"
    ATTRS{subsystem_vendor}=="0x1458"
    ATTRS{vendor}=="0x8086"

  looking at parent device '/devices/pci0000:00':
    KERNELS=="pci0000:00"
    SUBSYSTEMS==""
    DRIVERS==""

And what udev rules are currently in effect for this? Note that this doesn’t require root, and nothing really happens to the system:

$ udevadm test -a add $(udevadm info -q path -n /dev/input/event22)calling: test
version 237
This program is for debugging only, it does not run any program
specified by a RUN key. It may show incorrect results, because
some values may be different, or not available at a simulation run.

Load module index
Parsed configuration file /etc/systemd/network/eth1.link
Skipping empty file: /etc/systemd/network/99-default.link
Created link configuration context.

[ ... reading a lot of files ... ]

rules contain 393216 bytes tokens (32768 * 12 bytes), 39371 bytes strings
25632 strings (220044 bytes), 22252 de-duplicated (184054 bytes), 3381 trie nodes used
GROUP 104 /lib/udev/rules.d/50-udev-default.rules:29
IMPORT builtin 'hwdb' /lib/udev/rules.d/60-evdev.rules:8
IMPORT builtin 'hwdb' returned non-zero
IMPORT builtin 'hwdb' /lib/udev/rules.d/60-evdev.rules:17
IMPORT builtin 'hwdb' returned non-zero
IMPORT builtin 'hwdb' /lib/udev/rules.d/60-evdev.rules:21
IMPORT builtin 'hwdb' returned non-zero
IMPORT builtin 'input_id' /lib/udev/rules.d/60-input-id.rules:5
capabilities/ev raw kernel attribute: 3
capabilities/abs raw kernel attribute: 0
capabilities/rel raw kernel attribute: 0
capabilities/key raw kernel attribute: 100000 0 0 0
properties raw kernel attribute: 0
test_key: checking bit block 0 for any keys; found=0
test_key: checking bit block 64 for any keys; found=0
test_key: checking bit block 128 for any keys; found=0
test_key: checking bit block 192 for any keys; found=1
IMPORT builtin 'hwdb' /lib/udev/rules.d/60-input-id.rules:6
IMPORT builtin 'hwdb' returned non-zero
IMPORT builtin 'usb_id' /lib/udev/rules.d/60-persistent-input.rules:11
/sys/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0: if_class 14 protocol 0
LINK 'input/by-id/usb-Generic_USB2.0_PC_CAMERA-event-if00' /lib/udev/rules.d/60-persistent-input.rules:32
IMPORT builtin 'path_id' /lib/udev/rules.d/60-persistent-input.rules:35
LINK 'input/by-path/pci-0000:00:14.0-usb-0:5.2.1:1.0-event' /lib/udev/rules.d/60-persistent-input.rules:40
PROGRAM 'libinput-device-group /sys/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/input/input121/event22' /lib/udev/rules.d/80-libinput-device-groups.rules:7
starting 'libinput-device-group /sys/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/input/input121/event22'
'libinput-device-group /sys/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/input/input121/event22'(out) '3/1908/2311:usb-0000:00:14.0-5.2'
Process 'libinput-device-group /sys/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/input/input121/event22' succeeded.
IMPORT builtin 'hwdb' /lib/udev/rules.d/90-libinput-model-quirks.rules:46
IMPORT builtin 'hwdb' returned non-zero
IMPORT builtin 'hwdb' /lib/udev/rules.d/90-libinput-model-quirks.rules:50
IMPORT builtin 'hwdb' returned non-zero
handling device node '/dev/input/event22', devnum=c13:86, mode=0660, uid=0, gid=104
preserve permissions /dev/input/event22, 020660, uid=0, gid=104
preserve already existing symlink '/dev/char/13:86' to '../input/event22'
found 'c13:86' claiming '/run/udev/links/\x2finput\x2fby-id\x2fusb-Generic_USB2.0_PC_CAMERA-event-if00'
found 'c13:85' claiming '/run/udev/links/\x2finput\x2fby-id\x2fusb-Generic_USB2.0_PC_CAMERA-event-if00'
found 'c13:84' claiming '/run/udev/links/\x2finput\x2fby-id\x2fusb-Generic_USB2.0_PC_CAMERA-event-if00'
found 'c13:83' claiming '/run/udev/links/\x2finput\x2fby-id\x2fusb-Generic_USB2.0_PC_CAMERA-event-if00'
creating link '/dev/input/by-id/usb-Generic_USB2.0_PC_CAMERA-event-if00' to '/dev/input/event22'
preserve already existing symlink '/dev/input/by-id/usb-Generic_USB2.0_PC_CAMERA-event-if00' to '../event22'
found 'c13:86' claiming '/run/udev/links/\x2finput\x2fby-path\x2fpci-0000:00:14.0-usb-0:5.2.1:1.0-event'
found 'c13:85' claiming '/run/udev/links/\x2finput\x2fby-path\x2fpci-0000:00:14.0-usb-0:5.2.1:1.0-event'
found 'c13:84' claiming '/run/udev/links/\x2finput\x2fby-path\x2fpci-0000:00:14.0-usb-0:5.2.1:1.0-event'
found 'c13:83' claiming '/run/udev/links/\x2finput\x2fby-path\x2fpci-0000:00:14.0-usb-0:5.2.1:1.0-event'
creating link '/dev/input/by-path/pci-0000:00:14.0-usb-0:5.2.1:1.0-event' to '/dev/input/event22'
preserve already existing symlink '/dev/input/by-path/pci-0000:00:14.0-usb-0:5.2.1:1.0-event' to '../event22'
ACTION=add
BACKSPACE=guess
DEVLINKS=/dev/input/by-path/pci-0000:00:14.0-usb-0:5.2.1:1.0-event /dev/input/by-id/usb-Generic_USB2.0_PC_CAMERA-event-if00
DEVNAME=/dev/input/event22
DEVPATH=/devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5.2/1-5.2.1/1-5.2.1:1.0/input/input121/event22
ID_BUS=usb
ID_INPUT=1
ID_INPUT_KEY=1
ID_MODEL=USB2.0_PC_CAMERA
ID_MODEL_ENC=USB2.0\x20PC\x20CAMERA
ID_MODEL_ID=2311
ID_PATH=pci-0000:00:14.0-usb-0:5.2.1:1.0
ID_PATH_TAG=pci-0000_00_14_0-usb-0_5_2_1_1_0
ID_REVISION=0100
ID_SERIAL=Generic_USB2.0_PC_CAMERA
ID_TYPE=video
ID_USB_DRIVER=uvcvideo
ID_USB_INTERFACES=:0e0100:0e0200:
ID_USB_INTERFACE_NUM=00
ID_VENDOR=Generic
ID_VENDOR_ENC=Generic
ID_VENDOR_ID=1908
LIBINPUT_DEVICE_GROUP=3/1908/2311:usb-0000:00:14.0-5.2
MAJOR=13
MINOR=86
SUBSYSTEM=input
TAGS=:power-switch:
USEC_INITIALIZED=5662716590816
XKBLAYOUT=us,il
XKBMODEL=pc105
XKBOPTIONS=grp:alt_shift_toggle,grp_led:scroll
XKBVARIANT=,
Unload module index
Unloaded link configuration context.

Other failed attempts

I tried the following:

# Rule for disabling bogus keyboard on webcam. It causes X-Windows to
# crash if it goes on and off too much

SUBSYSTEM=="input", ENV{ID_VENDOR_ID}=="1908", ENV{ID_MODEL_ID}=="2311", MODE:="000"
SUBSYSTEM=="input", ATTRS{name}=="USB2.0 PC CAMERA:*", ENV{LIBINPUT_IGNORE_DEVICE}="1"

(the := assignment makes this assignment final).

However none of these two rules managed to stop X from reacting.

Setting the mode to 000 made the device file inaccessible, but yet it was registered. As for the second rule, it doesn’t help, because it indeed set LIBINPUT_IGNORE_DEVICE correctly, but for the wrong udev event. That’s because the udev event that triggers libinput is based upon that the KERNEL attribute is event[0-9]*, which is executed earlier (see 80-libinput-device-groups.rules), but ATTRS{name} isn’t defined for that specific udev event (see output of udevadm info above).

I also tried RUN+=”/bin/rm /dev/input/event%n”, and that indeed removed the device node, but X still reacted, and complained with “libinput: USB2.0 PC CAMERA: USB2.0 PC CAM: Failed to create a device for /dev/input/event28″. Because it was indeed deleted.

But since it appears like X.org accesses keyboards through libinput, maybe use the example for ignoring a device, as given on this page, even though it’s quite similar to what I’ve already attempted?

So I saved this file as /etc/udev/rules.d/79-no-camera-keyboard.rules:

# Make libinput ignore webcam's button as a keyboard. As a result there's
# no event to X-Windows

ACTION=="add|change", KERNEL=="event[0-9]*", \
   ENV{ID_VENDOR_ID}=="1908", \
   ENV{ID_MODEL_ID}=="2311", \
   ENV{LIBINPUT_IGNORE_DEVICE}="1"

And then reload:

# udevadm control --reload

but that didn’t make any apparent difference (I verified that the rule was matched).

And that’s all, folks. Recall that I didn’t promise a happy end.

Perl: “$” doesn’t really mean end of string

Who ate my newline?

It’s 2023, Perl is ranked below COBOL, but I still consider it as my loyal working horse. But even the most loyal horse will give you a grand kick in the bottom every now and then.

So let’s jump to the problematic code:

#!/usr/bin/perl
use warnings;
use strict;

my $str = ".\n\n";

my $nonn = qr/[ \t]|(?<!\n)\n(?!\n)/;

my ($pre, $match, $post) = ($str =~ /^($nonn*)(.*?)($nonn*)$/s);

print "pre = \"$pre\"\n";
print "match = \"$match\"\n";
print "post = \"$post\"\n";

print "This doesn't add up!\n"
  unless ($str eq "$pre$match$post");

For now, never mind what I tried to do here. Let’s just note that $nonn doesn’t capture anything: Those two expressions with parentheses are a lookbehind and a lookahead, and hence don’t capture.

So now let’s look at

my ($pre, $match, $post) = ($str =~ /^($nonn*)(.*?)($nonn*)$/s);

This is an enclosure between ^ and $, and everything in the middle is captured into three matches. So no matter what, the concatenation of these three matches should equal $str, shouldn’t it? Let’s give it a test run:

$ ./try.pl
pre = ""
match = ".
"
post = ""
This doesn't add up!

So $pre and $post are empty. OK, fine. Hence $match should equal $str, which is “.\n\n”. But I see only one newline. Where’s the other one?

RTFM

The one thing that I really like about Perl, is that even when it plays a dirty trick, the answer is in the plain manual. As in “man perlre”, where it says, black on white in the description of $:

Match the end of the string (or before newline at the end of the string; or before any newline if /m is used)

So there we have it. “$” can also consider the character before the last newline as the end. Note that “$” itself will not match the last newline, so even if there’s a capture on the “$” itself, as in “($)”, that last newline is still not captured. It’s a Perl quirk. One of those things that make Perl do exactly what you really want, except for when you’re surgical about it.

I’ve been using Perl a lot for 20 years, and I wasn’t aware that “$” could match anything but the end of the string (let alone the “/m” modifier).

So that’s what happened above: $ considered the character before the last newline to be the end, and one newline went up in smoke.

Use \z instead

The second thing that I really like about Perl, is that even when it’s quirky, there’s always a simple solution. The same “man perlre” also says:

\z     Match only at end of string

Simple, isn’t it? From now on and until the end of time, always use \z if you really mean the end of string. Like, character-wise. And if I change “$” to “\z” in the code above, I get:

my ($pre, $match, $post) = ($str =~ /^($nonn*)(.*?)($nonn*)\z/s);

and the test run gives:

$ ./try.pl
pre = ""
match = ".

"
post = ""

The working horse is back on track again.

What I really wanted to do

Since I messed up with this regex, I should maybe explain what it does:

my $nonn = qr/[ \t]|(?<!\n)\n(?!\n)/;

First, let’s note that $nonn only matches one character (or none): It’s either a plain space, a tab or a newline. But what’s the mess with the newline?

The “(?<!\n)\n(?!\n)” part says this: Match a \n character that isn’t preceded by a \n, and isn’t followed by a \n. Or in other words, match a newline only if it isn’t part of a sequence of newlines. Only if it’s one, isolated \n.

No double \n. Or for short, “nonn”.

I needed this for a script that handles multiple newlines later on (in LaTeX, a double newline means a new paragraph, that’s the reason).

And it actually worked. The “\n\n” part in the string wasn’t matched into neither $pre nor $post. But the (.*?), which attempts to match as little as possible, sold off the last newline to $. Tricky stuff.

Writing about timing and timing constraints: Kind-of behind the scenes

Oh my goodness

I’ll say this from the start: This is me babbling. You have been warned. Don’t expect anything coherent in this post. Not that I necessarily keep things very tidy in my other posts, but this one is clearly me typing at full speed, not trying to be organized or anything.

I’ll buy you a pizza if you figure out the real motivation behind this post. Well, actually I won’t. At most, I’ll answer with an emoji with a pizza. But you’re welcome to try your luck anyhow. If you bother to read this, that is. Which I’m not sure is the best way to spend your time.

So these are my thoughts after publishing a series of pages about FPGA timing and timing constraints (which begins with this page). It’s not only a relatively long series of pages, but it also involves the translation of the pages into Chinese, Japanese and Korean. I know none of these languages, so I used Google Translate. I’ve already written a post about the challenges and solutions with automatic translation of technical documents.

The translations are a huge experiment. It’s not clear at all if people who speak these languages will find them appealing. There’s clearly a readability problem with an automatic translation, no matter how hard I work to make the text digestible for an AI machine.

People who speak English and one of these languages have expressed little enthusiasm when looking at the results (to say the least). But then it boils down to the choices of people who only have that East Asian language. They will surely have a laugh or curse when reading encountering the hick-ups of automatic translations, but do they have a better alternative? Time will tell. Or more precisely, the logs will tell.

I’ve never written a behind-the-scenes post about other things that I’ve written. Maybe I do this because it’s the first time in ages that I can write English exactly the way I think, and not keeping restricting myself to write in a way that works well with the translation machinery. One of the main limitations with automatic translation to CJK languages is that the sentences must be short and concise. The ordering of words is very different in those languages, so if I use a word like “it” or “which” that refers back to something mentioned earlier in the sentence, the translation machinery is forced to decide what it referred to, and name it explicitly in the translation. And quite often, that goes wrong. When the translation is to another European language, or even Hebrew, it’s possible to retain the ambiguity in the translation, and leave it to the intelligence of the reader.

So if this post feels like a huge ramble, yes, that’s exactly what it is. Me feeling free at the keyboard. The liberation from thinking how each and every word may be mistranslated. I mean, even a work like “generation” in the sense of creating something can be translated into like generation gap. Or if I say “that’s a special case”, will that become something about a bag? I’m really tired of translating and back-translating and cutting down Chinese into single characters in order to ensure that when I write “segment” I don’t get something related to marketing.

Let’s talk about timing

I’ve been thinking about writing a guide to FPGA timing and timing constraints for quite a while. It’s one of the most important topics, if not the most important topic in the field, and somehow it seems like the explanations about this field are either very theoretic, or they’re so specific that it’s difficult to understand the whole point. It was clear to me that I needed to begin with the basics of the theory (and translate it, of course). I was however quite surprised myself how little theory there was to convey. The entire thing is about setup time, hold time and clock-to-output. That’s it. That’s the whole story.

It was quite clear where to take it from there: To discuss the most commonly used timing constraint, that is to say the period constraint. And make that international, once again.

But how do you talk about this simple constraint? I mean, the normal explanation is just “this is what you should write”. How do you explain what it really means?

So the approach that I went for throughout the series of pages was to explain it through the timing reports. It’s a bit of a tedious way, but the idea was that the correct way to understand timing constraints, and in fact one of the most essential capabilities for working with timing, is to read the timing reports. Hence I went for explaining the timing report.

Another alternative could have been to jump into explaining the meaning of the timing constraint as a clock object. This is a more technical approach, and it’s necessary to go through it at some stage. But I deferred that to later.

To PLL or not to PLL

And by the way, that’s a title that I really wanted somewhere in those pages, but it was clearly going to be lost in translation. So all you guys from East Asia out there, if you won’t read the translated pages, just know I sacrificed proper English in favor of you getting something decent to read. Kindly appreciate it.

Anyhow, it turned out that I couldn’t finish the entire topic of the period constraint in one page, so there’s a second page on the same topic, this time for the sake of explaining how a PLL changes the picture. And that brings up the whole story of clock domain crossing between related clocks. Plus I also filled in the part about hold time.

Hold time is a neglected topic regarding timing constraints. Probably because the timing constraints rarely fail on that, and still: The tools deliberately add delays to the paths in order to meet hold timing, and by doing so, they put an obstacle before themselves in the quest to attain the setup timing. It’s quite amazing to see how a simple routed net suddenly gets a delay of several ns, just because the hold timing required so. In particular when crossing clock domains between related, but unaligned clocks.

And that’s another expression I couldn’t use on those pages: Crossing clock domains. It always says “clock domain crossing”, and it’s protected against translation. Otherwise, I get crucifixion in the translation. And the protection against translation works only on nouns, not verbs. Did I just divert from the topic? Maybe I need a closure?

Timing closure

When is the right time to talk about it? Timing closure is often the reason we even start think about timing (ghosts in the FPGA is another common reason). So it’s definitely an important topic, at least to the impatient reader. But shouldn’t I put it in the end? After all, it’s the ultimate surgical operation that is done once you understand it all.

So no. I went with the people on this one. It’s the ultimate topic, yes, but I also need to realize that some people will read one page or two and then give up. So better get to the important thing ASAP. Actually, I also gave away a checklist kind of page (with trans-la-tions). Maybe that’s the biggest contradiction I made in those pages. I said all the time that timing is like a divine science, and don’t be lazy about it. Be deep and profound, learn it as an art, and then I come up with a checklist. Not only that, I repeat that crime at the end of the page series. And I do that in three additional languages, too.

To my defense (or is it?), the repeated crime had a practical and silly reason. I had that page already written. More or less. So it was a bit of low-hanging fruit. And my rule (in this blog and otherwise) has always been not to throw away food. If you have some random pieces of cheese in your fridge, make a quiche. If there is mold on the cheese, you just make a different kind of quiche.

So that last page in the series is not 100% coherent with those preceding it, but it kinda summarizes the point. Besides, I already mentioned that the whole series of pages is one big experiment…?

Tachles

That’s a word in Jiddish, literally taken from the Hebrew word תכלית. Actually, it’s the pronunciation of that word in the mouth of European Jews. The meaning of the word is “purpose”, the reason for doing something, but more in the direction of the intention, what result you want to achieve etc.

So after talking about timing closure for two pages, I finally got to the pudding: Actually talk about how to write timing constraint. But not so fast! There’s still one tedious phase to go through, namely learn how to select the logic that we want to be specific about.

This whole thing with Tcl commands, which are used in SDC format (that is, by Vivado, Quartus and several other tools) gives a certain level of compatibility. But as it turns out, there’s Synopsys’ set of capabilities, which seems to have been adopted in full by Vivado, and then there’s Quartus’ take on the same topic, which is same-same but completely different. So the simple commands work the same, but when it comes to more complicated stuff, Quartus diverges, and not for the better. In particular, Quartus lacks the “-filter” option that allows more complicated expressions for defining whether to include a logic element (actually, the object representing it) in a timing constraint. So one is limited to playing with wildcards, and that, well, sucks.

So in short, this was the point I had to admit that Vivado is superior, with a margin. I also began to suspect that Vivado is Synopsys in disguise, but that’s just a wild speculation.

It’s only after that, that I could begin to really talk about commands like set_max_delay, set_min_delay and set_false_path. That’s what I call tachles. I finally got to the point, after rambling tons.

I think that the most important point about set_false_path is that it has maximal priority. One would usually consider priority to be an advanced topic.

Ah, and here comes another messy, out-of-nowhere comment about this series of pages. The idea wasn’t to write a full and comprehensive guide that includes everything that there is possible to know. That’s what the official docs are for. The idea was to cover the things that I personally use and feel that I must be aware of, and hence the convey the minimal toolbox that one needs to carry.

Return from subroutine: I didn’t want to cover an advanced topic like priorities between timing constraints, because who really needs that? It’s quite clear that those specific timing constraints are stronger than the period constraint, and except for that, nobody should mess up with more complicated combinations of contradicting constraints.

But set_false_path is scary: It wins everything else, and it disables the timing check of the relevant paths silently. It’s literally a killer. So make a small mistake with that constraint, and your design is cooked. So I spent quite a few words on warnings about that.

Eight o’clock rock

When I first started working with FPGAs, my mentors were from the ASIC industry. For some reason, which I’m still to find out, ASIC guys like to everything with one clock, if possible. In the FPGA world, it’s more a like clock being given away like they were air. So the whole issue of timing constraints needs to be addressed in the context of crossing clock domains.

And here’s another word that I avoided all the time: Addressed. Because who knows what that will be translated into, when it comes to Chinese, Japanese and Korean.

The alternative to a lot of clocks, is multi-cycle paths, of course: This is what the ASIC guys do in order to stick to one clock for the entire design. And in order to take advantage of the relatively relaxed timing of logic that depends on a clock enable, there are timing constraints. Which is a risky business. I won’t repeat the explanation here, but let’s say that it’s generally not recommended.

Who you wanna talk to?

One of the things that I actually planned and carried out as planned (well, not really as planned) was to put the discussion about I/O constraints last. This went against the principle of important things first, but since I/O constraints are technically just a private case of constraints on internal paths (that is, paths that start and end inside the FPGA), it made a lot of sense to put this part last.

Now read this last sentence again, and realize that there’s nothing even close to that length in the pages I wrote on timing constraint. No chance a sentence like that would have been translated correctly.

So where were I? Oh yes, timing constraint for I/O. But soon enough I realized that I can’t talk about timing constraints for I/O before discussing system synchronous vs. source synchronous vs. asynchronous I/O. But it’s not that simple. One can’t just throw these terms in the air and hope for good. So all of the sudden I found myself with a small בלת”ם (Read “baltam”, meaning “unplanned” in Hebrew), which wasn’t so small.

That’s because source-synchronous I/O means both inputs and outputs, and these are implemented in different ways. So there was no choice (or was there?) to sit down and explain how to do that. How can you discuss the possibilities for timing constraints of a source-synchronous input without explaining how the I/O is implemented. There are quite a few possibilities, after all.

Source-synchronous outputs are actually a much simpler story, because it just means to generate all signals with registers.

On top of all this mess, there’s another topic which is important both for the timing of I/O and for ensuring that the design behaves in a consistent way: IOB registers.

But oops. IOB is a Xilinx word. Oddly enough, it doesn’t seem like there’s a standard term for the idea of having a flip-flop close to the I/O pin. It’s such a common thing, and yet there’s no common term. I think. So I used the term I’m used to, IOB. It’s quite apparent that I’m a Xilinx guy. I’ve worked a bit with Altera (should I call them AMD and Intel FPGA by now? It takes time to get used to that). I nevertheless tried to be vendor-neutral in those pages. They were supposed to be about all FPGAs, not only the two that were leading back in 202x, just before being acquired by two giants, leaving an unknown future ahead.

But that was quite impossible, because timing constraints are usually written in SDC, and SDC was invented by Synopsys, and Vivado are the closest that I know of to the full and accurate interpretation of the constraints, as well as the Tcl environment. That was yet another long sentence that I would never write for translation. But I actually expect that all tools will end up supporting Synopsys’ full vocabulary, so maybe a few years from now, these posts will really be vendor-neutral, and the only thing people won’t understand is what Vivado and Xilinx is, because some other FPGA vendor that we’ve never heard of has taken control of the market, while “AMD FPGA” put all they had on datacenter acceleration and ended up closing the FPGA department because, well, because.

Did I diverge again? Definitely. All I wanted to say is that I also wrote a page about IOB registers, and needless to say, that page is available in three additional languages, just like the rest.

Phew

I really wonder if anyone will read through this post from top to bottom. If you haven’t, and just scrolled down to the bottom to see if this goes anywhere, I can’t blame you. Truth to be said, this post went nowhere, and neither was it intended to do so.

Any guesses about the reason I wrote this anyhow. I still promise to send a pizza emoji in response to correct answers. Even though all answers are incorrect by the nature of this foolishness.

A few posts on FPGA on my other site

I’ve been a bit silent on this blog for a while, but that’s only because I’ve been busy writing on my spin-off site lately.

So here are a few posts over there which are pretty much related to what I do on this blog. These pages are also translated to Chinese, Japanese and Korean, so this is what those links in parentheses are about.

About FIFOs:

About clock domains:

Using git send-email with Gmail + OAUTH2, but without subscribing to cloud services

Introduction

There is a widespread belief, that in order to use git send-email with Gmail, there’s a need to subscribe to Google Cloud services and obtain some credentials. Or that a two-factor authentication (2fa) is required.

This is not the case, however. If Thunderbird can manage to fetch and send emails through Google’s mail servers (as well as other OAUTH2 authenticated mail services), there’s no reason why a utility won’t be able to do the same.

The subscription to Google’s services is indeed required if the communication with Google’s server must be done without human supervision. That’s the whole point with API keys. If a human is around when the mail is dispatched, there’s no need for any special measures. And it’s quite obvious that there’s a responsive human around when a patch is being submitted.

What is actually needed, is a client ID and a client secret, and these are indeed obtained by registering to Google’s cloud service (this explains how). But here’s the thing: Someone at Mozilla has already obtained these, and hardcoded them into Thunderbird itself. So there’s no problem using these to access Gmail with another mail client. It seems like many believe that the client ID and secret must be related to the mail account to access, and therefore each and every one has to obtain their own pair. That’s a mistake that has made a lot of people angry for nothing.

This post describes how to use git send-email without any further involvement with Google, except for having a Gmail account. The same method surely applies for other mail service providers that rely on OAUTH2, but I haven’t gotten into that. It should be quite easy to apply the same idea to other services as well however.

For this to work, Thunderbird must be configured to access the same email account. This doesn’t mean that you actually have to use Thunderbird for your mail exchange. It’s actually enough to configure the Gmail server as an outgoing mail server for the relevant account. In other words, you don’t even need to fetch mails from the server with Thunderbird.

The point is to make Thunderbird set up the OAUTH2 session, and then fetch the relevant piece of credentials from it. And take it from there with Google’s servers. Thunderbird is a good candidate for taking care of the session’s setup, because the whole idea with OAUTH2 is that the user / password session (plus possible additional authentication challenges) is done with a browser. Since Thunderbird is Firefox in disguise, it integrates the browser session well into its general flow.

If you want to use another piece of software to maintain the OAUTH2 session, that’s most likely possible, given that you can get its refresh token. This will also require obtaining its client ID and client secret. Odds are that it can be found somewhere in that software’s sources, exactly as I found it for Thunderbird. Or look at the https connection it runs to get an access token (which isn’t all that easy, encryption and that).

Outline of solution

All below relates to Linux Mint 19, Thunderbird 91.10.0, git version 2.17.1, Perl 5.26 and msmtp 1.8.14. But except for Thunderbird and msmtp, I don’t think the versions are going to matter.

It’s highly recommended to read through my blog post on OAUTH2, in particular the section called “The authentication handshake in a nutshell”. You’re going to need to know the difference between an access token and a refresh token sooner or later.

So the first obstacle is the fact that git send-email relies on the system’s sendmail to send out the emails. That utility doesn’t support OAUTH2 at the time of writing this. So instead, I used msmtp, which is a drop-in replacement for sendmail, plus it supports OAUTH2 (since version 1.8.13).

msmtp identifies itself to the server by sending it an access token in the SMTP session (see a dump of a sample session below). This access token is short-lived (3600 seconds from Google as of writing this), so it can’t be fetched from Thunderbird just like that. In particular because most of the time Thunderbird doesn’t have it.

What Thunderbird does have is a refresh token. It’s a completely automatic task to ask Google’s server for the access token with the refresh token at hand. It’s also an easy task (once you’ve figured out how to do it, that is). It’s also easy to get the refresh token from Thunderbird, exactly in the same way as getting a saved password. In fact, Thunderbird treats the refresh token as a password.

msmtp allows executing an arbitrary program in order to get the password or the access token. So I wrote a Perl script (oauth2-helper.pl) that reads the refresh token from a file and gets an access token from Google’s server. This is how msmtp manages to authenticate itself.

So everything relies on this refresh token. In principle, it can change every time it’s used. In practice, as of today, Google’s servers don’t change it. It seems like the refresh token is automatically replaced every six months, but even if that’s true today, it may change.

But that doesn’t matter so much. All that is necessary is that the refresh token is correct once. If the refresh token goes out of sync with Google’s server, a simple user / password session rectifies this. And as of now, than virtually never happens.

So let’s get to the hands-on part.

Install msmtp

Odds are that your distribution offers msmtp, so it can be installed with something like

# apt install msmtp

Note however that the version needs to be at least 1.8.13, which wasn’t my case (Linux Mint 19). So I installed it from the sources. To do that, first install the TLS library, if it’s not installed already (as root):

# apt install gnutls-dev

Then clone the git repository, compile and install:

$ GIT_SSL_NO_VERIFY=true git clone http://git.marlam.de/git/msmtp.git
$ cd msmtp
$ git checkout msmtp-1.8.14
$ autoreconf -i
$ ./configure
$ make && echo Success
$ sudo make install

The installation goes to /usr/local/bin and other /usr/local/ paths, as one would expect.

I checked out version 1.8.14 because later versions failed to compile on my Linux Mint 19. OAUTH2 support was added in 1.8.13, and judging by the commit messages it hasn’t been changed since, except for commit 1f3f4bfd098, which is “Send XOAUTH2 in two lines, required by Microsoft servers”. Possibly cherry-pick this commit (I didn’t).

Once everything has been set up as described below, it’s possible to send an email with

$ msmtp -v -t < ~/email.eml

The -v flag is used only for debugging, and it prints out the entire SMTP session.

The -t flag tells msmtp to fetch the recipients from the mail’s own headers. Otherwise, the recipients need to be listed in the command line, just like sendmail. Without this flag or recipients, msmtp just replies with

msmtp: no recipients found

The -t flag isn’t necessary with git send-email, because it explicitly lists the recipients in the command line.

The oauth2-helper.pl script

As mentioned above, Thunderbird has the refresh token, but msmtp needs an access token. So the script that talks with Google’s server and grabs the access token can be downloaded from its Github repo. Save it, with execution permission to /usr/local/bin/oauth2-helper.pl (or whatever, but this is what I assume in the configurations below).

Some Perl libraries may be required to run this script. On a Debian-based system, the packages’ names are  probably something like libhttp-message-perl, libwww-perl and libjson-perl.

It’s written to access Google’s token server, but can be modified easily to access a different service provider by changing the parameters at its beginning. For other email providers, check if it happens to be listed in OAuth2Providers.jsm. I don’t know how well it will work with those other providers, though.

The script reads the refresh token from ~/.oauth2_reftoken as a plain file containing the blob only. There’s an inherent security risk of having this token stored like this, but it’s basically the same risk as the fact that it can be obtained from Thunderbird’s credential files. The difference is the amount of security by obscurity. Anyhow, the reference token isn’t your password, and it can’t be derived from it. Either way, make sure that this file has a 0600 or 0400 permission, if you’re running on a multi-user computer.

The script caches the access token in ~/.oauth2_acctoken, with an expiration timestamp. As of today, it means that the script talks with the Google’s server once in 60 minutes at most.

Setting up config files

So with msmtp installed and the script downloaded into /usr/local/bin/oauth2-helper.pl, all that is left is configuration files.

First, create ~/.msmtprc as follows (put your Gmail username instead of mail.username, of course):

account default
host smtp.gmail.com
port 587
tls on
tls_starttls on
auth xoauth2
user mail.username
passwordeval /usr/local/bin/oauth2-helper.pl
from mail.username@gmail.com

And then change the [sendemail] section in ~/.gitconfig to

[sendemail]
        smtpServer = /usr/local/bin/msmtp

That’s it. Only that single line. It’s however possible to use smtpServerOption in the .gitconfig to add various flags. So for example, to get the entire SMTP session shown while sending the email, it should say:

[sendemail]
        smtpServer = /usr/local/bin/msmtp
        smtpServerOption = -v

But really, don’t, unless there’s a problem sending mails.

Other than that, don’t keep old settings. For example, there should not be a “from=” entry in .gitconfig. Having such causes a “From:” header to be added into the mail body (so it’s visible to the reader of the mail). This header is created when there is a difference between the “From” that is generated by git send-email (which is taken from the “from=” entry) and the patch’ author, as it appears in the patch’ “From” header. The purpose of this in-body header is to tell “git am” who the real author is (i.e. not the sender of the patch). So this extra header won’t appear in the commit, but it nevertheless makes the sender of the message look somewhat clueless.

So in short, no old junk.

Sending a patch

Unless it’s the first time, I suggest just trying to send the patch to your own email address, and see if it works. There’s a good chance that the refresh token from the previous time will still be good, so it will just work, and no point hassling more.

Actually, it’s fine to try like this even on the first time, because the Perl script will fail to grab the access token and then tell you what to do to fix it, namely:

  • Make sure that Thunderbird has access to the mail account itself, possibly by attempting to send an email through Gmail’s server.
  • Go to Thunderbird’s Preferences > Privacy & Security and click on Saved Passwords. Look for the account, where the Provider start with oauth://. Right-click that line and choose “Copy Password”.
  • Create or open ~/.oauth2_reftoken, and paste the blob into that file, so it contains only that string. No need to be uptight with newlines and whitespaces: They are ignored.

And then go, as usual:

$ git send-email --to 'my@test.mail' 0001-my.patch

I’ve added the output of a successful session (with the -v flag) below.

Room for improvements

It would have been nicer to fetch the refresh token automatically from Thunderbird’s credentials store (that is from logins.json, based upon the decryption key that is kept in key4.db), but the available scripts for that are written in Python. And to me Python is equal to “will cause trouble sooner or later”. Anyhow, this tutorial describes the mechanism (in the part about Firefox).

Besides, it could have been even nicer if the script was completely standalone, and didn’t depend on Thunderbird at all. That requires doing the whole dance with the browser, something I have no motivation to get into.

A successful session

This is what it looks like when a patch is properly sent, with the smtpServerOption = -v line in .gitignore (so msmtp produces verbose output):

Send this email? ([y]es|[n]o|[q]uit|[a]ll): y
ignoring system configuration file /usr/local/etc/msmtprc: No such file or directory
loaded user configuration file /home/eli/.msmtprc
falling back to default account
Fetching access token based upon refresh token in /home/eli/.oauth2_reftoken...
using account default from /home/eli/.msmtprc
host = smtp.gmail.com
port = 587
source ip = (not set)
proxy host = (not set)
proxy port = 0
socket = (not set)
timeout = off
protocol = smtp
domain = localhost
auth = XOAUTH2
user = mail.username
password = *
passwordeval = /usr/local/bin/oauth2-helper.pl
ntlmdomain = (not set)
tls = on
tls_starttls = on
tls_trust_file = system
tls_crl_file = (not set)
tls_fingerprint = (not set)
tls_key_file = (not set)
tls_cert_file = (not set)
tls_certcheck = on
tls_min_dh_prime_bits = (not set)
tls_priorities = (not set)
tls_host_override = (not set)
auto_from = off
maildomain = (not set)
from = mail.username@gmail.com
set_from_header = auto
set_date_header = auto
remove_bcc_headers = on
undisclosed_recipients = off
dsn_notify = (not set)
dsn_return = (not set)
logfile = (not set)
logfile_time_format = (not set)
syslog = (not set)
aliases = (not set)
reading recipients from the command line
<-- 220 smtp.gmail.com ESMTP m8-20020a7bcb88000000b003c6d21a19a0sm3316430wmi.29 - gsmtp
--> EHLO localhost
<-- 250-smtp.gmail.com at your service, [109.186.183.118]
<-- 250-SIZE 35882577
<-- 250-8BITMIME
<-- 250-STARTTLS
<-- 250-ENHANCEDSTATUSCODES
<-- 250-PIPELINING
<-- 250-CHUNKING
<-- 250 SMTPUTF8
--> STARTTLS
<-- 220 2.0.0 Ready to start TLS
TLS session parameters:
    (TLS1.2)-(ECDHE-ECDSA-SECP256R1)-(CHACHA20-POLY1305)
TLS certificate information:
    Subject:
        CN=smtp.gmail.com
    Issuer:
        C=US,O=Google Trust Services LLC,CN=GTS CA 1C3
    Validity:
        Activation time: Mon 26 Sep 2022 11:22:04 AM IDT
        Expiration time: Mon 19 Dec 2022 10:22:03 AM IST
    Fingerprints:
        SHA256: 53:F3:CA:1D:37:F2:1F:ED:2C:67:40:A2:A2:29:C2:C8:E8:AF:9E:60:7A:01:92:EC:F0:2A:11:E8:37:A5:88:F3
        SHA1 (deprecated): D4:69:6E:59:2D:75:43:59:02:74:25:67:E7:57:40:E0:28:43:A8:62
--> EHLO localhost
<-- 250-smtp.gmail.com at your service, [109.186.183.118]
<-- 250-SIZE 35882577
<-- 250-8BITMIME
<-- 250-AUTH LOGIN PLAIN XOAUTH2 PLAIN-CLIENTTOKEN OAUTHBEARER XOAUTH
<-- 250-ENHANCEDSTATUSCODES
<-- 250-PIPELINING
<-- 250-CHUNKING
<-- 250 SMTPUTF8
--> AUTH XOAUTH2 dXNlcj1lbGkuYmlsbGF1ZXIBYXV0aD1CZWFyZXIgeWEyOS5hMEFhNHhyWE1GM1gtOTJMVWNidjE4MFdVOBROENRcUdSbk5KaUFSY0VSckVaXzdzbDlHMTNpdFIyUTk0NjlKWG45aHVGLQVRBU0FSTVXJpSjRqMjBLcWh6WU9GekxlcU5BYVpFNUU4WXRhNjdLUXpCRm1HRDg3dFgzeHJ4amNPTnRVTkZFVWdESXhsUlcxOFhVT0pqQ1hPSlFwZlNGUUVqRHZMOWw4RExkTjlKZlNbGRTazNNbFNMNjVfQWFDZ1lLVVF2Y0luOWNSSUEwMTY2AQE=
<-- 235 2.7.0 Accepted
--> MAIL FROM:<mail.username@gmail.com>
--> RCPT TO:<test@mail.com>
--> RCPT TO:<mail.username@gmail.com>
--> DATA
<-- 250 2.1.0 OK m8-20020a7bcb88000000b003c6d21a19a0sm3316430wmi.29 - gsmtp
<-- 250 2.1.5 OK m8-20020a7bcb88000000b003c6d21a19a0sm3316430wmi.29 - gsmtp
<-- 250 2.1.5 OK m8-20020a7bcb88000000b003c6d21a19a0sm3316430wmi.29 - gsmtp
<-- 354  Go ahead m8-20020a7bcb88000000b003c6d21a19a0sm3316430wmi.29 - gsmtp
--> From: Eli Billauer <mail.username@gmail.com>
--> To: test@mail.com
--> Cc: Eli Billauer <mail.username@gmail.com>
--> Subject: [PATCH v8] Gosh! Why don't you apply this patch already!
--> Date: Sun, 30 Oct 2022 07:01:14 +0200
--> Message-Id: <20221030050114.49299-1-mail.username@gmail.com>
--> X-Mailer: git-send-email 2.17.1
--> 

[ ... email body comes here ... ]

--> --
--> 2.17.1
-->
--> .
<-- 250 2.0.0 OK  1667106108 m8-20020a7bcb88000000b003c6d21a19a0sm3316430wmi.29 - gsmtp
--> QUIT
<-- 221 2.0.0 closing connection m8-20020a7bcb88000000b003c6d21a19a0sm3316430wmi.29 - gsmtp
OK. Log says:
Sendmail: /usr/local/bin/msmtp -v -i test@mail.com mail.username@gmail.com
From: Eli Billauer <mail.username@gmail.com>
To: test@mail.com
Cc: Eli Billauer <mail.username@gmail.com>
Subject: [PATCH v8] Gosh! Why don't you apply this patch already!
Date: Sun, 30 Oct 2022 07:01:14 +0200
Message-Id: <20221030050114.49299-1-mail.username@gmail.com>
X-Mailer: git-send-email 2.17.1

Result: OK

Ah, and the fact that the access token can be copied from here is of course meaningless, as it has expired long ago.

Thunderbird debug notes

These are some random notes I made while digging in Thunderbird’s guts to find out what’s going on.

So this is Thunderbird’s official git repo. Not that I used it.

To get logging info from Thunderbird: Based upon this page, go to Thunderbird’s preferences > General and click the Config Editor button. Set mailnews.oauth.loglevel to All (was Warn). Same with mailnews.smtp.loglevel. Then open the Error Console with Ctrl+Shift+J.

The cute thing about these logs is that the access code is written in the log. So it’s possible to skip the Perl script, and use the access code from Thunderbird’s log. Really inconvenient, but possible.

The OAuth2 token requests is implemented in Oauth2.jsm. It’s possible to make a breakpoint in this module by through Tools > Developer Tools > Developer Toolbox, and once it opens (after requesting permission for external connection), go to the debugger.

Find Oauth2.jsm in the sources pane to the left (of the Debugger tab), under resource:// modules > sessionstore. Add a breakpoint in requestAccessToken() so that the clientID and consumerSecret properties can be revealed.

Sending a patch from Thunderbird directly

This is a really bad idea. But if you have Thunderbird, and need to send a patch right now, this is a quick, dirty and somewhat dangerous procedure for doing that.

Why is it dangerous? Because at some point, it’s easy to pick “Send now” instead of “Send later”, and boom, a junk patch is mailed to the whole world.

The problem with Thunderbird is that it makes small changes into the patch’ body. So to work around this, there’s a really silly procedure. I used it once, and I’m not proud of that.

So here we go.

First, a very simple script that outputs the patch mail into a file. Say that I called it dumpit (should be executable, of course):

#!/bin/bash

cat > /home/eli/Desktop/git-send-email.eml

Then change ~/.gitconfig, so it reads something like this in the [sendemail] section:

[sendemail]
        from = mail.username@gmail.com
        smtpServer = /home/eli/Desktop/dumpit

So basically it uses the silly script as a mail server, and the content goes out to a plain file.

Then run git send-email as usual. The result is a git-send-email.eml as a file.

And now comes the part of making Thunderbird send it.

  • Close Thunderbird. All windows.
  • Change directory to where Thunderbird keeps its profile files, to under Mail/Local Folders
  • Remove “Unsent Messages” and “Unsent Messages.msf”
  • Open Thunderbird again
  • Inside Thunderbird, go to Hamburger Icon > File > Open > Saved Message… and select git-send-email.eml. The email message should appear.
  • Right-Click somewhere in the message’s body, and pick Edit as New Message…
  • Don’t send this message as is! It’s completely messed up. In particular, there are some indentations in the patch itself, which renders it useless.
  • Instead, pick File > Send Later.
  • Once again, close Thunderbird. All windows.
  • Remove “Unsent Messages.msf” (only)
  • Edit “Unsent Messages” as follows: Everything under the “Content-Transfer-Encoding: 7bit” part is the mail’s body. So remove the “From:” line after it, and paste the email’s body from git-send-email.eml instead.
  • Note that there are normally two blank lines after the mail’s body. Retain them.
  • Open Thunderbird again. Verify that those indentations are away.
  • Look at the mail inside Outbox, and verify that it’s OK now. These are the three things to look for in particular:
    • The “From:” part at the beginning of the message is gone.
    • At the end of the message, there’s a “–” and git’s version number. These should be in separate lines.
    • Look at the mail’s source. The “+” and “-” signs of the diffs must not be indented.
  • If all is fine, right-click Outbox, and pick “Send unsent messages”. And hope for good.

Are you sure you want to do this?