Linux: Permanent graphics mode (resolution) on Cinnamon

The goal

Quite simple: Set a fixed graphics mode on the computer screen.

More precisely, make Cinnamon (version 3.2.6) on Linux Mint 18.1 (Serena) show the desktop with a predefined resolution, no matter what happens. Spoiler: I failed. But I got close enough for practical purposes, and collected a lot of knowledge while trying. So here it is.

The reason I need this: On the machine mentioned here, I have an two screens connected through an HDMI splitter, so the monitor identification is somewhat shaky, and it’s not clear which piece of info the computer gets each time. To make it even trickier, the graphics mode I need is only listed in the EDID information submitted by one of the monitors. In other words: More often than not, the computer doesn’t know it’s allowed to use the mode I want it to use.

This situation meets the somewhat arrogant “I know what’s best, I never fail” attitude often seen by graphics software. There is more than one automatic mechanism for changing the resolution to “what is correct”, so just changing the resolution with xrandr doesn’t cut. The underlying mechanisms seem to change frequently from one version to another, and having them documented is probably too much to ask for. It seems like there are some race conditions taking place between different utilities that have a say on this matter. Possibly the reason for the problem I tried to solve on this post.

For clarity: EDID is a chunk of data that is typically stored on a small flash memory on the monitor. This data is fetched through I2C wires that are part of an DVI / HDMI / VGA connector when the monitor is plugged in. This is how the computer knows not only the commercial name of the monitor, but also what graphics modes it supports and prefers.

How cinnamon selects the resolution to use

So — the first question is: How does (my very specific) Cinnamon determine which screen resolution is the “correct” one?

This is a journey into the realm of mystery and uncertainty, but it seems like the rationale is to remember previously connected monitors, along with a separate user-selected graphics mode for each.

So the steps are probably something like:

  • Grab the list of allowed resolution modes, as presented by xrandr, for the relevant monitor (through libxrandr?). This is typically the set of modes listed in the monitor’s EDID information, but it’s possible to add modes as well (see below).
  • If there’s a user logged in, look up .config/monitors.xml in that user’s home directory. If there’s a match between the monitor’s product identification, apply the selected resolution. This file is changed by Cinnamon’s Display setting utility (among others, I guess), and presents the user’s preferences.
  • There’s possibly also a globally default monitor.xml at /etc/gnome-settings-daemon/xrandr/. I don’t have such file, and it’s not clear if it’s in effect had it existed. I haven’t tried this one.
  • If there’s no matching (or adequate?) mode setting in monitor.xml (or no user logged in), choose the preferred mode, as pointed at by xrandr.

This way or another, monitors.xml only lists width, height and rate for each graphics mode, without the timing details that are required to run it properly. So if the resolution requested in monitors.xml isn’t listed by xrandr, there is no way to request it, as there is crucial information missing. This isn’t supposed to happen ever, since the utility that sets the user’s preferences isn’t supposed to select a mode that the monitor doesn’t support. But if it does, the logical thing would be to ignore the resolution in monitors.xml, and go on with the monitor’s preferred mode. In reality, it appears like this causes the blank screen that I’ve mentioned on this post.

The automatic setting of resolution seems to take place when some kind of X session starts (the login screen and after the user logs in) as well as when a new monitor is hotplugged. Setting a monitor’s mode with xrandr seems to trigger an automatic setting as well sometimes. Having tried to set the resolution with xrandr a few times, it reverts sometimes to the automatic setting, and sometimes it stays with the one I set. Go figure.

How I got it done

Since there are all kinds of ghosts in the system that insist on “fixing” the display resolution, I might as well play along. So the trick is as follows:

  • Edit ~/.config/monitors.xml (manually), setting the resolution for all monitors listed to the one I want.
  • Make sure that the desired graphics mode, along with its timing parameters, is listed by xrandr, even if the monitor didn’t mention it in its EDID info.The first step is relatively easy. The entries in the XML file look like this:
      <output name="HDMI3">
          <vendor>SNY</vendor>
          <product>0x0801</product>
          <serial>0x01010101</serial>
          <width>1360</width>
          <height>768</height>
          <rate>60</rate>
          <x>0</x>
          <y>0</y>
          <rotation>normal</rotation>
          <reflect_x>no</reflect_x>
          <reflect_y>no</reflect_y>
          <primary>yes</primary>
      </output>

This is after editing the file. I needed 1360 x 768 @60 Hz, as shown above. So just set the width, height and rate tags in the XML file for all entries. No matter what monitor the system thinks it sees, the “user preference” is the same.

Now making sure that the mode exists: Add something like the following as /etc/X11/Xsession.d/10add-xrandr-mode (owned by root, not executable, no shebang):

xrandr -d :0 --newmode "hdmi_splitter" 85.5 1360 1424 1536 1792 768 771 777 795 +hsync +vsync
xrandr -d :0 --addmode HDMI3 hdmi_splitter
xrandr -d :0 --output HDMI3 --mode hdmi_splitter

Needless to say (?), this relates to the specific graphics mode.

So this file is executed every time X is started (and hence the xrandr modes list is cleared). All it does is making sure that the relevant output port (HDMI3) knows how to display 1360 x 768. Note that the name of the mode has no particular significance, and that the frame rate isn’t given explicitly, but is calculated by the tools. I got these figures from an xrandr readout with the desired monitor connected directly. See the full listing at the end of this post. It’s the first entry there.

The third command actually switches the display to the desired mode. It can be removed actually, because it’s overridden very soon anyhow. Nevertheless, it shows the command that can be used manually on console, given the two earlier commands (should not be needed, given that the mode is invoked automatically, fingers crossed).

That’s it. Except for occasional glitches (getting full control of this was too much to expect), the two actions mentioned above are enough to get the mode I wanted. Not the “no matter what” I wanted, but close enough.

As for the -d :0 flags, it’s required in remote sessions and scripts. Alternatively, start with an

$ export DISPLAY=:0

Using cvt to obtain the timing parameters (not!)

It’s suggested on some websites to obtain the timing parameters with something like

$ cvt 1360 768 60
# 1360x768 59.80 Hz (CVT) hsync: 47.72 kHz; pclk: 84.75 MHz
Modeline "1360x768_60.00"   84.75  1360 1432 1568 1776  768 771 781 798 -hsync +vsync

I tried this, and the monitor didn’t sync on the signal. It’s indeed a pretty lousy monitor to miss on a DVI signals, and still.

Note the small differences between the timing parameters — that’s probably the reason for this failure. So when the real parameters can be obtained, use them. There is no secret catch-all formula for all graphics modes. The formula works on a good day.

Hands off, Cinnamon’s daemon!

My original idea was to turn off all automatic graphics mode setting mechanisms, and stay with a single xrandr command, running from /etc/X11/Xsession.d/ or something. It was a great idea, but it didn’t work: I saw a momentary switch to the mode I wanted, and then it changed to something else. I could have added some kind of daemon of my own, that waits a bit and then changes the mode with xrandr, but that’s just adding another daemon to wrestle with the others.

So this didn’t really help, but I’ll leave it here anyway, in case someone wants to change the display mode without having some daemon change it back. Note that according to this page, using gsettings as shown below works only up to Cinnamon before version 3.4, after which the procedure is different (haven’t tried it however): Copy /etc/xdg/autostart/cinnamon-settings-daemon-xrandr.desktop to $HOME/.config/autostart. Then append the line Hidden=true to the copied file.

In short, YMMV. Here’s how I did it on my system (and then found it’s not good enough, as mentioned above).

Resolution mode settings made with xrandr will be sporadically overridden by cinnamon-settings-daemon, which has a lot of plugins running for different housekeeping tasks. One of them is to keep X-Window’s display resolution in sync with .config/monitors.xml. So disable it.

Following my own post, this is typically the setting for the said plugin:

$ gsettings list-recursively org.gnome.settings-daemon.plugins.xrandr
org.gnome.settings-daemon.plugins.xrandr active true
org.gnome.settings-daemon.plugins.xrandr priority 0
org.gnome.settings-daemon.plugins.xrandr default-monitors-setup 'follow-lid'
org.gnome.settings-daemon.plugins.xrandr default-configuration-file '/etc/gnome-settings-daemon/xrandr/monitors.xml'

So turn it off:

$ gsettings set org.gnome.settings-daemon.plugins.xrandr active false

and then check again with the list-recursively command above.

xrandr output: The full list of modes

Just for reference, these are the modes given by xrandr for the monitor I did all this for:

$ xrandr -d :0 --verbose

[ ... ]

  1360x768 (0x4b) 85.500MHz +HSync +VSync *current +preferred
        h: width  1360 start 1424 end 1536 total 1792 skew    0 clock  47.71KHz
        v: height  768 start  771 end  777 total  795           clock  60.02Hz
  1920x1080i (0x10b) 74.250MHz -HSync -VSync Interlace
        h: width  1920 start 2008 end 2052 total 2200 skew    0 clock  33.75KHz
        v: height 1080 start 1084 end 1094 total 1125           clock  60.00Hz
  1920x1080i (0x10c) 74.250MHz +HSync +VSync Interlace
        h: width  1920 start 2008 end 2052 total 2200 skew    0 clock  33.75KHz
        v: height 1080 start 1084 end 1094 total 1125           clock  60.00Hz
  1920x1080i (0x10d) 74.250MHz +HSync +VSync Interlace
        h: width  1920 start 2448 end 2492 total 2640 skew    0 clock  28.12KHz
        v: height 1080 start 1084 end 1094 total 1125           clock  50.00Hz
  1920x1080i (0x10e) 74.176MHz +HSync +VSync Interlace
        h: width  1920 start 2008 end 2052 total 2200 skew    0 clock  33.72KHz
        v: height 1080 start 1084 end 1094 total 1125           clock  59.94Hz
  1280x720 (0x10f) 74.250MHz -HSync -VSync
        h: width  1280 start 1390 end 1430 total 1650 skew    0 clock  45.00KHz
        v: height  720 start  725 end  730 total  750           clock  60.00Hz
  1280x720 (0x110) 74.250MHz +HSync +VSync
        h: width  1280 start 1390 end 1430 total 1650 skew    0 clock  45.00KHz
        v: height  720 start  725 end  730 total  750           clock  60.00Hz
  1280x720 (0x111) 74.250MHz +HSync +VSync
        h: width  1280 start 1720 end 1760 total 1980 skew    0 clock  37.50KHz
        v: height  720 start  725 end  730 total  750           clock  50.00Hz
  1280x720 (0x112) 74.176MHz +HSync +VSync
        h: width  1280 start 1390 end 1430 total 1650 skew    0 clock  44.96KHz
        v: height  720 start  725 end  730 total  750           clock  59.94Hz
  1024x768 (0x113) 65.000MHz -HSync -VSync
        h: width  1024 start 1048 end 1184 total 1344 skew    0 clock  48.36KHz
        v: height  768 start  771 end  777 total  806           clock  60.00Hz
  800x600 (0x114) 40.000MHz +HSync +VSync
        h: width   800 start  840 end  968 total 1056 skew    0 clock  37.88KHz
        v: height  600 start  601 end  605 total  628           clock  60.32Hz
  720x576 (0x115) 27.000MHz -HSync -VSync
        h: width   720 start  732 end  796 total  864 skew    0 clock  31.25KHz
        v: height  576 start  581 end  586 total  625           clock  50.00Hz
  720x576i (0x116) 13.500MHz -HSync -VSync Interlace
        h: width   720 start  732 end  795 total  864 skew    0 clock  15.62KHz
        v: height  576 start  580 end  586 total  625           clock  50.00Hz
  720x480 (0x117) 27.027MHz -HSync -VSync
        h: width   720 start  736 end  798 total  858 skew    0 clock  31.50KHz
        v: height  480 start  489 end  495 total  525           clock  60.00Hz
  720x480 (0x118) 27.000MHz -HSync -VSync
        h: width   720 start  736 end  798 total  858 skew    0 clock  31.47KHz
        v: height  480 start  489 end  495 total  525           clock  59.94Hz
  720x480i (0x119) 13.514MHz -HSync -VSync Interlace
        h: width   720 start  739 end  801 total  858 skew    0 clock  15.75KHz
        v: height  480 start  488 end  494 total  525           clock  60.00Hz
  720x480i (0x11a) 13.500MHz -HSync -VSync Interlace
        h: width   720 start  739 end  801 total  858 skew    0 clock  15.73KHz
        v: height  480 start  488 end  494 total  525           clock  59.94Hz
  640x480 (0x11b) 25.200MHz -HSync -VSync
        h: width   640 start  656 end  752 total  800 skew    0 clock  31.50KHz
        v: height  480 start  490 end  492 total  525           clock  60.00Hz
  640x480 (0x11c) 25.175MHz -HSync -VSync
        h: width   640 start  656 end  752 total  800 skew    0 clock  31.47KHz
        v: height  480 start  490 end  492 total  525           clock  59.94Hz

The vast majority are standard VESA modes.

Making a snapstot of a full Ubuntu / Mint repository on the local disk

What’s that good for?

This isn’t about maintaining a local repository that mirrors its original, following along with its changes. The idea is to avoid the upgrades of a lot of packages every time I want to install a new one with apt. Maybe I should mention that I don’t allow automatic upgrades on my machine? Exactly like I don’t leave my car keys to the mechanic, so he can make any fixes he considers would make my car better. Every day.

Before I get into the technical details, I’ll have my say on the culture of upgrading everything all the time. Just in case someone with influence on the matter reads this. Or maybe someone ready to maintain a non-updating mirror…?

The way packaging is made today is that each package requires the latest-latest dependencies just because they happened to exist, not because they’re needed. I mean, forcing an unnecessary upgrade of other packages is fine, because how could upgrading be wrong? Or go wrong?

But upgrading is good…?

Most people believe upgrading software is generally good. Personally, I don’t. Every now and then, an upgrade breaks something that worked, and even seemingly harmless upgrades of minor pieces of software can force me into a session of debugging my system. It may very well be that the upgrade rectified something that was wrong before. But this way or another, my computer worked before, and after the upgrade it didn’t. As has already been said:

If it ain’t broke, don’t fix it.

Upgrades sometimes involve security fixes. Staying with old versions is considered neglecting your security. This may be true when the computer is a server or a multiuser machine, and strangers are allowed to do this and that with it. However when it comes to single-user desktops that are properly set up (plus a firewall), the risk for an upfront security exploit is rather minimal. I always ask people when they last heard about a personal Linux desktop being compromised by virtue of a vulnerability, and nobody can come up with such case.

I’ve also discussed this issue with several guys who are responsible for major Linux systems, for which a downtime means real damage, and stability is important. The typical conversation contains an apology for using old distributions and old software, and them reassuring me that they understand that upgrading is important. It’s just that in their specific case they have to stick to a certain, old Linux distribution to keep the system running continuously.

So it’s a risk management question: The risk of having the computer messed up by an upgrade (with probability converging to 1 as time increases) vs the probability of desktop being attacked (probability not known, as no such event is known to me). Given that I fix significant security issues by other means, as they occur.

So my decision is clear, and here’s how to do it.

apt-mirror

All said below relates to Linux Mint 19, but most likely applies to a wide range of Debian-based distros.

apt-mirror is a cleverly written Perl script that mirrors selected Debian repositories into the local disk. It’s one of those utilities that simply do the job with the practical, real-life details taken care of correctly. In the proper Perl spirit, in short.

In essence:

  • Install apt-mirror with a plain apt command.
  • Change the ownership the directory to which the packages go (given as base_path in the config file) to apt-mirror.
  • Don’t set up the cron job, as we’re not into having it updated (possibly delete /etc/cron.d/apt-mirror).

Set up mirror.list

The default /etc/apt/mirror.list is generally fine, with nthreads set to 20 by default, which is OK.

You may want to set base_path in /etc/apt/mirror.list to something else than the default.

Then copy all repositories listed in /etc/apt/sources.list.d into mirror.list. This is just copying the lines beginning with “deb” as is.

Well, probably not. If you’re running on a 64-bit machine (is there anyone not?), set /etc/apt/mirror.list to download packages for amd64 and i386. This will grow the disk consumption from 140 GB to 193 GB (YMMV), but sometimes these i386 packages are handy.

For this to work, each line must appear twice. So if the original “deb” line said

deb http://packages.linuxmint.com tara main upstream import backport

these two should appear in mirror.list:

deb-i386 http://packages.linuxmint.com tara main upstream import backport
deb-amd64 http://packages.linuxmint.com tara main upstream import backport

Otherwise apt-mirror downloads only the packages for the current arch. One can also add a deb-src for the mirror repository as well, if desired.

Running apt-mirror

Run apt-mirror as root with

# su - apt-mirror -c apt-mirror

The cool part with running it this way is that if you go CTRL-C (you want to do that, apt-mirror runs forever on the first attempt, downloading ~200 GiB or so), the child processes (a lot of wgets) are killed gracefully as well.

How it works: First it does some crunching of the repositories’ metadata. After not too long time, it generates 20 processes, each for downloading a list of URLs:

wget --no-cache --limit-rate=100m -t 5 -r -N -l inf -o /opt/apt-mirror/var/archive-log.0 -i /opt/apt-mirror/var/archive-urls.0

all of which run in /opt/apt-mirror/mirror, which is the target for the files.

and then it just waits. The output shown on console (like “[20]…”) is the number of processes still running.

Configure apt to use local repositories only

First of all, move the “mirror” subdirectory away to some other place, so it’s out of sight for apt-mirror. No more updates. For example, into /var/local-apt-repo. I also suggest changing its owner to root at this stage:

# chown -R root:root local-apt-repo/

So a line saying

deb http://packages.linuxmint.com tara main upstream import backport

changes into (if the repository is kept in /var/local-apt-repo/)

deb file:///var/local-apt-repo/packages.linuxmint.com tara main upstream import backport

and make apt aware of the change:

# apt clean
# apt update

Verify that only local files are accessed (it prints out the paths) and that there are no errors. Those opting out downloading the i386 repositories as well will get a lot of error messages at the end, like

E: Failed to fetch file:/var/local-apt-repo/packages.linuxmint.com/dists/tara/main/binary-i386/Packages  File not found - /var/local-apt-repo/packages.linuxmint.com/dists/tara/main/binary-i386/Packages (2: No such file or directory)

I suppose it’s harmless to the end that there will be no i386 packages to work with, but I don’t really know, as I went for downloading packages for both archs.

And then comes the last session of upgrades. At a convenient time for tackling possible upgrade side effects, go

# apt list --upgradable

and then try to upgrade packages in small chunks, so that the changes each make can be tracked (in particular if you have a git repo on /etc, like myself), with

# apt install --only-upgrade package-name

For convenience, package-name may include wildcards with * (use single quotes or escape it with backslash, i.e. \*).

After all this is done, fix whatever broke because of all upgrades. In my case I lost graphics acceleration on my NVidia card, solved by manually reinstalling the drivers as originally downloaded from the vendor. Just to remind me why I’m doing all this.

If apt-file is also installed (it’s a good idea to have it), this is also a good time to go

# apt-file update

How and why the local repo is self-contained

It’s not worth much to take a snapshot that can’t be relied upon forever. The fact that the download process typically takes a few days, most likely with several interruptions in the middle, doesn’t contribute to the feeling of reassurance. I ran my downloads on nights only, for example (hey, I want a decent internet connection during the day).

This is solved in a surprisingly simple manner: The Packages files contain the list of files required for constituting a self-contained repository. apt-mirror first downloads these files, then it downloads the files it requires, and then uploads the Packages files in the mirror. At all times, all required files are in place.

This is why it’s safe to stop apt-mirror in the middle: Even though the running wget processes will leave some files half-downloaded, they will be fixed on the next run: apt-mirror compares the size of the file on disk with the size declared in the respective Packages file (in its need_update() function). So all files in the Packages files must exist and be of the correct size. This is apt-mirror’s view of a file being in place.

One could also compare the SHA sums of each file in the entire repo. I haven’t found such utility, and I’m not sure it’s worth the effort.

There’s somewhat reassuring to run apt-mirror after its completion, and see that it downloads nothing. It seems like that doesn’t happen. I ended up downloading one archive file of 596 MiB each time (or so apt-mirror said), but then going

$ find /opt/apt-mirror/ -iname \*.deb -cmin -3

found no files. So this was probably only metadata loaded (indeed, dropping the *.deb requirement listed a lot of files).

Reducing wasted disk space

A side effect of the way apt-mirror works, is that outdated packages remain in the repository: When apt-mirror is re-run, it makes sure that all files in the current Packages files are downloaded. When a package is updated, a new package file is enlisted, and the old one just vanishes from the Packages file. But apt-mirror doesn’t delete it, as it’s in the process of updating the repository. The old Packages file is still in effect.

Also, in a real-life mirror scenario, someone could be in the middle of an installation which is based upon several files. So the unnecessary files can only be deleted after the Packages files have been updated (i.e. when apt-mirror finishes) plus the maximal time one could imagine an installation to take. Actually, in a continuously updating web mirror situation, removing a package file will break things for end-users until they run “apt update”. So a real mirror with happy end users should probably not delete files all that often.

Anyhow, apt-mirror creates a clean.sh script in the var/ subdirectory, which deletes all files that aren’t required by the current set of Packages files. It should be executed to get rid of those, when it’s good time. Note that the script changes directory to the absolute path to which it downloaded the mirror (so watch out if you’re moving that directory eventually).

# su - apt-mirror -c /opt/apt-mirror/var/clean.sh

For this script to be generated, add “clean” lines in mirror.list, like

clean http://packages.linuxmint.com

If there are several “deb” lines for the same host, one “clean” line like the above covers them all.

Another waste of disk space is that security.ubuntu.com contains a lot of packages that are already in other repositories. As this entire repo takes 40 GB (30 GB for amd64 alone), it’s unfortunate. One possibility would be to write a script than scans the directories for identical files (based upon SHA sums) and removes one file, replacing it with a symbolic link. Or maybe get this info from the Packages file. Or, like I did, not bother at all.

apt / dpkg: Ignore error in post-install script

You have been warned

This post shows how to cripple the installation of a Debian package, and make the system think it went through OK. This might very well bring your system’s behavior towards the exotic, unless you know perfectly what you’re doing.

In some few cases, like the one shown below, it might actually be a good idea.

Introduction

Sometimes the post-installation script of Debian packages fails for a good reason. So good that one wants to ignore the failure, and mark the package as installed, so its dependencies are in place. And so apt stops nagging about it.

On my Linux Mint 19 machine, this happened to me with grub-efi-amd64: Its installation involves updating something in /boot, which is mounted read-only, exactly for that reason: The system boots perfectly, why fiddle?

And indeed, it’s not fully installed (note the iF part):

$ dpkg -l grub-efi-amd64
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                                  Version                         Architecture                    Description
+++-=====================================================-===============================-===============================-================================================================================================================
iF  grub-efi-amd64                                        2.02-2ubuntu8.13                amd64                           GRand Unified Bootloader, version 2 (EFI-AMD64 version)

There must be an easy way around it…?

OK, so how about giving it a push?

# dpkg --configure --force-all grub-efi-amd64
Setting up grub-efi-amd64 (2.02-2ubuntu8.13) ...
Installing for x86_64-efi platform.
grub-install: error: cannot delete `/boot/grub/x86_64-efi/lsacpi.mod': Read-only file system.
Failed: grub-install --target=x86_64-efi
WARNING: Bootloader is not properly installed, system may not be bootable
cp: cannot create regular file '/boot/grub/unicode.pf2': Read-only file system
dpkg: error processing package grub-efi-amd64 (--configure):
 installed grub-efi-amd64 package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
 grub-efi-amd64

Not only didn’t it work, but this error message appears every time I try installing anything with apt. It will always attempt to finish that installation. And fail.

After quite some googling, I’m convinced that there’s no way to tell apt or dpkg to ignore a failed post-installation. Not with a million warnings and confirmations. Nothing. The post installation must succeed, by hook or by crook. The packaging machinery simply won’t register a package as fully installed otherwise.

Maybe apt-mark?

Ehm, the truth is I wasn’t aware of this utility when I went through this. So maybe apt-mark can be used to mark the relevant package as installed, and that’s it. Will update if I run into a similar issue again.

The ugly fix

So that leaves us with the crook. Luckily, the script in question is in a known place, waiting to be edited:

# vi /var/lib/dpkg/info/grub-efi-amd64.postinst

For any other package, just replace the grub-efi-amd64 with what you have.

And just add an “exit 0″ as shown below. This makes the script return with success, without doing anything. You may want to examine what it would do, possibly perform some of the operations manually etc. But anyhow, it’s just this:

#!/bin/bash
set -e

exit 0;

[ ... ]

And then try again:

# dpkg --configure grub-efi-amd64
Setting up grub-efi-amd64 (2.02-2ubuntu8.13) ...

Like a charm, of course. And now the package is happily installed:

$ dpkg -l grub-efi-amd64
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version      Architecture Description
+++-==============-============-============-=================================
ii  grub-efi-amd64 2.02-2ubuntu amd64        GRand Unified Bootloader, version

And don’t forget to remove that edit afterwards. Or possibly it might cause issues in the future…?

Linux: Command-line utilities for obtaining information

There are many ways to ask a Linux machine how it’s doing. I’ve collected a few of them, mostly for my own reference. I guess I’ll add more items as I run across new ones.

General Info

  • inxi -Fxxxz (neat output, but makes the system send me security “password required” alert mails because of attempts to execute hddtemp).
  • hwinfo
  • lshw
  • Temperature and fans: sensors
  • hostnamectl (host name, kernel version, Distribution etc.)
  • dmidecode: Lots of info on and by the BIOS (From computer model an make down to exact part number of installed DIMM memories etc.)

Status

  • Logs: journalctl and dmesg
  • systemctl, with all its variants
  • Network: ifconfig
  • Wifi: iwconfig
  • Bluetooth: hciconfig
  • CPU: lscpu
  • PCI bus: lspci
  • USB: lsusb
  • RAID: mdadm –detail /dev/md0

Operating system

  • List open files: lsof
  • Block devices and partitions: blkid and lsblk
  • List namespaces: lsns
  • List loaded kernel modules: lsmod
  • List locks: lslocks

Linux: Atheros QCA6174′s Bluetooth disappearing after reboot

When Bluetooth goes poof

Having rebooted my computer after a few months of continuous operation, I suddenly failed to use my Bluetooth headphones. It took some time to figure out that the problem wasn’t with the Cinnamon 3.8.9 fancy stuff, nor the DBus interface, which produced error messages. There was simply no Bluetooth device in the system to talk to.

Prior to this mishap, my Atheros QCA6174 had worked flawlessly and reliably for several months, both as a Wifi adapter and a Bluetooth adapter.

For the record, I have a Linux Mint 19 Tara machine with 4.15.0-20-generic kernel on a X299 AORUS Gaming 7 motherboard, running in 64 bit mode of course.

I’ll jump to the spoiler: If you happen to have a Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter on your machine, never just reboot the machine: Shut down the computer completely, and disconnect main power for a minute or so with the power supply’s switch. Just powering off the computer the fine way isn’t enough. The device probably continues to get power from the motherboard when in computer is off by virtue of its own power control.

Powering off the computer this way is what solved it for me. However there are also some rumors on the web, which I can’t confirm, about Bluetooth coming back to life after loading Windows on the same computer. Or turning Bluetooth off and on again with the BIOS. My guess is that due to a bug, the chip sometimes needs some kind of tickle on shutdown or when starting, or Bluetooth is lost. Something that is worked around with a hush-hush fix in the driver for Windows, but the Linux driver doesn’t do the same.

This post goes down to the gory details, partly for the sake of quick diagnostics in the future, and partly because Bluetooth tends to be a mystery thing. So I’m trying to give an idea on what’s going on.

Why it’s confusing

Here’s the thing: The Qualcomm QCA6174 connects to the motherboard as a PCIe device as a Wifi adapter, and to the USB bus as a Bluetooth device. Sounds weird, but that’s the way it is. Bluetooth has been wonky for 20 years, and that’s probably its destiny.

So there are wires going from the QCA6174 to the PCIe bus and other wires from the same device going to one of the ports of the motherboard’s USB root hub (see details below). On my specific motherboard, the Bluetooth interface of the QCA6174 is connected to port 13 of the root hub, that facilitates the motherboard’s physical USB connector at the back of the computer. So while designing the board, they wired some of the D+/D- wires to the physical ports at the back, and a couple of those go to the QCA6174. I’m saying this over and over again, because it’s so counterintuitive.

Counterintuituve, but it seems like it’s quite common. Intel’s AC 7260 Wifi / Bluetooth combo seems to do exactly the same thing.

As a PCIe device, QCA6174 has Vendor / Product IDs 168c:003e. As as USB device, it’s 0cf3:e300. Confusing? It won’t surprise me if the Wifi and Bluetooth interfaces are two independent units on the same chip, that happen to share an antenna.

Apparently, when the QCA6174 has a bad day, the PCIe interface wakes up properly, and USB doesn’t. The result is that the Wifi works fine, but the Bluetooth is absent.

To add some confusion, the kernel source’s drivers/net/wireless/ath/ath10k/usb.c matches USB device 13b1:0042, which is indeed a Linksys device (the comment in the code says Linksys WUSB6100M). Not clear why it’s there.

On the other hand, drivers/bluetooth/btusb.c, matches a whole range of Atheros USB devices, among others 0cf3:e300, calling it “QCA ROME” in the comments. So it’s the btusb module that takes care of QCA6174′s Bluetooth interface, not anything in ath/ath10k. Cute, isn’t it?

What it looks like when it works

When trying to figure out what’s wrong, it helps knowing that it looks like when it’s OK. So below is a lot of info that was collected when I got the Bluetooth up and running.

When it failed, everything looked exactly the same in relation to the device’s PCIe interface, but there was absolutely nothing related to USB and Bluetooth: No entry for the device in lsusb, hcicontrol nor rfkill, as shown below.

Kernel log output on behalf of the device, as connected to the PCIe bus. Note that the exact same logs appeared when the Bluetooth device was absent. Exactly-exactly. Down to the single character, I’ve compared them. So this isn’t really relevant, but anyhow:

[    0.126428] pci 0000:03:00.0: [168c:003e] type 00 class 0x028000
[    0.126456] pci 0000:03:00.0: reg 0x10: [mem 0x92800000-0x929fffff 64bit]
[    0.126555] pci 0000:03:00.0: PME# supported from D0 D3hot D3cold

[ ... ]

[   17.616738] ath10k_pci 0000:03:00.0: enabling device (0000 -> 0002)
[   17.617514] ath10k_pci 0000:03:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0

[ ... ]

[   17.915091] ath10k_pci 0000:03:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:03:00.0.bin failed with error -2
[   17.915109] ath10k_pci 0000:03:00.0: Direct firmware load for ath10k/cal-pci-0000:03:00.0.bin failed with error -2
[   17.926172] ath10k_pci 0000:03:00.0: qca6174 hw3.2 target 0x05030000 chip_id 0x00340aff sub 1a56:1535
[   17.926173] ath10k_pci 0000:03:00.0: kconfig debug 0 debugfs 1 tracing 1 dfs 0 testmode 0
[   17.926505] ath10k_pci 0000:03:00.0: firmware ver WLAN.RM.4.4.1-00124-QCARMSWPZ-1 api 6 features wowlan,ignore-otp crc32 d8fe1bac
[   18.078191] ath10k_pci 0000:03:00.0: board_file api 2 bmi_id N/A crc32 506ce037

[ ... ]

[   18.642461] ath10k_pci 0000:03:00.0: Unknown eventid: 3
[   18.658195] ath10k_pci 0000:03:00.0: Unknown eventid: 118809
[   18.661096] ath10k_pci 0000:03:00.0: Unknown eventid: 90118
[   18.661772] ath10k_pci 0000:03:00.0: htt-ver 3.56 wmi-op 4 htt-op 3 cal otp max-sta 32 raw 0 hwcrypto 1

Note that two attempts to load firmware failed, but apparently the third went OK. Don’t let these error messages mislead you: The kernel messages in this respect were the same when the Bluetooth appeared and when it didn’t.

The “Unknown eventid” may appear more than once.

Its entry with plain lspci (unrelated entries removed):

$ lspci
03:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32)

And now to the parts that were missing completely when the Bluetooth device didn’t appear: The logs on behalf of the device, connected to the USB bus:

[    3.764868] usb 1-13: new full-speed USB device number 12 using xhci_hcd
[    3.913930] usb 1-13: New USB device found, idVendor=0cf3, idProduct=e300
[    3.915610] usb 1-13: New USB device strings: Mfr=0, Product=0, SerialNumber=0

Plain lsusb:

$ lsusb
[ ... ]
Bus 001 Device 012: ID 0cf3:e300 Atheros Communications, Inc.
[ ... ]

lsusb, tree view (a lot of irrelevant stuff excluded):

$ lsusb -t
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/16p, 480M
    |__ Port 13: Dev 12, If 1, Class=Wireless, Driver=btusb, 12M
    |__ Port 13: Dev 12, If 0, Class=Wireless, Driver=btusb, 12M

More in detail for the device:

# lsusb -v -d 0cf3:e300

Bus 001 Device 012: ID 0cf3:e300 Atheros Communications, Inc.
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.01
  bDeviceClass          224 Wireless
  bDeviceSubClass         1 Radio Frequency
  bDeviceProtocol         1 Bluetooth
  bMaxPacketSize0        64
  idVendor           0x0cf3 Atheros Communications, Inc.
  idProduct          0xe300
[ ... ]

Several modules appear in lsmod, but the point is that there are dependencies on bluetooth module, in particular the btusb module. Irrelevant modules deleted from list below:

$ lsusb
rfcomm                 77824  16
bnep                   20480  2
btusb                  45056  0
btrtl                  16384  1 btusb
btbcm                  16384  1 btusb
btintel                16384  1 btusb
bluetooth             548864  43 btrtl,btintel,bnep,btbcm,rfcomm,btusb

How to check if a Bluetooth device is present

There is no device file for Bluetooth interface, exactly as there’s none for network. Like there’s eth0 for Ethernet, there’s hci0 for Bluetooth.

hcicontrol grabs the info by opening a socket. As in the relevant strace:

socket(PF_BLUETOOTH, SOCK_RAW, 1)

So this is what it looked like with the Bluetooth device present (without it, hciconfig simply prints nothing):

$ hciconfig
hci0:	Type: Primary  Bus: USB
	BD Address: xx:xx:xx:xx:xx:xx  ACL MTU: 1024:8  SCO MTU: 50:8
	UP RUNNING PSCAN ISCAN
	RX bytes:1386 acl:0 sco:0 events:94 errors:0
	TX bytes:5494 acl:0 sco:0 commands:94 errors:0

Real hex numbers appear instead of the xx’s above. Use hciconfig -a for more verbose output.

And the device appears in the rfkill list, and shouldn’t be blocked.

$ rfkill list
0: hci0: Bluetooth
	Soft blocked: no
	Hard blocked: no
1: phy0: Wireless LAN
	Soft blocked: no
	Hard blocked: no

These two show that the kernel supplies a Bluetooth device to the higher software levels. If Bluetooth doesn’t work, there are other reasons…

VIA VL805 USB 3.0 PCIe adapter: Forget about Linux

TL;DR

Bought an Orico PCIe adapter for USB 3.0 for testing a USB device I’m developing (PVU3-5O2I). It has the VL805 chipset (1106/3483) which isn’t xHCI compliant. So it works only with the vendor’s own drivers for Windows, which you’ll have to struggle a bit to install.

Attempt with Linux

That the device is detected by its class (xHCI), and not by its Vendor / Product IDs.

The following was found in the kernel log while booting:

[    0.227014] pci 0000:03:00.0: [1106:3483] type 00 class 0x0c0330
[    0.227042] pci 0000:03:00.0: reg 0x10: [mem 0xdf000000-0xdf000fff 64bit]
[    0.227104] pci 0000:03:00.0: PME# supported from D0 D1 D2 D3hot D3cold
[    0.227182] pci 0000:03:00.0: System wakeup disabled by ACPI

and

[    0.325254] pci 0000:03:00.0: xHCI HW did not halt within 16000 usec status = 0x14

and then

[    1.474178] xhci_hcd 0000:03:00.0: xHCI Host Controller
[    1.474421] xhci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 3
[    1.505919] xhci_hcd 0000:03:00.0: Host not halted after 16000 microseconds.
[    1.506066] xhci_hcd 0000:03:00.0: can't setup: -110
[    1.506241] xhci_hcd 0000:03:00.0: USB bus 3 deregistered
[    1.506494] xhci_hcd 0000:03:00.0: init 0000:03:00.0 fail, -110
[    1.506640] xhci_hcd: probe of 0000:03:00.0 failed with error -110

The error message comes from xhci_halt() defined in drivers/usb/host/xhci.c, and doesn’t seem to indicate anything special, except that the hardware doesn’t behave as expected.

Update firmware, maybe?

The idea was to try updating the firmware on the card. Maybe that will help?

So I downloaded the driver from the manufacturer and the firmware fix tool from Station Drivers.

Ran the firmware fix tool before installing the driver on Windows. It went smooth. Then recycled power completely and booted Linux again (the instructions require that). Exactly the same error as above.

Went for Windows again, ran the firmware update tool, and this time read the firmware revision. It was indeed 013704, as it should be. So this doesn’t help.

Install driver on Windows 10

Checking in the Device Manager, the card was found as an xHCI controller, but with the “device cannot start (Code 10)”. In other words, Windows’ xHCI driver didn’t like it either.

Attempted installation of the driver. Failed with “Sorry, the install wizard can’t find the proper component for the current platform. Please press OK to terminate the install Wizard”. What it actually means is that the installation software (just downloaded from the hardware vendor) hasn’t heard about Windows 10, and could therefore not find an appropriate driver.

So I found the directory to which the files were extracted, somewhere under C:\Users\{myuser}\AppData\Local\Temp\is-VJVK5.tmp, and copied the USetup directory from there. Then selected xhcdrv.inf for driver installation. It’s intended for Windows 7, but it so happends, that generally drivers for Windows 7 and Windows 10 are the same. It’s the installer that was unnecessarily fussy.

After installing this driver, a “VIA USB eXtensible Host Controller” entry appeared in the USB devices list of the Device Manager, and it said it works properly.

After a reboot, there was “xHCI Root Hub 0″ under “Other Devices” of the Device Manager, with the error message “The drivers for this device are not installed”. It was available under the same USetup directory (ViaHub3.inf).

This added “VIA USB 2 Hub” and “VIA USB 3 Root Hub” to the list of USB devices, and believe it or not, the card started working.

Bottom line: It does work with its own very special drivers for Windows, with a very broken setup procedure.

ImageMagick convert to scale jpgs

Instead of using my scanner, I put my cell phone on some kind of stand, and shot a lot of paper documents (voice activation is a blessing). But then the files are unnecessarily large. Don’t need all that resolution. So

$ for i in * ; do convert "$i" -scale '33%' -quality 75 "smaller/scan_$i" ; done

And the files are 100-200k each with enough resolution to see the fine print.

USB 3.0 device compliance test notes

Introduction

While implementing Xillybus‘ USB 3.0 general purpose IP core for FPGAs, I found the USB Implementers Forum’s compliance tool handy, yet somewhat quirky, for verifying I got things right. It was USB3CV version 2.1.12.1, running on Windows 10 @32 bit. The 64 bit version works the same (I’ve tested it as well).

A GPLed open-source version for Linux is something one could have wished for (and would probably improve things considerably), but that’s probably too much to expect when Microsoft is all over the USB standard.

These are my notes as I went along.

Obtaining and installing

Download USB3CV from this page. Installation went smooth on Windows 10 @32 bits (and 64 bits as well).

As suggested by the utility’s author, disable UAC completely on the system by invoking regedit, setting the EnableLUA to 0 on the following path: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System. And restart Windows. This seems to be a matter of convenience, and not something to do on anything else but an internal computer intended for testing only, as it’s a security hole.

A usbif.json file can be obtained from USB-IF for use by USB3CV. This only allows looking up the Vendor ID, so there’s no problem working without it. There will be two red lines in the log, but the relevant tests passes without this file as well.

Installing on Windows 7 (32 bit) failed on installing the Microsoft Visual C++ 2017 Redistributable (vc_redist.x86.exe) prerequisite. Despite removing everything installed on that computer (under Programs and Features), the problem remained — it’s probably because SP1 isn’t installed on that machine.

The hijack

When invoked, the USB3CV program replaces the original xHCI driver in Window’s I/O stack with one it uses for testing, and returns the original one when exiting. This means that all USB devices that are connected to the USB controller (possibly all USB devices on a motherboard) become effectively disconnected. That includes USB 2.0 and USB 1.1 devices. To work around this, either use good-old PS/2 keyboard and mouse, employ a remote session, or plug in an extra USB board (possibly as low as USB 1.1) and plug the USB mouse and keyboard there. Otherwise, well, no mouse and keyboard on a Windows system.

Also, be sure to close USB3CV before shutting down Windows, or it may not have enough time to bring the original driver back. See more notes on this issue below.

USB3CV is kind enough to prompt for which USB controller to take over, and it’s also fine if you accidentally knock out your own USB mouse and keyboard (a second dialog box requires confirming the takeover with a mouse click, or it times out and reverts it).

The immediate difference when plugging in a device when USB3CV is running (and has taken over the relevant controller) is that nothing happens — there is no enumeration nor descriptor fetching, as there would usually be. This happens only later on, when requesting tests, in which case the controller is scanned for devices. Several times, actually.

Another significant difference is that the test xHCI driver doesn’t have these small workaround features that a usual xHCI driver has for getting unstuck from protocol bug deadlocks, and it doesn’t attempt to smooth protocol errors. Which makes sense: A regular xHCI driver’s goal is to make the device work. The test driver is there to expose errors, so it should get stuck when things are done wrong. Hence it may expose bugs that were smoothed out when the device was connected in a regular manner to a computer. For example, a bug in the device’s LTSSM implementation may be smoothed out by a regular driver by issuing a warm reset and starting all over again, without any log message, but USB3CV’s driver will just fail.

So if weird stuff happens with USB3CV, check your own implementation, and don’t look for bugs in USB3CV. Reject the immediate “but it worked before” instinct to blame something else than your own design.

Hands on

Double-click the USB 3 Gen X CV icon. A dialog box with a list of USB controller(s) opens. Select the one that the device is attached to. All other USB devices on that controller, of all speeds, will be disconnected from the computer. Then the “Command Verifier” asks to verify this choice. If you just disabled your own mouse and keyboard you can’t click “Continue”, and that dialog box will time out, and the hijacked USB controller is released.

Then the main windows opens. Select “Chapter 9 Tests (USB 3 Gen X devices)”, Compliance Test (or Debug for individual tests), and click Run.

This is when USB3CV tries to find devices on the hijacked USB controller (it’s complete silence on wires until then). Sometimes this fails for no apparent reason — see below. A dialog box asking to select the device to work with appears, and is then followed by three dialog boxes asking for the number of retimers. If you don’t know what it’s about, you don’t have any, so select zero on all three.

When tests fail, the error messages may be misleading. For example, a problem with the device’s LTSSM made the GetDecriptor() test fail, spitting out the paragraphs in the spec that aren’t met, but on the wires there was no related SETUP packet sent (because the link wasn’t up, it turned out eventually). However the SET_SEL test went through OK nevertheless. So it may be really confusing. This is easily mistaken for a bug in USB3CV.

Even worse, it seems like a test failure can lead to all kind of unexpected and unrelated errors in following tests.

The tool also complains when the device declares itself as USB 3.0 in the device descriptor rather than USB 3.2, considering it to be a test failure. What about USB 3.0 devices, not supporting anything related to SuperSpeedPlus? Why should they even mention USB 3.2?

When things get stuck

If weird things happen (in particular if the device isn’t found by USB3CV and/or otherwise) re-run USB3CV and exit it, so the original xHCI controller is brought back upon exit. That’s what USB3CV expects to see on invocation, and it doesn’t work properly otherwise. So just run the tool and exit immediately, and then run it again.

It’s better to start USB3CV with the device already plugged in. Moving it to another plug while USB3CV is running often helps.

Crashes

Unfortunately, USB3CV crashes quite a lot (mostly in relation with test failures, in particular failing tests related to low power states). The “Command Verifier Tool has stopped working” dialog box may appear. A rather ironic workaround seems to work: Clicking “Abort” as the tests run (towards the end, actually), and then clicking “No”, a bit before the end of the tests, in the dialog box asking if you really want to abort (so the test isn’t aborted, after all). Sometimes the enumeration test is done, sometimes it isn’t (and fails with some ugly error), so maybe that’s related.

Sometimes USB3CV just gets stuck in a test, and attempting abort the test doesn’t help. Closing the USB3CV windows brings up “Wait for a stack switch” after which Windows crashes with a “Your PC ran into a problem and needs to restart. You can restart”. Which probably means that it’s OK to recycle power (no other possibility left). Windows suggest searching online for “WDF_VIOLATION” too.

Whether the power recycle took place or not, USB3CV didn’t have the opportunity to return the original xHCI driver as it usually does upon a normal exit. Therefore, be sure to invoke and exit USB3CV as mentioned above to get the system back to its original state.

Linux: Writing a Windows MBR instead of grub

On a test computer, I use /dev/sda1 to contain whatever operating system I need for the moment. At some point, I installed Linux Mint 19.1 properly on that partition, and then I wanted to return to Windows 10. After writing the Windows 10 image to /dev/sda1, I got a message from grub saying it didn’t detect the filesystem.

Hmmm… So the MBR was overwritten by GRUB, and now I need to get it back to Windowish. One can use Microsoft’s rescue disk-on-key, or the quick hack: Download ms-sys, compile with plain “make”, don’t bother to install, and just go from the bin/ directory:

# ./ms-sys -7 /dev/sda

and Windows 10 boots like a charm.

Plumbing notes (yes, really plumbing)

Introduction

FPGA and Linux and all that hi-tech stuff is nice, but nothing compares to the self pride of getting a simple plumbing job done right. So this time it was about installing a pressure gauge under a bathroom sink, between the water outlet for the faucet’s cold water and the faucet itself.

Pressure gauge and tee

No need to tell me the reading is valid only if the taps are all closed. Anyhow, this is a simple task if you happen to have a Tee fitting that happens to match the stuff that is supposed to connect to it. If not, go find adapters. Or another Tee. If you’re a plumber, you probably have a large box with stuff to try around. As I’m not (the pipelines I usually deal with are digital register pipelines), the trick is to define the exact parts needed, and find them on AliExpress or Ebay. Or maybe even at the hardware store. The latter option turned out pretty difficult, as these parts are cheap, the motivation to help is accordingly, and if I can’t define what I need exactly, it’s a lost battle. So I ordered the stuff from AliExpress eventually. And I got it right.

So here are the notes to myself for the next time I’ll need to do something similar.

The standards

Spoiler: In Israel, everything follows BSP. A former British colony, after all.

Pluming is a local thing, performed by local people, depending on their local hardware stores, with a “give me that thingy” kind of communication. It’s therefore quite difficult to find exact definitions for plumbing fittings. It goes by “see if it fits”.

For threaded fittings, terms like ½” and ¾” are often used, but they refer to nothing measured on the piece of metal itself. These figures used to tell the inner diameter of the pipe itself, but that doesn’t work anymore. So if you want to measure a fitting and tell what it’s called in the market, you need to go to the standard tables.

And here comes the real fun. There are mainly two standards for pipe sizes, which detail the dimensions for the pipes and threads. It seems like the most common ones in Israel (and Europe) follow the British Standard (BSP), but there’s also American National Standard Pipe Thread standards (Often referred to as Nominal Pipe Size, NPS or National Pipe Taper / Thread, NPT).

Sometimes IPS (Iron Pipe Size) is mentioned, but it usually means NPS.

The two standards are incompatible, despite similar terminology and measures. In particular, the thread pitch doesn’t match between the two standards. But there’s also the thread form: American goes with Sellers, which has sharp edges, and British with Whitworth thread form, which is a sine wave shape. Not that I can tell the difference just by looking. So if you try to mix British with American fittings, screwing will probably be difficult, and it won’t hold pressure well, if at all.

Either way, for historical reasons, the inch number used in these standard matches none of the measured diameters of the pipe: Neither the inner or outer. The OD, outer diameter, is the easiest to measure, and should be compared with the standard.

So the main headache is BSP vs. NTP (or NPS, IPS, MIP, FIP and all other abbreviations meaning “American”).

And then we have this thing with DN sizes. For example, DN15 means ½”, and DN20 means ¾”. Even though these were coined for the American standard (and listed on Wikipedia’s page for NPS) it seems like they’re also used in context of BSP. So if a product is listed with a DN number, it probably means nothing on which BSP vs. NTP.

Actual measurements

This is what I measured on my own stuff.

  • The tap for my washing machine is a ¾” according to the manual, I measured 0.97″ outer diameter (1.05″ per standard). Apparently washing machines go BSP.
  • A typical shower head has a ½” fitting, not clear if American or British (for this size, it seems like NPS and BSP are roughly the same).
  • My supply stop valves (those wall taps for bathroom faucets) are 3/8″ (measured 0.64″ outer diameter). In Israel, these wall-mounted angle valves are called “NIL taps (ברז ניל)”, which most likely refers to the German company NIL, and therefore conforms to BSP.
  • The pressure gauge is an ¼” (measured outer diameter 0.5″).

Iron and brass

It’s pretty well known, that if brass fittings are used on iron pipes, or if these two metals are mixed in any other way, the iron will corrode rapidly (within a few years), as they work together as a battery. So the material is crucial.

In Israel, all valves, taps and faucets are made of chrome or nickel plated brass, and is therefore OK for use with brass Tees and adapters.

“Teflon” tape (or PTFE)

When there’s no rubber ring sealing, teflon tape is applied on the thread. It works better when there’s a hard end to the screwing, as the force of the end works on the thread and the teflon applied to it. But it can work well otherwise.

It’s 20 rounds around, or it won’t seal. Apply evenly on the thread with slight tension. The direction is as for the turn direction while fitting, i.e. the circular motion will tighten the “teflon” even more (and not unwind it). Don’t cover the few first threads (the end of the pipe) for easier fitting. If the fitting torque is easy all the way, it’s not going to seal.