systemd random jots
As systemd seems to be here to stay (or at least I hope so), this is a post of random notes to self that I jot down as I explore it. It will probably grow with time, and become a mixture of basic issues and rather advanced stuff.
Also see my post on systemd services as cronjobs, which also discusses templates and some other finer details.
Useful references
- man systemd.service and man systemd.unit as well as others. Really. These are the best sources, it turns out.
- The excellent Systemd for Admins series (with several relevant and specific topics).
- The primer for systemd: Basic concepts explained.
- Red Hat’s guide to creating custom targets (and daemons)
- The FAQ (with actually useful info!)
- On the Network Target (and how to run a target only when the network is up)
- man systemd.special for a list of built-in targets, their meaning and recommended use
- man systemd.timer
- man systemd.time for how to express time events with systemd
- systemd.kill on how systemd kills services
- systemd.exec
systemctl is the name of the game
Forget “service”, “telinit” and “initctl”. “systemctl” is the new swiss knife for starting, stopping, enabling and disabling services, as well as obtaining information on how services are doing. And it’s really useful.
To get an idea on what runs on the system and what unit triggered it off, go
# systemctl status
Note that “systemd status” lists, among others, all processes initiated by each login session for each user. Which is an extremely useful variant of “ps”.
And just a list of all services
# systemctl
Ask about a specific service:
# systemctl status ssh
● ssh.service - OpenBSD Secure Shell server
Loaded: loaded (/lib/systemd/system/ssh.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2017-12-01 10:37:21 IST; 1h 17min ago
Main PID: 1018 (sshd)
CGroup: /system.slice/ssh.service
└─1018 /usr/sbin/sshd -D
Dec 01 12:26:26 machine sshd[2841]: Accepted publickey for eli from 192.168.1.12 port 45220 ssh2: RSA SHA256:xxx
Dec 01 12:26:26 machine sshd[2841]: pam_unix(sshd:session): session opened for user eli by (uid=0)
Show the service unit’s file (note that the file name in effect appears as a comment in the first row):
$ systemctl cat ssh # /lib/systemd/system/ssh.service [Unit] Description=OpenBSD Secure Shell server After=network.target auditd.service ConditionPathExists=!/etc/ssh/sshd_not_to_be_run [Service] EnvironmentFile=-/etc/default/ssh ExecStartPre=/usr/sbin/sshd -t ExecStart=/usr/sbin/sshd -D $SSHD_OPTS ExecReload=/usr/sbin/sshd -t ExecReload=/bin/kill -HUP $MAINPID KillMode=process Restart=on-failure RestartPreventExitStatus=255 Type=notify RuntimeDirectory=sshd RuntimeDirectoryMode=0755 [Install] WantedBy=multi-user.target Alias=sshd.service
Really, isn’t it sweet?
There’s also systemctl show for an extensive printout of all assignments, explicit and implicit.
Turning off a service: Find it with the “systemctl status” command above (or just “systemctl”), and then (this is an example of a service not found in the status, because it’s an LSB service);
# systemctl disable tvheadend tvheadend.service is not a native service, redirecting to systemd-sysv-install Executing /lib/systemd/systemd-sysv-install disable tvheadend insserv: warning: current start runlevel(s) (empty) of script `tvheadend' overrides LSB defaults (2 3 4 5). insserv: warning: current stop runlevel(s) (0 1 2 3 4 5 6) of script `tvheadend' overrides LSB defaults (0 1 6).
And “enable” for enabling a service.
Needless to say, services can be started, stopped and restarted with “systemctl X service” where X is either start, stop or restart.
For a list of all services (including disactivated):
$ systemctl --all
and then there’s a whole range of systemctl list-this-and-that, which are really useful. For example (try them out!):
$ systemctl list-dependencies $ systemctl list-timers $ systemctl list-unit-files $ systemctl list-sockets
No more fishing in /var/log/syslog
/var/log/syslog is still there, but forget about it: journalctl is the way read logs. And it doesn’t require root privileges, which is reason enough.
To get the log message since the current boot:
$ journalctl -b
(that alone justifies using the utility).
There’s also the -u flag to see the logs from a specific (systemd) unit (systemctl status gives that as well), -g for grep, and the -f (follow) flag as in tail -f.
A fast shutdown
Maybe the most annoying thing about systemd is that if some process gets stuck, the shutdown waits for it forever. That is, three minutes typically. To fix this edit both /etc/systemd/system.conf and /etc/systemd/user.conf and make them say
DefaultTimeoutStopSec=5s DefaultTimeoutAbortSec=5s
This typically requires uncommenting the assignment for DefaultTimeoutStopSec, and add the latter. The result of this setting is a reduction of the delay to 10 seconds (these two add up).
A reboot is required for this to take effect.
What makes a systemd service run (on boot)
- It’s enabled, which means that there’s a symbolic link from some /etc/systemd/*/*.wants directory to the unit file. In the example below, it’s a .path file, so it activates a watch on the path specified, but if it’s a service, it’s kicked off at boot
# systemctl enable foo.path Created symlink from /etc/systemd/system/paths.target.wants/foo.path to /etc/systemd/system/foo.path
- In the unit file symlinked to, there’s an [Install] section, which says when it should be kicked off with the WantedBy directive. More precisely, which target or service should be active. Once again, for a plain .service unit file, this kicks off the service, and for an e.g. .path file, this starts the monitoring. man systemd.special for a list of built in targets, and targets can be generated, of course. The most common target for “run me at boot” is multi-user.target. Dependency on services is expressed by using the service name with a .service suffix instead.
- By default, unit files such as .path files kick off the a .service file with the same non-suffix part (this can be changed with a directive in the file. But why?)
See /etc/systemd/system/multi-user.target.wants for a list of services that are activated on boot. In particular note that not all are symlinks to .service unit files.
General memo jots
- Always run the
systemctl daemon-reload
command after creating new unit files or modifying existing unit files. Otherwise, thesystemctl start
orsystemctl enable
commands could fail due to a mismatch between states of systemd and actual service unit files on disk. - Services are best run in the foreground. Unlike classic UNIX services, there’s no point in daemonizing. All processes belonging to the relevant service are enclosed in a Cgroup anyhow, and systemd handles the daemonizing for Type=simple services. In a clean and uniform manner.
- Unit files’ suffix indicate their type. When the non-suffix part of two files is the same, they indicate a functional relationship. For example, systemd-tmpfiles-clean.timer says when to launch systemd-tmpfiles-clean.service. Or that systemd-ask-password-console.path gives the path to be watched, and systemd-ask-password-console.service is the service to fire off.
- After= doesn’t imply a dependency, and Requires= doesn’t imply the order of starting services. Both are needed if one service depends on the other running when it starts.
- The Type= directive’s main influence is determining when the service is active, i.e. when other services that depend on it can be launched.
- There’s also “loginctl” which lists the current users and their sessions
Where to find unit files
The configuration files are considered in the following order (later overrules earlier):
- /lib/systemd/system — where installation scripts write to
- /run/systemd/system — runtime files
- /etc/systemd/system — per-system user preferences
Per-user files can be found in ~/.config/systemd/user and possibly also in ~/.local/share/systemd/user.
Keeping the service under control
The control on services is quite impressive. Both container virtualization and systemd use Cgroups, so there’s a bit of container flavor to this whole thing.
From man systemd.exec:
- User= to run as a certain user. This also sets the group information of the user, so there’s no need to use Group= in addition to this.
- WorkingDirectory= for setting the cwd of the service (there’s also RootDirectory= for a chroot).
- NoNewPrivileges= to prevent privileges elevation. The easy and efficient way, according to the man page.
- SecureBits= set to noroot, to prevent the process from gaining root. More fine-grained than NoNewPrivileges.
- A whole lot of Limit*= assignments for limiting resource usage
- OOMScoreAdjust= for making it less or more eligible for OOM killer
- ProtectSystem= and ProtectHome= for preventing access to certain directories from the processes in the control group.
- ReadWritePaths=, ReadOnlyPaths=, InaccessiblePaths= are more fine-grained in choosing in which directories the service is allowed to do what.
- PrivateTmp= for creating private /tmp and /var/tmp for the service.
- Environment= (and possibly EnvironmentFile=) for setting environment variables
- StandardInput= allows feeding the process’ stdin with data from a file (or other sources, e.g. ttys, sockets and more), or with literal data from the unit file, with StandardInputText= or StandardInputData=
- StandardOutput= and StandardError= define where the respective outputs go. Default to the system journal.
It’s also possible to create runtime directories that are removed when the service terminates (e.g. RuntimeDirectory=).
There is a whole range of other sandboxing options, including disabling networking (leaving lo only). It’s also possible to restrict system calls
KillMode
By default, KillMode=control-group, so all processes in the group are killed with the signal specified in KillSignal (defaults to SIGTERM). It then sends a SIGKILL after TimeoutStopSec seconds, assuming that SendSIGKILL=yes, which is the default and definitely recommended setting (see man systemd.kill).
Setting KillMode=mixed is like control-group, but the initial SIGTERM is only sent to the main process. This is useful if it catches this signal and shuts down the other processes nicely. And if it doesn’t, the big hammer goes on all processes after TimeoutStopSec.
I’m not clear on what happens if SIGKILL doesn’t really kill some process (due to e.g. being stuck in an uninterruptible sleep). I guess the service would be considered stopped anyhow, but it appears like this isn’t documented.
User Systemd?
User-mode systemd is in principle the same as the mainstream services, with the main difference that they are intended to run with a specific user ID, and while that user is logged in. So the original idea behind this concept is to have certain processes running in the background while this user has a session (i.e. is logged in) and turn them off when this user logs out. This is an excellent page on the matter.
But if the service is supposed to run regardless of whether the user is logged in or not, it’s typically wiser to make it a regular systemd service, and set the User= assignment in the unit file to select the relevant user for execution. The only advantage with User Systemd is if that user needs the capability to make changes in the service unit files, and it doesn’t have root on the computer. So I opted this out.
It’s important to note however that the processes generated by systemd don’t belong to a session, and they don’t have the environment variables set by .bashrc or anything of that sort. They run independently. Their only relation with the user logging in is when they live or not, and even that isn’t always true (see notes on lingering below).
Another important thing is that WantedBy (in the service unit file’s Install section) should be set to default.target and not multi-user.target, like a system-wide service. And that makes sense: The latter target is something related to the entire system.
And then there’s “lingering”, which means that the user service runs even when the user isn’t logged in. Effectively, the service turns into a regular service, kicked off on boot (if enabled), just with user privileges and with the definition files put in a user directory. To do this, go
# loginctl enable-linger username
This makes the login manager kick off the services as soon as it’s started — that is, at boot.
Enabling console autologin on tty1 and ttyPS0
Following this page and as explained on this page, add /etc/systemd/system/getty@tty1.service.d/autologin.conf (after creating the getty@tty1.service.d directory) as follows:
[Service] ExecStart= ExecStart=-/sbin/agetty -a root --noclear %I $TERM
Note that the filename autologin.conf has no significance. It’s suffix and the directory it resides in that matter.
The idea is to override the ExecStart parameter given in /lib/systemd/system/getty@.service template unit, which reads
ExecStart=-/sbin/agetty --noclear %I $TERM
but otherwise have it running the same. The reason for two ExecStart lines is that the empty assignment clears the existing assignment (or otherwise it would have been added on top of it), and the second sets the command.
Note the %I substitute parameter, which stands for the instance of the current tty.
The leading dash means that the exit value of the command is ignored, and may be nonzero without the unit considered as failed (see man systemd.service).
This can’t be repeated with ttyPS0, because systemd goes another way for setting up the serial console: At an early stage in the boot, systemd-getty-generator automatically sets a target, serial-getty@ttyPS0.service, which is implemented by the /lib/systemd/system/serial-getty@.service template unit.
# systemctl status
[ ... ]
│ ├─system-serial\x2dgetty.slice
│ │ └─serial-getty@ttyPS0.service
│ │ └─2337 /sbin/agetty --keep-baud 115200 38400 9600 ttyPS0 vt220
So the solution is adding /etc/systemd/system/serial-getty@ttyPS0.service.d/autologin.conf saying
[Service] ExecStart= ExecStart=-/sbin/agetty -a root --keep-baud 115200,38400,9600 %I $TERM
Disable renaming of Ethernet interfaces
Ditch those tedious Ethernet interface names (e.g. enp3s0, enp0s31f6, wlp2s0, really, come on), and bring back the good old eth0, eth1 etc. Note that the kernel still assigns the good old interface names. It’s systemd that renames them. So the trick is simple: Mask the default naming policy, which causes this to happen:
# ln -s /dev/null /etc/systemd/network/99-default.link
(try man systemd.link for more info, and there’s a lot — including how to change MAC address)
And update the initramfs:
# update-initramfs -u
It won’t work without updating the initramfs, because the network interface names are set way before the root filesystem is mounted. The files are copied into the initramfs’ /lib/systemd/network/ directory (note that it’s /lib, even though they’re saved in /etc on the root filesystem. Which makes sense, because on the initramfs there’s no point with the /lib vs. /etc distinction).
Also, renaming can’t be used to achieve persistent allocation of ethN names (so says the manpage of systemd.link) because of a race with the kernel’s name assignments. As of kernel v4.15, that is.
Resetting failed services
To get the system out of the “degraded” state due to an irrelevant failed service, for example:
● lvm2-pvscan@253:9.service loaded failed failed LVM2 PV scan on device 253:9
Just go (as root)
# systemctl reset-failed
and the failed service disappears, returning the overall status to “running” again (assuming there’s no real problem).
Ditching NetworkManager
I can’t say I was very fond of it ever (not only because of the capital letters in its name), and systemd can now do its job. Odds are that NetworkManager will become history in a matter of a few years, so better give it the boot now:
# systemctl disable NetworkManager # systemctl disable NetworkManager-wait-online
A word of warning: The GUI for handling Ethernet connection requires NetworkManager, so for better or for worse, it won’t work anymore.
I also ditched ModemManager, as I have a solution for my ADSL modem. But I’m not sure about others.
And enable systemd’s cutie instead
# systemctl enable systemd-networkd
You might want to get rid of the service that waits for networking to be “online”:
# systemctl mask systemd-networkd-wait-online.service
This service runs /lib/systemd/systemd-networkd-wait-online, which is supposed to wait for at least one Ethernet card being configured. In practice, it waited until its 120 seconds timeout, and then said it failed. As a result, some services that depend on the network.online were kicked off only after these two minutes, and the overall system’s status was marked as “degraded”.
What does “online” mean, and why is it important? Good question, discussed here. My conclusion: As any contemporary Linux system should be able to tolerate hotplugging of its NICs, there’s no need to wait. Handle them as they appear.
A simple definition file for a NIC can be, for example, /etc/systemd/network/20-eth0.network
[Match] MACAddress=1c:1b:0d:45:0f:eb [Network] DHCP=yes
Note that I detected the card by its MAC address. It’s also possible to select it by its ifconfig name, and other methods. But this is safest.
DHCP is “no” by default. Replace it with a line saying e.g.
Address=10.10.10.10/16
for a static IP address. Go “man systemd.network” for the whole set of options. They cover basically everything needed.
After making changes to such file (or adding one), go
# systemctl restart systemd-networkd
to make them effective. There’s no need to update the initramfs. In fact, .network files aren’t copied into it (unlike .link files, as said above).
As for responding to hotplugging events of network devices, there’s this post.