Systemd services as cronjobs: No process runs away

This post was written by eli on January 18, 2021
Posted Under: Linux,perl,Server admin,systemd

But why?

Cronjobs typically consists of a single utility which we’re pretty confident about. Even if it takes quite some time to complete (updatedb, for example), there’s always a simple story, a single task to complete with a known beginning and end.

If the task involves a shell script that calls a few utilities, that feeling of control fades. It’s therefore reassuring to know that everything can be cleaned up neatly by simple stopping a service. Systemd is good at that, since all processes that are involved in the service are kept in a separate cgroup. So when the service is stopped, all processes that were possibly generated eventually get a SIGKILL, typically 90 seconds after the request to stop the service, unless they terminated voluntarily in response to the initial SIGTERM.

Advantage number two is that the systemd allows for a series of capabilities to limit what the cronjob is capable of doing, thanks to the cgroup arrangement. This doesn’t fall very short from the possibilities of container virtualization, with pretty simple assignments in the unit file. This includes making certain directories inaccessible or accessible for read-only, setting up temporary directories, disallow external network connection, limit the set of allowed syscalls, and of course limit the amount of resources that are consumed by the service. They’re called Control Groups for a reason.

There’s also the RuntimeMaxSec parameter in the service unit file, which is the maximal wall clock time the service is allowed to run. The service is terminated and put in failure state if this time is exceeded. This is however supported from systemd version 229 and later, so check with “systemctl –version”.

My original idea was to use systemd timers to kick off the job, and let RuntimeMaxSec make sure it would get cleaned up if it ran too long (i.e. got stuck somehow). But because the server in question ran a rather old version of systemd, I went for a cron entry for starting the service and another one for stopping it, with a certain time difference between them. In hindsight, cron turned to be neater for kicking off the jobs, because I had multiple variants of them in different times. So one single file enclosed all.

The main practical difference is that if a service reaches RuntimeMaxSec, it’s terminated with a failed status. The cron solution stops the service without this. I guess there’s a systemctl way to achieve the failed status, if that’s really important.

As a side note, I have a separate post on Firejail, which is yet another possibility to use cgroups for controlling what processes do.

Timer basics

The idea is simple: A service can be started as a result of a timer event. That’s all that timer units do.

Timer units are configured like any systemd units (man systemd.unit) but have a .timer suffix and a dedicated [Timer] section. By convention, the timer unit named foo.timer activates the service foo.service, unless specified differently with the Unit= assignment (useful for generating confusion).

Units that are already running when the timer event occurs are not restarted, but are left to keep running. Exactly like systemctl start would do.

For an cronjob-style timer, use OnCalendar= to specify the times. See man systemd.time for the format. Note that AccuracySec= should be set too to control how much systemd can play with the exact time of execution, or systemd’s behavior might be confusing.

To see all active timers, go

$ systemctl list-timers

The unit file

As usual, the unit file (e.g. /etc/systemd/system/cronjob-test@.service) is short and concise:

[Unit]
Description=Cronjob test service

[Service]
ExecStart=/home/eli/shellout/utils/shellout.pl "%I"
Type=simple
User=eli
WorkingDirectory=/home/eli/shellout/utils
KillMode=mixed
NoNewPrivileges=true

This is a simple service, meaning that systemd expects the process launched by ExecStart to run in the foreground.

Note however that the service unit’s file name has a “@” character and that %I is used to choose what to run, based upon the unescaped instance name (see main systemd.unit). This turns the unit file into a template, and allows choosing an arbitrary command (the shellout.pl script is explained below) with something like (really, this works)

# systemctl start cronjob-test@'echo "Hello, world"'

This might seems dangerous, but recall that root privileges are required to start the service, and you get a plain-user process (possibly with no ability to escalate privileges) in return. Not the big jackpot.

For stopping the service, exactly the same service specifier string is required. But it’s also possible to stop all instances of a service with

# systemctl stop 'cronjob-test@*'

How neat is that?

A few comments on this:

  • The service should not be systemd-wise enabled (i.e. no “systemctl enable”) — that’s what you do to get it started on boot or following some kind of event. This is not the case, as the whole point is to start the service directly by a timer or crond.
  • Accordingly, the service unit file does not have an [Install] section.
  • A side effect of this is that the service may not appear in the list made by “systemctl” (without any arguments) unless it has processes running on its behalf currently running (or possibly if it’s in the failed state). Simple logic: It’s not loaded unless it has a cgroup allocated, and the cgroup is removed along with the last process. But it may appear anyhow under some conditions.
  • ExecStart must have a full path (i.e. not relative) even if the WorkingDirectory is set. In particular, it can’t be ./something.
  • A “systemctl start” on a service that is marked as failed will be started anyhow (i.e. the fact that it’s marked failed doesn’t prevent that). Quite obvious, but I tested it to be sure.
  • Also, a “systemctl start” causes the execution of ExecStart if and only if there’s no cgroup for it, which is equivalent to not having a process running on its behalf
  • KillMode is set to “mixed” which sends a SIGTERM only to the process that is launched directly when the service is stopped. The SIGKILL 90 seconds later, if any, is sent to all processes however. The default is to give all processes in the cgroup the SIGTERM when stopping.
  • NoNewPrivileges is a little paranoid thing: When no process has any reason to change its privileges or user IDs, block this possibility. This mitigates damage, should the job be successfully attacked in some way. But I ended up not using it, as running sendmail fails (it has some setuid thing to allow access to the mail spooler).

Stopping

There is no log entry for a service of simple type that terminates with a success status. Even though it’s stopped in the sense that it has no allocated cgroup and “systemctl start” behaves as if it was stopped, a successful termination is silent. Not sure if I like this, but that’s the way it is.

When the process doesn’t respond to SIGTERM:

Jan 16 19:13:03 systemd[1]: Stopping Cronjob test service...
Jan 16 19:14:33 systemd[1]: cronjob-test.service stop-sigterm timed out. Killing.
Jan 16 19:14:33 systemd[1]: cronjob-test.service: main process exited, code=killed, status=9/KILL
Jan 16 19:14:33 systemd[1]: Stopped Cronjob test service.
Jan 16 19:14:33 systemd[1]: Unit cronjob-test.service entered failed state.

So there’s always “Stopping” first and then “Stopped”. And if there are processes in the control group 90 seconds after “Stopping”, SIGKILL is sent, and the service gets a “failed” status. Not being able to quit properly is a failure.

A “systemctl stop” on a service that is already stopped is legit: The systemctl utility returns silently with a success status, and a “Stopped” message appears in the log without anything actually taking place. Neither does the service’s status change, so if it was considered failed before, so it remains. And if the target to stop was a group if instances (e.g. systemctl stop ‘cronjob-test@*’) and there were no instances to stop, there’s even not a log message on that.

Same logic with “Starting” and “Started”: A superfluous “systemctl start” does nothing except for a “Started” log message, and the utility is silent, returning success.

Capturing the output

By default, the output (stdout and stderr) of the processes is logged in the journal. This is usually pretty convenient, however I wanted the good old cronjob behavior: An email is sent unless the job is completely silent and exits with a success status (actually, crond doesn’t care, but I wanted this too).

This concept doesn’t fit systemd’s spirit: You don’t start sending mails each time a service has something to say. One could use OnFailure for activating another service that calls home when the service gets into a failure status (which includes a non-success termination of the main process), but that mail won’t tell me the output. To achieve this, I wrote a Perl script. So there’s one extra process, but who cares, systemd kills’em all in the end anyhow.

Here it comes (I called it shellout.pl):

#!/usr/bin/perl

use strict;
use warnings;

# Parameters for sending mail to report errors
my $sender = 'eli';
my $recipient = 'eli';
my $sendmail = "/usr/sbin/sendmail -i -f$sender";

my $cmd = shift;
my $start = time();

my $output = '';

my $catcher = sub { finish("Received signal."); };

$SIG{HUP} = $catcher;
$SIG{TERM} = $catcher;
$SIG{INT} = $catcher;
$SIG{QUIT} = $catcher;

my $pid = open (my $fh, '-|');

finish("Failed to fork: $!")
  unless (defined $pid);

if (!$pid) { # Child process
  # Redirect stderr to stdout for child processes as well
  open (STDERR, ">&STDOUT");

  exec($cmd) or die("Failed to exec $cmd: $!\n");
}

# Parent
while (defined (my $l = <$fh>)) {
  $output .= $l;
}

close $fh
 or finish("Error: $! $?");

finish("Execution successful, but output was generated.")
 if (length $output);

exit 0; # Happy end
sub finish {
  my ($msg) = @_;

  my $elapsed = time() - $start;

  $msg .= "\n\nOutput generated:\n\n$output\n"
    if (length $output);

  open (my $fh, '|-', "$sendmail $recipient") or
    finish("Failed to run sendmail: $!");

  print $fh <<"END";
From: Shellout script <$sender>
Subject: systemd cron job issue
To: $recipient

The script with command \"$cmd\" ran $elapsed seconds.

$msg
END

  close $fh
    or die("Failed to send email: $! $?\n");

  $SIG{TERM} = sub { }; # Not sure this matters
  kill -15, $$; # Kill entire process group

  exit(1);
}

First, let’s pay attention to

open (STDERR, ">&STDOUT");

which makes sure standard error is redirected to standard output. This is inherited by child processes, which is exactly the point.

The script catches the signals (SIGTERM in particular, which is systemd’s first hint that it’s time to pack and leave) and sends a SIGTERM to all other processes in turn. This is combined with KillMode being set to “mixed” in the service unit file, so that only shellout.pl gets the signal, and not the other processes.

The rationale is that if all processes get the signal at once, it may (theoretically?) turn out that the child process terminates before the script reacted to the signal it got itself, so it will fail to report that the reason for the termination was a signal, as opposed to the termination of the child. This could miss a situation where the child process got stuck and said nothing when being killed.

Note that the script kills all processes in the process group just before quitting due to a signal it got, or when the invoked process terminates and there was output. Before doing so, it sets the signal handler to a NOP, to avoid an endless loop, since the script’s process will get it as well (?). This NOP thing appears to be unnecessary, but better safe than sorry.

Also note that the while loop quits when there’s nothing more in <$fh>. This means that if the child process forks and then terminates, the while loop will continue, because unless the forked process closed its output file handles, it will keep the reference count of the script’s stdin above zero. The first child process will remain as a zombie until the forked process is done. Only then will it be reaped by virtue of the close $fh. This machinery is not intended for fork() sorcery.

I took a different approach in another post of mine, where the idea was to fork explicitly and modify the child’s attributes. Another post discusses timing out a child process in general.

Summary

Yes, cronjobs are much simpler. But in the long run, it’s a good idea to acquire the ability to run cronjobs as services for the sake of keeping the system clean from runaway processes.

Add a Comment

required, use real name
required, will not be published
optional, your blog address