Apache 2.4: RewriteRule with [NE] causing 500 Internal Server Error

This is the weirdest thing. With an Apache 2.4.10 on Linux Debian 8 (yes, old), and a relatively simple mod_rewrite rule in .htaccess going

RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule (.*) https://www.mysite.com/$1 [R=301,L,NE]

This is really nothing special. Just pushing users to the www host name, if they were lazy typing it.

This works almost perfectly, but then I have this thing about trying odd stuff. So I get an internal server error when I request the page https://mysite.com/%01, but not mysite.com/%1 and not mysite.com/%20. In other words, I get a server error when there’s a legal percent-escape of a character in the %01-%1f range. Not with %00, and not with %20 and up, and not when there’s just one digit. It’s really a matter of a proper percent escape.

This %-escape doesn’t have to be in the beginning of the string. https://mysite.com/mything%01.html is equally offensive. But it appears like a “?” neutralizes the problem — there is no server error if a “?” appeared before the offending %-escape. Recall that characters after a “?” are normally not escaped.

And of course, there is no problem accessing https://www.mysite.com/mything%01.html (getting a 404, but fine), as the rule doesn’t come to effect.

The internal server error leaves no traces in neither the error log nor the access log. So there’s no hint there. Neither in the journalctl log. No process dies either.

The problem is clearly the NE flag (marked red above), telling mod_rewrite not to escape the the string it rewrites. Why this is an issue is beyond me. I could try asking in forums and such, but I have hunch on the chances getting an answer.

Exactly the same behavior is observed when this rule is in an Apache configuration file (in <Virtualhost> context).

So I just dropped the NE flag. Actually, this is a note to self explaining why I did it.

One could argue that it’s pretty harmless to get an error on really weird URL requests, however often a serious vulnerability begins with something that just went a little wrong. So I prefer to stay away.

If anyone has insights on this matter, please comment below. Maybe it has been silently fixed in later versions of Apache?

Systemd services as cronjobs: No process runs away

But why?

Cronjobs typically consists of a single utility which we’re pretty confident about. Even if it takes quite some time to complete (updatedb, for example), there’s always a simple story, a single task to complete with a known beginning and end.

If the task involves a shell script that calls a few utilities, that feeling of control fades. It’s therefore reassuring to know that everything can be cleaned up neatly by simple stopping a service. Systemd is good at that, since all processes that are involved in the service are kept in a separate cgroup. So when the service is stopped, all processes that were possibly generated eventually get a SIGKILL, typically 90 seconds after the request to stop the service, unless they terminated voluntarily in response to the initial SIGTERM.

Advantage number two is that the systemd allows for a series of capabilities to limit what the cronjob is capable of doing, thanks to the cgroup arrangement. This doesn’t fall very short from the possibilities of container virtualization, with pretty simple assignments in the unit file. This includes making certain directories inaccessible or accessible for read-only, setting up temporary directories, disallow external network connection, limit the set of allowed syscalls, and of course limit the amount of resources that are consumed by the service. They’re called Control Groups for a reason.

There’s also the RuntimeMaxSec parameter in the service unit file, which is the maximal wall clock time the service is allowed to run. The service is terminated and put in failure state if this time is exceeded. This is however supported from systemd version 229 and later, so check with “systemctl –version”.

My original idea was to use systemd timers to kick off the job, and let RuntimeMaxSec make sure it would get cleaned up if it ran too long (i.e. got stuck somehow). But because the server in question ran a rather old version of systemd, I went for a cron entry for starting the service and another one for stopping it, with a certain time difference between them. In hindsight, cron turned to be neater for kicking off the jobs, because I had multiple variants of them in different times. So one single file enclosed all.

The main practical difference is that if a service reaches RuntimeMaxSec, it’s terminated with a failed status. The cron solution stops the service without this. I guess there’s a systemctl way to achieve the failed status, if that’s really important.

As a side note, I have a separate post on Firejail, which is yet another possibility to use cgroups for controlling what processes do.

Timer basics

The idea is simple: A service can be started as a result of a timer event. That’s all that timer units do.

Timer units are configured like any systemd units (man systemd.unit) but have a .timer suffix and a dedicated [Timer] section. By convention, the timer unit named foo.timer activates the service foo.service, unless specified differently with the Unit= assignment (useful for generating confusion).

Units that are already running when the timer event occurs are not restarted, but are left to keep running. Exactly like systemctl start would do.

For an cronjob-style timer, use OnCalendar= to specify the times. See man systemd.time for the format. Note that AccuracySec= should be set too to control how much systemd can play with the exact time of execution, or systemd’s behavior might be confusing.

The unit file

As usual, the unit file (e.g. /etc/systemd/system/cronjob-test@.service) is short and concise:

[Unit]
Description=Cronjob test service

[Service]
ExecStart=/home/eli/shellout/utils/shellout.pl "%I"
Type=simple
User=eli
WorkingDirectory=/home/eli/shellout/utils
KillMode=mixed
NoNewPrivileges=true

This is a simple service, meaning that systemd expects the process launched by ExecStart to run in the foreground.

Note however that the service unit’s file name has a “@” character and that %I is used to choose what to run, based upon the unescaped instance name (see main systemd.unit). This turns the unit file into a template, and allows choosing an arbitrary command (the shellout.pl script is explained below) with something like (really, this works)

# systemctl start cronjob-test@'echo "Hello, world"'

This might seems dangerous, but recall that root privileges are required to start the service, and you get a plain-user process (possibly with no ability to escalate privileges) in return. Not the big jackpot.

For stopping the service, exactly the same service specifier string is required. But it’s also possible to stop all instances of a service with

# systemctl stop 'cronjob-test@*'

How neat is that?

A few comments on this:

  • The service should not be systemd-wise enabled (i.e. no “systemctl enable”) — that’s what you do to get it started on boot or following some kind of event. This is not the case, as the whole point is to start the service directly by a timer or crond.
  • Accordingly, the service unit file does not have an [Install] section.
  • A side effect of this is that the service may not appear in the list made by “systemctl” (without any arguments) unless it has processes running on its behalf currently running (or possibly if it’s in the failed state). Simple logic: It’s not loaded unless it has a cgroup allocated, and the cgroup is removed along with the last process. But it may appear anyhow under some conditions.
  • ExecStart must have a full path (i.e. not relative) even if the WorkingDirectory is set. In particular, it can’t be ./something.
  • A “systemctl start” on a service that is marked as failed will be started anyhow (i.e. the fact that it’s marked failed doesn’t prevent that). Quite obvious, but I tested it to be sure.
  • Also, a “systemctl start” causes the execution of ExecStart if and only if there’s no cgroup for it, which is equivalent to not having a process running on its behalf
  • KillMode is set to “mixed” which sends a SIGTERM only to the process that is launched directly when the service is stopped. The SIGKILL 90 seconds later, if any, is sent to all processes however. The default is to give all processes in the cgroup the SIGTERM when stopping.
  • NoNewPrivileges is a little paranoid thing: When no process has any reason to change its privileges or user IDs, block this possibility. This mitigates damage, should the job be successfully attacked in some way. But I ended up not using it, as running sendmail fails (it has some setuid thing to allow access to the mail spooler).

Stopping

There is no log entry for a service of simple type that terminates with a success status. Even though it’s stopped in the sense that it has no allocated cgroup and “systemctl start” behaves as if it was stopped, a successful termination is silent. Not sure if I like this, but that’s the way it is.

When the process doesn’t respond to SIGTERM:

Jan 16 19:13:03 systemd[1]: Stopping Cronjob test service...
Jan 16 19:14:33 systemd[1]: cronjob-test.service stop-sigterm timed out. Killing.
Jan 16 19:14:33 systemd[1]: cronjob-test.service: main process exited, code=killed, status=9/KILL
Jan 16 19:14:33 systemd[1]: Stopped Cronjob test service.
Jan 16 19:14:33 systemd[1]: Unit cronjob-test.service entered failed state.

So there’s always “Stopping” first and then “Stopped”. And if there are processes in the control group 90 seconds after “Stopping”, SIGKILL is sent, and the service gets a “failed” status. Not being able to quit properly is a failure.

A “systemctl stop” on a service that is already stopped is legit: The systemctl utility returns silently with a success status, and a “Stopped” message appears in the log without anything actually taking place. Neither does the service’s status change, so if it was considered failed before, so it remains. And if the target to stop was a group if instances (e.g. systemctl stop ‘cronjob-test@*’) and there were no instances to stop, there’s even not a log message on that.

Same logic with “Starting” and “Started”: A superfluous “systemctl start” does nothing except for a “Started” log message, and the utility is silent, returning success.

Capturing the output

By default, the output (stdout and stderr) of the processes is logged in the journal. This is usually pretty convenient, however I wanted the good old cronjob behavior: An email is sent unless the job is completely silent and exits with a success status (actually, crond doesn’t care, but I wanted this too).

This concept doesn’t fit systemd’s spirit: You don’t start sending mails each time a service has something to say. One could use OnFailure for activating another service that calls home when the service gets into a failure status (which includes a non-success termination of the main process), but that mail won’t tell me the output. To achieve this, I wrote a Perl script. So there’s one extra process, but who cares, systemd kills’em all in the end anyhow.

Here it comes (I called it shellout.pl):

#!/usr/bin/perl

use strict;
use warnings;

# Parameters for sending mail to report errors
my $sender = 'eli';
my $recipient = 'eli';
my $sendmail = "/usr/sbin/sendmail -i -f$sender";

my $cmd = shift;
my $start = time();

my $output = '';

my $catcher = sub { finish("Received signal."); };

$SIG{HUP} = $catcher;
$SIG{TERM} = $catcher;
$SIG{INT} = $catcher;
$SIG{QUIT} = $catcher;

# Redirect stderr to stdout for child processes as well
open (STDERR, ">&STDOUT");

open (my $fh, '-|', $cmd) or
  finish("Failed to fork: $!");

while (defined (my $l = <$fh>)) {
  $output .= $l;
}

close $fh
  or finish("Error: $! $?");

finish("Execution successful, but output was generated.")
  if (length $output);

exit 0; # Happy end

sub finish {
  my ($msg) = @_;

  my $elapsed = time() - $start;

  $msg .= "\n\nOutput generated:\n\n$output\n"
    if (length $output);

  open (my $fh, '|-', "$sendmail $recipient") or
    finish("Failed to run sendmail: $!");

  print $fh <<"END";
From: Shellout script <$sender>
Subject: systemd cron job issue
To: $recipient

The script with command \"$cmd\" ran $elapsed seconds.

$msg
END

  close $fh
    or die("Failed to send email: $! $?\n");

  $SIG{TERM} = sub { }; # Not sure this matters
  kill -15, $$; # Kill entire process group

  exit(1);
}

First, let’s pay attention to

open (STDERR, ">&STDOUT");

which makes sure standard error is redirected to standard output. This is inherited by child processes, which is exactly the point.

The script catches the signals (SIGTERM in particular, which is systemd’s first hint that it’s time to pack and leave) and sends a SIGTERM to all other processes in turn. This is combined with KillMode being set to “mixed” in the service unit file, so that only shellout.pl gets the signal, and not the other processes.

The rationale is that if all processes get the signal at once, it may (theoretically?) turn out that the child process terminates before the script reacted to the signal it got itself, so it will fail to report that the reason for the termination was a signal, as opposed to the termination of the child. This could miss a situation where the child process got stuck and said nothing when being killed.

Note that the script kills all processes in the process group just before quitting due to a signal it got, or when the invoked process terminates and there was output. Before doing so, it sets the signal handler to a NOP, to avoid an endless loop, since the script’s process will get it as well (?). This NOP thing appears to be unnecessary, but better safe than sorry.

Also note that the while loop quits when there’s nothing more in <$fh>. This means that if the child process forks and then terminates, the while loop will continue, because unless the forked process closed its output file handles, it will keep the reference count of the script’s stdin above zero. The first child process will remain as a zombie until the forked process is done. Only then will it be reaped by virtue of the close $fh. This machinery is not intended for fork() sorcery.

I took a different approach in another post of mine, where the idea was to fork explicitly and modify the child’s attributes. Another post discusses timing out a child process in general.

Summary

Yes, cronjobs are much simpler. But in the long run, it’s a good idea to acquire the ability to run cronjobs as services for the sake of keeping the system clean from runaway processes.

Perl + DBI: Measuring the time of database queries

It’s often desired to know how much wall clock time the SQL queries take. As with Perl, there’s more than one way to do it. This is a simple way, which involves overriding DBI’s execute() method, so it measures the time and issues a warn() with basic caller info and the time in milliseconds.

The same thing can be done with any other method, of course. Go “man DBI” and look for “Subclassing the DBI” for the full story.

So the first thing is to define the RootClass property on the DBI object generation, so that MySubDBI is the root class. Something like

$dbh = DBI->connect( "DBI:mysql::localhost", "thedatabase", "thepassword",
		     { RaiseError => 1, AutoCommit => 1, PrintError => 0,
		       RootClass => 'MySubDBI',
		       Taint => 1});

and then, somewhere, the class needs to be defined. This can be in a separate module .pm file, but also at the end of the same file as the code for the DBI->connect:

package MySubDBI;
use strict;

use DBI;
use vars qw(@ISA);
@ISA = qw(DBI);

package MySubDBI::db;
use vars qw(@ISA);
@ISA = qw(DBI::db);

# Empty, yet necessary.

package MySubDBI::st;
use Time::HiRes qw(gettimeofday tv_interval);
use vars qw(@ISA);
@ISA = qw(DBI::st);

sub execute {
  my ($sth, @args) = @_;
  my $tic = [gettimeofday];
  my $res = $sth->SUPER::execute(@args);
  my $exectime = int(tv_interval($tic)*1000);

  my ($package0, $file0, $line0, $subroutine0) = caller(0);
  my ($package1, $file1, $line1, $subroutine1) = caller(1);

  warn("execute() call from $subroutine1 (line $line0) took $exectime ms\n");
  return $res;
}

1;

The code can be smarter, of course. For example, issue a warning only if the query time exceeds a certain limit.

MySQL: LIKE vs. REGEXP() performance testing

Introduction

At some point I needed to choose between using LIKE or REGEXP() for a not so simple string match. Without going into the details, the matching string contains a lot of wildcard segments, and while both would have done the job, I thought maybe REGEXP() would benefit from some extra information about the wildcard parts. It’s not that I cared that ‘%’ would match characters it shouldn’t, but I wanted so save some backtracking by telling the matching engine not to match just anything. Save some CPU, I thought. Spoiler: It was a nice thought, but no.

So I ran a few performance tests on a sample table:

CREATE TABLE product_info (
	cpe_name BLOB NOT NULL,
	title BLOB NOT NULL,
	PRIMARY KEY (cpe_name(511))
) engine = MyISAM;

Note that cpe_name is the primary key, and is hence a unique index.

I should also mention that the match patterns I use are for testing are practically useless for CPE name matching because they don’t handle escaped “:” characters properly. Just in case you know what a CPE is and you’re here for that. In short, these are just performance tests.

I did this on MySQL server version: 5.1.47, Source distribution. There are newer versions around, I know. Maybe they do better.

The art of silly queries

So there’s about more than half a million entries:

mysql> SELECT COUNT(*) FROM product_info;
+----------+
| COUNT(*) |
+----------+
|   588094 |
+----------+
1 row in set (0.00 sec)

Let’s ask it another way:

mysql> SELECT COUNT(*) FROM product_info WHERE cpe_name LIKE '%';
+----------+
| COUNT(*) |
+----------+
|   588094 |
+----------+
1 row in set (0.08 sec)

Say what? LIKE ‘%’ is always true for a non-NULL BLOB. MySQL didn’t optimize this simple thing, and actually checked every single entry?

mysql> EXPLAIN SELECT COUNT(*) FROM product_info;
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.00 sec)

mysql> EXPLAIN SELECT COUNT(*) FROM product_info WHERE cpe_name LIKE '%';
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table        | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | product_info | ALL  | NULL          | NULL | NULL    | NULL | 588094 | Using where |
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)

Apparently it did. Other silly stuff:

mysql> SELECT COUNT(*) FROM product_info WHERE NOT cpe_name IS NULL;
+----------+
| COUNT(*) |
+----------+
|   588094 |
+----------+
1 row in set (0.08 sec)

mysql> EXPLAIN SELECT COUNT(*) FROM product_info WHERE NOT cpe_name IS NULL;
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table        | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | product_info | ALL  | PRIMARY       | NULL | NULL    | NULL | 588094 | Using where |
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)

Silly or what? The column is defined as “NOT NULL”. What is there to check? So maybe the idea is that if I make stupid queries, MySQL responds with stupid behavior. Well, not really:

mysql> SELECT COUNT(*) FROM product_info WHERE 1=1;
+----------+
| COUNT(*) |
+----------+
|   588094 |
+----------+
1 row in set (0.00 sec)

mysql> EXPLAIN SELECT COUNT(*) FROM product_info WHERE 1=1;
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.00 sec)

It’s more like MySQL makes optimizations only if they’re really obvious. MySQL != gcc.

LIKE vs REGEXP

So it turns out that LIKE is really fast, and it seems to take advantage of the index:

mysql> SELECT cpe_name FROM product_info WHERE cpe_name LIKE 'cpe:2.3:a:hummingbird:cyberdocs:-%';
+-------------------------------------------------+
| cpe_name                                        |
+-------------------------------------------------+
| cpe:2.3:a:hummingbird:cyberdocs:-:*:*:*:*:*:*:* |
+-------------------------------------------------+
1 row in set (0.00 sec)

The same query, only with a regex:

mysql> SELECT cpe_name FROM product_info WHERE cpe_name REGEXP '^cpe:2.3:a:hummingbird:cyberdocs:-.*';
+-------------------------------------------------+
| cpe_name                                        |
+-------------------------------------------------+
| cpe:2.3:a:hummingbird:cyberdocs:-:*:*:*:*:*:*:* |
+-------------------------------------------------+
1 row in set (0.21 sec)

Recall that indexing means that the rows are sorted. Because the initial part of the string is fixed, it’s possible to narrow down the number of rows to match with an index lookup, and then work on those that match that initial part. This was taken advantage of with the LIKE match, but apparently not with REGEXP():

mysql> EXPLAIN SELECT cpe_name FROM product_info WHERE cpe_name LIKE 'cpe:2.3:a:hummingbird:cyberdocs:-%';
+----+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table        | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | product_info | range | PRIMARY       | PRIMARY | 513     | NULL |    1 | Using where |
+----+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> EXPLAIN SELECT cpe_name FROM product_info WHERE cpe_name REGEXP '^cpe:2.3:a:hummingbird:cyberdocs:-.*';
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table        | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | product_info | ALL  | NULL          | NULL | NULL    | NULL | 588094 | Using where |
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)

The conclusion is quite clear: For heavy duty matching, don’t use REGEXP() if LIKE can do the job. In particular if a lot of rows can be ruled out by virtue of the first characters in the string.

Making it harder for LIKE

Let’s warm up a bit:

mysql> SELECT cpe_name FROM product_info WHERE cpe_name LIKE 'cpe:2.3:a:humming%:cyberdocs:-%';
+-------------------------------------------------+
| cpe_name                                        |
+-------------------------------------------------+
| cpe:2.3:a:hummingbird:cyberdocs:-:*:*:*:*:*:*:* |
+-------------------------------------------------+
1 row in set (0.00 sec)

It was quicker than measurable, mainly because there was little to do:

mysql> EXPLAIN SELECT cpe_name FROM product_info WHERE cpe_name LIKE 'cpe:2.3:a:humming%:cyberdocs:-%';
+----+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table        | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | product_info | range | PRIMARY       | PRIMARY | 513     | NULL |   80 | Using where |
+----+-------------+--------------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)

As expected, the index was used for the initial part of the string, and that left MySQL with 80 rows to actually do the matching (in this specific case). So it was quick.

But what if the wildcard is at the beginning of the string?

mysql> SELECT cpe_name FROM product_info WHERE cpe_name LIKE '%cpe:2.3:a:hummingbird:cyberdocs:-%';
+-------------------------------------------------+
| cpe_name                                        |
+-------------------------------------------------+
| cpe:2.3:a:hummingbird:cyberdocs:-:*:*:*:*:*:*:* |
+-------------------------------------------------+
1 row in set (0.10 sec)

So that took some considerable time, however still less than the REXEXP() case. The index didn’t help in this case, it seems:

mysql> EXPLAIN SELECT cpe_name FROM product_info WHERE cpe_name LIKE '%cpe:2.3:a:hummingbird:cyberdocs:-%';
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table        | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | product_info | ALL  | NULL          | NULL | NULL    | NULL | 588094 | Using where |
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)

So apparently, the pattern was applied to all rows. It was still faster than REGEXP() making a very simple match. So the latter seems not to be optimized in MySQL.

And to wrap this up, an expression full with wildcards. Similar to the one I thought maybe REXEXP would do better:

mysql> SELECT COUNT(*) FROM product_info WHERE cpe_name LIKE 'cpe:2.3:a:%:%:%:-:%:%:%:%:%:%';
+----------+
| COUNT(*) |
+----------+
|    13205 |
+----------+
1 row in set (0.11 sec)

mysql> EXPLAIN SELECT COUNT(*) FROM product_info WHERE cpe_name LIKE 'cpe:2.3:a:%:%:%:-:%:%:%:%:%:%';
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table        | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | product_info | ALL  | PRIMARY       | NULL | NULL    | NULL | 588094 | Using where |
+----+-------------+--------------+------+---------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)

So what happened here is that the initial part of the string didn’t help much, and the match seems to have been done on all rows. It took more or less the same time as the much simpler match pattern above.

Conclusion

There seems to be two main conclusions from this little set of experiments: The first one isn’t surprising: The most important factor is how many rows are being accessed, not so much what is done with them. And the second is that MySQL does some sensible optimizations when LIKE is used, in particular it narrows down the number of rows with the index, when possible. Something it won’t do with REGEXP(), even if with an “^” at the beginning of the regex.

MySQL: Enforcing long unique BLOB / TEXT columns with UNIQUE INDEX

Introduction

When creating indexes to TEXT / BLOB columns (and their variants), it’s required to specify how many characters the index should cover. In MySQL’s docs it’s usually suggested to keep them short in for better performance. There’s also a limit on the number of characters, which varies from one database engine to another, going from a few hundreds to a few thousands characters.

However it’s not unusual to use UNIQUE INDEX for making the database enforce the uniqueness of a field in the table. ON DUPLICATE KEY, INSERT IGNORE, UPDATE IGNORE and REPLACE can then be used to gracefully keep things tidy.

But does UNIQUE INDEX mean that the entire field remains unique, or is only the part covered by the index checked? Spoiler: Only the part covered by the index. In other words, MySQL’s uniqueness enforcement may be too strict.

Almost needless to say, if the index isn’t UNIQUE, it’s just a performance issue: If the number of covered characters is small, the database will more often fetch and discard the data of a row that turns out not to match the search criteria. But it doesn’t change the logic behavior. With UNIQUE INDEX, the number of characters does.

A simple test follows.

Trying the UNIQUE INDEX

I created a table with MySQL 5.1.47, with an index covering only 5 chars. Deliberately very short.

CREATE TABLE try (
	id INT UNSIGNED NOT NULL AUTO_INCREMENT,
	message BLOB NOT NULL,
	PRIMARY KEY (id),
	UNIQUE INDEX (message(5))
) engine = MyISAM;

Which ends up with this:

mysql> DESCRIBE try;
+---------+------------------+------+-----+---------+----------------+
| Field   | Type             | Null | Key | Default | Extra          |
+---------+------------------+------+-----+---------+----------------+
| id      | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| message | blob             | NO   | UNI | NULL    |                |
+---------+------------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)

Inserting the first row:

mysql> INSERT INTO try(message) VALUES('Hello, world');
Query OK, 1 row affected (0.00 sec)

And now trying a second:

mysql> INSERT INTO try(message) VALUES('Hello there');
ERROR 1062 (23000): Duplicate entry 'Hello' for key 'message'

That’s it. It just looked at the first five chars. Trying a difference within this region:

mysql> INSERT INTO try(message) VALUES('Hell there');
Query OK, 1 row affected (0.00 sec)

No wonder, that worked.

mysql> SELECT * FROM try;
+----+--------------+
| id | message      |
+----+--------------+
|  1 | Hello, world |
|  2 | Hell there   |
+----+--------------+
2 rows in set (0.00 sec)

Handling large TEXT/BLOB

And that leaves the question: How is it possible to ensure uniqueness on large chunks of text or binary data? One solution I can think of is to add a column for a hash (say SHA1), and let the application calculate that hash for each row it inserts, and insert it along with it. And make the UNIQUE INDEX on the hash, not the text. Something like

CREATE TABLE try2 (
	id INT UNSIGNED NOT NULL AUTO_INCREMENT,
	message BLOB NOT NULL,
	hash TINYBLOB NOT NULL,
	PRIMARY KEY (id),
	UNIQUE INDEX (hash(40))
) engine = MyISAM;

But wait. MySQL supports hashing functions. Why not use them instead? Well, the problem is that if I want an INSERT statement push the data and its hash in one go they query becomes a bit nasty. What came to my mind is:

mysql> INSERT INTO try2(message, hash) (SELECT t.x, SHA1(t.x) FROM (SELECT 'Hello there' AS x) AS t);
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

Can it get more concise than this? Suggestions are welcome. The double SELECT is required because I want the string literal to be mentioned once.

Isn’t it easier to let the application calculate the SHA1, and send it to the server by value? It’s a matter of taste, I guess.

Anyhow, trying again with exactly the same:

mysql> INSERT INTO try2(message, hash) (SELECT t.x, SHA1(t.x) FROM (SELECT 'Hello there' AS x) AS t);
ERROR 1062 (23000): Duplicate entry '726c76553e1a3fdea29134f36e6af2ea05ec5cce' for key 'hash'

and with something slightly different:

mysql> INSERT INTO try2(message, hash) (SELECT t.x, SHA1(t.x) FROM (SELECT 'Hello there!' AS x) AS t);
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

So yep, it works:

mysql> SELECT * FROM try2;
+----+--------------+------------------------------------------+
| id | message      | hash                                     |
+----+--------------+------------------------------------------+
|  1 | Hello there  | 726c76553e1a3fdea29134f36e6af2ea05ec5cce |
|  2 | Hello there! | 6b19cb3790b6da8f7c34b4d8895d78a56d078624 |
+----+--------------+------------------------------------------+
2 rows in set (0.00 sec)

Once again, even though the example I showed demonstrates how to make MySQL calculate the hash, I would do it in the application.

Decoding email’s quoted-printable with Perl

To make it short, the command at shell prompt is

$ perl -MMIME::QuotedPrint -e 'local $/; $x=<>; print decode_qp($x)' < quoted.txt > unquoted.html

and I needed this to extract an HTML segment of an email.

Pretty-printing data structures with Perl with Data::Dumper

As Perl allows for complicated data structures by virtue of references to hashes and arrays, it’s often useful to look into what’s going on there. In my case, it was on the output of a large JSON parse.

So to make a long story short, if $tree is a reference to the data structure, go

use Data::Dumper;
$Data::Dumper::Terse = 1;
$Data::Dumper::Purity = 1;
$Data::Dumper::Sortkeys = 1;
print Dumper $tree;

Noted how many flags? Data::Dumper isn’t always ideal for pretty-printing (there are a few alternatives) but it wins mainly because it’s part of the commonly installed Perl libraries. One of its drawbacks, which is also its advantage, is that its output is Perl code that reconstructs the data structure. Which means that it fusses with accuracy, in particular if the data structure contains blessed references.

So I definitely miss a Data::Dumper::Hooman or something of that sort.

And even more annoyingly, if it meets a complicated value more than one (e.g. a blessed refs), it puts a reference to the first appearance of the same value in the following times. Which is efficient, maybe, but doesn’t help for reading by a human.

So to the flags.

The Sortkeys flag is recommended for human reading (as well as diffs) for obvious reasons.

The Terse flag makes sure that values are dumped as literals and not referenced. For this to work, the Purity flag was also necessary in my case, or the Terse flag was simply ignored silently. The latter adds a lot of assignments at the end of the dump to fix inaccuracies.

The problem was that one of the fields was blessed as JSON::PP::Boolean, so the dump read

'acInsufInfo' => bless( do{\(my $o = 0)}, 'JSON::PP::Boolean' ),

and then it was referenced over and over again. With the Purity flag, the references appeared at the end of the dump, to correct the inaccurate (non-blessed) assignment before. It seems like without it, Dumper refused to respect the Terse flag for these, because it would break the concept that the dump can be executed to reconstruct the original.

Closing note: It’s quite unusual that mainstream Perl libraries behave in a quirky way that actually needs explanation.

Random notes about Gimp 2.10

This is just a few notes to self on this tool.

  • Version 2.10 is a big deal: It’s the first one to use GPU acceleration for scaling and rotation, as well as deforming of different types. No more small previews, no more sluggish response to scaling. No need to scale down images for working, because it was too slow.
  • The default theme (dark) is a joke. It’s elegant, but it’s difficult to see the differences between dark grey and lighter grey, which is the difference between a layer lock being enabled or not. I went for “system”, which got me Mint colors and nice icon spacing. Doesn’t look very elegant, but hey, I now know what I’m doing.
  • The default icons are a joke as well. Very sleek, but the colors are gone. Once again, it’s elegance vs. functionality. There’s a “color” option for the icon. Looks like a children’s toolbox now, but it’s much easier to find the correct tool.

Linked layers

First and foremost: Layer groups is most likely the preferred solution. In particular, I used to link between a layer with a layer mask and its duplicate with the same layer mask applied in order to resize / rotate the layer along its layer mask. This isn’t necessary anymore: Just put the layer with a layer mask inside a group, and manipulate the group instead of the layer. This is quicker than this duplicate, apply and link method, but there’s one feature still lacking: The ability to crop the layer to which the mask has been applied, and manipulate a smaller area.

Linked layers doesn’t seem to work on 2.10.14 — this seems to be a bug: Say, when linking two layers and scaling / rotating one of them, nothing happens with the other.

Solution / workaround: There are two options I know of:

  • Don’t use linked layers: Put the relevant layers in a layer group, and do the scaling / rotation on the group. But that doesn’t work well for manipulating a small layer in order to apply the changes on a larger one.
  • Link between a layer group (possibly with one layer) and the layers that are affected by it. This really looks like a bug workaround: Linking layers to a layer group will make the changes to the layer group apply to all other layers. But only if the manipulations are made to the layer group. Which is more or less fine, because the normal way to work is to manipulate some specific layer, and the others follow. So put that layer in a group of its own, and link the others with the group.

FPGA + USB 3.0: Cypress EZ-USB FX3 or XillyUSB?

Introduction

As the title implies, this post compares two solutions for connecting an FPGA to a host via USB 3.0: Cypress’ FX3 chipset, which has been around since around 2010, and the XillyUSB IP core, which was released in November 2020.

Cypress has been acquired by Infineon, but I’ll stick with Cypress. It’s not clear if the products are going to be re-branded (like Intel did with Altera, for example).

Since I’m openly biased towards XillyUSB, let’s be fair enough and start with its disadvantages. The first and obvious one is how long it’s been around compared with the FX3. Another thing is that XillyUSB won’t fall back to USB 2.0 if a USB 3.0 link fails to establish. This fallback option is important in particular because computer’s USB 3.x ports are sometimes of low quality, so even though the user expected to benefit from USB 3.x speed, the possibility to plug the device into a non-USB 3.x port can save the day.

This is however relevant only for applications that are still useful with USB 2.0, e.g. hard disk, USB sticks and Ethernet adapters — these still work, but do benefit from a faster connection when possible. If the application inherently needs payload speeds above 25 MBytes/s, it’s USB 3.0 or perish.

Thirdly, XillyUSB requires an FPGA with an MGT supporting 5 Gb/s. Low-cost FPGAs don’t. But from a BOM cost point of view, odds are that upgrading the FPGA costs less than adding the FX3 device along with its supporting components.

Finally, a not completely related comment: USB is good for hotpluggable, temporary connections. If a fixed link is required between an FPGA and some kind of computer, PCIe is most likely a better choice, possibly using Xillybus’ IP core for PCIe. Compared with USB 2.0, it might sound like a scary option, and PCIe isn’t always supported by embedded devices. But if USB 3.x is an option, odds are that PCIe is too. And a better one, unless hotplugging is a must.

FX3: Another device, another processor, another API and SDK

XillyUSB is an IP core, and hence resides side-by-side with the application logic on the FPGA. It requires a small number of pins for its functionality: Two differential wire pairs to the USB connector, and an additional pair of wires to a low-jitter reference clock. A few GPIO LEDs are recommended for status indications, but are not mandatory. The chances for mistakes in the PCB design are therefore relatively slim.

By contrast, using the FX3 requires following a 30+ pages hardware design application note (Cypress’ AN70707) to ensure proper operation of that device. As for FPGA pin consumption, a minimum of 40 pins is required to attain 400 MB/s of data exchange through a slave FIFO (e.g. 200 MB/s in each direction, half the link capacity), since the parallel data clock is limited to 100 MHz.

It doesn’t end there: The FX3 contains an ARM9 processor for which firmware must be developed. This firmware may produce USB traffic by itself, or configure the device to expose a slave FIFO interface for streaming data from and to the FPGA. This way or another, code for the ARM processor needs to be developed in order to carry out the desired configuration, at a minimum.

This is done with Cypress’ SDK and based upon coding examples, but there’s no way around this extra firmware task, which requires detailed knowledge on how the device works. For example, to turn off the FX3′s LPM capability (which is a good idea in general), the CyU3PUsbLPMDisable() API function should be called. And there are many more of this sort.

Interface with application logic in the FPGA

XillyUSB follows Xillybus’ paradigm regarding interface with application logic: There’s a standard synchronous FIFO between the application logic and the XillyUSB IP core for each data stream, and the application logic uses it mindlessly: For an FPGA-to-host stream, the application logic just pushes the data into the FIFO (checking that it’s not full), knowing it will reach the host in a timely manner. For the opposite direction, it reads from the FIFO when it’s non-empty.

In other words, the application logic interfaces with these FIFOs like FPGA designers are used to, for the sake of streaming data between different functional modules in a design. There is no special attention required because the destination or source of the data is a USB data link.

The FX3′s slave FIFO interface may sound innocent, but it’s a parallel data and control signal interface, allowing the FPGA to issue read and write commands on buffers inside the FX3. This requires developing logic for a controller that interfaces with the slave FIFO interface: Selection of the FX3 buffer to work with, sense its full or empty status (depending on the direction) and transfer data with this synchronous interface. If more than one data stream is required between the FPGA and the host, this controller also needs to perform scheduling and multiplexing. State machines, buffering of data, arbitration, the whole thing.

Even though a controller of this sort may seem trivial, it’s often this type of logic that is exposed to corner cases regarding flow of data: The typical randomness of data availability on one side and the ability to receive it on the other, creates scenarios that are difficult to predict, simulate and test. Obtaining a bulletproof controller of this sort is therefore often significantly more difficult than designing one for a demo.

When working with XillyUSB (or any other Xillybus IP core), the multiplexing is done inside the IP core: Designed, tested and fine polished once and for all. And this opens for another advantage: Making changes to the data stream setting, and adding streams to an existing design is simple and doesn’t jeopardize the stability of the already existing logic. Thanks to Xillybus’ IP Core Factory, this only requires some simple operations on the website and downloading the new IP core. Its deployment in the FPGA design merely consists of replacing files, making trivial changes in the HDL following a template, and adding a standard FPGA FIFO for the new stream. Nothing else in the logic design changes, so there are no side effects.

Host software design

The FX3′s scope in the project is to present a USB device. The driver has to be written more or less from scratch. So the host software, whether as a kernel driver or a libusb user-space implementation, must be written with USB transfers as the main building block. For a reasonable data rate (or else why USB 3.0?), the software design must be asynchronous: Requests are queued for submission, and completer functions are called when these requests are completed. The simple wait-until-done method doesn’t work, because this leads to long time gaps of no communication on the USB link. Aside from the obvious impact on bandwidth utilization, this is likely to cause overflows or underflows in the FPGA’s buffers.

With XillyUSB (and once again, with other Xillybus IP cores too), a single, catch-all driver presents pipe-like device files. Plain command-line utilities like “cat” and “dd” can be used to implement reliable and practical data acquisition and playback. The XillyUSB IP core and the dedicated driver use the transfer-based USB protocol for creating an abstraction of a simple, UNIX-like data stream.

FPGA application logic: USB transfers or continuous data?

The USB specification was written with well-defined transfers in mind. The underlying idea was that the host allocates a buffer and queues a data transfer request, related to a certain USB endpoint, to or from that buffer. For continuous communication, several transfers can be queued. Yet, there are data buffers of fixed size, each waiting for its turn.

Some data sinks and sources are naturally organized in defined chunks of data, and fit USB’s concept well. From a software design’s point of view, it’s simpler to comprehend a mechanism that relies on fixed-sized buffers, requests and fulfillments.

But then, what is natural in an FPGA design? In most applications, continuous, non-packeted data is the common way. Even video applications, where there’s a clear boundary between frames, are usually implemented with regular FIFOs between the internal logic block. With XillyUSB, this is the way the data flows: FIFOs on the FPGA and pipe-like device files on the host side.

With FX3, on the other hand, the USB machinery needs direct attention. For example: When transmitting data towards the host, FX3′s slave FIFO interface requires asserting PKTEND# in order to commit the data to the host, which may also issue a zero-length packet instead. This complication is necessary to maintain USB’s concept of a transfer: Sending a USB DATA packet shorter than the maximal allowed length tells the host that the transfer is finished, even if the buffer that was allocated for the transfer isn’t filled. Therefore, the FX3 can’t just send whatever data it has in the buffer because it has nothing better to do. Doing so would terminate the transfer, which can mean something in the protocol between the driver and its device.

But then, if the transfer request buffer’s size isn’t a multiple of the maximal USB DATA packet size (1024 bytes for USB 3.0), PKTEND# must be asserted before this buffer fills, or a USB protocol error occurs, as the device sends more data than can be stored. The USB protocol doesn’t allow the leftovers to be stored in the next queued transfer’s buffer, and it’s not even clear if such transfer is queued.

If this example wasn’t clear because of too much new terminology, no problem, that was exactly the point: The USB machinery one needs to be aware of.

Physical link diagnostics

As a USB device can be connected to a wide range of USB host controllers, on various motherboards, through a wide range of USB cables, the quality of the bitstream link may vary. On a good day it’s completely error-free, but sometimes it’s a complete mess.

Low-level errors don’t necessarily cause immediate problems, and sometimes the visible problems don’t look like a low-level link issue. The USB protocol is designed to keep the show running to the extent possible (retransmits and whatnot), so what appears to be occasional problems with a USB device could actually be a bad link all the time, with random clusters of mishaps that make the problem become visible, every now and then.

Monitoring the link’s health is therefore beneficial, both in a lab situation, but nevertheless in a product. The application software can collect error event information, and warn the user that even though all seems well, it’s advisable to try a different USB port or cable. Sometimes, that’s all it takes.

XillyUSB provides a simple means for telling something is wrong. There’s an output from the IP core, intended for a plain LED that flashes briefly for each error event that is detected. There are more detailed LEDs as well. Also, the XillyUSB driver creates a dedicated device file, from which diagnostic data can be read with a simple file operation. This diagnostic data chunk mainly consists of event counters for different error situations, which can be viewed with a utility that is downloaded along with XillyUSB’s driver for Linux. Likewise, a simple routine in an application suite can perform this monitoring for the sake of informing users about a problematic hardware setting.

Cypress’ FX3 does provide some error information of this sort, however this is exposed to the ARM processor inside the device itself. The SDK supplies functions such as CyU3PUsbInitEventLog() for enabling event logging and CyU3PUsbGetErrorCounts() for obtaining error count, but it’s the duty of the ARM’s firmware to transfer this data to the host. And then some kind of driver and utility are needed on the host as well.

The documentation for error counting is somewhat minimal, but looking at the definition of LNK_PHY_ERROR_CONF in the EZ-USB FX3 Technical Reference Manual helps.

Bugs and Errata

As always when evaluating a component for use, it’s suggested to read through the errata section in FX3′s datasheet. In particular, there’s a known problem causing errors in payload data towards the host, for which there is no planned fix. It occurs when a Zero Length Packet is followed by data “very quickly”, i.e. within a microframe of 125μs.

So first, 125μs isn’t “very quickly” in USB 3.0 terms. It’s the time corresponding to 62.5 kBytes of raw bandwidth of the link, which is a few dozens of DATA IN packets. Second, a zero length packet is something that is sent to finish a USB transfer. One can avoid it in some situations, but not in others. For example, if the transfer’s length is a multiple of 1024 bytes, the only way to finish it explicitly is with a zero length packet. The said errata requires not sending any data for 125 μs after such event, or there will be data errors.

This doesn’t just make the controller more complicated, but there’s a significant bandwidth penalty.

It may not be worth much saying that XillyUSB doesn’t have any bug of this sort, as it has been extensively tested with randomized data sources and sinks. It’s in fact quite odd that Cypress obviously didn’t perform tests of this sort (or they would have caught that bug easily).

The crucial difference is however that bugs in an IP core can be fixed and deployed quickly. There is no new silicon device to release, and no need to replace a physical device on the PCB.

No design is born perfect. The question is to what extent the issues that arise are fixed.

Creating a tarball for distribution (without user/group information)

A tarball is the common way to convey several files on UNIX systems. But because tar was originally intended for backup, it stores not only the permission information, but also the owner and group of each file. Try listing the content of a tarball with e.g.

$ tar -tzvf thestuff.tar.gz

Note the “v” flag that goes along with the flag for listing, “t”: It causes tar to print out ownership and permission information.

This doesn’t matter much if the tarball is extracted as a non-root user on the other end, because tar doesn’t set the user and group ID in that case: The extracted files get the uid/gid of the process that extracted them.

However if user at the other end extract the tarball as root, the original uid/gid is assigned, which may turn out confusing.

To avoid this, tell tar to assign user root to all files in the archive. This makes no difference if the archive is extracted by a non-root user, but sets the ownership to root if extracted by root. In fact, it sets the ownership to the extracting user in both cases, which is what one would expect.

So this is the command to use to create an old-school .tar.gz tarball:

$ tar --owner=0 --group=0 -czf thestuff.tar.gz thestuff

Note that you don’t have to be root to do this. You’re just creating a plain file with your own ownership. It’s extracting these file as root that requires root permissions (if so desired).