Solved: netcat (nc) doesn’t terminate at end of transmission
Introduction
I often use netcat to transmit chunks of data between two Linux machines. I usually go something like
$ pv backup-image.iso | nc -l 1234
on one machine, and then maybe
# nc 10.1.2.3 1234 > /dev/sdb1
This is an example for using another machine to write data into a USB disk-on-key, because writing to any /dev/sdX on my main computer scares me too much.
But it doesn’t quit when it finishes
So after a couple of hours of operation it’s obviously finished, but with certain couples of computers, neither side quits the netcat program. So it’s not clear if the very last piece of data was fully written on the target.
Immediate thing to try
If you’re stuck like this after a long netcat transmission, this might save you: Press CTRL-D on the console of computer receiving data. If you’re lucky, this releases netcat on both sides.
Why this happens
This was written in July 2018, reflecting the netcat versions I’ve come across.
netcat opens a bidirectional TCP link, passing one side’s stdin to the other side’s stdout and vice versa. When netcat is faced with an EOF on its standard input, it may or may not close the sending part of its TCP connection, depending on which version of netcat it is. If it indeed closed the the sending connection, a FIN TCP packet is sent to the other side. The netcat program on the other side receives this FIN packet, and may or may not quit, once again, depending on which version of netcat it is. If it did quit, it returns a FIN packet, closing the connection altogether.
So we have two “may or may not”, leading to four possibilities of how it all behaves.
The example above works if (and only if) the sending side sends a FIN when it sees EOF on its stdin, and that causes the other side’s netcat to quit, closing the TCP connection completely (sending a FIN packet back), which causes the first netcat to quit as well. And all is good.
Well, no. Formally, this is actually wrong behavior: Considering netcat to be a bidirectional link (this isn’t used a lot, but still), closing one direction shouldn’t cause the closing of the other. Maybe there’s data for waiting for transmission in the opposite direction. It’s perfectly legal, and quite commonplace, to transmit data on a half-open TCP link.
This is probably why recent revisions of netcat will not quit on receiving a FIN packet, but only when there’s no data in either direction: After receiving a FIN on the incoming TCP line and an EOF on its stdin.
Also, recent netcat revisions ignore the EOF on its stdin until the FIN arrives, unless the -q flag is given (which is not supported by earlier versions, but neither is it needed). This causes a potential deadlock: Even if both sides have received an EOF, none will quit, because neither has sent the FIN packet. The -q flag solves this.
Does it matter which side is listening?
I haven’t read the sources (of which revision should I read?), but after quite some experiments I got the impression that the behavior is symmetric: Client and server behave exactly the same way. Doesn’t matter which side was listening and which was initiating the TCP connection.
So what to do
Since there are two different paradigms of netcat out there, there’s no catch-all solution. For each pair of machines, test your netcat pair before starting a heavy operation. Possibly on a short file, possibly by typing data on the console. Be sure both sides quit at the end.
One thing that helps is to change the receiving part to e.g.
# nc 10.1.2.3 1234 > /dev/sdb1 < /dev/null
/dev/null supplies an EOF right away. Older netcats will send a FIN on the TCP link immediately on its establishment, so if there’s an old netcat on the other side, both sides quit right away, possibly before any data is sent. But if there are old netcats on both sides, you’re probably not bothered by this issue at all.
Newer netcats do nothing because of this immediate EOF (they ignore it until a FIN arrives), but it allows them to quit when the other side does.
Another thing to do, is to add the -q flag on the sending netcat (supported only by the newer netcats). For example:
$ pv backup-image.dat | nc -q 0 -l 1234
The “-q 0″ part tells netcat to “quit” immediately after receiving an EOF (or so says the man page). My own anecdotal experiment shows that “-q 0″ doesn’t make netcat quit at all, but just to send the FIN packet when an EOF arrives. In other words, “-q 0″ means “send a FIN packet when EOF arrives on stdin”. Something old netcats do anyhow.
This is good enough to get out of the deadlock mentioned above: When the data stream ends, the sending part sends a FIN because of the “-q 0″ flag. The receiving part now has an EOF by virtue of /dev/null, and a FIN from the sending part, so it quits, sending a FIN back. Now the first side has an EOF and a FIN, and quits as well.
Note that the “-q 0″ is more important than the /dev/null trick: If the receiving side has quit, we know all data has been flushed. It therefore doesn’t matter so much that a CTRL-C is needed to release the sending side. Doesn’t matter, but doesn’t add a feeling of confidence when the transmission is really important.
And this bring me back to what I began with: Each pair of computers needs a test before attempting something heavy. Sadly.