<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>my tech blog &#187; Virtualization</title>
	<atom:link href="http://billauer.co.il/blog/category/virtualization/feed/" rel="self" type="application/rss+xml" />
	<link>http://billauer.co.il/blog</link>
	<description>Anything I found worthy to write down.</description>
	<lastBuildDate>Sun, 19 Sep 2021 10:43:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.2</generator>
		<item>
		<title>Using firejail to throttle network bandwidth for wget and such</title>
		<link>http://billauer.co.il/blog/2021/08/firejail-network-bandwidth-limit/</link>
		<comments>http://billauer.co.il/blog/2021/08/firejail-network-bandwidth-limit/#comments</comments>
		<pubDate>Sun, 15 Aug 2021 13:39:17 +0000</pubDate>
		<dc:creator>eli</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Server admin]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://billauer.co.il/blog/?p=6382</guid>
		<description><![CDATA[Introduction Occasionally, I download / upload huge files, and it kills my internet connection for plain browsing. I don&#8217;t want to halt the download or suspend it, but merely calm it down a bit, temporarily, for doing other stuff. And then let it hog as much as it want again. There are many ways to [...]]]></description>
			<content:encoded><![CDATA[<h3>Introduction</h3>
<p>Occasionally, I download / upload huge files, and it kills my internet connection for plain browsing. I don&#8217;t want to halt the download or suspend it, but merely calm it down a bit, temporarily, for doing other stuff. And then let it hog as much as it want again.</p>
<p>There are many ways to do this, and I went for firejail. I suggest reading <a title="Firejail: Putting a program in its own little container" href="http://billauer.co.il/blog/2020/06/firejail-cgroups/" target="_blank">this post of mine</a> as well on this tool.</p>
<p>Firejail gives you a shell prompt, which runs inside a mini-container, like those cheap virtual hosting services. Then run wget or youtube-dl as you wish from that shell.</p>
<p>It has practically access to everything on the computer, but the network interface is controlled. Since firejail is based on cgroups, all processes and subprocesses are collectively subject to the network bandwidth limit.</p>
<p>Using firejail requires setting up a bridge network interface. This is a bit of container hocus-pocus, and is necessary to get control  over the network data flow. But it&#8217;s simple, and it can be done once  (until the next reboot, unless the bridge is configured permanently,  something I don&#8217;t bother).</p>
<h3>Setting up a bridge interface</h3>
<p>Remember: Do this once, and just don&#8217;t remove the interface when done with it.</p>
<p>You might need to</p>
<pre># <strong>apt install bridge-utils</strong></pre>
<p>So first, set up a new bridge device (as root):</p>
<pre># <strong>brctl addbr hog0</strong></pre>
<p>and give it an IP address that doesn&#8217;t collide with anything else on the system. Otherwise, it really doesn&#8217;t matter which:</p>
<pre># <strong>ifconfig hog0 10.22.1.1/24</strong></pre>
<p>What&#8217;s going to happen is that there will be a network interface named eth0 inside the container, which will behave as if it was connected to a real Ethernet card named hog0 on the computer. Hence the container has access to everything that is covered by the routing table (by means of IP forwarding), and is also subject to the firewall rules. With my specific firewall setting, it prevents some access, but ppp0 isn&#8217;t blocked, so who cares.</p>
<p>To remove the bridge (no real reason to do it):</p>
<pre># <strong>brctl delbr hog0</strong></pre>
<h3>Running the container</h3>
<p>Launch a shell with firejail (I called it &#8220;nethog&#8221; in this example):</p>
<pre>$ <strong>firejail --net=hog0 --noprofile --name=nethog</strong></pre>
<p>This starts a new shell, for which the bandwidth limit is applied. Run wget or whatever from here.</p>
<p>Note that despite the &#8211;noprofile flag, there are still some directories that are read-only and some are temporary as well. It&#8217;s done in a sensible way, though so odds are that it won&#8217;t cause any issues. Running &#8220;df&#8221; inside the container gives an idea on what is mounted how, and it&#8217;s scarier than the actual situation.</p>
<p>But <strong>be sure to check that the files that are downloaded are visible outside the container</strong>.</p>
<p>From another shell prompt, <strong>outside the container</strong> go something like (<strong>doesn&#8217;t </strong>require root):</p>
<pre>$ <strong>firejail --bandwidth=nethog set hog0 800 75</strong>
Removing bandwith limit
Configuring interface eth0
Download speed  6400kbps
Upload speed  600kbps
cleaning limits
configuring tc ingress
configuring tc egress
</pre>
<p>To drop the bandwidth limit:</p>
<pre>$ <strong>firejail --bandwidth=nethog clear hog0</strong></pre>
<p>And get the status (saying, among others, how many packets have been dropped):</p>
<pre>$ <strong>firejail --bandwidth=nethog status</strong></pre>
<p>Notes:</p>
<ul>
<li>The &#8220;eth0&#8243; mentioned in firejail&#8217;s output blob relates to the interface name <strong>inside</strong> the container. So the &#8220;real&#8221; eth0 remains untouched.</li>
<li>Actual download speed is slightly slower.</li>
<li>The existing group can be joined by new processes with firejail &#8211;join, as well as from firetools.</li>
<li>Several containers may use the same bridge (hog0 in the example  above), in which case each has its own independent bandwidth setting.  Note that the commands configuring the bandwidth limits mention both the  container&#8217;s name and the bridge.</li>
</ul>
<h3>Working with browsers</h3>
<p>When starting a browser from within a container, pay attention to  whether it really started a new process. Using firetools can help.</p>
<p>If  Google Chrome says &#8220;Created new window in existing browser session&#8221;, it <strong>didn&#8217;t</strong> start a new process inside the container, in which case the window isn&#8217;t subject to bandwidth limitation.</p>
<p>So close all windows of Chrome before kicking off a new one. Alternatively, this can we worked around by starting the container with.</p>
<pre>$ firejail --net=hog0 --noprofile <strong>--private</strong> --name=nethog</pre>
<p>The &#8211;private flags creates, among others, a new <strong>volatile</strong> home directory, so Chrome doesn&#8217;t detect that it&#8217;s already running. Because I use some other disk mounts for the large partitions on my computer, it&#8217;s still possible to download stuff to them from within the container.</p>
<p>But extra care is required with this, and regardless, the new browser doesn&#8217;t remember passwords and such from the private container.</p>
]]></content:encoded>
			<wfw:commentRss>http://billauer.co.il/blog/2021/08/firejail-network-bandwidth-limit/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Firejail: Putting a program in its own little container</title>
		<link>http://billauer.co.il/blog/2020/06/firejail-cgroups/</link>
		<comments>http://billauer.co.il/blog/2020/06/firejail-cgroups/#comments</comments>
		<pubDate>Thu, 11 Jun 2020 03:39:11 +0000</pubDate>
		<dc:creator>eli</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Server admin]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://billauer.co.il/blog/?p=6049</guid>
		<description><![CDATA[Introduction Firejail is a lightweight security utility which ties the hands of running processes, somewhat like Apparmor and SELinux. However it takes the mission towards Linux kernel&#8217;s cgroups and namespaces. It&#8217;s in fact a bit of a container-style virtualization utility, which creates sandboxes for running specific programs: Instead of a container for an entire operating [...]]]></description>
			<content:encoded><![CDATA[<h3>Introduction</h3>
<p>Firejail is a lightweight security utility which ties the hands of running processes, somewhat like Apparmor and SELinux. However it takes the mission towards Linux kernel&#8217;s cgroups and namespaces.  It&#8217;s in fact a bit of a container-style virtualization utility, which creates sandboxes for running specific programs: Instead of a container for an entire operating system, it makes one for each application (i.e. the main process and its children). Rather than disallowing access from files and directories by virtue of permissions, simply make sure they aren&#8217;t visible to the processes. Same goes for networking.</p>
<p>By virtue of Cgroups, several security restrictions are also put in place regardless if so desired. Certain syscalls can be prevented etc. But in the end of the day, think container virtualization. A sandbox is created, and everything happens inside it. It&#8217;s also easy to add processes to an existing sandbox (in particular, start a new shell). Not to mention the joy of shutting down a sandbox, that is, killing all processes inside it.</p>
<p>While the main use of Firejail  to protect the file system from access and tampering by malicious or infected software, it also allows more or less everything that a container-style virtual machine does: Control of network traffic (volume, dedicated firewall, which physical interfaces are exposed) as well as activity (how many subprocesses, CPU and memory utilization etc.). And like a virtual machine, it also allows statistics on resource usage.</p>
<p>Plus spoofing the host name, restricting access to sound devices, X11 capabilities and a whole range of stuff.</p>
<p>And here&#8217;s the nice thing: It <strong>doesn&#8217;t require root</strong> privileges to run. Sort of. The firejail executable is run with setuid.</p>
<p>It&#8217;s however important to note that firejail <strong>doesn&#8217;t create a stand-alone container</strong>. Rather, it mixes and matches files from the real file system and overrides selected parts of the directory tree with temporary mounts. Or overlays. Or whiteouts.</p>
<p>In fact, compared with the accurate rules of a firewall, its behavior is quite loose and inaccurate. For a newbie, it&#8217;s a bit difficult to predict exactly what kind of sandbox it will set up given this or other setting. It throws in all kind of files of its own into the temporary directories it creates, which is very helpful to get things up and running quickly, but that doesn&#8217;t give a feeling of control.</p>
<p>Generally speaking, everything that isn&#8217;t explicitly handled by blacklisting or whitelisting (see below) is accessible in the sandbox just like outside it. In particular, it&#8217;s the user&#8217;s responsibility to hide away all those system-specific mounted filesystems (do you call them /mnt/storage?). If desired, of course.</p>
<p><strong>Major disclaimer:</strong> <strong>This post is not authoritative</strong> in any way, and contains my jots as I get to know the beast. In particular, I may mislead you to think something is protected even though it&#8217;s not. You&#8217;re responsible to your own decisions.</p>
<p>The examples below are with firejail version 0.9.52 on a Linux Mint 19.</p>
<h3>Install</h3>
<pre># apt install firejail
# apt install firetools</pre>
<p>By all means, go</p>
<pre>$ man firejail</pre>
<p>after installation. It&#8217;s also worth to look at /etc/firejail/ to get an idea on what protection measures are typically used.</p>
<h3>Key commands</h3>
<p>Launch FireTools, a GUI front end:</p>
<pre>$ firetools &amp;</pre>
<p>And the &#8220;Tools&#8221; part has a nice listing of running sandboxes (right-click the ugly thing that comes up).</p>
<p>Now some command line examples. I name the sandboxes in these examples, but I&#8217;m not sure it&#8217;s worth bothering.</p>
<p>List existing sandboxes (or use FireTools, right-click the panel and choose Tools):</p>
<pre>$ firejail --list</pre>
<p>Assign a name to a sandbox when creating it</p>
<pre>$ firejail --name=mysandbox firefox</pre>
<p>Shut down a sandbox (kill all its processes, and clean up):</p>
<pre>$ firejail --shutdown=mysandbox</pre>
<p>If a name wasn&#8217;t assigned, the PID given in the list can be used instead.</p>
<p>Disallow the root user in the sandbox</p>
<pre>$ firejail --noroot</pre>
<p>Create overlay filesystem (mounts read/write, but changes are kept elsewhere)</p>
<pre>$ firejail --overlay firefox</pre>
<p>There&#8217;s also &#8211;overlay-tmpfs for overlaying tmpfs only, as well as &#8211;overlay-clean to clean the overlays, which are stored in $HOME/.firejail.</p>
<p>To create a completely new home directory (and /root) as temporary filesystems (private browsing style), so they are volatile:</p>
<pre>$ firejail --private firefox</pre>
<p>Better still,</p>
<pre>$ firejail --private=/path/to/extra-homedir firefox</pre>
<p>This uses the directory in the given path as a <strong>persistent</strong> home directory (some basic files are added automatically). This path can be anywhere in the filesystem, even in parts that are otherwise hidden (i.e. blacklisted) to the sandbox. So this is probably the most appealing choice in most scenarios.</p>
<p>Don&#8217;t get too excited, though: Other mounted filesystems remain unprotected (at different levels). This just protects the home directory.</p>
<p>By default, a whole bunch of security rules are loaded when firejail is invoked. To start the container without this:</p>
<pre>$ firejail --noprofile</pre>
<p>A profile can be selected with the  &#8211;profile=filename flag.</p>
<h3>Writing a profile</h3>
<p>If you really want to have a sandbox that protects your computer with relation to a specific piece of software, you&#8217;ll probably have to write your own profile. It&#8217;s no big deal, except that it&#8217;s a bit of trial and error.</p>
<p>First read the manpage:</p>
<pre>$ man firejail-profile</pre>
<p>It&#8217;s easiest to start from a template: Launch FireTools from a shell, right-click the ugly thing that comes up, and pick &#8220;Configuration  Wizard&#8221;, and create a custom security profile for one of the listed application &#8212; the one that resembles most the one for which the profile is set up.</p>
<p>Then launch the application from FireTools. The takeaway is that it writes out the configuration file to the console. Start with that.</p>
<h3>Whilelisting and blacklisting</h3>
<p>First and foremost: Always run a</p>
<pre>$ df -h</pre>
<p>inside the sandbox to get an idea of what is mounted. Blacklist anything that isn&#8217;t necessary. Doing so to entire mounts removes the related mount from the df -h list, which makes it easier to spot things that shouldn&#8217;t be there.</p>
<p>It&#8217;s also a good idea to start a sample bash session with the sandbox, and get into the File Manager in the Firetool&#8217;s &#8220;Tools&#8221; section for each sandbox.</p>
<p>But then, what is whitelisting and blacklisting, exactly? These two terms are used all over the docs, somehow assuming we know what they mean. So I&#8217;ll try to nail it down.</p>
<p>Whitelisting isn&#8217;t anywhere near what one would think it is: By whitelisting certain files and/or directories, the original files/directories appear in the sandbox <strong>but all other files in their vicinity are invisible</strong>. Also, changes in the same vicinity are temporary to the sandbox session. The idea seems to be that if files and/or directories are whitelisted, everything else close to it should be out of sight.</p>
<p>Or as put in the man page:</p>
<blockquote><p>A temporary file system is mounted on the top directory, and the whitelisted files are mount-binded inside.  Modifications to whitelisted  files are persistent, everything else is discarded when the sandbox is closed. The top directory could be user home, /dev, /media, /mnt, /opt, /srv, /var, and /tmp.</p></blockquote>
<p>So for example, if any file or directory in the home directory is whitelisted, the entire home directory becomes overridden by an almost empty home directory plus the specifically whitelisted items. For example, from my own home directory (which is populated with a lot of files):</p>
<pre>$ firejail --noprofile --whitelist=/home/eli/this-directory
Parent pid 31560, child pid 31561
Child process initialized in 37.31 ms

$ find .
.
./.config
./.config/pulse
./.config/pulse/client.conf
./this-directory
./this-directory/this-file.txt
./.Xauthority
./.bashrc</pre>
<p>So there&#8217;s just a few temporary files that firejail was kind enough to add for convenience. Changes made in this-directory/ are persistent since it&#8217;s bind-mounted into the temporary directory, but <strong>everything else is temporary.</strong></p>
<p>Quite unfortunately, it&#8217;s not possible to whitelist a directory outside the specific list of hierarchies (unless bind mounting is used, but that requires root). So if the important stuff is one some /hugedisk, only a bind mount will help (or is this the punishment for not putting it has /mnt/hugedisk?).</p>
<p>But note that the &#8211;private= flag allows setting the home directory to anywhere on the filesystem (even inside a blacklisted region). This ad-hoc home directory is persistent, so it&#8217;s not like whitelisting, but even better is some scenarios.</p>
<p>Alternatively, it&#8217;s possible to blacklist everything but a certain part of a mount. That&#8217;s a bit tricky, because if a new directory appears after the rules are set, it remains unprotected. I&#8217;ll explain why below.</p>
<p>Or if that makes sense, make the entire directory tree read-only, with only a selected part read-write. That&#8217;s fine if there&#8217;s no issue with data leaking, just the possibility of malware sabotage.</p>
<p>So now to blacklisting: Firejail implements blacklisting by mounting an empty, read-only-by-root file or directory on top of the original file. And indeed,</p>
<pre>$ firejail --blacklist=delme.txt
Reading profile /etc/firejail/default.profile
Reading profile /etc/firejail/disable-common.inc
Reading profile /etc/firejail/disable-passwdmgr.inc
Reading profile /etc/firejail/disable-programs.inc

** Note: you can use --noprofile to disable default.profile **

Parent pid 30288, child pid 30289
Child process initialized in 57.75 ms
$ ls -l
<span style="color: #888888;"><em>[ ... ]</em></span>
-r--------  1 nobody nogroup     0 Jun  9 22:12 delme.txt
<span style="color: #888888;"><em>[ ... ]</em></span>
$ less delme.txt
delme.txt: Permission denied</pre>
<p>There are &#8211;noblacklist and &#8211;nowhitelist flags as well. However these merely cancel future or automatic black- or whitelistings. In particular, one can&#8217;t blacklist a directory and whitelist a subdirectory. It would have been very convenient, but since the parent directory is overridden with a whiteout directory, there is no access to the subdirectory. So each and every subdirectory must be blacklisted separately with a script or something, and even then if a new subdirectory pops up, it&#8217;s not protected at all.</p>
<p>There&#8217;s also a &#8211;read-only flag allows setting certain paths and files as read-only. There&#8217;s &#8211;read-write too, of course. When a directory or file is whitelisted, it must be flagged read-only separately if so desired (see man firejail).</p>
<h3>Mini-strace</h3>
<p>Trace all processes in the sandbox (in particular accesses to files and network). Much easier than using strace, when all we want is &#8220;which files are accessed?&#8221;</p>
<pre>$ firejail --trace</pre>
<p>And then just run any program to see what files and network sockets it accesses. And things of that sort.</p>
]]></content:encoded>
			<wfw:commentRss>http://billauer.co.il/blog/2020/06/firejail-cgroups/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MySQL, OOM killer, overcommitting and other memory related issues</title>
		<link>http://billauer.co.il/blog/2019/10/mysqld-killed-oom/</link>
		<comments>http://billauer.co.il/blog/2019/10/mysqld-killed-oom/#comments</comments>
		<pubDate>Sun, 13 Oct 2019 17:21:46 +0000</pubDate>
		<dc:creator>eli</dc:creator>
				<category><![CDATA[Internet]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Server admin]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://billauer.co.il/blog/?p=5900</guid>
		<description><![CDATA[It started with an error message This post is a bit of a coredump of myself attempting to resolve a sudden web server failure. And even more important, understand why it happened (check on that) and try avoiding it from happening in the future (not as lucky there). I&#8217;ve noticed that there are many threads [...]]]></description>
			<content:encoded><![CDATA[<h3>It started with an error message</h3>
<p>This post is a bit of a coredump of myself attempting to resolve a sudden web server failure. And even more important, understand why it happened (check on that) and try avoiding it from happening in the future (not as lucky there).</p>
<p>I&#8217;ve noticed that there are many threads in the Internet on why mysqld died suddenly, so to make a long story short: mysqld has the exact profile that the OOM killer is looking for: Lots of resident RAM, and it&#8217;s not a system process. Apache gets killed every now and then for the same reason.</p>
<p>This post relates to a  VPS hosted Debian 8, kernel 3.10.0, x86_64. The MySQL server is a 5.5.62-0+deb8u1 (Debian).</p>
<p>As always, it started with a mail notification from some cronjob complaining about something. Soon enough it was evident that the MySQL server was down. And as usual, the deeper I investigated this issue, the more I realized that this was just the tip of the iceberg (the kind that doesn&#8217;t melt due to global warming).</p>
<h3>The crash</h3>
<p>So first, it was clear that the MySQL had restarted itself a couple of days before disaster:</p>
<pre>191007  9:25:17 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
191007  9:25:17 [Note] Plugin 'FEDERATED' is disabled.
191007  9:25:17 InnoDB: The InnoDB memory heap is disabled
191007  9:25:17 InnoDB: Mutexes and rw_locks use GCC atomic builtins
191007  9:25:17 InnoDB: Compressed tables use zlib 1.2.8
191007  9:25:17 InnoDB: Using Linux native AIO
191007  9:25:17 InnoDB: Initializing buffer pool, size = 128.0M
191007  9:25:17 InnoDB: Completed initialization of buffer pool
191007  9:25:17 InnoDB: highest supported file format is Barracuda.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
191007  9:25:17  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
191007  9:25:19  InnoDB: Waiting for the background threads to start
191007  9:25:20 InnoDB: 5.5.62 started; log sequence number 1427184442
191007  9:25:20 [Note] Server hostname (bind-address): '127.0.0.1'; port: 3306
191007  9:25:20 [Note]   - '127.0.0.1' resolves to '127.0.0.1';
191007  9:25:20 [Note] Server socket created on IP: '127.0.0.1'.
191007  9:25:21 [Note] Event Scheduler: Loaded 0 events
191007  9:25:21 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.62-0+deb8u1'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Debian)
191007  9:25:28 [ERROR] /usr/sbin/mysqld: Table './mydb/wp_options' is marked as crashed and should be repaired
191007  9:25:28 [Warning] Checking table:   './mydb/wp_options'
191007  9:25:28 [ERROR] /usr/sbin/mysqld: Table './mydb/wp_posts' is marked as crashed and should be repaired
191007  9:25:28 [Warning] Checking table:   './mydb/wp_posts'
191007  9:25:28 [ERROR] /usr/sbin/mysqld: Table './mydb/wp_term_taxonomy' is marked as crashed and should be repaired
191007  9:25:28 [Warning] Checking table:   './mydb/wp_term_taxonomy'
191007  9:25:28 [ERROR] /usr/sbin/mysqld: Table './mydb/wp_term_relationships' is marked as crashed and should be repaired
191007  9:25:28 [Warning] Checking table:   './mydb/wp_term_relationships'</pre>
<p>And then, two days layer, it crashed for real. Or actually, got killed. From the syslog:</p>
<pre>Oct 09 05:30:16 kernel: OOM killed process 22763 (mysqld) total-vm:2192796kB, anon-rss:128664kB, file-rss:0kB</pre>
<p>and</p>
<pre>191009  5:30:17 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
191009  5:30:17 [Note] Plugin 'FEDERATED' is disabled.
191009  5:30:17 InnoDB: The InnoDB memory heap is disabled
191009  5:30:17 InnoDB: Mutexes and rw_locks use GCC atomic builtins
191009  5:30:17 InnoDB: Compressed tables use zlib 1.2.8
191009  5:30:17 InnoDB: Using Linux native AIO
191009  5:30:17 InnoDB: Initializing buffer pool, size = 128.0M
<span style="color: #ff0000;"><strong>InnoDB: mmap(137363456 bytes) failed; errno 12</strong></span>
191009  5:30:17 InnoDB: Completed initialization of buffer pool
191009  5:30:17 InnoDB: Fatal error: cannot allocate memory for the buffer pool
191009  5:30:17 [ERROR] Plugin 'InnoDB' init function returned error.
191009  5:30:17 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
191009  5:30:17 [ERROR] Unknown/unsupported storage engine: InnoDB
191009  5:30:17 [ERROR] Aborting

191009  5:30:17 [Note] /usr/sbin/mysqld: Shutdown complete</pre>
<p>The mmap() is most likely anonymous (i.e. not related to a file), as I couldn&#8217;t find any memory mapped file that is related to the mysql processes (except for the obvious mappings of shared libraries).</p>
<h3>The smoking gun</h3>
<p>But here comes the good part: It turns out that the OOM killer had been active several times before. It just so happen that the processes are being newborn every time this happens. It was the relaunch that failed this time &#8212; otherwise I wouldn&#8217;t have noticed this was going on.</p>
<p>This is the output of plain &#8220;dmesg&#8221;. All OOM entries but the last one were not available with journalctl, as old entries had been deleted.</p>
<pre>[3634197.152028] OOM killed process 776 (mysqld) total-vm:2332652kB, anon-rss:153508kB, file-rss:0kB
[3634197.273914] OOM killed process 71 (systemd-journal) total-vm:99756kB, anon-rss:68592kB, file-rss:4kB
[4487991.904510] OOM killed process 3817 (mysqld) total-vm:2324456kB, anon-rss:135752kB, file-rss:0kB
[4835006.413510] OOM killed process 23267 (mysqld) total-vm:2653112kB, anon-rss:131272kB, file-rss:4404kB
[4835006.767112] OOM killed process 32758 (apache2) total-vm:282528kB, anon-rss:11732kB, file-rss:52kB
[4884915.371805] OOM killed process 825 (mysqld) total-vm:2850312kB, anon-rss:121164kB, file-rss:5028kB
[4884915.509686] OOM killed process 17611 (apache2) total-vm:282668kB, anon-rss:11736kB, file-rss:444kB
[5096265.088151] OOM killed process 23782 (mysqld) total-vm:4822232kB, anon-rss:105972kB, file-rss:3784kB
[5845437.591031] OOM killed process 24642 (mysqld) total-vm:2455744kB, anon-rss:137784kB, file-rss:0kB
[5845437.608682] OOM killed process 3802 (systemd-journal) total-vm:82548kB, anon-rss:51412kB, file-rss:28kB
[6896254.741732] OOM killed process 11551 (mysqld) total-vm:2718652kB, anon-rss:144116kB, file-rss:220kB
[7054957.856153] OOM killed process 22763 (mysqld) total-vm:2192796kB, anon-rss:128664kB, file-rss:0kB</pre>
<p>Or, after calculating the time stamps (using the last OOM message as a reference):</p>
<pre>Fri Aug 30 15:17:36 2019 OOM killed process 776 (mysqld) total-vm:2332652kB, anon-rss:153508kB, file-rss:0kB
Fri Aug 30 15:17:36 2019 OOM killed process 71 (systemd-journal) total-vm:99756kB, anon-rss:68592kB, file-rss:4kB
Mon Sep  9 12:27:30 2019 OOM killed process 3817 (mysqld) total-vm:2324456kB, anon-rss:135752kB, file-rss:0kB
Fri Sep 13 12:51:05 2019 OOM killed process 23267 (mysqld) total-vm:2653112kB, anon-rss:131272kB, file-rss:4404kB
Fri Sep 13 12:51:05 2019 OOM killed process 32758 (apache2) total-vm:282528kB, anon-rss:11732kB, file-rss:52kB
Sat Sep 14 02:42:54 2019 OOM killed process 825 (mysqld) total-vm:2850312kB, anon-rss:121164kB, file-rss:5028kB
Sat Sep 14 02:42:54 2019 OOM killed process 17611 (apache2) total-vm:282668kB, anon-rss:11736kB, file-rss:444kB
Mon Sep 16 13:25:24 2019 OOM killed process 23782 (mysqld) total-vm:4822232kB, anon-rss:105972kB, file-rss:3784kB
Wed Sep 25 05:31:36 2019 OOM killed process 24642 (mysqld) total-vm:2455744kB, anon-rss:137784kB, file-rss:0kB
Wed Sep 25 05:31:36 2019 OOM killed process 3802 (systemd-journal) total-vm:82548kB, anon-rss:51412kB, file-rss:28kB
Mon Oct  7 09:25:13 2019 OOM killed process 11551 (mysqld) total-vm:2718652kB, anon-rss:144116kB, file-rss:220kB
Wed Oct  9 05:30:16 2019 OOM killed process 22763 (mysqld) total-vm:2192796kB, anon-rss:128664kB, file-rss:0kB</pre>
<p>anon-rss is the resident RAM consumed by the process itself (anonymous = not memory mapped to a file or something like that).</p>
<p>total-vm is the total size of the Virtual Memory in use. This isn&#8217;t very relevant (I think), as it involves shared libraries, memory mapped files and other segments that don&#8217;t consume any actual RAM or other valuable resources.</p>
<p>So now it&#8217;s clear what happened. Next, to some finer resolution.</p>
<h3>The MySQL keepaliver</h3>
<p>The MySQL daemon is executed by virtue of an SysV init script, which launches /usr/bin/mysqld_safe, a patch-on-patch script to keep the daemon alive, no matter what. It restarts the mysqld daemon if it dies for any or no reason, and should also produce log messages. On my system, it&#8217;s executed as</p>
<pre>/usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/log/mysql/error.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306</pre>
<p>The script issues log messages when something unexpected happens, but they don&#8217;t appear in /var/log/mysql/error.log or anywhere else, even though the file exists, is owned by the mysql user, and has quite a few messages from the mysql daemon itself.</p>
<p>Changing</p>
<pre>/usr/bin/mysqld_safe &gt; /dev/null 2&gt;&amp;1 &amp;</pre>
<p>to</p>
<pre>/usr/bin/mysqld_safe --syslog &gt; /dev/null 2&gt;&amp;1 &amp;</pre>
<p>Frankly speaking, I don&#8217;t think this made any difference. I&#8217;ve seen nothing new in the logs.</p>
<p>It would have been nicer having the messages in mysql/error.log, but at least they are visible with journalctl this way.</p>
<h3>Shrinking the InnoDB buffer pool</h3>
<p>As the actual failure was on attempting to map memory for the buffer pool, maybe make it smaller&#8230;?</p>
<p>Launch MySQL as the root user:</p>
<pre>$ mysql -u root --password</pre>
<p>and check the InnoDB status, as suggested on <a href="https://dev.mysql.com/doc/refman/5.5/en/innodb-buffer-pool.html" target="_blank">this page</a>:</p>
<pre>mysql&gt; SHOW ENGINE INNODB STATUS;

<span style="color: #888888;"><em>[ ... ]</em></span>

----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 137363456; in additional pool allocated 0
Dictionary memory allocated 1100748
Buffer pool size   8192
Free buffers       6263
Database pages     1912
Old database pages 725
Modified db pages  0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 1894, created 18, written 1013
0.00 reads/s, 0.00 creates/s, 0.26 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 1912, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]</pre>
<p>I&#8217;m really not an expert, but if &#8220;Free buffers&#8221; is 75% of the total allocated space, I&#8217;ve probably allocated too much. So I reduced it to 32 MB  &#8212; it&#8217;s not like I&#8217;m running a high-end server. I added /etc/mysql/conf.d/innodb_pool_size.cnf (owned by root, 0644) reading:</p>
<pre># Reduce InnoDB buffer size from default 128 MB to 32 MB
[mysqld]
innodb_buffer_pool_size=32M</pre>
<p>Restarting the daemon, it says:</p>
<pre>----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 34340864; in additional pool allocated 0
Dictionary memory allocated 1100748
Buffer pool size   2047
Free buffers       856
Database pages     1189
Old database pages 458</pre>
<h3>And finally, repair the tables</h3>
<p>Remember those warnings that the tables were marked as crashed? That&#8217;s the easy part:</p>
<pre>$ mysqlcheck -A --auto-repair</pre>
<p>That went smoothly, with no complaints. After all, it wasn&#8217;t really a crash.</p>
<h3>Some general words on OOM</h3>
<p>This whole idea that the kernel should do Roman Empire style decimation of processes is widely criticized by many, but it&#8217;s probably not such a bad idea. The root cause lies in the fact that the kernel agrees to allocate more RAM than it actually has. This is even possible because the kernel doesn&#8217;t really allocate RAM when a process asks for memory with a brk() call, but it only allocates the memory space segment. The actual RAM is allocated only when the process attempts to access a page that hasn&#8217;t been RAM allocated yet. The access attempt causes a page fault, the kernel quickly fixes some RAM and returns from the page fault interrupt as if nothing happened.</p>
<p>So when the kernel responds with an -ENOMEM, it&#8217;s not because it doesn&#8217;t have any RAM, but because it doesn&#8217;t want to.</p>
<p>More precisely, the kernel keeps account on how much memory it has given away (system-wise and/or cgroup-wise) and make a decision. The common policy is to overcommit to some extent &#8212; that is, to allow the total allocated RAM allocated to exceed the total physical RAM. Even, and in particular, if there&#8217;s no swap.</p>
<p>The common figure is to overcommit by 50%: For a 64 GiB RAM computer, there might be 96 GiB or promised RAM. This may seem awfully stupid thing to do, but hey, it works. If that concept worries you, modern banking (with real money, that is) might worry you even more.</p>
<p>The problem rises when the processes run to the bank. That is, when the processes access the RAM they&#8217;ve been promised, and at some point the kernel has nowhere to take memory from. Let&#8217;s assume there&#8217;s no swap, all disk buffers have been flushed, all rabbits have been pulled. There&#8217;s a process waiting for memory, and it can&#8217;t go back running until the problem has been resolved.</p>
<p>Linux&#8217; solution to this situation is to select a process with a lot of RAM and little importance. How the kernel does that judgement is documented everywhere. The important point is that it&#8217;s not necessarily the process that triggered the event, and that it will usually be the same victim over and over again. In my case, mysqld is the favorite. Big, fat, and not a system process.</p>
<p>Thinking about it, the OOM is a good solution to get out of a tricky situation. The alternative would have been to deny memory to processes just launched, including the administrator&#8217;s attempt to rescue the system. Or an attempt to shut it down with some dignity. So sacrificing a large and hopefully not-so-important process isn&#8217;t such a bad idea.</p>
<h3>Why did the OOM kick in?</h3>
<p>This all took place on a VPS virtual machine with 1 GB leased RAM. With the stuff running on that machine, there&#8217;s no reason in the world that the total actual RAM consumption would reach that limit. This is a system that typically has 70% of its memory marked as &#8220;cached&#8221; (i.e. used by disk cache). This should be taken with a grain of salt, as &#8220;top&#8221; displays data from some bogus /proc/vmstat, and still.</p>
<p>As can be seen in the dmesg logs above, the amount of resident RAM of the killed mysqld process was 120-150 MB or so. Together with the other memory hog, apache2, they reach 300 MB. That&#8217;s it. No reason for anything drastic.</p>
<p>Having said that, it&#8217;s remarkable that the total-vm stood at 2.4-4.3 GB when killed. This is much higher than the typical 900 MB visible usually. So maybe there&#8217;s some kind of memory leak, even if it&#8217;s harmless? Looking at mysql over time, its virtual memory allocation tends to grow.</p>
<p>VPS machines do have a physical memory limit imposed, by virtue of the relevant cgroup&#8217;s <a href="https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html" target="_blank">memory.high and memory.max</a> limits. In particular the latter &#8212; if the cgroup&#8217;s total consumption exceeds memory.max, OOM kicks in. This is how the illusion of an independent RAM segment is made on a VPS machine. Plus faking some /proc files.</p>
<p>But there&#8217;s another explanation: Say that a VPS service provider takes a computer with 16 GB RAM, and places 16 VPS machines with 1 GB leased RAM each. What will the overall actual RAM consumption be? I would expect it to be much lower than 16 GB. So why not add a few more VPS machines, and make some good use of the hardware? It&#8217;s where the profit comes from.</p>
<p>Most of the time, there will be no problem. But occasionally, this will cause RAM shortages, in which case the kernel&#8217;s <strong>global</strong> OOM looks for a victim. I suppose there&#8217;s no significance to cgroups in this matter. In other words, the kernel sees all processes in the system the same, regardless of which cgroup (and hence VPS machine) they belong to. Which means that the process killed doesn&#8217;t necessarily belong to the VPS that triggered the problem. The processes of one VPS may suddenly demand their memory, but some other VPS will have its processes killed.</p>
<h3>Conclusion</h3>
<ul>
<li>Shrinking the buffer pool of mysqld was probably a good idea, in particular if a computer-wide OOM killed the process &#8212; odds are that it will kill some other mysqld instead this way.</li>
<li>Possibly restart mysql with a cronjob every day to keep its memory consumption in control. But this might create problems of its own.</li>
<li>It&#8217;s high time to replace the VPS guest with KVM or similar.</li>
</ul>
<h3>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</h3>
<h3>Rambling epilogue: Some thoughts about overcomitting</h3>
<p>The details for how overcomitting is accounted for is given on the kernel tree&#8217;s <a href="https://www.kernel.org/doc/Documentation/vm/overcommit-accounting" target="_blank">Documentation/vm/overcommit-accounting</a>. But to make a long story short, it&#8217;s done in a sensible way. In particular, if a piece of memory is shared by threads and processes, it&#8217;s only accounted for once.</p>
<p>Relevant files: /proc/meminfo and /proc/vmstat</p>
<p>It seems like CommitLimit and Committed_AS are not available on a VPS guest system. But the OOM killer probably knows these values (or was it because /proc/sys/vm/overcommit_memory was set to 1 on my system, meaning &#8220;Always overcommit&#8221;?).</p>
<p>To get a list of the current memory hogs, run &#8220;top&#8221; and press shift-M as it&#8217;s running.</p>
<p>To get an idea on how a process behaves, use pmap -x. For example, looking at a mysqld process (run as root, or no memory map will be shown):</p>
<pre># pmap -x 14817
14817:   /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/log/mysql/error.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306
Address           Kbytes     RSS   Dirty Mode  Mapping
000055c5617ac000   10476    6204       0 r-x-- mysqld
000055c5623e6000     452     452     452 r---- mysqld
000055c562457000     668     412     284 rw--- mysqld
000055c5624fe000     172     172     172 rw---   [ anon ]
000055c563e9b000    6592    6448    6448 rw---   [ anon ]
00007f819c000000    2296     320     320 rw---   [ anon ]
<span style="color: #ff0000;"><strong>00007f819c23e000   63240       0       0 -----   [ anon ]
</strong></span>00007f81a0000000    3160     608     608 rw---   [ anon ]
<span style="color: #ff0000;"><strong>00007f81a0316000   62376       0       0 -----   [ anon ]
</strong></span>00007f81a4000000    9688    7220    7220 rw---   [ anon ]
<span style="color: #ff0000;"><strong>00007f81a4976000   55848       0       0 -----   [ anon ]
</strong></span>00007f81a8000000     132       8       8 rw---   [ anon ]
<span style="color: #ff0000;"><strong>00007f81a8021000   65404       0       0 -----   [ anon ]
</strong></span>00007f81ac000000     132       4       4 rw---   [ anon ]
<span style="color: #ff0000;"><strong>00007f81ac021000   65404       0       0 -----   [ anon ]
</strong></span>00007f81b1220000       4       0       0 -----   [ anon ]
00007f81b1221000    8192       8       8 rw---   [ anon ]
00007f81b1a21000       4       0       0 -----   [ anon ]
00007f81b1a22000    8192       8       8 rw---   [ anon ]
00007f81b2222000       4       0       0 -----   [ anon ]
00007f81b2223000    8192       8       8 rw---   [ anon ]
00007f81b2a23000       4       0       0 -----   [ anon ]
00007f81b2a24000    8192      20      20 rw---   [ anon ]
00007f81b3224000       4       0       0 -----   [ anon ]
00007f81b3225000    8192       8       8 rw---   [ anon ]
00007f81b3a25000       4       0       0 -----   [ anon ]
00007f81b3a26000    8192       8       8 rw---   [ anon ]
00007f81b4226000       4       0       0 -----   [ anon ]
00007f81b4227000    8192       8       8 rw---   [ anon ]
00007f81b4a27000       4       0       0 -----   [ anon ]
00007f81b4a28000    8192       8       8 rw---   [ anon ]
00007f81b5228000       4       0       0 -----   [ anon ]
00007f81b5229000    8192       8       8 rw---   [ anon ]
00007f81b5a29000       4       0       0 -----   [ anon ]
00007f81b5a2a000    8192       8       8 rw---   [ anon ]
00007f81b622a000       4       0       0 -----   [ anon ]
00007f81b622b000    8192      12      12 rw---   [ anon ]
00007f81b6a2b000       4       0       0 -----   [ anon ]
00007f81b6a2c000    8192       8       8 rw---   [ anon ]
00007f81b722c000       4       0       0 -----   [ anon ]
00007f81b722d000   79692   57740   57740 rw---   [ anon ]
00007f81bc000000     132      76      76 rw---   [ anon ]
<span style="color: #ff0000;"><strong>00007f81bc021000   65404       0       0 -----   [ anon ]
</strong></span>00007f81c002f000    2068    2052    2052 rw---   [ anon ]
00007f81c03f9000       4       0       0 -----   [ anon ]
00007f81c03fa000     192      52      52 rw---   [ anon ]
00007f81c042a000       4       0       0 -----   [ anon ]
00007f81c042b000     192      52      52 rw---   [ anon ]
00007f81c045b000       4       0       0 -----   [ anon ]
00007f81c045c000     192      64      64 rw---   [ anon ]
00007f81c048c000       4       0       0 -----   [ anon ]
00007f81c048d000     736     552     552 rw---   [ anon ]
00007f81c0545000      20       4       0 rw-s- [aio] (deleted)
00007f81c054a000      20       4       0 rw-s- [aio] (deleted)
00007f81c054f000    3364    3364    3364 rw---   [ anon ]
00007f81c0898000      44      12       0 r-x-- libnss_files-2.19.so
00007f81c08a3000    2044       0       0 ----- libnss_files-2.19.so
00007f81c0aa2000       4       4       4 r---- libnss_files-2.19.so
00007f81c0aa3000       4       4       4 rw--- libnss_files-2.19.so
00007f81c0aa4000      40      20       0 r-x-- libnss_nis-2.19.so
00007f81c0aae000    2044       0       0 ----- libnss_nis-2.19.so
00007f81c0cad000       4       4       4 r---- libnss_nis-2.19.so
00007f81c0cae000       4       4       4 rw--- libnss_nis-2.19.so
00007f81c0caf000      28      20       0 r-x-- libnss_compat-2.19.so
00007f81c0cb6000    2044       0       0 ----- libnss_compat-2.19.so
00007f81c0eb5000       4       4       4 r---- libnss_compat-2.19.so
00007f81c0eb6000       4       4       4 rw--- libnss_compat-2.19.so
00007f81c0eb7000       4       0       0 -----   [ anon ]
00007f81c0eb8000    8192       8       8 rw---   [ anon ]
00007f81c16b8000      84      20       0 r-x-- libnsl-2.19.so
00007f81c16cd000    2044       0       0 ----- libnsl-2.19.so
00007f81c18cc000       4       4       4 r---- libnsl-2.19.so
00007f81c18cd000       4       4       4 rw--- libnsl-2.19.so
00007f81c18ce000       8       0       0 rw---   [ anon ]
00007f81c18d0000    1668     656       0 r-x-- libc-2.19.so
00007f81c1a71000    2048       0       0 ----- libc-2.19.so
00007f81c1c71000      16      16      16 r---- libc-2.19.so
00007f81c1c75000       8       8       8 rw--- libc-2.19.so
00007f81c1c77000      16      16      16 rw---   [ anon ]
00007f81c1c7b000      88      44       0 r-x-- libgcc_s.so.1
00007f81c1c91000    2044       0       0 ----- libgcc_s.so.1
00007f81c1e90000       4       4       4 rw--- libgcc_s.so.1
00007f81c1e91000    1024     128       0 r-x-- libm-2.19.so
00007f81c1f91000    2044       0       0 ----- libm-2.19.so
00007f81c2190000       4       4       4 r---- libm-2.19.so
00007f81c2191000       4       4       4 rw--- libm-2.19.so
00007f81c2192000     944     368       0 r-x-- libstdc++.so.6.0.20
00007f81c227e000    2048       0       0 ----- libstdc++.so.6.0.20
00007f81c247e000      32      32      32 r---- libstdc++.so.6.0.20
00007f81c2486000       8       8       8 rw--- libstdc++.so.6.0.20
00007f81c2488000      84       8       8 rw---   [ anon ]
00007f81c249d000      12       8       0 r-x-- libdl-2.19.so
00007f81c24a0000    2044       0       0 ----- libdl-2.19.so
00007f81c269f000       4       4       4 r---- libdl-2.19.so
00007f81c26a0000       4       4       4 rw--- libdl-2.19.so
00007f81c26a1000      32       4       0 r-x-- libcrypt-2.19.so
00007f81c26a9000    2044       0       0 ----- libcrypt-2.19.so
00007f81c28a8000       4       4       4 r---- libcrypt-2.19.so
00007f81c28a9000       4       4       4 rw--- libcrypt-2.19.so
00007f81c28aa000     184       0       0 rw---   [ anon ]
00007f81c28d8000      36      28       0 r-x-- libwrap.so.0.7.6
00007f81c28e1000    2044       0       0 ----- libwrap.so.0.7.6
00007f81c2ae0000       4       4       4 r---- libwrap.so.0.7.6
00007f81c2ae1000       4       4       4 rw--- libwrap.so.0.7.6
00007f81c2ae2000       4       4       4 rw---   [ anon ]
00007f81c2ae3000     104      12       0 r-x-- libz.so.1.2.8
00007f81c2afd000    2044       0       0 ----- libz.so.1.2.8
00007f81c2cfc000       4       4       4 r---- libz.so.1.2.8
00007f81c2cfd000       4       4       4 rw--- libz.so.1.2.8
00007f81c2cfe000       4       4       0 r-x-- libaio.so.1.0.1
00007f81c2cff000    2044       0       0 ----- libaio.so.1.0.1
00007f81c2efe000       4       4       4 r---- libaio.so.1.0.1
00007f81c2eff000       4       4       4 rw--- libaio.so.1.0.1
00007f81c2f00000      96      84       0 r-x-- libpthread-2.19.so
00007f81c2f18000    2044       0       0 ----- libpthread-2.19.so
00007f81c3117000       4       4       4 r---- libpthread-2.19.so
00007f81c3118000       4       4       4 rw--- libpthread-2.19.so
00007f81c3119000      16       4       4 rw---   [ anon ]
00007f81c311d000     132     112       0 r-x-- ld-2.19.so
00007f81c313e000       8       0       0 rw---   [ anon ]
00007f81c3140000      20       4       0 rw-s- [aio] (deleted)
00007f81c3145000      20       4       0 rw-s- [aio] (deleted)
00007f81c314a000      20       4       0 rw-s- [aio] (deleted)
00007f81c314f000      20       4       0 rw-s- [aio] (deleted)
00007f81c3154000      20       4       0 rw-s- [aio] (deleted)
00007f81c3159000      20       4       0 rw-s- [aio] (deleted)
00007f81c315e000      20       4       0 rw-s- [aio] (deleted)
00007f81c3163000      20       4       0 rw-s- [aio] (deleted)
00007f81c3168000    1840    1840    1840 rw---   [ anon ]
00007f81c3334000       8       0       0 rw-s- [aio] (deleted)
00007f81c3336000       4       0       0 rw-s- [aio] (deleted)
00007f81c3337000      24      12      12 rw---   [ anon ]
00007f81c333d000       4       4       4 r---- ld-2.19.so
00007f81c333e000       4       4       4 rw--- ld-2.19.so
00007f81c333f000       4       4       4 rw---   [ anon ]
00007ffd2d68b000     132      68      68 rw---   [ stack ]
00007ffd2d7ad000       8       4       0 r-x--   [ anon ]
ffffffffff600000       4       0       0 r-x--   [ anon ]
---------------- ------- ------- -------
total kB          640460   89604   81708</pre>
<p>The KBytes and RSS column&#8217;s Total at the bottom matches the VIRT and RSS figures shown by &#8220;top&#8221;.</p>
<p>I should emphasize that this a freshly started mysqld process. Give it a few days to run, and some extra 100 MB of virtual space is added (not clear why) plus some real RAM, depending on the setting.</p>
<p>I&#8217;ve marked six anonymous segments that are completely virtual (no resident memory at all) summing up to ~360 MB. This means that they are counted in as 360 MB at least once &#8212; and that&#8217;s for a process that only uses 90 MB for real.</p>
<p>My own anecdotal test on another machine with a 4.4.0 kernel showed that putting /proc/sys/vm/overcommit_ratio below what was actually committed (making /proc/meminfo&#8217;s CommitLimit smaller than Committed_AS) didn&#8217;t have any effect unless /proc/sys/vm/overcommit_memory was set to 2. And when I did that, the OOM wasn&#8217;t called, but instead I had a hard time running new commands:</p>
<pre># echo 2 &gt; /proc/sys/vm/overcommit_memory
# cat /proc/meminfo
bash: fork: Cannot allocate memory</pre>
<p>So this is what it looks like when memory runs out and the system refuses to play ball.</p>
]]></content:encoded>
			<wfw:commentRss>http://billauer.co.il/blog/2019/10/mysqld-killed-oom/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Upgrading to Linux Mint 19, running the old system in a chroot</title>
		<link>http://billauer.co.il/blog/2018/11/linux-chroot-system-in-parallel/</link>
		<comments>http://billauer.co.il/blog/2018/11/linux-chroot-system-in-parallel/#comments</comments>
		<pubDate>Thu, 29 Nov 2018 20:30:10 +0000</pubDate>
		<dc:creator>eli</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[systemd]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://billauer.co.il/blog/?p=5605</guid>
		<description><![CDATA[Background Archaeological findings have revealed that prehistoric humans buried their forefathers under the floor of their huts. Fast forward to 2018, yours truly decided to continue running the (ancient) Fedora 12 as a chroot when migrating to Linux Mint 19. That&#8217;s an eight years difference. While a lot of Linux users are happy to just [...]]]></description>
			<content:encoded><![CDATA[<h3>Background</h3>
<p>Archaeological findings have revealed that prehistoric humans buried their forefathers under the floor of their huts. Fast forward to 2018, yours truly decided to continue running the (ancient) Fedora 12 as a chroot when migrating to Linux Mint 19. That&#8217;s an eight years difference.</p>
<p>While a lot of Linux users are happy to just install the new system and migrate everything &#8220;automatically&#8221;, this isn&#8217;t a good idea if you&#8217;re into more than plain tasks. Upgrading is supposed to be smooth, but small changes in the default behavior, API or whatever always make things that worked before fail, and sometimes with significant damage. Of the sort of not receiving emails, backup jobs not really working as before etc. Or just <a href="http://billauer.co.il/blog/2018/11/fsck-inode-checksum-errors-resize2fs/" target="_blank">a new bug</a>.</p>
<p>I&#8217;ve talked with quite a few sysadmins who were responsible for computers that actually needed to work continuously and reliably, and it&#8217;s not long before the apology for their ancient Linux distribution arrived.  There&#8217;s no need to apologize: Upgrading is not good for keeping the system running smoothly. If it ain&#8217;t broke, don&#8217;t fix it.</p>
<p>But after some time, the hardware gets old and it becomes difficult to install new software. So I had this idea to keep running the old computer, with all of its properly running services and cronjobs, as a virtual machine. And then I thought, maybe go VPS-style. And then I realized I don&#8217;t need the VPS isolation at all. So the idea is to keep the old system as a chroot inside the new one.</p>
<p>Some services (httpd, mail handling, dhcpd) will keep running in the chroot, and others (the desktop in particular, with new shiny GUI programs) running natively. Old and new on the same machine.</p>
<p>The trick is making sure one doesn&#8217;t stamp on the feet of the other. These are my insights as I managed to get this up and running.</p>
<h3>The basics</h3>
<p>The idea is to place the old root filesystem (only) into somewhere in the new system, and chroot into it for the sake of running services and oldschool programs:</p>
<ul>
<li>The old root is placed as e.g. /oldy-root/ in the new filesystem (note that oldy is a legit alternative spelling for oldie&#8230;).</li>
<li>bind-mounts are used for a unified view of home directories and those containing data.</li>
<li>Some services are executed from within the chroot environment. How to run them from Mint 19 (hence using systemd) is described below.</li>
<li>Running old programs is also possible by chrooting from shell. This is also discussed below.</li>
</ul>
<p>Don&#8217;t put the old root on a filesystem that contains useful data, because odds are that such file system will be bind-mounted into the chrooted filesystem, which will cause a directory tree loop. Then try to calculate disk space or <a href="http://billauer.co.il/blog/2018/11/tar-bind-mount/" target="_blank">backup with tar</a>. So pick a separate filesystem (i.e. a separate partition or LVM volume), or possibly as a subdirectory of the same filesystem as the &#8220;real&#8221; root.</p>
<h3>Bind mounting</h3>
<p>This is where the tricky choices are made. The point is to make the old and new systems see more or less the same application data, and also allow software to communicate over /tmp. So this is the relevant part in my /etc/fstab:</p>
<pre># Bind mounts for oldy root: system essentials
/dev                        /oldy-root/dev none bind                0       2
/dev/pts                    /oldy-root/dev/pts none bind            0       2
/dev/shm                    /oldy-root/dev/shm none bind            0       2
/sys                        /oldy-root/sys none bind                0       2
/proc                       /oldy-root/proc none bind               0       2

# Bind mounts for oldy root: Storage
/home                       /oldy-root/home none bind               0       2
/storage                    /oldy-root/storage none bind            0       2
/tmp                        /oldy-root/tmp  none bind               0       2
/mnt                        /oldy-root/mnt  none bind               0       2
/media                      /oldy-root/media none bind              0       2</pre>
<p>Most notable are /mnt and /media. Bind-mounting these allows temporary mounts to be visible at both sides. /tmp is required for the UNIX domain socket used for playing sound from the old system. And other sockets, I suppose.</p>
<p>Note that /run<strong> isn&#8217;t </strong>bind-mounted. The reason is that the tree structure has changed, so it&#8217;s quite pointless (the mounting point used to be /var/run, and the place of the runtime files tend to change with time). The motivation for bind mounting would have been to let software from the old and new software interact, and indeed, there are a few UNIX sockets there, most notably the DBus domain UNIX socket.</p>
<p>But DBus is a good example of how hopeless it is to bind-mount /run: Old software attempting to talk with the Console Kit on the new DBus server fails completely at the protocol level (or namespace? I didn&#8217;t really dig into that).</p>
<p>So just copy the old /var/run into the root filesystem and that&#8217;s it. CUPS ran smoothly, GUI programs run fairly OK, and sound is done through a UNIX domain socket as suggested in the comments of <a href="http://billauer.co.il/blog/2014/01/pa-multiple-users/" target="_blank">this post</a>.</p>
<p>I opted out on bind mounting /lib/modules and /usr/src. This makes manipulations of kernel modules (as needed by VMware, for example) impossible from the old system. But gcc is outdated for compiling under the new Linux kernel build system, so there was little point.</p>
<p>/root isn&#8217;t bind-mounted either. I wasn&#8217;t so sure about that, but in the end, it&#8217;s not a very useful directory. Keeping them separate makes the shell history for the root user distinct, and that&#8217;s actually a good thing.</p>
<h3>Make /dev/log for real</h3>
<p>Almost all service programs (and others) send messages to the system log by writing to the UNIX domain socket /dev/log. It&#8217;s actually a misnomer, because /dev/log is not a device file. But you don&#8217;t break tradition.</p>
<p><strong>WARNING:</strong> If the logging server doesn&#8217;t work properly, Linux will fail to boot, dropping you into a tiny busybox rescue shell. So before playing with this, reboot to verify all is fine, and then make the changes. Be sure to prepare yourself for reverting your changes with plain command-line utilities (cp, mv, cat) and reboot to make sure all is fine.</p>
<p>In Mint 19 (and forever on), logging is handled by systemd-journald, which is a godsend. However for some reason (does anyone know why? Kindly comment below), the UNIX domain socket it creates is placed at /run/systemd/journal/dev-log, and /dev/log is a symlink to it. There are a few bug reports out there on software refusing to log into a symlink.</p>
<p>But that&#8217;s small potatoes: Since I decided not to bind-mount /run, there&#8217;s no access to this socket from the old system.</p>
<p>The solution is to swap the two: Make /dev/log the UNIX socket (as it was before), and /run/systemd/journal/dev-log the symlink (I wonder if the latter is necessary). To achieve this, copy /lib/systemd/system/systemd-journald-dev-log.socket into /etc/systemd/system/systemd-journald-dev-log.socket. This will make the latter override the former (keep the file name accurate), and make the change survive possible upgrades &#8212; the file in /lib can be overwritten by apt, the one in /etc won&#8217;t be by convention.</p>
<p>Edit the file in /etc, in the part saying:</p>
<pre>[Socket]
Service=systemd-journald.service
<span style="color: #888888;"><strong><span style="color: #ff0000;">ListenDatagram=/run/systemd/journal/dev-log
Symlinks=/dev/log</span>
</strong></span>SocketMode=0666
PassCredentials=yes
PassSecurity=yes</pre>
<p>and swap the files, making it</p>
<pre>ListenDatagram=/dev/log
Symlinks=/run/systemd/journal/dev-log</pre>
<p>instead.</p>
<p><span class="pl-s">All in all this works perfectly. Old programs work  well (try &#8220;logger&#8221; command line utility on both sides). This can cause  problems if the program expects &#8220;the real thing&#8221; on  /run/systemd/journal/dev-log. Quite unlikely.<br />
</span></p>
<p><em>As a side note, I had this idea to make journald listen to two UNIX domain sockets: Dropping the Symlinks assignment in the original .socket file, and copying it into a new .socket file, setting ListenDatagram to /dev/log. Two .socket files, two UNIX sockets. Sounded like a good idea, only it failed with an error message saying &#8220;</em><span class="pl-s"><em>Too many /dev/log sockets passed&#8221;.</em><br />
</span></p>
<h3>Running old services</h3>
<p>systemd&#8217;s take on sysV-style services (i.e. those init.d, rcN.d scripts) is that when systemctl is called with reference to a service, it first tries with its native services, and if none is found, it looks for a service of that name in /etc/init.d.</p>
<p>In order to run old services, I wrote a catch-all init.d script, /etc/init.d/oldy-chrooter. It&#8217;s intended to be symlinked to, so it tells which service it should run from the command used to call it, then chroots, and executes the script inside the old system. And guess what, systemd plays along with this.</p>
<p>The script follows. Note that it&#8217;s written in Perl, but it has the standard info header, which is required on init scripts. String manipulations are easier this way.</p>
<pre>#!/usr/bin/<span style="color: #ff0000;"><strong>perl</strong></span>
### BEGIN INIT INFO
# Required-Start:    $local_fs $remote_fs $syslog
# Required-Stop:     $local_fs $remote_fs $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# X-Interactive:     false
# Short-Description: Oldy root wrapper service
# Description:       Start a service within the oldy root
### END INIT INFO

use warnings;
use strict;

my $targetroot = '/oldy-root';

my ($realcmd) = ($0 =~ /\/oldy-([^\/]+)$/);

die("oldy chroot delegation script called with non-oldy command \"$0\"\n")
  unless (defined $realcmd);

chroot $targetroot or die("Failed to chroot to $targetroot\n");

exec("/etc/init.d/$realcmd", @ARGV) or
  die("Failed to execute \"/etc/init.d/$realcmd\" in oldy chroot\n");</pre>
<p>To expose the chroot&#8217;s httpd service, make a symlink in init.d:</p>
<pre># cd /etc/init.d/
# ln -s oldy-chrooter oldy-httpd</pre>
<p>And then enable with</p>
<pre># systemctl enable oldy-httpd
oldy-httpd.service is not a native service, redirecting to systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable oldy-httpd</pre>
<p>which indeed runs /lib/systemd/systemd-sysv-install, a shell script, which in turn runs /usr/sbin/update-rc.d with the same arguments. The latter is a Perl script, which analyzes the init.d file, and, among others, parses the INFO header.</p>
<p>The result is the SysV-style generation of S01/K01 symbolic links into /etc/rcN.d. Consequently, it&#8217;s possible to start and stop the service as usual. If the service isn&#8217;t enabled (or disabled) with systemctl first, attempting to start and stop the service will result in an error message saying the service isn&#8217;t found.</p>
<p>It&#8217;s a good idea to install the same services on the &#8220;main&#8221; system and disable them afterwards. There&#8217;s no risk for overwriting the old root&#8217;s installation, and this allows installation and execution of programs that depend on these services (or they would complain based upon the software package database).</p>
<h3>Running programs</h3>
<p>Running stuff inside the chroot should be quick and easy. For this reason, I wrote a small C program, which opens a shell within the chroot when called without argument. With one argument, it executes it within the chroot. It can be called by a non-root user, and the same user is applied in the chroot.</p>
<p>This is compiled with</p>
<pre>$ gcc oldy.c -o oldy -Wall -O3</pre>
<p>and placed /usr/local/bin <strong>with setuid root</strong>:</p>
<pre>#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;string.h&gt;
#include &lt;errno.h&gt;
#include &lt;unistd.h&gt;
#include &lt;sys/types.h&gt;
#include &lt;pwd.h&gt;

int main(int argc, char *argv[]) {
  const char jail[] = "/oldy-root/";
  const char newhome[] = "/oldy-root/home/eli/";
  struct passwd *pwd;

  if ((argc!=2) &amp;&amp; (argc!=1)){
    printf("Usage: %s [ command ]\n", argv[0]);
    exit(1);
  }

  pwd = getpwuid(getuid());
  if (!pwd) {
    perror("Failed to obtain user name for current user(?!)");
    exit(1);
  }

  // It's necessary to set the ID to 0, or su asks for password despite the
  // root setuid flag of the executable

  if (setuid(0)) {
    perror("Failed to change user");
    exit(1);
  }

  if (chdir(newhome)) {
    perror("Failed to change directory");
    exit(1);
  }

  if (chroot(jail)) {
    perror("Failed to chroot");
    exit(1);
  }

  // oldycmd and oldyshell won't appear, as they're overridden by su

  if (argc == 1)
    execl("/bin/su", "oldyshell", "-", pwd-&gt;pw_name, (char *) NULL);
  else
    execl("/bin/su", "oldycmd", "-", pwd-&gt;pw_name, "-c", argv[1], (char *) NULL);
  perror("Execution failed");
  exit(1);
}</pre>
<p>Notes:</p>
<ul>
<li>Using setuid root is a number one for security holes. I&#8217;m not sure I would have this thing on a computer used by strangers.</li>
<li>getpwuid() gets the real user ID (not the effective one, as set by setuid), so the call to &#8220;su&#8221; is made with the original user (even if it&#8217;s root, of course). It will fail if that user doesn&#8217;t exist.</li>
<li>&#8230; but note that the user in the chroot system is then one having the same <strong>user name</strong> as in the original one, not uid. There should be no difference, but watch it if there is (security holes&#8230;?)</li>
<li>I used &#8220;su -&#8221; and not just executing bash for the sake of su&#8217;s &#8220;-&#8221; flag, which sets up the environment. Otherwise, it&#8217;s a mess.</li>
</ul>
<p>It&#8217;s perfectly OK to run GUI programs with this trick. However it becomes extremely confusing with command line. Is this shell prompt on the old or new system?  To fix this, edit /etc/bashrc <strong>in the chroot system only </strong>to change the prompt. I went for changing the line saying</p>
<pre>[ "$PS1" = "\\s-\\v\\\$ " ] &amp;&amp; PS1="[\u@\h \W]\\$ "</pre>
<p>to</p>
<pre>[ "$PS1" = "\\s-\\v\\\$ " ] &amp;&amp; PS1="\[\e[44m\][\u@chroot \W]\[\e[m\]\\$ "</pre>
<p>so the &#8220;\h&#8221; part, which turns into the host&#8217;s name now appears as &#8220;chroot&#8221;. But more importantly, the text background of the shell prompt is changed to blue (as opposed to nothing), so it&#8217;s easy to tell where I am.</p>
<p>If you&#8217;re into playing with the colors, I warmly recommend <a title="Setting PS1 with color codes properly with gnome-terminal" href="http://billauer.co.il/blog/2018/12/ansi-escape-color-bash-prompt/" target="_blank">looking at this</a>.</p>
<h3>Lifting the user processes limit</h3>
<p>At some point (it took a few months), I started to have failures of this sort:</p>
<pre>$ oldy
oldyshell: /bin/bash: Resource temporarily unavailable</pre>
<p>and even worse, some of the chroot-based utilities also failed sporadically.</p>
<p>Checking with ulimit -a, it turned out that the limit for the number of processes owned by my &#8220;regular&#8221; user was limited to 1024. Checking with ps, I had only about 510 processes belonging to that UID. So it&#8217;s not clear why I hit the limit. In the non-chroot environment, the limit is significantly higher.</p>
<p>So edit /etc/security/limits.d/90-nproc.conf (the one inside the jail), changing the line saying</p>
<pre>-*          soft    nproc     <strong>1024</strong>
</pre>
<p>to</p>
<pre>*          soft    nproc     <strong>65536</strong></pre>
<p>There&#8217;s no need for any reboot or anything of that sort, but the already running processes remain within the limit.</p>
<h3>Desktop icons and wallpaper messup</h3>
<p>This is a seemingly small, but annoying thing: When Nautilus is launched from within the old system, it restores the old wallpaper and sets all icons on the desktop. There are suggestions on how to fix it, but they rely on gsettings, which came after Fedora 12. Haven&#8217;t tested this, but is the common suggestion is:</p>
<pre><code>$ gsettings set org.gnome.desktop.background show-desktop-icons false</code></pre>
<p>So for old systems as mine, first, check the current value:</p>
<pre>$ gconftool-2 --get /apps/nautilus/preferences/show_desktop</pre>
<p>and if it&#8217;s &#8220;true&#8221;, fix it:</p>
<pre>$ gconftool-2 --type bool --set /apps/nautilus/preferences/show_desktop false</pre>
<p>The settings are stored in ~/.gconf/apps/nautilus/preferences/%gconf.xml.</p>
<h3>Setting title in gnome-terminal</h3>
<p>So someone thought that the possibility to set the title in the Terminal window, directly from the GUI,  is unnecessary. That happens to be one of the most useful features, if you ask me. I&#8217;d really like to know why they dropped that. Or maybe not.</p>
<p>After some wandering around, and reading suggestions on how to do it in <a href="https://askubuntu.com/questions/22413/how-to-change-gnome-terminal-title" target="_blank">various other ways</a>, I went for the old-new solution: Run the old executable in the new system. Namely:</p>
<pre># cd /usr/bin
# mv gnome-terminal new-gnome-terminal
# ln -s /oldy-root/usr/bin/gnome-terminal</pre>
<p>It was also necessary to install some library stuff:</p>
<pre># apt install libvte9</pre>
<p>But then it complained that it can&#8217;t find some terminal.xml file. So</p>
<pre># cd /usr/share/
# ln -s /oldy-root/usr/share/gnome-terminal</pre>
<p>And then I needed to set up the keystroke shortcuts (Copy, Paste, New Tab etc.) but that&#8217;s really no bother.</p>
<h3>Other things to keep in mind</h3>
<ul>
<li>Some users and groups must be migrated from the old system to the  new manually. I do this always when installing a new computer to make  NFS work properly etc, but in this case, some service-related users and  groups need to be in sync.</li>
<li>Not directly related, but if the IP address of the host changes (which it usually does), set the updated IP address in /etc/sendmail.mc, and recompile. Or  get an error saying &#8220;opendaemonsocket: daemon MTA: cannot bind: Cannot  assign requested address&#8221;.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://billauer.co.il/blog/2018/11/linux-chroot-system-in-parallel/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>VMplayer: Silencing excessive hard disk activity + getting rid of freezes</title>
		<link>http://billauer.co.il/blog/2017/05/vmplayer-disk-calm/</link>
		<comments>http://billauer.co.il/blog/2017/05/vmplayer-disk-calm/#comments</comments>
		<pubDate>Tue, 30 May 2017 14:18:29 +0000</pubDate>
		<dc:creator>eli</dc:creator>
				<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://billauer.co.il/blog/?p=5231</guid>
		<description><![CDATA[The disk is hammering For some unknown reason, possibly after an VMplayer upgrade, running any Windows Virtual machine on my Linux machine with WMware Player caused some non-stop heavy hard disk activity, even when the guest machine was effectively idle, and made had no I/O activity of its own. Except for being surprisingly annoying, it [...]]]></description>
			<content:encoded><![CDATA[<h3>The disk is hammering</h3>
<p>For some unknown reason, possibly after an VMplayer upgrade, running any Windows Virtual machine on my Linux machine with WMware Player caused some non-stop heavy hard disk activity, even when the guest machine was effectively idle, and made had no I/O activity of its own.</p>
<p>Except for being surprisingly annoying, it also made the mouse pointer non-responsive and the effect was adverse on the hosting machine as well.</p>
<div>
<p>So eventually I managed to get things normal by editing the virtual machine&#8217;s  .vmx file as described below.</p>
<p>I have Vmplayer 6.0.2 on Fedora 12 (suppose both are considered quite old).</p>
<p>Following <a href="https://artykul8.com/2012/06/vmware-performance-enhancing/" target="_blank">this post</a>, add</p>
<pre>isolation.tools.unity.disable = "TRUE"
unity.allowCompositingInGuest = "FALSE"
unity.enableLaunchMenu = "FALSE"
unity.showBadges = "FALSE"
unity.showBorders = "FALSE"
unity.wasCapable = "FALSE"</pre>
<p>(unity.wasCapable was already in the file, so remove it first)</p>
<p>That appeared to help somewhat. But what really gave the punch was also adding</p>
<pre>MemTrimRate = "0"
sched.mem.pshare.enable = "FALSE"
MemAllowAutoScaleDown = "FALSE"</pre>
<p>Don&#8217;t ask me what it means. Your guess is as good as mine.</p>
<h3>The Linux desktop freezes</h3>
<p>Freezes = Cinnamon&#8217;s clock stops advancing for a minute or so. Apparently, it&#8217;s the graphics that doesn&#8217;t update for about 1.5 second for each time that the mouse pointer goes on or off the area belonging to the guest&#8217;s display. But it accumulates, so moving the mouse all over the place trying to figure out what&#8217;s going on easily turns this freeze out to a whole minute.</p>
<p><del>Just turn off the display&#8217;s hardware acceleration. That is, enter the virtual machine settings the GUI menus, pick the display, and uncheck &#8220;Accelerate 3D graphics&#8221;. Bliss.</del></p>
<p>Nope, it didn&#8217;t help. :(</p>
<p>Also tried to turn off the usage of OpenGL with</p>
<pre>mks.noGL = "FALSE"</pre>
<p>and indeed there was nothing OpenGL related in the log file (vmware.log), but the problem remained.</p>
<p>This command was taken from a <a href="https://www.basvanbeek.nl/linux/undocumented-vmware-vmx-parameters/" target="_blank">list of undocumented parameters</a> (there also <a href="http://www.sanbarrow.com/vmx/vmx-advanced.html" target="_blank">this one</a>).</p>
<p>Upgrading to VMPlayer 15.5.6 didn&#8217;t help. Neither did adding vmmouse.present = &#8220;FALSE&#8221;.</p>
<p>But after the upgrade, my Windows XP got horribly slow, and it seems like it had problems accessing the disk as well (upgrading is always a good idea, as we all know). Programs didn&#8217;t seem to launch properly and such. I may have worked that around by setting the VM&#8221;s type to &#8220;Other&#8221; (i.e. not something Windows related). That turns VMTools off, and maybe that&#8217;s actually a good idea.</p>
<p>The solution I eventually adopted was to use VMPlayer as a VNC server. So I ignore the emulated display window that is opened directly by VMPlayer, and connect to it with a VNC viewer on the local machine instead. Rather odd, but works. The only annoying that Alt-Tab and Alt-Shift keystrokes etc. aren&#8217;t captured by the guest. To set this up, go to the virtual machine settings &gt; Options &gt; VNC Connections and set to enabled. If the port number is set to 5901 (i.e. 5900 with an offset of 1), the connection is done with</p>
<pre>$ vncviewer :1 &amp;</pre>
<p>(or pick your other favorite viewer).</p>
<h3>The computer is a slug</h3>
<p>On a newer machine, with 64 GiB RAM and a more recent version of VMPlayer, it took a few seconds to go back and forth from the VMPlayer window to anything else. The fix, as root is:</p>
<pre># echo never &gt; /sys/kernel/mm/transparent_hugepage/defrag
# echo never &gt; /sys/kernel/mm/transparent_hugepage/enabled</pre>
<p>taken from <a href="https://unix.stackexchange.com/questions/161858/arch-linux-becomes-unresponsive-from-khugepaged/185172#185172" target="_blank">here</a>. There&#8217;s still some slight freezes when working on a window that overlaps the VMPlayer window (and other kinds of backs and forths with VMPlayer), but it&#8217;s significantly better this way.</p>
<p>&nbsp;</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://billauer.co.il/blog/2017/05/vmplayer-disk-calm/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Playing with Linux namespaces</title>
		<link>http://billauer.co.il/blog/2015/07/linux-namespaces-unshare/</link>
		<comments>http://billauer.co.il/blog/2015/07/linux-namespaces-unshare/#comments</comments>
		<pubDate>Thu, 16 Jul 2015 07:39:00 +0000</pubDate>
		<dc:creator>eli</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Linux kernel]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://billauer.co.il/blog/?p=4723</guid>
		<description><![CDATA[Intro Linux namespaces is the foundation for container-based virtualization, which is becoming increasingly popular. Aside from the ability to isolate a shell (and the processes it generates) from the &#8220;main environment&#8221;, as is required for this kind of lightweight virtualization, namespaces is useful for overriding selected functionalities. So I&#8217;m jotting down things I use myself. [...]]]></description>
			<content:encoded><![CDATA[<h3>Intro</h3>
<p>Linux namespaces is the foundation for container-based virtualization, which is becoming increasingly popular. Aside from the ability to isolate a shell (and the processes it generates) from the &#8220;main environment&#8221;, as is required for this kind of lightweight virtualization, namespaces is useful for overriding selected functionalities.</p>
<p>So I&#8217;m jotting down things I use myself.</p>
<h3>Using an ad-hoc host name</h3>
<pre>[root@mycomputer eli]# uname -n
mycomputer.localdomain
[root@mycomputer eli]# unshare -u bash
[root@mycomputer eli]# uname -n
<strong>mycomputer.localdomain</strong>
[root@mycomputer eli]# hostname newname.localdomain
[root@mycomputer eli]# uname -n
<strong>newname.localdomain</strong>
[root@mycomputer eli]# <strong>exit</strong>
[root@mycomputer eli]# uname -n
mycomputer.localdomain</pre>
<p>Note that unshare started a new bash shell, with a separate namespace. That&#8217;s why hostname&#8217;s effect ended when pressing CTRL-D and exiting the shell (where it says &#8220;exit&#8221;).</p>
<p>The truth is, that &#8220;unshare [options] bash&#8221; and just &#8220;unshare [options]&#8221; do the same thing &#8212; in the absence of a program to execute, unshare kicks off a new shell.</p>
<h3>Hiding the network from a program</h3>
<p>Some programs have an annoying &#8220;feature&#8221; of reporting back (hopefully anonymous) information to its vendor&#8217;s server. If it has nothing to do on the internet, it can be run from a subshell that can&#8217;t possibly see any network interface. Except for loopback, of course. For example,</p>
<pre># unshare -n bash
# ifconfig lo up
# ifconfig
lo        Link encap:Local Loopback 
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)</pre>
<p>I should mention that iptables allows rules per user, and there&#8217;s always SELinux for those with strong nerves, so these are alternative solutions. But a network namespace looks much simpler to me.</p>
<p>The loopback interface is absent unless enabled as shown above.</p>
<p>In a more sophisticated version, one can add a tunnel interface to the &#8220;real world&#8221; and apply iptables rules on the dedicated interface. Note that iptables is local to the new network namespace, so it&#8217;s empty. The firewall rules can therefore be written inside the network namespace so the main firewall isn&#8217;t touched, or rules can be set up on the main firewall for the tunnel interface in the &#8220;real world&#8221; network namespace.</p>
]]></content:encoded>
			<wfw:commentRss>http://billauer.co.il/blog/2015/07/linux-namespaces-unshare/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>VMware Player or Workstation: Patching for Linux kernel 3.12 (or so)</title>
		<link>http://billauer.co.il/blog/2014/05/vmware-player-workstation-patches-kernel-3-12/</link>
		<comments>http://billauer.co.il/blog/2014/05/vmware-player-workstation-patches-kernel-3-12/#comments</comments>
		<pubDate>Fri, 30 May 2014 14:59:25 +0000</pubDate>
		<dc:creator>eli</dc:creator>
				<category><![CDATA[Linux kernel]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://billauer.co.il/blog/?p=4257</guid>
		<description><![CDATA[For a reason not so clear to me, VMware doesn&#8217;t keep its drivers up to date with newer kernels, so they fail to compile against newer kernels. Consequently, there&#8217;s an insane race to patch them up. It starts with a compilation failure at the GUI level, and sooner or later it becomes clear that there&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>For a reason not so clear to me, VMware doesn&#8217;t keep its drivers up to date with newer kernels, so they fail to compile against newer kernels. Consequently, there&#8217;s an insane race to patch them up. It starts with a compilation failure at the GUI level, and sooner or later it becomes clear that there&#8217;s no choice but to get down to work.</p>
<p>The procedure, if you don&#8217;t want to install VMware on the computer you&#8217;re working on, is to first extract the files with something like (as a non-root user)</p>
<pre>$ bash VMware-Player-6.0.2-1744117.x86_64.bundle --extract newplayer</pre>
<p>which just writes the files into a new directory newplayer/.</p>
<p>Next step is to untar the relevant directories into a fresh working directory. May I suggest:</p>
<pre>for i in <span style="color: #888888;">/path/to</span>/newplayer/vmware-vmx/lib/modules/source/*.tar ; do tar -xvf $i ; done</pre>
<p>This creates five new directories, each containing a Makefile and kernel module sources. In principle, the goal is to type &#8220;make&#8221; in all five and not have an error.</p>
<p>That&#8217;s where the headache is. I&#8217;ve packed up a set of patches that took me from Player 6.0.2 (or Workstation 10.0.2) to kernel 3.12 (in a rather messy way, but hey, nothing about it is really in order): <a href="http://billauer.co.il/blog/wp-content/uploads/2014/05/patches.tar.gz">Here they are as a tarball.</a></p>
<p>Once that is done, wrap it all up with</p>
<pre>$ for i in vmblock vmci vmmon vmnet vsock ; do tar -cf $i.tar $i-only ; done</pre>
<p>and copy the *.tar files into /usr/lib/vmware/modules/source/ (actually, replace the existing files) in an existing installation.</p>
<p>And then just run VMware as usual, and behave nicely when it wants to install modules.</p>
<p>Or if you want to see it with your own eyes (as root):</p>
<pre># vmware-modconfig --console --install-all</pre>
<h3>VMCI issues</h3>
<p>As if it wasn&#8217;t enough as is, there was a problem with running the VMCI:</p>
<pre>Starting VMware services:
 Virtual machine monitor                                 [  OK  ]
 Virtual machine communication interface                 [<span style="color: #ff0000;"><strong>FAILED</strong></span>]
 VM communication interface socket family                [  OK  ]
 Blocking file system                                    [  OK  ]
 Virtual ethernet                                        [  OK  ]
 VMware Authentication Daemon                            [  OK  ]</pre>
<p>And of course, so virtual machine would agree to run this way.</p>
<p>After a lot of messing around, it turned out that for some reason, the module wasn&#8217;t installed. Go figure.</p>
<p>The solution was to go back to the vmci-only directory (one of the untarred one above), compile it with &#8220;make&#8221; (it should work by now, after all), and then copy it to the running kernel&#8217;s module repository:</p>
<pre># cp vmci.ko /lib/modules/$(uname -r)/kernel/drivers/misc/
# depmod -a</pre>
<p>Or maybe just create a kernel/drivers/vmware directory and copy all *.ko files that were compiled, and depmod.</p>
<p>I. Never. Want. To hear. About. This. Again.</p>
]]></content:encoded>
			<wfw:commentRss>http://billauer.co.il/blog/2014/05/vmware-player-workstation-patches-kernel-3-12/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Running a nested X-Windows server</title>
		<link>http://billauer.co.il/blog/2013/10/xnest-prevent-focus-stealing/</link>
		<comments>http://billauer.co.il/blog/2013/10/xnest-prevent-focus-stealing/#comments</comments>
		<pubDate>Wed, 30 Oct 2013 19:35:25 +0000</pubDate>
		<dc:creator>eli</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://billauer.co.il/blog/?p=3886</guid>
		<description><![CDATA[Why? It&#8217;s sometimes desired to run an X-Windows program in a separate &#8220;screen&#8221; but not actually have another screen. The expensive way is to bring up a whole virtual server. But if it&#8217;s fine to run the program on the same computer, all we want is to have a window, in which the program is [...]]]></description>
			<content:encoded><![CDATA[<h3>Why?</h3>
<p>It&#8217;s sometimes desired to run an X-Windows program in a separate &#8220;screen&#8221; but not actually have another screen. The expensive way is to bring up a whole virtual server. But if it&#8217;s fine to run the program on the same computer, all we want is to have a window, in which the program is confined.</p>
<p>This is handy if the program has a tendency to steal focus with popups all the time.</p>
<p>It&#8217;s also useful for opening windows from a remote machine, and the regular X server refuses despite being generous with &#8220;xhost +&#8221;. The nested server isn&#8217;t picky with who it&#8217;s hosting.</p>
<h3>Some installations</h3>
<p>First, install Xnest if it&#8217;s not already installed, e.g. (as root)</p>
<pre># yum install Xnest</pre>
<p>It&#8217;s also possible to install a very simple (and somewhat yucky) window manager</p>
<pre># yum install twm</pre>
<h3>Action</h3>
<p>Then open a new window, which turns into a new X server:</p>
<pre>$ Xnest -s 0 -ac :1 -geometry 1900x1020+5+0&amp;</pre>
<p>The dimensions given by the &#8220;geometry&#8221; flag are those making a full screen coverage on my monitor. This varies, of course.</p>
<p>Launch a Window Manager and a terminal window in the new X server. The former is necessary to make windows movable, resizable, etc.</p>
<pre>$ twm -display :1
$ DISPLAY=:1 gnome-terminal &amp;</pre>
<p>Note that apparently nothing happens after launching the first command, because there are no clients in the Xnest window.</p>
<p>And then use the terminal to run applications inside the nested X-window server.</p>
<h3>twm too yucky?</h3>
<p>The Gnome Window Manager can be used instead of the command issuing twm:</p>
<pre>$ DISPLAY=:1 gnome-wm &amp;</pre>
<p>The reason not to use Gnome&#8217;s window manager is that it allows minimizing windows. If that is done accidentally, the window is lost (unless a bottom panel is generated, which starts to get a bit messy for a simple task).</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://billauer.co.il/blog/2013/10/xnest-prevent-focus-stealing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Setting up a VPS server. It was a bumpy road.</title>
		<link>http://billauer.co.il/blog/2013/01/vps-service-openvz-hosting/</link>
		<comments>http://billauer.co.il/blog/2013/01/vps-service-openvz-hosting/#comments</comments>
		<pubDate>Fri, 11 Jan 2013 11:32:06 +0000</pubDate>
		<dc:creator>eli</dc:creator>
				<category><![CDATA[email]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Server admin]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://billauer.co.il/blog/?p=3341</guid>
		<description><![CDATA[Introduction These are my own notes as I set up an OpenVZ VPS server, based upon CentOS 5.6 to function as a web and mailing list server. A $36/year 128 MB RAM machine was good enough for this. Since there&#8217;s some criticism about the hosting provider, and it looks like they&#8217;re OK after all, I&#8217;m [...]]]></description>
			<content:encoded><![CDATA[<h3>Introduction</h3>
<p>These are my own notes as I set up an OpenVZ VPS server, based upon CentOS 5.6 to function as a web and mailing list server. A $36/year 128 MB RAM machine was good enough for this.</p>
<p>Since there&#8217;s some criticism about the hosting provider, and it looks like they&#8217;re OK after all, I&#8217;m leaving their name out for now. The main purpose of this post is to help myself getting started again, if that is ever necessary (I sure hope it will never be).</p>
<h3>Foul #1: Mails from hosting provider marked as spam</h3>
<p>This is the first time it happens to me that automated emails from any service providers go to Gmail&#8217;s spam box. That includes the welcome mails as I subscribed, the payment confirmation and the message telling me the server is up. And messages about support tickets. None arrived.</p>
<p>Spamassassin gives these mails some points (1.3 or so) as well. I&#8217;ve hardly seen anything like this from any decent automatic mail producer. I first thought this was a major blunder, but then it turns out that machine-generated emails tend to get caught by spam filters. Since email messages that are relayed by a mailing list (mailman) don&#8217;t get caught, it looks like the spam filter checks the &#8220;received&#8221; chain of headers for the first hops of the message, and tries to figure out if that&#8217;s a decent ISP there. Just a wild guess.</p>
<p>Workaround: Add a filter in Gmail to never send emails from *@the-hosting-provider.com to the spam box. Simple, when you know about it.</p>
<h3>Foul #2: 12 hours from payment to server running</h3>
<p>Even for a low-cost service, 12 hours of &#8220;pending&#8221; is a bit too much. In particular when $36 have been paid. That alone filters out most scammers, I suppose.</p>
<h3>Foul #3: Root password not set</h3>
<p>Maybe I was naive to expect that the root password would be set in the server, so I tried to SSH the server with the password I had assigned during the subscription, but was consistently denied.</p>
<p>Workaround: Enter the VPS control panel and change the password.</p>
<h3>Foul #4: Uncertified HTTPS link</h3>
<p>The control panel of the VPS is accessed with a link to an IP address. Which is a bit weird, but let&#8217;s leave that alone. I mean, what about buying a domain for that purpose? To make things even worse, they supply an HTTPS link as well. Which works, but makes the browser display a scare &#8220;GET ME OUT OF HERE&#8221; message.</p>
<p>An uncertified HTTPS link is better than HTTP, even though cryptologists will argue that in the absence of a certificate, a man-in-the-middle attack is possible. But let&#8217;s get serious. It&#8217;s not really dangerous. It&#8217;s just yet another sign that they don&#8217;t give a shit. Setting up a domain and certifying it something you would expect from any serious company, just to avoid that scary warning message. But they didn&#8217;t.</p>
<h3>Bump #1: Lacking yum repository</h3>
<p>Among the first thing I did after logging in (because I&#8217;m addicted):</p>
<pre># yum install git
Loaded plugins: fastestmirror
Determining fastest mirrors
 * base: mirror01.th.ifl.net
 * extras: mirror01.th.ifl.net
 * updates: mirror01.th.ifl.net
base                                                    | 1.1 kB     00:00    
base/primary                                            | 967 kB     00:00    
base                                                                 2725/2725

[ ... yada yada ... ]

vz-updates/primary                                      | 1.0 kB     00:00    
vz-updates                                                                 3/3
Setting up Install Process
<strong>No package git available.</strong>
Nothing to do</pre>
<p>Are you kidding me? Using</p>
<pre># rpm -qa --last | head</pre>
<p>I got a list of packages installed, many of which were marked with &#8220;el5&#8243;, which isn&#8217;t surprising,</p>
<pre># cat /etc/redhat-release
CentOS release 5.6 (Final)</pre>
<p>since it&#8217;s a CentOS 5 distro (EL = Enterprise Linux).</p>
<p>The list of existing repos:</p>
<pre># yum repolist
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror01.th.ifl.net
 * extras: mirror01.th.ifl.net
 * updates: mirror01.th.ifl.net
repo id                                repo name                                      status
base                                   CentOS-5 - Base                                enabled: 2,725
extras                                 CentOS-5 - Extras                              enabled:   286
updates                                CentOS-5 - Updates                             enabled: 1,003
vz-base                                vz-base                                        enabled:     5
vz-updates                             vz-updates                                     enabled:     3
repolist: 4,02</pre>
<p>So where&#8217;s git? On Fedora 12, I checked where git was loaded from (for comparison) with</p>
<pre>$ yumdb info 'git*'
Loaded plugins: presto, refresh-packagekit
git-1.7.2.3-1.fc12.x86_64
 changed_by = 1010
 checksum_data = 470af233244731e51076c6aac5007e1eebd2f73f23cd685db7cd8bd6fb2b3dd1
 checksum_type = sha256
 command_line = install git-email
 from_repo = updates
 from_repo_revision = 1291265770
 from_repo_timestamp = 1291266900
 reason = user
 releasever = 12

[ ... here comes info about git-deamon and other packages ]</pre>
<p>So CentOS&#8217; repository doesn&#8217;t have git? That looks odd to me. A last try:</p>
<pre># yum list available | less</pre>
<p>No, git wasn&#8217;t on the list. The fix was to add <a href="http://repoforge.org/" target="_blank">Repoforge </a>to the list of repositories on the server (following the <a href="http://repoforge.org/use/" target="_blank">instructions</a>):</p>
<pre># wget http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.2-2.el5.rf.i386.rpm
# rpm -i rpmforge-release-0.5.2-2.el5.rf.i386.rpm</pre>
<p>And then &#8220;yum install git&#8221; went fine.</p>
<h3>Bump #2: Bad host name</h3>
<p>OK, it&#8217;s partly my fault. A short host name (without a dot) isn&#8217;t good enough. At least not for sending mails. A fully-qualified host name (such as example.com, as opposed to just &#8220;example&#8221;) is needed. Or sendmail starts up very slowly and then refuses to send mails.</p>
<h3>Bump #3: Setting up reverse DNS</h3>
<p>For the server to be able to send emails that aren&#8217;t immediately detected as spam, its IP must have the reverse DNS set to its host name.</p>
<p>There is a place to edit the rDNS name in the control panel (under &#8220;Network&#8221;) but it was disabled. Contact support, it said. So I did.</p>
<p>Having rDNS disabled by default is somewhat understandable to at least keep an eye on spammers. On the other hand, show me a spammer paying $36 upfront.</p>
<p>It took support 11 hours to answer this support request, asking me to supply the rDNS record I needed for manual setting. The actual fix came an hour later, so overall this was fairly OK.</p>
<p>The automatic feature is simply not supported. But it&#8217;s not like decent people need to change their rDNS every day.</p>
<h3>Bump #4: No swap</h3>
<p>It seems like there is no way to activate a swap file on the VPS server (in particular, losetup returns with &#8220;permission denied&#8221; so there is nothing to attach the swap partition to). So there&#8217;s no choice than to make sure that the overall memory consumption doesn&#8217;t exceed the allocation of virtual RAM, which is 128 MB in my case. Or processes will just die. I can understand the commercial sense in this limitation: If users would start putting large swap files on their systems, they would buy lower-cost machines and then complain that they&#8217;re not responsive.</p>
<p>The figure to keep track of is the amount of free + cached memory. For example,</p>
<pre>$ cat /proc/meminfo
MemTotal:         131072 kB
MemFree:           47516 kB
Cached:            27156 kB
[ ... ]</pre>
<p>The free memory is 47416 + 27156 = 74672 kB, which means 131072 &#8211; 74672 = 56400 kB is used. These are the figures that appear in the Control Panel.</p>
<h3>Installation note: Setting up sendmail</h3>
<p>By default, sendmail doesn&#8217;t accept external connections. Edit /etc/mail/sendmail.mc, changing</p>
<pre>DAEMON_OPTIONS(`Port=smtp,Addr=127.0.0.1, Name=MTA')dnl</pre>
<p>to</p>
<pre>DAEMON_OPTIONS(`Port=smtp, Name=MTA')dnl</pre>
<p>This removes the restriction that only the local address is listened to. And also from</p>
<pre>dnl DAEMON_OPTIONS(`Port=submission, Name=MSA, M=Ea')dnl</pre>
<p>to</p>
<pre>DAEMON_OPTIONS(`Port=submission, Name=MSA, M=Ea')dnl</pre>
<p>(remove the &#8220;dnl&#8221; hence uncommenting the line).</p>
<p>Check with the hosting provider if they supply a mail relay server. Relaying through a well-reputed server can decrease the spam score of your mails. Besides, the hosting provider can decide to block all outgoing direct connections with mail servers out of the blue, because spams were flying out from their servers.</p>
<p>A line like the following should be added (possibly close to where SMART_HOST is mentioned in the file).</p>
<pre>define(`SMART_HOST',`relay.myprovider.com')dnl</pre>
<p>And then compile the file into sendmail.cf, and restart the server as follows:</p>
<pre># make -C /etc/mail
# service sendmail restart</pre>
<p>Then test the server, possibly using <a title="A perl script sending mails for testing a mail server" href="http://billauer.co.il/blog/2013/01/perl-sendmail-exim-postfix-test/" target="_blank">a script</a>. In particular, verify that the mail server isn&#8217;t relaying (accepting messages to other domains) or the server turns into a spam machine.</p>
<h3>Installation note: Mailman</h3>
<pre># yum install mailman</pre>
<p>Then customize. See instructions <a href="http://minhtech.com/featuredlinux/install-and-configure-mailman/" target="_blank">here</a> and also have a look at /usr/share/doc/mailman-2.1.9/INSTALL.REDHAT. There&#8217;s a need to access the host by its domain name (as opposed to just the IP address) so the local computer&#8217;s /etc/hosts may need to be fiddled with when working on a server not yet allocated with the new address.</p>
<p><strong>Note</strong>: Do <strong>not</strong> edit /usr/lib/mailman/Mailman/mm_cfg.py for changing DEFAULT_URL_HOST and DEFAULT_EMAIL_HOST, if they happen to say</p>
<pre>DEFAULT_URL_HOST   = fqdn
DEFAULT_EMAIL_HOST = fqdn</pre>
<p>because in this case they&#8217;re set up automagically.</p>
<p>First, make sure that the mailman daemon is off:</p>
<pre># service mailman stop</pre>
<p>To <a href="http://vuksan.com/linux/mailman_moving_lists.html" target="_blank">migrate a few lists from one server to another</a>, copy the respective lists in /var/lib/mailman/{lists,archives} into the new server. Note that the &#8220;data&#8221; directory doesn&#8217;t contain any information on the lists, so therefore just adding these directories is enough.</p>
<p>The lists will not appear in the web interface if there was a domain switch during the list migration, as can be observed by searching for &#8216;web_page_url&#8217; in the output of</p>
<pre># /usr/lib/mailman/bin/dumpdb config.pck</pre>
<p>To fix this, go (for each list)</p>
<pre># /usr/lib/mailman/bin/withlist -l -r fix_url the-list-name</pre>
<p>Make sure the files are owned by mailman with</p>
<pre># chown -R mailman:mailman ... the copied directories ...</pre>
<p>At this point, the list should appear on the web console. Copy the entries into /etc/aliases, more or less like this:</p>
<pre>## listname mailing list
listname:              "|/usr/lib/mailman/mail/mailman post listname"
listname-admin:        "|/usr/lib/mailman/mail/mailman admin listname"
listname-bounces:      "|/usr/lib/mailman/mail/mailman bounces listname"
listname-confirm:      "|/usr/lib/mailman/mail/mailman confirm listname"
listname-join:         "|/usr/lib/mailman/mail/mailman join listname"
listname-leave:        "|/usr/lib/mailman/mail/mailman leave listname"
listname-owner:        "|/usr/lib/mailman/mail/mailman owner listname"
listname-request:      "|/usr/lib/mailman/mail/mailman request listname"
listname-subscribe:    "|/usr/lib/mailman/mail/mailman subscribe listname"
listname-unsubscribe:  "|/usr/lib/mailman/mail/mailman unsubscribe listname"</pre>
<p>And finally, turn the service on again:</p>
<pre># service mailman start</pre>
<p>And make it a permanent service:</p>
<pre># chkconfig mailman on</pre>
<p>Changing the subscription confirmation message (to tell users to look in the spam folder): Edit /usr/lib/mailman/templates/en/subscribe.html, and remove /var/lib/mailman/lists/{listname}/en/subscribe.html (if it exists, and contains nothing special for the list).</p>
<p>Then restart qrunner to flush cached templates:</p>
<pre># /usr/lib/mailman/bin/mailmanctl restart</pre>
<p>See <a href="http://mail.python.org/pipermail/mailman-users/2004-June/037497.html" target="_blank">these</a> <a href="http://mail.python.org/pipermail/mailman-users/2004-June/037650.html" target="_blank">two</a> pages for more info about this (even though they&#8217;re not very accurate).</p>
]]></content:encoded>
			<wfw:commentRss>http://billauer.co.il/blog/2013/01/vps-service-openvz-hosting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Virtualization: Notes to self</title>
		<link>http://billauer.co.il/blog/2010/01/virtualization-notes-to-self/</link>
		<comments>http://billauer.co.il/blog/2010/01/virtualization-notes-to-self/#comments</comments>
		<pubDate>Fri, 15 Jan 2010 21:29:42 +0000</pubDate>
		<dc:creator>eli</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://billauer.co.il/blog/?p=414</guid>
		<description><![CDATA[This is just things I wrote down while playing with QEMU/KVM virtualization, for my own purposes of packing two existing computers into a third one. There is no point to make here, and neither do I expect anyone to understand this. It&#8217;s published because I don&#8217;t care to. Log files There are definitely two files [...]]]></description>
			<content:encoded><![CDATA[<p>This is just things I wrote down while playing with QEMU/KVM virtualization, for my own purposes of packing two existing computers into a third one. There is no point to make here, and neither do I expect anyone to understand this. It&#8217;s published because I don&#8217;t care to.</p>
<h3>Log files</h3>
<p>There are definitely two files one wants to peek at every now and then</p>
<ul>
<li>/var/log/libvirt/qemu/{guest-name}.log (in particular when a USB device doesn&#8217;t attach)</li>
<li>/var/log/audit/audit.log (SELinux audit, possibly piped to grep -E &#8216;^type=(AVC|SELINUX\_ERR)&#8217; to reduce amount of junk)</li>
</ul>
<p>Start with running virt-manager <strong>as a non-root user</strong> (it will fail with a nondescriptive message if trying to run it as root). A root password will have to be fed.</p>
<h3>Use qcow2 disk images</h3>
<p>If you&#8217;re going to play around with the image, and then maybe want to get rid if the changes, this is sooo simple:</p>
<pre>#qemu-img create -F raw -b clean-hda.img -f qcow2 hda.qcow2</pre>
<p>Note that qemu-img can also create and apply snapshots of images, which is also good.</p>
<p>Don&#8217;t try to use Virtual Machine manager&#8217;s clone utility for qcow2 images, since the tool will rewrite all data.</p>
<h3>USB device passtrough</h3>
<p>It looks like libvirt <a href="https://bugzilla.redhat.com/show_bug.cgi?id=550951" target="_blank">doesn&#8217;t massage</a> the permissions for USB character devices, before adopting them, so both classic permissions and SELinux blow when trying to run passthrough. The current workaround would be to change the USB device&#8217;s classic permission manually, and run in permissive mode. Which is indeed unhealthy, but harmless given the fact that XP gives a healthy blue screen in response to the new device. I think I saw XP complaining something about USB 2.0 versus USB 1.1.</p>
<p>Running the same scenario with a knoppix LiveCD image, I managed to find the device (a Canon 500D camera) with lsusb. The root USB hub was shown to be a UHCI. I&#8217;m not sure whether a real PTP camera would respond to a UHCI hub, which maybe explains why Windows got confused finding such a camera under the emulated hub.</p>
<p>The emulator appears as /usr/bin/qemu-kvm under the SELinux domain (type) svirt_t, which is decleared in /usr/share/selinux/devel/include/services/virt.if (which you don&#8217;t want to read).</p>
<p>USB devices appear somewhere under /dev/bus/usb/ as character devices with SELinux type usb_device_t.</p>
<h3>Command line</h3>
<p>Create a guest, run it and pause it. Then dump its XML by something like (as root):</p>
<pre># virsh dumpxml try1 &gt; ~eli/virt/try1.xml</pre>
<p>Then play around with the XML file a bit, destroy the previous domain, and</p>
<pre># virsh create ~eli/virt/try2.xml</pre>
<p>How can I manually manipulate the list of guests?</p>
<pre>virsh # start demo
error: Failed to start domain demo
error: internal error unable to start guest: qemu: could not open disk image /home/eli/virt/tryxp.img: Permission denied</pre>
<p>Reason(?): tryxp.img is of the wrong context, so <a href="http://www.mail-archive.com/libvir-list@redhat.com/msg11381.html" target="_blank">SELinux prevents it to be opened</a>&#8230;? But I ran SELinux in permissive mode. How could this happen? Or, as put in the <a href="http://docs.fedoraproject.org/virtualization-guide/f12/en-US/html/chap-Virtualization_Guide-Security_for_virtualization.html" target="_blank">Fedora 12 Virtualization Guide</a>:</p>
<blockquote><p>SELinux prevents guest images from loading if SELinux is enabled and the images are not in the correct directory. SELinux requires that all guest images are stored in <code class="filename">/var/lib/libvirt/images</code>.</p></blockquote>
<p>Basically, what solved this was to move the image to the dedicated directory, and going:</p>
<pre># virt-install --force --name demo3 --ram 1024 --import --disk path=/var/lib/libvirt/images/tryxp.img</pre>
<p>or even better:</p>
<p># virt-install &#8211;force &#8211;name demo5 &#8211;ram 1024 &#8211;import &#8211;disk path=/var/lib/libvirt/images/hda.img</p>
<p>For playing around:</p>
<pre># virsh</pre>
<h3>&#8220;Stealing&#8221; command lines from Virtual Machine Manager</h3>
<p>After running a machine under the GUI interface, it&#8217;s possible to do so for running on an external VNC console. Just find the command with ps aux | grep qemu-kvm. The following changes in flags apply:</p>
<ul>
<li>Remove the -S flag. It says that the guest should not start until commanded to do so.</li>
<li>Remove the -montitor flag. We&#8217;re running on VNC only</li>
<li>Change the -vnc flag&#8217;s address to point to an address known to the outside world, if necessary</li>
<li>Add &#8220;-usbdevice tablet&#8221; so that the mouse is followed correctly</li>
<li>Change the -net flags (two of them) to &#8220;-net nic -net user&#8221;. This is said to have a performance hit, but it&#8217;s simple and it works with an internal (fake) DHCP server</li>
</ul>
<h3>The tip of the day</h3>
<p>In the relevant terminology, &#8220;source&#8221; refers to information seen by the <strong>host</strong> OS, while &#8220;target&#8221; to the <strong>guest</strong> OS.</p>
<p>Another little tip: If I try to install Windows 7, and the installation gets stuck for very long periods of time with nothing really happening, maybe it&#8217;s because the disk image is read-only? :-O</p>
<h3>VMPlayer</h3>
<p>Running VMplayer on a 2.6.35 kernel requires a small fix, which was published <a href="http://www.rrfx.net/2010/06/vmware-vmmon-module-compilation-issues.html" target="_blank">here</a>. The thing is that some kernel symbol has changed its name, and hence the vmmon module fails to compile. How I love when people are sensitive about backward compatibility.</p>
<p>To make a long story short, one needs to go to where the VMplayer module sources are, and go:</p>
<pre>$ perl -pi -e 's,_range,,' iommu.c</pre>
<p>which is bit of a Columbus egg, I would say. Also, VMplayer repeatedly wanted to compile the modules every time I started it, because it missed vsock (which wasn&#8217;t compiled in the first place), so I followed the <a href="http://www.rrfx.net/2010/06/vmware-vmmon-module-compilation-issues.html" target="_blank">same page</a>&#8216;s hint and edited /etc/vmware/config to say</p>
<pre>VSOCK_CONFED = "no"</pre>
<p>By the way, I tried to figure out what this module does, and all google results tell you how to tweak and patch. Don&#8217;t people care what they do on their computers? Maybe this component is useful?</p>
<p>The following remark was true when this post was published, but no more:</p>
<p><span style="text-decoration: line-through;">To run VMPlayer under Fedora 12, there&#8217;s a need for a little hack, or the VMPlayer closes <a href="http://forums.fedoraforum.org/showthread.php?p=1314667" target="_blank">right after starting</a>:</span></p>
<pre><span style="text-decoration: line-through;"># cd /usr/lib/vmware/resources/
# mv mozilla-root-certs.crt old-mozilla-root-certs.crt</span></pre>
<p><span style="text-decoration: line-through;">Have no idea why this is.</span></p>
<h3>VMPlayer networking</h3>
<p>The interesting stuff is at /etc/vmware/networking, which pretty much says which interface is NATed and which is host-only. To pick a certain device for bridging, additional lines configuring add_bridge_mapping should be added as explained on <a href="http://cromwell-intl.com/unix/vmware-networking.html" target="_blank">this page</a>.</p>
<p>Also useful to play with</p>
<pre># vmware-networks --status
# vmware-networks --start
# vmware-networks --stop</pre>
<p>etc. (as root)</p>
<h3>Moving an old Linux computer</h3>
<p>The mission: Move an old 2.4.21-kernel based computer into a VMPlayer virtual machine. The strategy was to run a LiveCD on the virtual machine, create an empty ext3 file system on it, copy the files and boot. Caveats follow.</p>
<p>The most important lesson learned, is that everything on the new machine has to be done with a kernel with the same or earlier version compared with the one that will be used. In simple words, the rescue disk must run a matching kernel. In particular, with a newer rescue disk, the ext3 is generated with an inode size of 256, which old kernels don&#8217;t support. Even worse, even if the file system was generated properly, newer kernels (say, 2.6.35) writes things on the disk that will confuse the old kernel. This leads to &#8220;attempt to access beyond end of device&#8221; errors during boot, from 01:00 (the ramdisk) as well as 08:01 (the root filesystem&#8217;s partition).</p>
<p>So fdisk, mkfs.ext3 and mkinitrd must be done with a matching kernel running. And copying the files too, of course. The rescue disk must match, as mentioned above.</p>
<p>The next thing to note is that all hda&#8217;s turn into sda&#8217;s. That needs to be adjusted in /etc/fstab as well as /etc/lilo.conf.</p>
<p>The most difficult thing to handle was the fact that SCSI drivers were not installed in the kernel by default, so the initrd image had to be adjusted. So after the file system is copied, mount it, chroot to it, and go</p>
<pre># mkinitrd --with=scsi_mod --with=sd_mod --with=BusLogic /boot/initrd-2.4.21-mykern.vmware.img 2.4.21-mykern</pre>
<p>The insmods are attempted in the same order the flags appear, and the two latter modules depend on the scsi_mod. So it&#8217;s important to keep the order of the flags as above.</p>
<p>If these modules aren&#8217;t loaded, the generation of /dev/sda and /dev/sda1 doesn&#8217;t occur, resulting in a kernel panic with various complains (pivot_mount fails, init not found and some more).</p>
<p>And then fix /etc/lilo.conf and /etc/fstab and run lilo. It&#8217;s recommended to copy the /boot directory first, so that the kernel image falls within the lower 4 GB. Or pick lba32 option, as lilo will tell you.</p>
<p>And then boot. Mount the ISO image for VMware tools (/dev/hdc) and run ./vmware-install (going with the defaults most of time).</p>
<h3>Converting a VMPlayer machine to VirtualBox</h3>
<p>Make a copy of the entire directory to a new place. No point messing up things. Then create an OVF file, without converting the disks (because it takes forever, and possibly fails)</p>
<pre>$ ovftool --noDisks CleanXP.vmx export.ovf</pre>
<p>For some reason, ovftool doesn&#8217;t fill in the correct names of the disks (I asked not to convert them, not to <strong>ignore</strong> them). The simplest way around this is to remove the disk definitions altogether, and import them manually from Virtualbox. For a VM with three disks, the lines marked in red should be removed:</p>
<pre> &lt;References&gt;
<span style="color: #ff0000;"><strong> &lt;File ovf:href="export-disk1.vmdk" ovf:id="file1" ovf:size="0"/&gt;</strong></span>
<span style="color: #ff0000;"><strong> &lt;File ovf:href="export-disk2.vmdk" ovf:id="file2" ovf:size="0"/&gt;</strong></span>
<span style="color: #ff0000;"><strong> &lt;File ovf:href="export-disk3.vmdk" ovf:id="file3" ovf:size="0"/&gt;</strong></span>
 &lt;/References&gt;
 &lt;DiskSection&gt;
 &lt;Info&gt;Virtual disk information&lt;/Info&gt;
<span style="color: #ff0000;"><strong> &lt;Disk ovf:capacity="10" ovf:capacityAllocationUnits="byte * 2^30" ovf:diskId="vmdisk1" ovf:fileRef="file1" ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized" ovf:populatedSize="0"/&gt;</strong></span>
<span style="color: #ff0000;"><strong> &lt;Disk ovf:capacity="30" ovf:capacityAllocationUnits="byte * 2^30" ovf:diskId="vmdisk2" ovf:fileRef="file2" ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized" ovf:populatedSize="0"/&gt;</strong></span>
<span style="color: #ff0000;"><strong> &lt;Disk ovf:capacity="40" ovf:capacityAllocationUnits="byte * 2^30" ovf:diskId="vmdisk3" ovf:fileRef="file3" ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized" ovf:populatedSize="0"/&gt;</strong></span>
 &lt;/DiskSection&gt;</pre>
<p>and also the other references to these vmdisk<em>n</em>:</p>
<pre>&lt;Item&gt;
 &lt;rasd:AddressOnParent&gt;0&lt;/rasd:AddressOnParent&gt;
 &lt;rasd:ElementName&gt;disk0&lt;/rasd:ElementName&gt;
 &lt;rasd:HostResource&gt;ovf:/disk/vmdisk1&lt;/rasd:HostResource&gt;
 &lt;rasd:InstanceID&gt;7&lt;/rasd:InstanceID&gt;
 &lt;rasd:Parent&gt;6&lt;/rasd:Parent&gt;
 &lt;rasd:ResourceType&gt;17&lt;/rasd:ResourceType&gt;
 &lt;/Item&gt;
 &lt;Item&gt;
 &lt;rasd:AddressOnParent&gt;1&lt;/rasd:AddressOnParent&gt;
 &lt;rasd:ElementName&gt;disk1&lt;/rasd:ElementName&gt;
 &lt;rasd:HostResource&gt;ovf:/disk/vmdisk2&lt;/rasd:HostResource&gt;
 &lt;rasd:InstanceID&gt;8&lt;/rasd:InstanceID&gt;
 &lt;rasd:Parent&gt;6&lt;/rasd:Parent&gt;
 &lt;rasd:ResourceType&gt;17&lt;/rasd:ResourceType&gt;
 &lt;/Item&gt;
&lt;Item&gt;
 &lt;rasd:AddressOnParent&gt;0&lt;/rasd:AddressOnParent&gt;
 &lt;rasd:ElementName&gt;disk2&lt;/rasd:ElementName&gt;
 &lt;rasd:HostResource&gt;ovf:/disk/vmdisk3&lt;/rasd:HostResource&gt;
 &lt;rasd:InstanceID&gt;10&lt;/rasd:InstanceID&gt;
 &lt;rasd:Parent&gt;4&lt;/rasd:Parent&gt;
 &lt;rasd:ResourceType&gt;17&lt;/rasd:ResourceType&gt;
 &lt;/Item&gt;</pre>
<p>Delete the export.mf file. It contains export.ovf&#8217;s hash signature, and it&#8217;s wrong after editing the file.</p>
<p>Then in the VM VirtualBox Manager, pick File &gt; Import Appliance&#8230; and choose export.ovf.</p>
<p>Then add the hard disks by choosing Storage &gt; (Diskette Icon) &gt; Add Hard Disk and pick Choose existing disk. Pick the .vmdk file with no number attached to it (<strong>not</strong> the e.g. *-s011.vmdk).</p>
<p>Enlarge the video memory to 18 MB (or more), or Virtualbox complains that it&#8217;s too little. Enable audio.</p>
<p>That was nice so far. Only problem is that Windows required re-activation because too much hardware changed, and the internet activation failed, probably because the NIC wasn&#8217;t detected. To install the driver, I&#8217;ll need to activate first, and so I was stuck, lost patience and left the whole thing for now.</p>
]]></content:encoded>
			<wfw:commentRss>http://billauer.co.il/blog/2010/01/virtualization-notes-to-self/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
