Mangling win32 executables with a hex editor

This is a short note about how to make small manipulations in executables or DLLs in order to get rid of malware behaviour. For example, if some application pops up a dialog box which I’d like to eliminate. It can also be the final step in cracking (which is very recommended as an educational experience).

Keep in mind that getting something useful done with this technique requires a very good understanding of assembly language, and how high-level languages are translated into machine code.

The general idea is to hook up a debugger (Microsoft Visual Studio’s will do, using Tools > Debug Processes), and try to get a breakpoint exactly where the bad things happens. Then, after verifying that there is a clear relation between the certain point in the code to the undesired behavior, use the debugger to skip it a few times, in order to be sure that it’s the fix. Tracing the API calls can be very helpful in finding the crucial point, as I’ve explained in another post. But if the offending behavior involves some message box (even if the popup only announces the issue), odds are that the critical point can be found by looking at the call stack, when attaching the debugger to the process, with the popup still open. Look for where the caller is the application itself (or the related DLL/OCX).

Lastly, the code must be changed in the original file. Fortunately, this can be done with a hex editor, since no CRC check is performed on executables.

One thing to bear in mind is that the code is possibly relocated when loaded into memory. During this relocation, absolute addresses in the machine code are mangled to point at the right place. This is why any opcode, which contains an address can’t be just changed to NOPs: The linker will do bad things there.

A crucial step is to match the memory viewed in the debugger with a position in the file. First we need to know where the EXE, DLL or OCX are mapped in memory. The Dumper application from the WinAPIOverride32 suite gives the mapped addresses for each component of a process, for example. The file is typically mapped linearly into memory, with the first byte going to the first address in memory. An application such as the PE Viewer (sources can be downloaded from here) can be helpful in getting a closer look on the Portable Executable data structure, but this is usually not necessary.

Once the hex data in the file matches what we see in the debugger, we’re left with pinpointing the position in the hex editor, and make the little change. There are a few classic simple tricks for manipulating machine code:

  • Put a RET opcode in the beginning of the subroutine which does the bad thing. RET’s opcode is 0xC3. Eliminating the call itself is problematic, since the linker may fiddle with the addresses.
  • Put NOPs where some offending operation takes place. NOP’s opcode is 0x90. Override only code which contains no absolute adresses.
  • Insert JMPs (opcode 0xEB). This is cool in particular when there is some kind of JNE or JEQ branching between desired and undesired behavior. Or just to skip some piece of code. This is a two-byte instruction, in which the second byte is a signed offset for how far to jump. Offset zero means NOP (go to the next instruction).

When the mangling is done, I suggest opening the application with the debugger again, and see that the disassembly makes sense, and that the change is correct. Retaining the break point once is good to catch the critical event again, and see that the program flows correctly from there on.

Tracing API calls on Windows

Linux has ltrace. Windows has…? I was looking for applications to trace DLL calls, so I could tell why a certain application goes wrong. The classic way is to get hints from library calls. Or system calls. Or both.

In the beginning, I was turned down by they idea, that most trackers only support those basic system DLLs (kerner32 and friends), but I was soon to find out that one gets loads of information about what the application is at only through them.

I found a discussion about such tools in a forum leading my to the WinAPIOverride32 application (“the tracker” henceforth), which is released under GPL. It doesn’t come with a fancy download installation, so you have to run the WinAPIOverride32.exe yourself and read the help file at WinAPIOverride.chm. Having looked at a handful of commercial and non-commercial alternatives, I have to say that it’s excellent. The only reason I looked further, is because it didn’t run on Windows 2000, which is an understandable drawback, and still, a problem for me.

The documentation is pretty straightforward about what to do, but here’s a short summary anyhow:

You can pick your target application by running it from the tracker, or hook on a live process by picking its process number, or even better, drag an icon over the target process’ window. Then click the green “play” button to the upper left. If you chose to run your process ad hoc, you’ll set up the modules immediately after the launch, and then resume operation by confirming a dialog box. As a side feature, this allows starting a process just to halt it immediately, and let a debugger hook on it, and then resume. A bit ugly, but effective (with Microsoft Studio, for example).

Module filters sets what calls are included or excluded, depending on the modules calling them. One problem with including modules such as kernel32 is that several calls are made while the hooks are made (by the tracker itself), so the log explodes with calls while the target application is paused anyhow. This is solved by using the default exclusion list (NotHookedModuleList.txt). Only be sure to have the “Use list” checked, and have the Apply for Monitoring and Apply for Overriding set. Or hell breaks loose.

At this point, the idea is to select which API calls are monitored. Click the torch. There’s a list of monitor files, basically contain the names of the DLLs to be hooked upon, and the function prototypes. One can pinpoint which functions to monitor or not, but the general idea is that the monitor files are sorted according to subjects which the calls cover (registry, I/O, etc).

Choosing kernel32 will give a huge amount of data, which reveals more or less what the target application was doing. Monitoring “reg” is also useful, as it reveals registry access. Specific other DLLs can be helpful as well.

When huge amounts of data comes out, the log will keep running even after monitoring has been stopped. If this goes on for a while, a dialog box opens, saying that unhooking seems to take too much time, and if it should wait. Answering “no” will cut the log right there, possibly causing the target application to crash. Whether to chop it right there or not is an open question, since the critical event is already over, yes, but we’re not sure whether it has been logged.

To make things easier, just minimize the tracker’s window, since it’s mainly the display of the log which slows things down. Quit the target application, wait for the popup telling about the unload of the process, and then look.

A very nice thing is that it’s possible to create monitor files automatically with the MonitoringFileBuilder.exe application, also in the bundle. Pick an application and create a monitor file for all its special DLLs, or pick a DLL to monitor its calls. The only problem with these files is that since the information about the function prototypes is missing, parsing the arguments is impossible.

It’s possible to write the call logs to an XML file or just a simple text file, of course. The only annoying thing is that the output format it 16-bit unicode. Notepad takes this easily, but simple text editors don’t.

In short, it’s a real work horse. And it just happened to help me solve a problem I had. This is the classic case where free software which was written by the person who uses it takes first prize, over several applications which were written to be sold.

I should also mention the Dumper.exe, which allows connection to any process, and not only dump and modify the process’ memory on the fly, but also see what DLL is mapped where in memory (which is useful when reading in-memory raw assembly with a debugger). Also, it allows displaying the call stack for each thread, which is sometimes more instructive than Microsoft’s debugger (is that really a surprise?).

But since I had a brief look on other things, I wrote my impressions. They may not be so accurate.

Other things I had a look at

There’s SpyStudio API monitor, which I had a look on (it isn’t free software, anyhow. That is, free as in free beer, but not as in freedom). Its main drawback is that it logs only specific functions, and doesn’t appear to allow an easy appliance of hooks to a massive amount of functions. In other words, one needs to know what the relevant functions are, which isn’t useful when one wants to know what an application is doing.

I also had a look on API monitor, which was completely useless to me, since it doesn’t allow adding command line arguments when launching a new process. Not to mention that their trial version completely stinks (buy me! buy me!). Everything useful was blocked in the trial version. I wonder if the real version is better. Was that an application I gladly uninstalled.

API sniffers with names such as Kerberos and KaKeeware Application Monitor seem to include trojan horses, according to AVG. Didn’t take the risk.

Rohitab API Monitor v1.5 (which I picked, since v2 is marked Alpha) wouldn’t let me start a new process, and since I was stupid enough to monitor all calls on all processes, this brought a nasty computer crash (when I killed the tracker). After a correspondence with the author, it turns out that the version is 10 years old, and it is possible to start a process with arguments. Then I tried v2. I would summarize it like this: It indeed looks very pretty, and seems to have a zillion features, but somehow I didn’t manage to get the simplest things done with it. Since I don’t want to rely on contacting the author for clarifications all the time, I don’t see it as an option.

Auto Debug appears pretty promising. It’s not free in any way, though, and even though it caught all kernel32 calls, and has neat dissection capabilities, I couldn’t see how to create a simple text output of the whole log. Maybe because I used the application in trial mode.

The Generic Tracker looks very neat, and it’s a command line application, which makes me like it even more. I didn’t try it though, because it allows tracking only four functions (as it’s based upon break points). But it looks useful for several other things as well.

DigiPro drawing tablet on Fedora 12

I just bought a DigiPro 5″/4″ drawing tablet to run with my Fedora 12. When plugging it in, the system recognized it, but every time I touched the tablet with the stylus pen, the cursor went to the upper left corner. Clicks worked OK, but it looked like the system needed to know the tablet’s dimensions.

To the system, this tablet is UC-LOGIC Tablet WP5540U. How do I know? Because when I plug it in, /var/log/messages gives:

Jun 29 19:49:06 big kernel: usb 6-1: new low speed USB device using uhci_hcd and address 5
Jun 29 19:49:06 big kernel: usb 6-1: New USB device found, idVendor=5543, idProduct=0004
Jun 29 19:49:06 big kernel: usb 6-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Jun 29 19:49:06 big kernel: usb 6-1: Product: Tablet WP5540U
Jun 29 19:49:06 big kernel: usb 6-1: Manufacturer: UC-LOGIC
Jun 29 19:49:06 big kernel: usb 6-1: configuration #1 chosen from 1 choice
Jun 29 19:49:06 big kernel: input: UC-LOGIC Tablet WP5540U as /devices/pci0000:00/0000:00:1d.0/usb6/6-1/6-1:1.0/input/input9
Jun 29 19:49:06 big kernel: generic-usb 0003:5543:0004.0005: input,hidraw1: USB HID v1.00 Mouse [UC-LOGIC Tablet WP5540U] on usb-0000:00:1d.0-1/input0

To get it going, I followed the solution in Fedoraunity (which requires registration to access, would you believe that!)

First, I downloaded the wizardpen RPM package from here.

And installed it:

# rpm -i wizardpen-0.7.0-0.fc12.x86_64.rpm

And then ran the calibration utility. To some this goes with /dev/input/event8, just play with the numbers until hitting gold:

# wizardpen-calibrate /dev/input/event6

Please, press the stilus at ANY
 corner of your desired working area: ok, got 1928,3766

Please, press the stilus at OPPOSITE
 corner of your desired working area: ok, got 30360,28914

According to your input you may put following
 lines into your XF86Config file:

Driver        "wizardpen"
 Option        "Device"    "/dev/input/event6"
 Option        "TopX"        "1928"
 Option        "TopY"        "3766"
 Option        "BottomX"    "30360"
 Option        "BottomY"    "28914"
 Option        "MaxX"        "30360"
 Option        "MaxY"        "28914"

Now, one of the side effects of installing the wizardpen RPM package was that it created a file: /etc/hal/fdi/policy/99-x11-wizardpen.fdi which is a HAL fdi file. If you’ve edited an xorg.conf file, that should be old history. Instead of the mumbo-jumbo above, there is a new mumbo-jumbo, which is supposed to work even if the device is hotplugged. No need to restart X for new devices! Hurray!

So I downloaded the recommended XML file from here,  modified the variables according to my own calibration, and saved the following as /etc/hal/fdi/policy/99-wizardpen.fdi (and trashed the previous file. The names are slightly different, who cares).

<?xml version="1.0" encoding="ISO-8859-1"?>
<deviceinfo version="0.2">
<device>
<!-- This MUST match with the name of your tablet -->
<match key="info.product" contains="UC-LOGIC Tablet WP5540U">
<merge key="input.x11_driver" type="string">wizardpen</merge>
<merge key="input.x11_options.SendCoreEvents" type="string">true</merge>
<merge key="input.x11_options.TopX" type="string">1928</merge>
<merge key="input.x11_options.TopY" type="string">3766</merge>
<merge key="input.x11_identifier" type="string">stylus</merge>
<merge key="input.x11_options.BottomX" type="string">30360</merge>
<merge key="input.x11_options.BottomY" type="string">28914</merge>
<merge key="input.x11_options.MaxX" type="string">30360</merge>
<merge key="input.x11_options.MaxY" type="string">28914</merge>
</match>
</device>
</deviceinfo>

According to the reference mentioned above, there’s a need to relogin. It turns out that replugging the tablet to the USB jack is enough to get it up and running OK.

To make GIMP respond to pressure sensitivity, set up the input device as follows: Edit > Preferences > Input Devices, press Configure Extended Input Devices. Under Devices find the tablet’s name, and choose the mode to Screen.

Using GIMP to get rid of cellulitis

This is a short note about how to get rid of cellulitis on natural skin, using GIMP 2.6 (will most likely work on earlier versions as well).

The truth is that I don’t really understand why this works, but it fixed a nasty case of ugly skin texture in a low key photo. The trick was using the hard light layer mode, which is described in detail in the GIMP documentation. Unfortunately, the explanations and equations didn’t help me much in understanding why it happened as it happened.

So here’s the procedure, as I did it. If it doesn’t work for you, don’t blame me. I have no idea what I actually did.

Original image:

Original image

Original image

Duplicate the layer, and blur the upper layer strongly (Gaussinan blur, radius 40 in our case)

Stage two: Image blurred

Stage two: Image blurred

Set upper layer’s mode as “Hard light”

Stage 3: Hard light applied

Stage 3: Hard light applied

Merge down the upper layer, so they become one layer, and reduce the saturation:

Final result

Final result

This may not look like a significant change, but when zooming out, it is.

Color Range Mapping on GIMP 2.6: Getting it back.

One of the nice things about upgrading software, is not only that there are a lot of new, confusing and useless features, but also that things that used to work in the past don’t anymore. At best, features which one used a lot have completely disappeared. Upgrading to Fedora 12, with its GIMP 2.6, was no exception.

It looks like the GIMP developers found the Color Range Mapping plugin useless, as is apparent from their correspondence.  As Sven Neumann says over there, “Let’s remove those plug-ins then if the code sucks that much. Are they in anyway useful at all?”

Let me answer you, Sven. Yes, it’s very very useful. I don’t know if Photoshop has a similar feature, but color range mapping is extremely useful in photomontage. That’s when you need one exact color at one place, and another exact color at another.

When the absence of the relevant plugin was reported in a bug report, it was said that “Looking at it again, the plug-in really was so badly broken, I would prefer if we would not have to add it back.” Broken or not, I love this plugin.

To fix my own problem, I followed this post and fetched myself an x86-64 rpm of an old version of GIMP. Matching architecture is important, because the plugins are precompiled binaries.

I downloaded just the first RPM I could find of GIMP, which was of version 2.4 and compiled for x86_64, and then extracted the files in an empty directory with

rpm2cpio gimp-2.4.6-1.fc7.x86_64.rpm | cpio -idvm

And then, as root:

cp usr/lib64/gimp/2.0/plug-ins/mapcolor /usr/lib64/gimp/2.0/plug-ins/

and that’s all! Restarting GIMP I found my beloved plugin there. Happy, happy, joy, joy!

Update 14.7.21: In Gimp 2.10, there’s a separate directory for each plug-in, but all in all, it’s the same:

# cd /usr/lib/gimp/2.0/plug-ins/
# mkdir mapcolor
# cp /path/to/old/gimp/2.0/plug-ins/mapcolor mapcolor/

Enumerating FSM states automatically for Verilog with Perl

Having a pretty large state machine, I wanted the states enumerated automatically. Or at least not do the counting by hand. I mean, doing it once is maybe bearable, but what if I’ll want to insert a new state in the future?

So what I was after is something like

module main_state #(parameter
		    ST_start = 0,
		    ST_longwait1 = 1,
		    ST_longwait2 = 2,
		    ST_synthesizer = 3,
		    ST_synth_dwell = 4)

to support a state machine like

	case (state)
	  ST_start:
	    begin
	       state <= ST_longwait1;
	       reset <= 1;
	    end 

	  ST_longwait1:
	    begin
              (...)
	    end

and so on.  Only with more than 20 states.

The solution is short and elegant. At command prompt (DOS window for unfortunate Windows users), this simple Perl one-liner does the job.

perl -ne 'print "$1 = ".$i++.",\n" if /^[ \t]*(\w+)[ \t]*:/;' < main_state.v

where main_state.v is the Verilog module’s file, of course. The script looks for anything textual which starts a line and is followed by a ‘:’. This is not bulletproof, but is likely to work. The output should be a list of state number assignments, which you can copy-paste into the code.

So this script will work only if there’s a single case statement in the input, which is the Verilog module itself. If there are several, just copy the relevant case statement into a separate file, and put that file’s name instead of main_state.v in the example above.

If you happen to be impressed by the magic, then you should probably take some time to play around with Perl. Every minute spent on that will be regained later. Believe me.

And if you don’t have Perl on your computer, that surely means you have a bad taste for operation systems. There are two possible ways to come around this:

  • If you have Xilinx ISE installed on your computer, try writing “xilperl” instead of “perl” above. xilperl is just a normal Perl interpreter, installed along with the Xilinx tools.
  • Download Perl for your operating system. It’s free (as in freedom) software, so there is no reason why you should pay a penny for this. There are many sources for Perl. I gave the link to ActivePerl, which is the one I happen to know about.

ImageMagick convert: Making viewable copies of underexposed images

The problem is relatively simple:  Sometimes I take images that are deliberately underexposed, or such that have important parts in the dark areas. This is then fixed with GIMP. But in order to choose which image to play with, I need to those details visible in some test image, so I can browse them with an image viewer. Playing with each shot manually is out of the question.

My original thought was to use GIMP in a script, as I’ve shown in the past and feed GIMP with some LISP commands so it resizes the image and runs a “Curves” command.

But then I thought it would be much easier with the “convert” utility. So here’s a short script, which downsizes the image by 4 and gives some visible dynamic range. If you want to use this, I warmly suggest to read the ImageMagick manual page, since the values given below were right for one specific set of shots. You’ll need to tweak with them a bit to get it right for you.

The script generates copies of the originals, of course…

#!/bin/bash

for i in IMG_* ; do
  echo $i;
  convert $i -resize 25%x25% -level 0,1.0,16300 -gamma 2.0 view_$i ;
done

Adding remarks to an existing pdf file with pdftk

This is how to solve a special case, when a PDF file is given, but I want to add my remarks in some free space.

The trick is to write the remarks into another single-page pdf file, so that the new text occupies the blank area in the original. In my case, I needed the remark on the second page, so I used the pdftk command-line utility so split the pages into two files, kind-of watermark the second page with my own pdf file and then rejoin them.

The pdftk is free software, and can be downloaded for various platforms here. If you have a fairly sane Linux distribution, you should be able to just grab a package with it (“yum install pdftk” or something).

Surprisingly enough, this was the most elegant solution I could come up with. This is the little bash script I wrote:

#!/bin/bash

tmpfile=tmp-delme-$$

# Split two pages into two files:
pdftk original.pdf cat 1 output $tmpfile-page1.pdf
pdftk original.pdf cat 2 output $tmpfile-page2.pdf

# Add footnote as if it's a watermark
pdftk $tmpfile-page2.pdf stamp footnote.pdf output $tmpfile-marked.pdf

# Recombine the two pages again
pdftk $tmpfile-page1.pdf $tmpfile-marked.pdf cat output original-marked.pdf

# Clean up
rm -f $tmpfile-page1.pdf $tmpfile-page2.pdf $tmpfile-marked.pdf

When Firefox starts up sooo slowly.

A short note, since it’s so simple and so important. When Firefox gets painfully slow, just compact its Sqlite databases. As has been pointed out elsewhere, the quick fix is to close Firefox, go to where it holds its files, find the .sqlite files, and go (bash under Cygwin, in my case):

$ for i in *.sqlite; do echo "VACUUM;" | sqlite3 $i ; done

And it helps a lot. It’s not just the files getting smaller. It’s everything getting faster.

The sqlite binary for Windows can be found here.

There is a Firefox plugin for this and a standalone application, but I like it in good-old command line with my full control on what’s happening.

LaTeX, pdf and imported 90-degrees rotated EPS images

The problem: In LaTeX, if I import an EPS file with \includegraphics and rotate it by 90 degrees, hell breaks lose in the resulting pdf file.

My processing chain, in case you wonder, is latex, dvips and ps2pdf. I avoid pdflatex since it won’t import EPS (as far as I can recall) but only images converted to pdf. Or something. It was a long time ago.

The figure is generated with

\begin{figure}[!btp]\begin{center}
\includegraphics[width=0.9\textheight,angle=-90]{blockdiagram.eps}
\caption{Modulator's block diagram}\label{blockdiagram}
\end{center}\end{figure}

which successfully rotates the figure as requested, but unfortunately this also causes the page appear in landscape format in Acrobat. While this is slightly annoying, the real problem is that the file will or won’t print properly, depending on the certain computer you print from, and possibly on the weather as well.

The curious thing about this is that if I choose angle=-89.99 it turns out like I want to, but I have a feeling that this will not end well.

Using \rotatebox instead didn’t work either:

\rotatebox{-90}{\includegraphics[width=0.9\textheight]{blockdiagram.eps}}

It looks like this does exactly the same (and the -89.99 trick also works). Now, it’s pretty evident that the clean 90 degrees value triggers off some hidden mechanism which tries to be helpful, but instead ends up messing up things. So this is how I solved this, eventually:

\begin{figure}[!btp]\begin{center}
\rotatebox{-1}{\includegraphics[width=0.9\textheight,angle=-89]{blockdiagram.eps}}
\caption{Modulator's block diagram}\label{blockdiagram}
\end{center}\end{figure}

In words: Rotate the EPS by 89 degrees, and then by another degree, so we have exactly 90 degrees. This leaves some room for precision errors, if the rotation involves actual calculations of coordinations (I have no idea if this is the case), but this is as close to 90 degrees as I managed to get, without having the page messed up.

Not an ideal solution. If you know how to do this better, please comment below!

Ah, I should mention that it’s possible to rotate the EPS file first, and then import it into the LaTeX doc as is. If the whole generation runs with makefile anyhow, this shouldn’t be too annoying. But it turns out (see here) that it’s not all that simple to rotate an EPS. I haven’t tried that solution, but it looks like it should work on any Linux machine. Anyhow, I didn’t feel like playing with bounding boxes.