Command-line (bash/GIMP) mass conversion and processing

The purpose

I use GIMP a lot. I store the images in the native file format, XCF. Now I’m stuck with a lot of files I can’t see outside GIMP, but I don’t want to save those files as anything else, because I’ll lose all the layer data. Solution: Batch conversion to JPEG as a simple bash script to run from the command line.

The idea is to make a JPEG copy of the image as it’s seen when it’s opened with GIMP. For that reason, I’ve chosen to flatten the image by visible layers only, and to crop it to image size.

Also, I’ll show an example of how to massively fix images with a GIMP script.

2022 update: This post is really old. Nowadays ImageMagick supports XCF, so it’s possible to just go

$ convert this.xcf this.jpg

But this post can still be of use for more sophisticated tasks with GIMP.

The script

It looks like LISP and it looks like bash. In fact, they’re mixed.

#!/bin/bash
{
cat <<EOF
(define (convert-xcf-to-jpeg filename outfile)
  (let* (
	 (image (car (gimp-file-load RUN-NONINTERACTIVE filename filename)))
	 (drawable (car (gimp-image-merge-visible-layers image CLIP-TO-IMAGE)))
	 )
    (file-jpeg-save RUN-NONINTERACTIVE image drawable outfile outfile .9 0 0 0 " " 0 1 0 1)
    (gimp-image-delete image) ; ... or the memory will explode
    )
  )

(gimp-message-set-handler 1) ; Messages to standard output
EOF

for i in *.xcf; do
  echo "(gimp-message \"$i\")"
  echo "(convert-xcf-to-jpeg \"$i\" \"${i%%.xcf}.jpg\")"
done

echo "(gimp-quit 0)"
} | gimp -i -b -

To try it out, simply execute the script from a directory containing several .xcf files. Be sure not have any .jpg files you care about in the same directory, because the outputs are with the same file names, just with the .jpg extension (old files are overwritten with no warning).

You will get kind-of-warning messages while the script is iterating, indicating which file is being processed. This is normal.

The concept of this script is simple. An ad-hoc LISP script is generated in the Bash block, which is enclosed by curly brackets. First we define the function, which converts one file. The bash script then creates calls to this function, by means of the (Bash) for-loop. All this is then fed into GIMP through standard input (piping).

Some LISP notes

I’m not really into LISP. So I ran into some trouble. These are my notes, so I won’t go through it again:

First, there’s the Script-Fu console, which was, well, sort-of helpful. The internal functions’ API can be found there as well.

As a LISP novice, I didn’t know the difference between “let” and “let*”. It turns out, that let* allows the use of previous assignments in the following ones, so this is what you get in the Script-Fu console:

> (let ( (x 2) (y x)) y)
Error: eval: unbound variable: x 

> (let* ( (x 2) (y x)) y)
2

It’s also worth to note, that the GIMP interpreter does not remember functions across different -b command-line arguments:

# First statement succeeds, second fails.
gimp -i -b '(define (myfun x y) (- x y))' -b '(myfun 2 3)'

# This works, because it's two statements in one execution (yuck!)
gimp -i -b '(define (myfun x y) (- x y)) (myfun 2 3)'

Mass processing of frames

As the title implies, I needed this to make adjustments on a video clip. It’s true that many video editors (Cinelerra included) have filters for that, but running a sequence of GIMP commands is probably stronger than any possible video editor.

So here’s a little script which runs some operations on a list of frames. This was useful, mainly because I wanted to run curves on a clip (and Cinelerra doesn’t have that operation. I wonder which video editor has).

#!/bin/bash
{
cat <<EOF

(define (do-retouch filename outfile)
 (let* (
 (image (car (gimp-file-load RUN-NONINTERACTIVE filename filename)))
 (drawable (car (gimp-image-merge-visible-layers image CLIP-TO-IMAGE)))
 )
 (gimp-colorize drawable 10 50 0)
 (gimp-curves-spline drawable HISTOGRAM-VALUE 6 #(0 0 147 44 255 122 ) )
 (gimp-hue-saturation drawable ALL-HUES 0 0 20)
 (plug-in-gauss RUN-NONINTERACTIVE image drawable 10 10 1)
 (file-png-save2 RUN-NONINTERACTIVE image drawable outfile outfile 0 9 0 0 0 0 0 0 1 )
 (gimp-image-delete image) ; ... or the memory will explode
 )
 )

(gimp-message-set-handler 1) ; Messages to standard output
EOF

for i in frame*.png; do
 echo "(gimp-message \"$i\")"
 echo "(do-retouch \"$i\" \"fixed/x_${i%%.png}.png\")"
done

echo "(gimp-quit 0)"
} | gimp -i -b

The script-Fu console was helpful in finding the function’s names, which are pretty obvious. I’ll only mention that running gimp-image-merge-visible-layers in this case is probably not necessary, but I suppose GIMP won’t waste too much time on merging a layer with itself.

As for the curves operation (gimp-curves-spline), I first tried to use curves traces which are saved by GIMP as they are being used in the GUI, but that turned out to be pretty complicated. So I went for the simple approach: Open the GUI, find the X-Y points on the graph, and copy them manually. The “6″ says that there are 6 numbers ahead (3 X-Y pairs), and then we have X0, Y0, X1, Y1, etc. The values go from 0 to 255. So it’s pretty trivial, and does exactly the same as the GUI.

Importing the frames to Cinelerra

Not that it’s directly relevant, but to import a bunch of frames into Cinelerra, use the mkframelist command line utility and load it like any file. For example,

$ ls *.png | mkframelist -r 30 > framelist

For 30 fps, and then load “framelist” to Cinelerra. Note that the paths in the list are absolute, so you can’t just move around the files.

 

Canon EOS 500D: Using the wrong driver intentionally

Foreword

Before I say a word, my children, I have to warn you: What I’m about to teach you here is basically how to mess up your computer. It’s how to make Windows install the wrong driver for a USB device (possibly PCI devices as well). Don’t complain about a headache when you want to switch to the correct driver (or just another one): Once Windows adopts a driver for a device, it develops certain sentiments for it, and will not give it up so easily.

Also, my children, I have to confess that I still use Windows 2000 these days. It’s about time I went on for at least XP, but I have too much installed software to carry with me (none of which will run, I fear, after an upgrade).

Having these issues out of the way, let’s get to business.

My motivation

I bought a brand new Canon EOS 500D, which came with a brand new EOS Digital Solution disk (v20.0, if we’re at it). It’s black, it’s pretty, it autoboots, but it tells me to go &@^%$ myself with my ancient operating system (Windows 2000, as mentioned). Canon’s “support” is more like a black hole, so I’m on my own. All I want is to download the pics to my computer. When I plug the camera in, I get “Found new hardware” but no driver to get it working with.

I was slightly more successful with Linux (Fedora 9, on my laptop). gphoto2 managed to download the images (using command line, which is cool) using PTP (Picture Transfer Protocol) but I want this working on my desktop.

Now here’s the problem in summary: The camera connects to the computer and says: “I was made by Canon (manufacturer number 0409) and I’m a 500D camera (product ID 31CF), and I support the standard interface 6.1.1, which means “Still Imaging”. An XP computer would say “Aha! A camera! Let’s try PTP, even though I’ve never heard about this device!” but a Windows 2000 won’t talk with strangers (if they are cameras, that is).

Drivers for this camera for Windows 2000 are not to be found. I tried to find a generic PTP driver for Windows, but couldn’t find one. There’s a generic interface (WIA), but no generic driver. Then I thought: Since any camera driver would talk PTP with any PTP camera, why not put just any Canon driver to talk with my camera? After all, the driver just transfers files. I hope.

Update (February 5th, 2010): I got the following laconic mail from Canon’s help desk today:

“Dear Customer,

The EOS 500D camera can only be connected to personal computers with either Windows Vista, Windows XP or Mac OS X operating systems. Unfortunately Windows 2000 is not supported”

Wow! That’s a piece of valuable information after having the camera for over six months!

The black magic

It just so happens, that the 400D has a PTP TWAIN driver for Windows 2000 (the installation file is k6906mux.exe). So I downloaded that one, and installed it as is. Which didn’t help much, of course. But it left me the INF file at the destination directory. That allowed me some wicked manipulations.

The trick is to bind the driver’s software to the specific hardware ID. So I opened the INF file, and found the part saying:

[Models]
%DSLRPTP.DeviceDesc%=DSLRPTP.Camera, USB\VID_04A9&PID_3110

That means, “if some device says it was made by 04A9 (Canon) and that its product ID is 3110 (EOS 400D, I suppose)”, use this driver.

Hey, this is an open invitation for intervention! I simply changed it to:

[Models]
%DSLRPTP.DeviceDesc%=DSLRPTP.Camera, USB\VID_04A9&PID_31CF

(actually, I did it on a copy of the file)

And then I went for all these places saying

[DSCamera.Addreg]
HKLM,"%DS_REG%\TWAIN\EOSPTP",DeviceDesc,,"EOS Kiss_X REBEL_XTi 400D"
HKLM,"%DS_REG%\TWAIN\EOSPTP",ModelName,,"EOS Kiss_X REBEL_XTi 400D"

and changed them to something saying it’s a 500D using 400D driver. Just free text so I know what I’m doing.

By the way, you may wonder where I had the 500D’s product ID from. The answer is Linux again. There’s a utility called lsusb, which supplies all that info. You can get it in Windows too, I suppose. I just don’t know how.

Putting it to work

At this point, I plugged in my camera, and powered it on. Windows told me it found new hardware, great, and then asked me to supply a driver (a couple of wizard windows ahead). It actually wants an INF file, so I gave it the one I cooked.

Since the VID/PID in the file match those given by the camera, Windows installed the drivers and associated them with the camera from now on. Mission accomplished.

Did it work?

The truth is that the result isn’t very impressive. Maybe because Canon’s own EOS utility failed to talk with the camera this way, and Picasa’s interface with the TWAIN driver is a bit uncomfortable. But the bottom line is that I can now download the images to my Windows 2000 computer.

On the other hand, maybe it’s this ugly with proper drivers as well. The most important thing is that it works, after all.

Verilog: Declaring each port (or argument) once

(…or why the Verilog-emacs AUTOARG is redundant)

In Verilog, I never understood why port declarations appear both in the module declaration, and then immediately afterwards, along with the wires and registers. I mean, if the ports in the module declaration are always deducible from what follows immediately, why is the language forcing me to write it twice?

The short answer is: It doesn’t.

Let’s have a look on this simple module:

module example(clk, outdata, inbit, outbit);
   parameter width = 16;

   input clk;
   input inbit;
   output outbit;
   output [(width-1):0] outdata;

   reg [(width-1):0]		outdata;

   assign 	outbit = !inbit;

   always @(posedge clk)
     outdata <= outdata + 1;

endmodule

There is nothing new here, even for the Verilog beginner: It demonstrates a simple combinatoric input-output relation. We also have an output, which happens to be a register as well (I didn’t even bother to reset it).

And as usual, every port is mentioned twice. Yuck.

Instead, we can go:

module example #(parameter width = 16)
  (
   input clk,
   input inbit,
   output outbit,
   output reg [(width-1):0] outdata
   );	       

   assign 	outbit = !inbit;

   always @(posedge clk)
     outdata <= outdata + 1;

endmodule

At this point, I’d like to point out, that this is no dirty trick; This type of module declaration is explicitly defined in the Verilog 2001 standard (or by its official name, IEEE Std 1364-2001). This goes both for defining the ports and the parameters  (thanks to Evgeni for pointing out the possibility to set parameters as shown above).

According to the BNF definition in Annex A (A.1.3, to be precise), a module definition must take one of the two formats shown above, but mixing them is not allowed.

So here are few things to note when using the latter format:

  • Each port declaration ends with a comma, not a semicolon. Same goes for parameter declarations.
  • It’s not allowed to declare anything about the port again in the module’s body. Repeating the port’s name as a wire or reg is not allowed.
  • Use “output reg” (which is legal in either format)  instead of declaring the register in the module’s body (which is not allowed in this setting)
  • Syntax highlighters and indenters may not work well

The question is now: How could I not know about this?

Porting to Virtex-4: Who ate my IOB registers?

Surprise, surprise!

When porting a design from Spartan-3 to Virtex-4, I discovered that many registers, which were correctly placed in the IOB in the Spartan-3, fell off into fabric-placed flip-flops. Which is very bad news, since keeping the registers in the IOB isn’t just a matter of better timing, but rather repeatable timing, which is much more important to me. I don’t want the timing to change when I reimplement, or a poorly designed board with marginal signal quality can make a new FPGA version to appear buggy, because something stopped to work.

It turns out, that the tools got lost somewhere in the transaction from plain IOBs to ILOGICs and OLOGICs. In other words, the synthesizer (XST J.39, with ISE 9.2.03i) or maybe the mapper failed to take obvious hints. My hunch is that the mapper is to blame.

What part of “ILOGIC” didn’t you understand?

There’s always the aggressive solution of instantiating the IOBUF and the relevant flip-flops explicitly. In fact, it may be enough to just instantiate the IOBUF itself. The only explanation I can think of why this would help, is that the synthesizer packs the registers during synthesis, and maybe also makes some minor fixes to allow this packing. It’s ugly, but it works.  Or if it doesn’t work, at least I know why: A major advantage of instantiating IDDR and ODDR (or IDDR_2CLK, if you want to feed it with two clocks) is that it forces the mapper to complain loudly when in refuses to put them in place. It can’t just hide the flip-flops in the fabric, say nothing, and hope I won’t notice.

In theory, differences between the clocking and reset schemes should be allowed between the ILOGIC and OLOGIC flip-flops. How do I know? Because I can get it done with the FPGA Editor. In practice, the packer is terribly picky about what it’s ready to pair into an ILOGIC/OLOGIC couple. I haven’t tested all combinations, but it appears it won’t try to pack a DDR flip-flop next to a non-DDR one. And if there’s a difference in clocking or reset, forget about it.

Example: I had a case where I used the PRESET input for the ODDR, but no reset at all for the IDDR, and got:

ERROR:Pack:1564 – The dual data rate register controller/ddr_bus_T[10] failed to join the OLOGIC component as required.  The OLOGIC SR signal does not match the ILOGIC SR signal, or the ILOGIC SR signal is absent.

Inference (and its black magic)

Just using the “IOB of … is TRUE” synthesis pragma seems to make the synthesizer do no more than to avoid eliminating equivalent registers, and duplicating them when their content is used in several sites. That’s nice, but sometimes not enough.

Let’s see a few examples of what worked and what didn’t work with the ISE foodchain, when instatiation was avoided. The target is Virtex-4. Note that no IOB pragma is used here.

First, let’s look at this:

module try(
	   input clk,
	   output reg toggle
	   );
   reg [1:0] 	  count;

   always @(posedge clk)
     begin
	count <= count + 1;

	case (count)
	  0: toggle <= 1'b0;
	  1: toggle <= 1'bz;
	  2: toggle <= 1'b1;
	  3: toggle <= 1'bz;
	endcase
     end

endmodule

This actually worked well: Both tri-state buffer and data register were placed in the OLOGIC element, resulting in optimal timing. But hey, what if we want to read from the data lines while they are tri-stated? So we go for this (note that it’s not functionally equivalent):

module try(
	   input clk,
	   inout toggle
	   );
   reg [1:0] 	  count;

   reg 		  toggle_reg;
   reg 		  z_reg;

   assign 	  toggle = z_reg ? 1'bz : toggle_reg;

   always @(posedge clk)
     begin
	count <= count + 1;
	z_reg <= count[1];
	case (count)
	  0: toggle_reg <= 1'b1;
	  1: toggle_reg <= 1'b0;
	  2: toggle_reg <= 1'b1;
	  3: toggle_reg <= 1'b1;
	endcase
     end

endmodule

This placed both registers in the OLOGIC as well (note that I didn’t bother to read from the IO, but never mind). The truth is that I was lucky here. If z_reg and toggle_reg would happen to be equivalent, they would melt into a single register, which would remain outside the OLOGIC element. Or let’s look at this example (note the subtle difference…):

module try(
	   input clk,
	   output toggle
	   );
   reg [1:0] 	  count;

   reg 		  toggle_reg;
   reg 		  z_reg;

   assign 	  toggle = !z_reg ? 1'bz : toggle_reg;

   always @(posedge clk)
     begin
	count <= count + 1;
	z_reg <= count[1];
	case (count)
	  0: toggle_reg <= 1'b1;
	  1: toggle_reg <= 1'b0;
	  2: toggle_reg <= 1'b1;
	  3: toggle_reg <= 1'b1;
	endcase
     end

endmodule

For those who missed the difference, here it is: The polarity of z_reg, as a data out enabler is reversed. The z_reg register which is implemented by the synthesizer turns out to be in the wrong polarity for the tri-state buffer in the pad, so the OLOGIC is used as a piece of combinatoric logic. A NOT gate, to be precise. It would be wiser and possible, of course, to implement both the NOT gate and the flip-flop inside the OLOGIC, but it looks like the tools don’t take it that far.

But let’s be a bit fair. These oddities could be fixed simply by adding a single synthesis hint:

// synthesis attribute IOB of z_reg is "TRUE"

The thing is, that the tools managed without this hint in several cases when targeting Spartan-3 devices. What’s ugly here,  is not that the synthesis pragma is necessary, but that the tools behave differently suddenly.

And another small pitfall: Don’t put double quotes around the register’s name (z_reg) in our case. That will cause the synthesizer to silently ignore the pragma comment. It’s OK to put the around “TRUE” but not around a name which lives in the synthesizer’s name space.

Botttom line

When porting to Virtex-4 (and most likely newer FPGAs) keep a very close eye on where the IOB registers are placed. Also, be aware of how picky the tools have become about the similarity between paired OLOGIC and ILOGIC.

Getting the right names in the UCF file: Using netgen

The problem: NGDBUILD tells you it can’t find a net or instance given in the UCF file. It’s likely that the synthesizer changed the names, sometimes slightly and sometimes beyond recognition. You need these names to define a timing group, for example, but how do you know them?

Normally, I would get the net and instance names from FPGA Editor, or possibly from the Timing analyzer. But without any successful place-and-route, how can I know what names the synthesizer gave them, if I can’t even get through NGDBUILD?

Solution: Create a simulation model in Verilog (it also possible in VHDL, but I’ll show Verilog):

If my synthesis gave mydesign.ngc, simply write at command prompt (to most of you it’s a DOS window):

netgen -ofmt verilog mydesign.ngc delme.v

And delme.v will contain the simulation model. It’s a fairly readable file, in which the design is broken down to small primitives, which makes it pretty heavy. But the names used for nets and logic are those that go to NGDBUILD, and with some searching in the text file, one can get around.

Note that if mydesign.ncd is used rather than mydesign.ngc, you’ll get the simulation model for the post-PAR result (which can be useful too at times).

The PCF file: Xilinx timing constraints as the tools understood them

One of the problems with setting up timing constraints in the UCF file, is to be sure that you got the right elements in, and kept the unnecessary ones out.

Suppose I wrote something like

NET "the_clock" TNM_NET = "tnm_ctrl_clk";
TIMESPEC "TS_ctrl_clk" = PERIOD "tnm_ctrl_clk" 40 ns HIGH 50 %;

What logic element does it apply to? Did it work like I expected?

The information can be obtained by creating a timegroup report in the Timing Analyzer, but it’s actually available in a much easier way: The PCF file, which is created by the MAP tool. This file has the same syntax as the UCF file, but it reflects the constraints as understood by the tools.

You will find the as-made pin placements there (not shown here), and the timing groups as TIMEGRP statements. It goes something like:

TIMEGRP tnm_ctrl_clk = BEL "controller/bus_oe_16" BEL
        "controller/ctrl_dout_15" BEL "controller/bus_oe_15" BEL
        "controller/ctrl_dout_14" BEL "controller/bus_oe_14" BEL
        "controller/ctrl_dout_13" BEL "controller/bus_oe_13" BEL
        "controller/ctrl_dout_12" BEL "controller/bus_oe_12" BEL
        "controller/ctrl_dout_11" BEL "controller/bus_oe_11" BEL
        "controller/ctrl_dout_10" BEL "controller/bus_oe_10" BEL
        "controller/ctrl_dout_9" BEL "controller/bus_oe_9" BEL
        "controller/ctrl_dout_8" BEL "controller/bus_oe_8" BEL
        "controller/ctrl_dout_7" BEL "controller/bus_oe_7" BEL
        "controller/ctrl_dout_6" BEL "controller/bus_oe_6" BEL
        "controller/ctrl_dout_5" BEL "controller/bus_oe_5" BEL
        "controller/ctrl_dout_4" BEL "controller/bus_oe_4" BEL
        "controller/ctrl_dout_3" BEL "controller/bus_oe_3" BEL
        "controller/ctrl_dout_2" BEL "controller/bus_oe_2" BEL
        "controller/ctrl_dout_1" BEL "controller/bus_oe_1" BEL
        "controller/ctrl_dout_0";

There you have it, in plain text. The relevant constraint is just a few rows away:

TS_ctrl_clk = PERIOD TIMEGRP "tnm_ctrl_clk" 40 ns HIGH 50%;

As simple as that.

Catching the transient cookies: Log in, then crawl

The old way

Sometimes all you need is a quick crawl within a site, which requires to log in first. There are two main techniques I can think about: One is to POST the login form with your script, and get the necessary cookie setting. The second is to login manually with a browser, and then hand over the web cookies to your script. Let’s start with the first (traditional?) method:

You could use WWW::Mechanize for that (not that I’ve tried), or use the good old LWP. Something like:

#!/usr/bin/perl

use warnings;
use HTTP::Request::Common qw(POST);
use HTTP::Cookies;
use LWP::UserAgent;

$basedir = 'http://www.somesite.com/';

# Create a cookie jar and log into the server

$ua = LWP::UserAgent->new;

$ua->agent("Mozilla/5.0"); # pretend we are very capable browser
$jar = HTTP::Cookies->new();
$ua->cookie_jar($jar);

my $req = POST $basedir.'login.php',
  [ username => 'dracula',
    password => 'bloodisgood'
  ];
print "Now logging in...\n";
$res = $ua->request($req);

# We're not really interested in the result.
# This was only a cookie thing.

die "Error: " . $res->status_line . "\n"
  unless ($res->is_success);

# And now we continue to whatever we wanted to do

The problem is that sometimes the login form is complicated. At times it’s obfuscated intentionally, and uses several tricks to make it difficult to automate the login. Sniffing a successful login (your own, I hope) may be helpful, since the correct POST data is there. If the login is through https, just go through the web page, replace all “https” with “http” and make a fake login. It may not login for real (it usually does), but at least you have the dump info.

But the bottom line is that it may be difficult. In some cases, it’s easier to do the login manually, and continue with your script from there.

Cookie stealing basics

So the plan is to login manually, and then give away the web cookies to your script. The target server can’t tell the difference. In extreme cases, you may need to set up the HTTP headers, so that your script’s and the browser send the same ones exactly. I suggest making your script identify itself with exactly the same user agent header as your browser. Some sites check that, and reject your login if there’s no match. Believe that.

There are several examples for this trick. One is using wget and its –load-cookies flag. It’s quick and dirty, and loads cookies from a cookie file in good old Netscape format. Some browsers can export their cookies to such a file (Firefox uses another format internally, for exampe). But there is still one major problem, and that’s the transient cookies.

Who ate my (transient) cookie?

Every cookie, which is sent from the server to the browser (or whatever you have there) has an expiration date. Some cookies are marked to be erased when the browser (ha!) quits. These are transient cookies.

The thing is, that the browser has no reason to write these cookies to the cookie file on the disk. Why write something that will be erased anyhow? So stealing cookies from the cookie file doesn’t help very much, if the crucial cookies are transient. If you can’t stay logged in to a site after shutting down your browser and getting it back on, that site may be using transient cookies for its session.

The only simple way I know to get a hand on those transient cookies, is to dump them into a file while the browser is alive and kicking. The Export Cookies add-on for Firefox does exactly that.

I suppose that wget works properly with the add-on’s output. I haven’t tried. I wanted to do this with Perl.

Importing Netscape cookies to LWP

It was supposed to be simple. The HTTP::Cookies::Netscape should have slurped the cookie file with joy, and taken things from there. But it didn’t. The module, if I may say so, has a problematic programming interface, which is miles away from the common Perl spirit.

The worst thing about it, is that if no cookies are imported because it didn’t like the cookie file, or didn’t find it at all, there is no notification. An empty cookie jar is silently created. I think that any Perl programmer would expect a noisy die() on that event. I mean, if the cookie file wasn’t read, there’s no point going on in 99% of the cases.

A second problem is with transient cookies. Their expiration time in the cookie file is set to 0 (surprise, surprise, they’re not supposed to survive at all), and the module simply discards them. I don’t blame the module for that, since transient cookies aren’t supposed to be found in a cookie file.

I’ve made the necessary changes for making it to work with the Export Cookies add-on, and got a new module, Exported.pm (click to download). I suggest to copy it next to where you find Netscape.pm in your Perl distribution.

Bottom line, the script looks like this:

#!/usr/bin/perl

use strict;
use warnings;
use LWP::UserAgent;
use HTTP::Cookies::Exported;

my $baseurl = 'http://www.somesite.com/juicydata.php';
my $ua = LWP::UserAgent->new;
# A user agent string matching your browser is a good idea.
$ua->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7');

# Not loading file in new(), because we don't want writeback
my $cookie_jar = HTTP::Cookies::Exported->new();
$cookie_jar->load('cookies.txt');
$ua->cookie_jar($cookie_jar);

my $req = HTTP::Request->new(GET => $baseurl);

my $res = $ua->request($req);
if ($res->is_success) {
  my $data = $res->content;

  # Here we do something with the data
}
else {
  die "Fatal error: $res->status_line\n";
}

Google ads: How to lose the wait

Someone once told me that WWW stands for World Wide Wait. But when the page is held because the browser waits for a Google Ad to come in, that’s really annoying. I didn’t want that to happen in my site.

So here’s the story about how to have the page displayed first, ads later. One may argue that you want your money machine up (ha!) before giving away the goods. One could also argue that the ads coming in late will draw more attention. I say: A slow site is like no site.

And as with any story, the solution comes in the end. If you want to know who the killer is, and don’t care about the drama, just skip to the end.

Trial #1: Absolute positioning

Spoiler: This didn’t work for me. But it may work for you. I don’t know if it’s because I’m using a lot of Javascript in my page or not.

The idea was that if the browser doesn’t need to know what’s inside the ad box, it will go on rendering the page. Since an absolutely positioned section is out of the flow, and won’t change any other part’s placement, I hoped that the browser would be smart enough not to wait for the ads. Or stupid enough to overlook the inconsistency in scripting. I was wrong.

Anyhow, this is the code I used:

<div style="position: relative; height: 620px; width: 200px; overflow: hidden; ">
<div style="position: absolute; top: 0px; left: 0px">
<script type="text/javascript"><!--
google_ad_client = "pub-xxxxxxx";
google_ad_slot = "xxxxxxxx";
google_ad_width = 160;
google_ad_height = 600;
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script>
</div></div>

The outer DIV allocates the necessary space on the page. The inner DIV is absolutely positioned, so no matter what comes in there, everything else will remain in place.

What really happened: IE6 went right on with the rest of the page, Firefox still waited for the ads to load. Even though IE played it nice this time, it’s actually a bug in IE.

The thing is, that I have a lot of Javascript embedded in my page. Since the Google ad code consists of Javascript itself, the browser must wait until that code is loaded before running any other Javascript code (which may, as far as the browser is concerned, depend on it). Firefox waited to prevent unpredictable execution of following Javascript code, IE just went on. Ignorance is bliss.

This way or another, the problem remains. I mean, even Microsoft will fix this bug some day.

(non)-Trial #2:

The idea is simple. If the Google ad code appears just before the closing </body>, the browser must be pretty boneheaded to wait for it. So now the trick is to get the ad in the right place.

Proposed trick:  Create a <div> container at the end of  the document, and push it to its place with absolute positioning. With Javascript or without. This can be done only if one knows the exact pixel position in terms of the entire document. There is no safe way, that I know of, to get the absolute position of an element by its ID. Neither to get it into a known position within a containing block other than the one you’re currenly inside.

There are scripts out there which claim to do that, but they don’t rely on any standard. And getting a block of ads in the wrong place is too bad to risk.

And I haven’t even mentioned what happens when the page is resized. Unless all surrounding elements are nailed in place with absolute positioning, bad things can happen…

Anyhow, this method didn’t fit my case. I dropped it.

Trial #3: Hijacking document.write()

One widely proposed solution is to hack document.write(), which Google uses to implant the extra HTML. I don’t like it very much, because it’s well, hacky. But I tried. And then I saw what the script in show_ads.js produces:

<script src="http://pagead2.googlesyndication.com/pagead/expansion_embed.js">
</script>
<script src="http://googleads.g.doubleclick.net/pagead/test_domain.js">
</script>
<script>window.google_render_ad();</script>

So show_ads.js writes a script that loads other scripts? Doing what? Call other?

This was just enough to scare me off. If hijacking means to recursively run scripts loaded by scripts, there’s a good chance to really mess things up. So I gave this idea up as well.

Trial #4: Moving the DOM element

This is the way to do it. Let the browser reach the ad segment at the end of the document, and then manipulate the document tree, moving the element to its right position. This is pretty intrusive, and it turns out that it pushes the browser’s tolerance to the limit.

But first, let’s look at the code. It has two parts. First, we put this where we really want the ads:

<div style="position: relative; height: 620px; width: 200px; overflow: hidden; ">
<div id="googletarget" style="position: absolute; top: 0px; left: 0px">
</div></div>

That’s pretty much like what I did in trial #1, with two important differences: The google ad isn’t here, of course, and I’ve given the inner DIV an ID, so I can target it later. Absolute positioning is still important, because I don’t want the page to flicker when the ad comes in.

Then, at the end of the document (before closing </body>) we have:

<div style="display: none;"><div id="googleads">
<script type="text/javascript"><!--
google_ad_client = "pub-xxxxxxx";
google_ad_slot = "xxxxxxxx";
google_ad_width = 160;
google_ad_height = 600;
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script>
</div></div>
<script type="text/javascript">
document.getElementById("googletarget").appendChild(document.getElementById("googleads"));
document.getElementById("googleads").style.display="block";
</script>

That rings an old bell too. It’s exactly like the original script, now embedded in other DIVs and with an extra script snippet, which moves the ads to their right position using appendChild(). W3C is very clear about that appendChild() never duplicates a node, but moves it in case it’s already placed in the document tree.

And the DIV, to which the ads are loaded into, is wrapped by another DIV which is set to “display: none”. This makes them invisible in their original position, so nothing flickers at the bottom of the page and the scroll bars don’t get crazy. And even better, if the appendChild() fails (because the browser is old or primitive), the ads remain invisible. Which is rare enough to neglect businesswise, and harmless in terms of how badly we’ve messed up the page.

In IE6, it was necessary to set the display mode back to “block” after moving it. That’s what the line after the appendChild() is for.

By the way, “visibility: hidden;” wasn’t enough, because the Google data contains an IFRAME, which doesn’t respect the external DIV’s visibility status.

And in the extreme case, in which the browser doesn’t respect the “display: hidden” attribute, we will have some ugliness at the bottom of the page. Functionally, it’s still harmless, and graphically it’s not so bad compared with what such a browser’s user sees every day.

So, this looks like a bulletproof solution. Of course it worked on any fairly updated browser I tested with. What could possibly go wrong?

The answer came to me when I tried in on an Mozilla 0.99 (seven years old). Don’t think I’m using it for myself, but testing on a steam engine is a good way to find weak spots. In our case, moving the ads around caused the browser to display a blank page. All was fine just before the call to appendChild(), but then the page went blank, and the browser started to behave strangely.

This is bad news. I’m trying to make a page load slightly faster, and I might make it unloadable (to some) instead. Even if I don’t believe that someone really uses a browser from 2002, a new, cutting-edge and somewhat buggy browser might fall in the same hole.

The problem, as it turned out, is that the element I move around consists of scripts. Mozilla 0.99 executes the script once when it’s loaded, and once again when it’s moved. Modern browsers execute the script once. IE6 is in the middle: It indeed runs the script once, but loads it twice. Not harmful to the user, but this indicates that moving a script around is not an issue of the past.

It also looks like there was an issue with appendChild() and scripts around the development of Firefox 3.1, but given the relatively little attention it got (and the fact that the relevant remark in the docs has vanished), I suppose this is a non-issue.

To see how the browser behaves, I wrote a small snippet:

<div id="taker"></div>
<p>Hello, world</p>
<div id="loser">
<div id="node">
<script>
alert("Script running");
document.write("I was first!");
</script>
</div></div>
<script>
var node = document.getElementById("node");
alert("Now swapping");
document.getElementById("taker").appendChild(node);
</script>

The idea is that the two rows swap places by virtue of appendChild(). The “script running” alert should appear only once, of course. On Mozilla 0.99 it popped up twice, and the page ended up with “I was first” written before and after “Hello world”, which sort-of explains why nothing worked with Google ads. Other browsers I checked, including Konqueror 3.0.0 from 2002 ran the script once, as expected.

Conclusion

The solution I picked was appendChild(), of course. But I also realize that moving around an element which contains a script pushes the browser to its limits. On the other hand, AJAX web sites are becoming increasingly popular, so browsers are more and more likely to support this properly in the future.

For my own use, I’ve decided to do this trick only on browsers I’ve tested. This is possible since the page is produced by a server-side script, which checks the browser version. This will make the page load slightly slower on non-mainstream browsers, but on the other hand, guarantee that the page loads at all.

One little button for Firefox: XUL is cool

Introduction

This post summarizes some of my findings while attempting to make a descent single-button Firefox extension. I’ve tested the code shown below with Firefox 1.5 and 3.0 under Windows, and Firefox 3.0 under Linux, and there were zero portability issues.

I would suggest reading a tutorial about toolbars and possibly Mozilla’s own tutorial about XUL before the notes below.

The magic

All I wanted was to put a little button on my Firefox browser. At some point I realized that I won’t get things working like I want unless I understand what I’m doing. So I dived into the thing called XUL.

I have to admit, that it looks terrible at first glance. I started to change my mind when I realized, that XUL is a format for describing GUI items. That my case of toolbar extensions is a very narrowminded one. To get an idea what XUL can do,  just have a look on the so-called periodic table. But that’s really nothing compared with the simple fact, the all of Firefox’  GUI (menus, buttons, dialogs) is implemented in XUL and JavaScript.

But my the real enlightment came to me at page 391-392 of Programming Firefox. That’s where I realized that I should unzip the browser.jar file, which is exactly where one could expect to find it: In the Mozilla Firefox/chrome directory (where the application is installed, not where it puts its cache data etc.)

Also, unzipping classic.jar is useful to see the names of the image files used, so that the relevant place can be found easily in browser.jar’s files. It’s much easier to find the place at which the reload button is defined when its image file’s name is known.

Reverse engineering

Let’s investigate the reload button, then. It wasn’t too difficult to realize that its icon can be found in the classic.jar, in a file named Toolbar.png (find the directory yourself). Searching for that string in all of classic.jar’s files, the next clue was  in browser.css (same directory as Toolbar.png).

Now, I won’t pretend to really understand what’s going on in that CSS file, and neither did I make a real effort to do so. All I needed was to come across a CSS definition like:

#reload-button {
  -moz-image-region: rect(0px 96px 24px 72px);
}

If you’re not so good in CSS, that’s a definition for an object with ID ‘reload-button’. I wonder what that object could be…

So now I had my next clue: Look for reload-button in browser.jar’s files. Surprise, surprise, it’s in browser.xul. The button is declared as

      <toolbarbutton id="reload-button" class="toolbarbutton-1 chromeclass-toolbar-additional"
                     label="&reloadCmd.label;"
                     command="Browser:Reload"
                     tooltiptext="&reloadButton.tooltip;"/>

Which, to tell the truth, is not so interesting. What I’m really looking for, is the ID of the enclosing XML tag. Scrolling up a bit, I found it:

<toolbarpalette id="BrowserToolbarPalette">

This is all I could ask for. But what is a “toolbarpalette”, I asked myself. Google took me straight to Mozilla’s documentation, which spelled it out for me.

The aftermath is, of course, that there was no need for reverse engineering, since the documentation tells me exactly how to get the button at the desired position. On the other hand, I didn’t find that information before.

I see no point in telling the whole story of going back and forth from searching the jar files for strings, then trying Google and/or trying to squeeze in an element of my own. The bottom line is that once I understood how the XUL represents the GUI itself, it became fairly easy to find the right magic words.

So I’ll go on with the bottom line of how to put the button in some interesting places.

The skeleton

I’ll skip the framework, since it’s documented all over. You can always unzip any Firefox extension XPI file to get an idea of what should be in it.

I assume that we already have a Javascript file, testingbutton.js. In that file, we have a function called testing_doit(), which we want executed every time our button is pushed.

Basically, the XUL file has the following format:

<?xml version="1.0"?>
<?xml-stylesheet href="chrome://testingbutton/skin/testingbutton.css" type="text/css"?>

<overlay id="Testing-Overlay"
  xmlns="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul">

  <script type="application/x-javascript"
    src="chrome://testingbutton/content/testingbutton.js" />

... One or more element definitions come here ...

</overlay>

The important magic word here is <overlay>. What it means is that we can add elements to existing GUI elements by using their ID.  For example (shown in detail below), if I want to add an element to the toolbarpalette with ID “BrowserToolbarPalette”, I simply repeat its opening tag, and put my things there. This will not override the existing definition, but rather append new elements.

Note that any number of snippets from the examples below can be inserted to add several controls to the browser window.

And just before starting, here’s the CSS that comes along: Basically, it defines the image to use on the buttons, which are pinpointed through their IDs (using classes may have been more elegant?)

#Testing-Toolbar-Button {
    list-style-image: url("chrome://testingbutton/skin/logo.png");
}
#Testing-Doit-Button {
    list-style-image: url("chrome://testingbutton/skin/logo.png");
}
#Testing-Doit-Button2 {
    list-style-image: url("chrome://testingbutton/skin/logo.png");
}

In the toolbar (not)

I’ll start with a spoiler: The code below will add the button to the Toolbar Palette and not to the toolbar itself. In other words, after installing the extension, nothing will happen until the user manually puts the button in place. Not very impressive. I suppose there is a way to get the button there automatically, but given the alternatives, I didn’t bother to check how to do that.

Here’s the code:

  <toolbarpalette id="BrowserToolbarPalette">
    <toolbarbutton
      id="Testing-Doit-Button"
      label="Testing button"
      tooltiptext="Do it!"
      oncommand="testing_doit();"
    />
  </toolbarpalette>

The only really interesting thing here is that the ‘toolbarpalette’ tag shares its ID with the one defined in browser.xul. That’s the point, after all.

In a toolbar of its own (ugly)

I’m showing this one, only because it’s popular in examples I’ve seen, and it turns out ugly for a single button. The code below creates a full toolbar, from left to right, and puts one little button in it. Looks pathetic. Makes sense if you have a lot of buttons (and users who love your application a lot).

  <toolbox id="navigator-toolbox">
    <toolbaritem flex="0">
      <toolbarbutton id="Testing-Toolbar-Button"
       tooltiptext="Do it!"
       oncommand="testing_doit();" />
    </toolbaritem>
  </toolbox>

Note that ‘toolbarbutton’ has an ID which is mentioned in the CSS file. This is how the graphical icon appears.

In the existing toolbar (bingo!)

Well, I’m not sure if this is exactly where you would expect it, but the code below puts the button at the rightmost position next to the toolbar the user drags bookmarks to. The purely logic place might have been next to the address bar, but the actual position is excellent. Code follows:

  <toolbar id="PersonalToolbar">
    <toolbarbutton id="Testing-Doit-Button2"
     tooltiptext="Do it!"
     oncommand="testing_doit();" />
  </toolbar>

(By the way, I found the magic ‘PersonalToolbar’ ID simply by looking at browser.xul and trying it out. The name sounded promising, after all)

In the status bar (So cool!)

At the bottom-right corner, with minimal distraction:

  <statusbar id="status-bar">
    <statusbarpanel class="statusbarpanel-iconic"
     id="Testing-status-button"
     onclick="testing_doit();"
     tooltiptext="Do it!"
     src="chrome://testingbutton/skin/logo.png" />
  </statusbar>

An entry in the context menu

For users who don’t like to travel much with their mouse (laptops?), we have the menu which pops up when you right-click. How to add an entry to the context menu is described here. Note that the executed script can get some info about what was right-clicked (it’s called a context menu, after all).

Anyhow, this is what I made of it:

  <popup id="contentAreaContextMenu">
    <menuitem id="Testing-menu-entry"
     label="Do it!"
     oncommand="testing_doit();"/>
  </popup>

Conclusion

This XUL framework takes some time to get used to, but it’s a fantastic demonstration of how an open-source project can expose its internals without needing to compile anything. Despite some difficulty, working with Firefox’ internals is far more enjoyable than its well-known counterpart. Which shouldn’t surprise anyone.

Making an IE toolbar button: Notes to self (from hell)

Nothing to see here, folks…

These are just some notes I wrote down for myself, in case I’ll ever want to repeat this mess.  Microsoft Visual Studio 2003 was used for developing the C++ class as well as the setup project. For testing, IE 6 was used.

A button running a script or executable

The simple way is described here. The interesting thing is that it all boils down to setting up some registry keys and values, put a couple of files somewhere, which contain the icon and whatever you want to execute, and off you are. The execution target can be some EXE or a script, including Javascript (!) which is pretty cool. What is less cool, is that the script is pretty crippled. In particular, it can’t manipulate the browser (as of IE6) and I’m not sure about its capabilities in manipulating the current document. So it’s easy, but not very useful.

Now, seriously

I didn’t want to face the facts, but I had no choice: There is no easy way to write a Toolbar button that actually does something useful. A terrible Microsoft document (“the guide” henceforth) offers some clues about how to make a COM DLL for this purpose.

I’ve seen plenty of web sites offering extensions for Firefox, but not for Internet Explorer. I thought the reason was that people with brain prefer Firefox. After writing an extension for IE and Firefox, I realize that the huge difference in difficulty is the probable reason.

Making a “Hello, world” toolbar button

  • In Visual Studio, create a “regular” ATL project. Keep it as DLL, uncheck “Attributed” and then check ““. Otherwise a separate stub/proxy DLL is created, and the IID/CLSID/LIBID symbols aren’t resolved in the h-file. I’m sure there’s a better way to solve this. I’m sure it would take me days to find out how.
  • Right-clicking the “Source Files”, add a class. Pick ATL Simple Object, and be sure to set the Options: Aggregation is “No” and IObjectWithSite checked.
  • (Build it and see that it is OK. Just so you know it’s possible)
  • Now open the .rc file in the solution explorer. Just walk through its properties and make sure that they make sense. The language may be set to something unnecessarily local. In particular, fix the Version->VS_VERSION_INFO so that Company Name and such say something more respectable than TODO-something.

At this point, we sort-of follow Microsoft’s disastrous guide. The first changes are in the .h-file.

  • The guide tells us to add the IOleCommandTarget interface. This boils down to adding only two lines (the public declaration and COM_INTERFACE_ENTRY), which are those mentioning IOleCommandTarget explicitly. All the rest is already there, courtesy of Visual Studio.
  • Add an #include <atlctl.h> in the beginning.
  • And immediately after END_COM_MAP:
    public:
        STDMETHOD(Exec)(const GUID *pguidCmdGroup, DWORD nCmdID,
            DWORD nCmdExecOpt, VARIANTARG *pvaIn, VARIANTARG *pvaOut);
        STDMETHOD(QueryStatus)(const GUID *pguidCmdGroup, ULONG cCmds,
            OLECMD *prgCmds, OLECMDTEXT *pCmdText);
  • These methods need to be implemented, of course. For an “Hello, world” application, this is enough (put in .cpp file):
    STDMETHODIMP Cjunkie::Exec(const GUID *pguidCmdGroup, DWORD nCmdID,
        DWORD nCmdExecOpt, VARIANTARG *pvaIn, VARIANTARG *pvaOut)
    {
         MessageBox(NULL, _T("Hello, world"), _T("It works!"), 0);
    
        return S_OK;
    }
    
    STDMETHODIMP Cjunkie::QueryStatus(const GUID* pguidCmdGroup, ULONG cCmds,
        OLECMD prgCmds[], OLECMDTEXT* pCmdText)
    {
    	int i;
    
    	// Indicate that we can do everything!
    
    	for (i=0; i<((int) cCmds); i++)
    		prgCmds[i].cmdf = OLECMDF_SUPPORTED | OLECMDF_ENABLED;
    
        return  S_OK;
    }
  • Just a word about the QueryStatus method implemented above: Microsoft describes what this function should do, but I found almost no sample implementation of it. Basically, the purpose of this function is to tell the world what the module is ready to do and what not. I went for an I-can-do-all approach, since any call to a toolbar button means “do your thing”. I’m not sure if this is the right thing to do, but given the promises regarding how narrowminded the calls are expected to be, I think this approach wins. I mean, ask a silly question, get a silly answer.
  • At this point, believe it or not, the project should build. The source code up to this stage is listed at the end of this post.

Setup project

  • Create a new Setup project. Give it a nice name (it will be the MSI file’s name)
  • Put its configuration as Release (as opposed to Debug) and check its “build” checkbox in the Configuration Manager if necessary. So it gets compiled…
  • Create a special folder (Windows Folder) to put the files in (too little to open an application folder for)
  • Make a subfolder in the Windows folder.
  • Put all files there: The DLL (Add->Project Output…->Primary output) and the icon file (read its format here).
  • Set the “Register” property of “Primary Output” to vsdrpCOMSelfReg (explained below).
  • Open a properties window, and set up the Setup project’s properties.
  • Open the Setup project’s Registry Editor and set up the entries. A sample screenshot below.
  • Make sure that the ‘DeleteAtUninstall’ property of the extension’s GUID is  ‘True’ (but none of the others’!)
Visual Studio's Setup project: The Registry Editor

Visual Studio's Setup project: The Registry Editor

Note that the path to the Windows Folder is given as [WindowsFolder]. This makes the value point to where the file was actually installed. A list of such variables can be found here.

And of course, the ‘{4B19…}’ -thing is the button’s class ID (in GUID form). Put your own instead.

  • Next, I went for the User Interface Editor. That’s a great opportunity to make the installation process neater. First I removed the “Installation Folder” and “Confirm Installation” steps. The only folder used is the Windows folder anyhow, and with nothing to choose there is nothing to confirm.
  • Then a 500x70 BMP file was added to the target directory. This is used as a banner on the installation dialogs by setting the BannerBitmap property for each installation dialog. Since the banner is overlaid with black text, it makes sense to put the logo at the bottom right corner and keep the banner bright.

A note about registration

This was a really bad one. The DLL has to be registered as the owner of the GUID, so that when that GUID is mentioned in the Explorer’s extension list, Explorer knows what DLL to fetch and run “Exec” on. (I suppose the important part is an entry with the key HKEY_CLASSES_ROOT\CLSID\{here comes the GUID}. Or maybe HKEY_LOCAL_MACHINE\SOFTWARE\Classes\CLSID\{here comes the GUID}?)

I wasn’t ready to think about pinpointing the keys to be set up (what do I know about Windows?). Neither was I ready to run Regsvr32 at installation for that (a great opportunity to fail the installation on a hostile computer).

The solution was proposed here: You go to the setup project, pick the item which marks the placement of the DLL (appears as “Primary Output from …”), right-click it and open the Properties page. There you change the “Register” property from vsdrpDoNotRegister (the default) to vsdrpCOMSelfReg.

Now, the project has an RGS file, which it seems wasn’t respected at all, but since the DLL’s registration is now secured, I don’t mind setting up the rest in the Setup project (the “Registry Editor” within a Setup project comes handy for this).

Just a word of caution: In the Setup project’s Registry Editor, you need to line up some of the existing keys as if they should be added, so to bring you to the desired path in the Registry (that is, ‘Microsoft’, ‘Internet Explorer’ and ‘Extensions’). Be sure that the ‘DeleteAtUninstall’ property of these is ‘False’, or you will cause some serious damage to the registry during uninstallation. Also, it’s a good idea to back up the complete registry before starting to play with the Setup project.

To make things a bit more complicated, the property of your GUID key should have the ‘DeleteAtUninstall’ property ‘True’, so that Explorer won’t look for your button after uninstalling.

Interaction with the browser

The “Hello world” application could have been written in Javascript. For some real action, just follow that horrible guide. At this stage, things actually get pretty easy.

  • To get a hold of the browser, we need to implement the SetSite method. Copied it right off the guide to the .cpp file.
  • The private property declaration, as well as the prototype of Setsite were copied into the .h-file
  • At this point, the project built and run (and I could verify that Setsite had been executed once, before the first call to Exec)
  • Then I switched to the Exec() method they offered. Basically, I changed nothing except the class name, and put zero instead of  navOpenInNewTab (not supported in my environment, which hasn’t heard about IE7).

Making a POST request

At some point, I decided that I needed to implement a POST request. This was more or the less the stage, at which I realized, that I was actually writing Visual Basic, only in C++. The lesson learned was that maybe I should have started with Visual Basic (YUCK!) to begin with.  And of course, I confirmed an old rule in Microsoft programming: “Prepare to spend a crazy amount of time to implement a trivial feature.”

Implementing POST forces the use of Navigate2, which is a quicksand of SAFEARRAYs and VARIANTs. The available examples show you how to get it done with code that makes you puke and looks like it depends on luck more than some solid API.

To my surprise and delight, I managed to narrow the whole thing down to this relatively-elegant code:

char postdata[] = "postdata=yeah";
CComVariant RequestUrl(_T("http://my.site.com")); 

VARIANT noArg;
noArg.vt = VT_EMPTY;

VARIANT flags;
flags.vt = VT_I4;
flags.lVal = 0;

CComSafeArray<byte> pSar(strlen(postdata), 0);

for (int x=0; x<strlen(postdata); x++)
  pSar.SetAt(x, postdata[x]);

CComVariant postdata(pSar); // Make this an array

m_spWebBrowser->Navigate2(&RequestUrl, &flags, &noArg, &postdata, &noArg);

Sources of Hello World application

Since the most difficult part was to get the application open a dialog box when the button was clicked, here’s the code for it. The main attempt here is to keep it simple.

And by the way, opening dialog boxes seems to be a bad idea. Explorer crashed a few times when the button was clicked before the dialog box was closed. It seems like the response to the Exec() call should be swift.

Header file:

// junkie.h : Declaration of the Cjunkie

#pragma once
#include "resource.h"       // main symbols

#include "myproject.h"
#include
#include  // For handling BSTRs
#include  // For handling BSTRs

// Cjunkie

class ATL_NO_VTABLE Cjunkie :
public CComObjectRootEx,
  public CComCoClass,
  public IObjectWithSiteImpl,
  public IDispatchImpl,
  public IOleCommandTarget
{
 public:
  Cjunkie()
    {
    }

  DECLARE_REGISTRY_RESOURCEID(IDR_JUNKIE)

    DECLARE_NOT_AGGREGATABLE(Cjunkie)

    BEGIN_COM_MAP(Cjunkie)
    COM_INTERFACE_ENTRY(Ijunkie)
    COM_INTERFACE_ENTRY(IDispatch)
    COM_INTERFACE_ENTRY(IObjectWithSite)
    COM_INTERFACE_ENTRY(IOleCommandTarget)
    END_COM_MAP()

    public:
  STDMETHOD(Exec)(const GUID *pguidCmdGroup, DWORD nCmdID,
		  DWORD nCmdExecOpt, VARIANTARG *pvaIn, VARIANTARG *pvaOut);
  STDMETHOD(QueryStatus)(const GUID *pguidCmdGroup, ULONG cCmds,
			 OLECMD *prgCmds, OLECMDTEXT *pCmdText);

  DECLARE_PROTECT_FINAL_CONSTRUCT()

    HRESULT FinalConstruct()
    {
      return S_OK;
    }

  void FinalRelease()
    {
    }

};

OBJECT_ENTRY_AUTO(__uuidof(junkie), Cjunkie)

And application file:

// junkie.cpp : Implementation of Cjunkie

#include "stdafx.h"
#include "junkie.h"

STDMETHODIMP Cjunkie::Exec(const GUID *pguidCmdGroup, DWORD nCmdID,
			   DWORD nCmdExecOpt, VARIANTARG *pvaIn,
			   VARIANTARG *pvaOut)
{
  MessageBox(NULL, _T("Hello, world"), _T("It works!"), 0);

  return S_OK;
}

STDMETHODIMP Cjunkie::QueryStatus(const GUID* pguidCmdGroup, ULONG cCmds,
				  OLECMD prgCmds[], OLECMDTEXT* pCmdText)
{
  int i;

  // Indicate that we can do everything!

  for (i=0; i<((int) cCmds); i++)
    prgCmds[i].cmdf = OLECMDF_SUPPORTED | OLECMDF_ENABLED;

  return  S_OK;
}