Wednesday, August 12, 2009

$weeks[-1]; # Home Stretch

We're into the last few days of the 2009 GSoC. Things have been moving very well for me and yesterday I was able to land the last major feature I wanted to implement in MojoX::UserAgent before the end of the program: opportunistic request pipelining. That is, the UserAgent (UA) will pipeline your HTTP requests when possible, based on three different settings:
  1. the maximum number of connections per host;
  2. the maximum number of consecutive pipelined requests;
  3. the strategy you want to use (see below).
Three strategies are available. The first (and current default) one is not to pipeline at all. So, if you allow, say, 2 connections per host, then one request at a time will be sent on each of those connections, and any other request spooled into the UA beyond that is simply kept waiting until there is an available connection, on a first come first served basis. The second strategy is what I call "horizontal" mode. In this mode, the UA prefers to spread things out over as many connections as possible before starting to pipeline. So if you allow 2 connections per host, and three consecutive pipelined requests, and then spool up three requests (to the same host) and run them, you'll wind up with one request going out by itself on one connection and one two-request pipeline on the other. In contrast, when using the last strategy, which I call "vertical", the UA prefers to pipeline as much as possible before spreading things out over many connections. So in this case, spooling and running three requests to the same host (also with maxconnections=2 and maxpipereqs=3) will result in all three requests being pipelined over a single connection. I hope that made sense...

In any case, basic session cookies are there and auto-following of redirects is also there. There are still plenty of rough edges and things that are simply not there yet (eg no proxy support, cookies don't expire, and so on). But in the last few days I rather hope to write up some documentation and code up a single upstream change in Mojo that will have a positive impact on pipelining performance. Once that's done, I'll call it a day for GSoC (and publish some reflexions here). But I do plan to continue working (voluntarily and on an ongoing basis) on MojoX::UserAgent and distribute releases through CPAN. In fact, my PAUSE id request came through yesterday. And I have a few other ideas that may just turn into projects...

Monday, August 3, 2009

We now resume our regularly scheduled programming

1, 2, 1, 2 - is this thing on?

OK, it's about time I updated this blog. For a couple of weeks after the midterm checkpoint, some external events interfered significantly with my progress. But am now back to forging ahead, so to speak. I'll be adding one week to my schedule to compensate for the time lost - my original schedule went to the "suggested pencils down" date of August 10th, but I'll keep going full time until August 17th, the "firm" pencils down date. My main goal will be to get MojoX::UserAgent to a usable state. I would like it to be fully asynchronous (using the LWPng model as inspiration), with support for persistent connections, automatically following redirections, basic (ie session) cookies and (hopefully) opportunistic request pipelining.

I've already created the source repository, and in the last week have gotten to a point where it makes simple requests, invokes an asynchronous callback upon completion and automatically follows redirects. I have basic, in-memory cookie storage implemented, and am now working on per-request cookie retrieval. And I have discovered the "wondeful" world of cookie "standards" - or lack thereof. Multiple specifications, ambiguities and de-facto standard behavior established by dominant user-agents and even by quirks of major banking sites (example 1, example 2 (note that these read almost like entertaining short stories, at least for me (YMMV))). It even looks like the IETF's http-state working group is being resurected. Ah well, here I go...