Guerilla career move: send your resume in it’s present form to the longest shot company that you’ve always wanted to work for.
/via @rands

Concurrency and Me

I got to play with the java.util.concurrent package today and I have to admit that after some of the lessons I’ve learned in Scala it didn’t seem nearly as scary as all these people make it out to be. Of course I suppose that part of the danger is exactly that it doesn’t seem difficult but I’m optimistic.

The reason I was playing with it at all is because I’m trying to export data from Google Sites but the current exporter was producing broken links and a lot of HTML cruft. Admittedly the cruft is most likely from the generated markup and not the export but I wanted to try and clean it up anyway. Since it was open source I pulled it down (and imported it into git from mercurial) and started hacking away.

The easiest way I could think of to do XML manipulation was to include Scala and use it’s builtin libraries but after I did that I noticed the export was ‘hanging’ so I looked at the processes and saw it was only using the main thread so I hacked in some multi-threaded support with Executors. It isn’t perfect and I broke some tests but I can export most of my data in a timely fashion now. The work I did on this is visible on github

D’oh: GAELucene

As I said just a bit ago I took a look at GAELucene and I guess I should have spent more time reading than writing because I noticed this bit on the project home page when I checked out the source code.

The GAEDirectory is read only, that is, you can not use the Directory to build index! You should do indexing on another machine, then push the indices onto google appengine datastore with LuceneIndexPushUtil.

Because of the quota limitation of google appengine, GAELucene is not fit to run with huge indices, it does better for small indices, around 100Mb. For large changing indices, you need to find other solutions.

Since I am looking to use a changing index I fall into the category of “need to find other solutions.”

Sorry Google, I jumped the gun.

I recently tweeted that something broke when I updated my Quote Crate application from version 1.2.5 of the AppEngine SDK to version 1.3.1. As it turns out this was unfair to Google (sorry). It doesn’t appear to have had anything to do with the SDK change but rather my poor implementation of Lucene for the AppEngine datastore.

Since the data in my application was all my test data I wiped it all and and tested the application again. Much to my surprise, with no data entries saved without a problem until I had nine entries. After the ninth entry it started dying again. This, along with the still failing after rolling back to 1.2.5, makes me pretty certain it’s my code and nothing to do with the AppEngine.

So it seems it’s time to go back to the drawing board on how that all operates but it looks like someone else has tackled the Lucene on AppEngine problem in the form of GAELucene on Google code. I’m going to dig through this a bit and see how that works and if I can take some inspiration from it in order to clean up my code.

Respect is Earned