Groovy: Simple file download from URL – Fixed

The Grails app I\’m working on right now has some cookbook code that takes a list of URLs and downloads the file each URL points to into a staging directory for other code to work on them. There are a couple dozen similar examples on groovy/grails blogs on the net:

def downloadFiles = { sourceUrls->
 def stagingDir = \"/tmp/stagingdir\"
 new File(stagingDir).mkdirs()
 sourceUrls.each { sourceUrl ->
   def filename = sourceUrl.tokenize(\'/\')[-1]
   def file = new FileOutputStream(\"$stagingDir/$filename\")
   def out = new BufferedOutputStream(file)
   out << new URL(sourceUrl).openStream()
   out.close()
 }
}

downloadFiles(
 [\"http://lavezzo.com/saic/mvnBuildLifecycle.png\",
 \"http://lavezzo.com/saic/settings.xml\"
 ])

Looks reasonable, right?

What happens if we call it like this?

downloadFiles(
 [\"http://lavezzo.com/saic/mvnBuildLifecycle.png\",
 \"http://lavezzo.com/saic/I have a space.png\"
 ])

Disaster!  java.net.URL can\’t handle spaces? Now normally, if I were writing the URLs I\’d just add in my own %20s and call it a day. But in this case that array of URL strings is the output of an XmlSlurper pointed at an html file. I have no control over the spaces in that file. java.net.URLEncoder seems like a good place to look, but it turns out that class is intended for use when composing links for html files. It substitutes a + for spaces, which don\’t work in java.net.URL. java.net.URI\’s documentation mentions that it encodes non-US-ASCII characters but not with the URI(String str) constructor. Again, this class seems to assume that you are making this URL yourself and can enter the protocol, port, hostname, etc each in its own constructor argument.

Well it was hard for me to believe but the answer was to separate out JUST the http portion of the URL string I collected from the web page and pass those into the URI(String scheme, String ssp, String fragment) constructor and then call URI\’s toURL() method.  Some Groovy array manipulation convienences made it a little easier:

def downloadFiles = { sourceUrls->
 def stagingDir = \"/tmp/stagingdir\"
 new File(stagingDir).mkdirs()
 sourceUrls.each { sourceUrl ->
   def filename = sourceUrl.tokenize(\'/\')[-1]
   def file = new FileOutputStream(\"$stagingDir/$filename\")
   def protocolUrlTokens = sourceUrl.tokenize(\':\')
   def sourceUrlAsURI = new URI(protocolUrlTokens[0],
       protocolUrlTokens[1..(protocolUrlTokens.size-1)].join(\":\"), \"\")
   def out = new BufferedOutputStream(file)
   out << sourceUrlAsURI.toURL().openStream()
   out.close()
 }
}

downloadFiles(
 [\"http://lavezzo.com/saic/mvnBuildLifecycle.png\",
 \"http://lavezzo.com/saic/I have a space.png\"
 ])

It looks silly to be splitting out the http just to put it back together in the constructor.  Seems like a simple point of improvement in the one argument constructor to URI to parse the String for protocol and then use the three argument constructor internally.

In Charlottesville, Virginia
Jeff

[Ed: Now with SyntaxHighlighter goodness]

beCamp 2010 is April 30 & May 1st

Since the Java Users Group fizzled out a few years ago, there haven\’t been a lot of networking opportunities for programmers in town. Now that\’s probably the least reason why you should pay attention to beCamp 2010.

beCamp 2010 is almost here! April 30th and May 1st are just four weeks away!

If you’re a geek in or around the Charlottesville metroplex or even if you’re merely tech-curious, this is the event you don’t want to miss. beCamp is Charlottesville’s version of the BarCamp unconference phenomenon—organized on the fly by attendees, for attendees. Realizing that the most energizing parts of any tech conference are the ad hoc conversations that take place in the hallways between the sessions, beCamp facilitates these types of interactions for an entire event.

So if you\’re a programmer (or \”geek\” or \”tech-curious\”) in Charlottesville and NOT trekking to Reston for No Fluff Just Stuff Sign up and show up. I have some experience with the Open-Spaces-Technology philosophy of conference and the results are consistently interesting and unexpected.

In Charlottesville, Virginia, United States
Jeff Lavezzo

Charlottesville Tech-job-market

I\’ve found myself about half a dozen times over the last 6 months sending friends or friends-of-friends info on companies in town who hire programmers, software testers, tech-writers or project managers. Before I send it out one more time, I thought I\’d put it up here for general reference and to solicit comments. I\’m going to quickly put it up without fully linking each company trusting that you all can use Google as well as I can.

Charlottesville Tech Companies in no particular order

These government contractors have offices all over so be sure to specify Charlottesville:

When finding jobs by going directly to company websites, remember that a company busy enough to need to hire may be busy enough not to be able to get a job on their site. If a site doesn\’t have a job you\’re looking for, use their website to find out the name of a manager in the department you\’d like to work or just call the front desk and ask for the name of the person in charge of developers, tech writers, testing, hiring project managers, etc. Ask for the HR manager. Ask if they\’re hiring now or expect to in the near future. Do some research ahead of time so you have some idea of what their industry is like so you can ask informed questions. Then you can follow this sequence:

  1. Give them a reason to expect your Coverletter (and resume)
  2. Write a coverletter that makes them want to read your resume
  3. Write a resume (customize it for each job) that makes them want to give you an interview
  4. Interview so they know they can work with you

That last point is often important. A lack of domain experience can be made up for in a month on the job. Being someone no one can get along with is really unlikely to be corrected.