Like many people, we use Jenkins at work as our continuous integration server and we require that all changes that are committed go through being built in CI before they can get deployed. Yesterday, someone asked if we could add another jenkins slave to try to reduce the amount of time spent waiting on builds. While the slaves are fully puppetized and so it’s not much work to bring an additional slave online, my own anecdotal experience made me think that we weren’t really held up often in a way that additional slaves would help. I had a vague memory of some graphs within jenkins so eventually found them but didn’t really find them that enlightening. The scale is funky, it’s a weird exponential moving average and I just didn’t find it that easy to get any insight from them.
So last night, I sat down and wrote a quick little script to run via cron and pull some statistics and throw them into graphite. Already with less than a day of data, I’m better able to tell that we end up with a few periods of about ten minutes where having more executors could help that are correlated with when someone does a commit to one of the projects at the base of our dependency tree. So that gives us a lot better idea of whether or not the cost of an additional machine is worth the few minutes that we’d be able to save in those cases.
Since it didn’t look like anyone else had done anything along these lines yet, I put the code up on github. There are a lot more stats that could be pulled out via the jenkins api, this is really just a starting point for what I needed today.

At HubSpot, we have a pretty wide array of different things being used for the webapps running behind the scenes. This isn’t surprising. There’a also some home-grown scripts (in python, as that’s the scripting language of choice… something I’m not complaining about) to take care of deploying the various webapps. It works, but I really want to get it doing a bit more so that it’s more useful and also get the different scripts doing a bit more sharing of code so that we can improve one place and get the benefits for everything.
Given that this seemed like a pretty typical problem, I figured I’d take a look and see what open source projects exist out there to see if any of them were suitable or could be at least close to a good fit for what we need and want. Unfortunately, I was kind of disappointed…
- Capistrano seems to be the big player in this arena. It was originally written for Rails and still very very strongly shows that heritage. This isn’t necessarily bad, but it makes it a lot harder to get to work if you’re not doing something that’s rails-like. There are some people who have gotten some things working with Java app deployments for tomcat, but they all feel a bit hacky. The other downside for me/us is that Capistrano is very much Ruby-based, both in how its own deployment language looks as well as some of the “how it depends on things working” aspects. Also, the fact that it’s written in Ruby and thus a little bit more difficult for us to hack on if/when we run into problems is a point against. So it’s probably a non-starter for now, or at least a pretty difficult sell
- Fabric is written in python and seems to be following in the footsteps of Capistrano. Right now, it’s far far simpler. This is in some ways good but some of the pieces that we’d want (eg, scm integration) aren’t there and so I’d have to write them. And I’m not sure if the Fabric devs are really interested in expanding in that way; haven’t sent email yet, but planning to tomorrow to feel it out.
- Config Management + Binary deployment is the approach taken in Fedora Infrastructure for app deployment and it seems to be working pretty well there. It might be something to get to eventually, but that’s going to be a longer term thing and I’m not actually convinced that it’s really the best approach. For Fedora it grew out of only a couple of things which could be considered “webapps” and a lot of system config that has turned much later into more webapps. It also pre-supposes a bit more homogenous of an environment than we use at HubSpot from the work I did there
- Func is something that a few people have been working on that I keep wanting to find a use for but it seems a little less well suited to doing a lot of java app building/deployment given that it’s more https/xml-rpc based than shell based.
- Roll your own is what we’re doing now and what it seems like is pretty common. I don’t necessarily like this, but it’s certainly the path of least resistance
So, what am I missing? Is there some great tool out there that I haven’t come found that you’re using for Java (and more) webapp deployments? Bonus points if its python-based and pretty extensible.
As I wrap up my first week at HubSpot, I have a few observations that are at least sort of interesting.
- Real hardware. I’m pretty happy with my current laptop so I just got a desktop machine to use at work. The box I got is a Dell quad core with 8 GB of RAM. Nice box overall and Fedora installed with no problems. The nVidia graphics work fine for 2d and even xrandr seems to be doing the right thing. One thing that is annoying is that Dell is still shipping machines with VT turned off in the BIOS. Once I turned that on, though, KVM is also working pretty well on the box
- Windows is both just as annoying as ever, less annoying and more annoying. You can run it in a virtual machine without real problems. But installing things, the terminals, etc are all still a pain. Stability is a bit improved. The whole “run as administrator” nonsense is a real pain when you’re trying to get a lot of stuff going.
- Coming in at the end of the scrum cycle seems to sort of be a good thing. Get to see the final push and then the demos from that cycle followed by getting to sit in on the planning for the next sprint. I won’t be on a scrum team until the next sprint and so hopefully I’ll have a better frame of reference¡
- Commuting to Kendall Square works really well for me. Okay, I knew this from riding into MIT but it’s still a takeaway. The bike ride in is a nice length; shorter would be fine, but longer really isn’t as practical.
- Complex build processes exist everywhere and are despised everywhere. But it always seems like a build and deployment process is the last thing cared about.
- I’m having a lot of fun being back in a startup environment.
So yeah, all in all, its been a good week. Now for a long weekend. Two four day weeks in a row for me I guess.
The new chapter begins… today was my first day working for HubSpot.
It’s a big change for me as I’ve been doing pretty much purely (fairly) low-level operating system work for a decade now. Going to a company that’s doing much more web development is making me shift how I think about everything from considering using Eclipse rather than a combination of Emacs/vim/terminals to the languages I’m writing in and the types of code I’ll be writing. And I think it’s a change that I need — I’ve been feeling a bit stagnant and so getting out of my comfort zone should help a lot.
Also, I think that HubSpot is doing some interesting stuff and I’m glad to be joining the team to help out in a variety of different ways.
Ramblings of a Cyclist Hacker