Category Archives: Fedora

Build systems are the new black

So I got into a little bit of a twitter argument last night about distro packaging. Which actually wasn’t where I was trying to go (this time ;) ) One of the problems with twitter is that 140 characters can be hard sometimes. So let’s see if more characters help.

There is a big push afoot from a lot of people towards omnibus packaging. It seems especially prevalent in the world of things written in Ruby I suspect because most even “current” Linux distros are or were shipping some Ruby 1.8.x build up until very recently. And people want to take advantage of the newer language features.

I have a lot I could write on this topic. But I’m not going to today.

The main point I was trying to make is that in going down this path, people are putting together increasingly large build chains that have complex dependencies and take a long time. And so they’re starting to do things like caching of sources, looking at short-circuited builds and a whole lot of other related things. Which, incidentally, are all things that all of the Linux distributions ran into, fought with and figured out (admittedly, each in their own way) years ago. So rather than just rediscover these things and do it yet another way, there’s a ton of knowledge that could be gained from people that have built distribution build systems to make things better for the omnibus world.

So reach out to people that have worked on things like the Debian build system, the OpenSuSE build system, plague, koji and a slew of others. There are some tricks and a lot of tools which are just as valuable as when they started being used. Things like caching build roots and how to fingerprint them for changes, what ccache can and can’t be good for, how to store things reasonably in a lookaside to not depend on upstream repositories being down while you build, incremental builds vs not, ccache, …

Just because you think that building system RPM or DEB packages doesn’t meet the needs of your users doesn’t mean that you have to throw away all of the work and experience that has gone into the toolchains and build systems present there.

Thoughts on DevOpsDays NYC

I’m currently on the train on my way back from DevOpsDays in Brooklyn. The conference was great — lots of smart people facing a lot of similar problems and trying to see what we could learn from each other. The scale was small, with only like 100-ish people present and not a ton of huge, in your face sponsorship. And the venue was a college campus. And so I kept making these comparisons in my head to LUG meetings, installfests and small scale Linux conferences.

Obviously the subject matter was a bit different — talking about and thinking about running large scale production infrastructures is a little bit different than the next cool Linux distribution. This tended, I think, to more discussion around patterns and best practices than about the specifics of “you should do X to get Y to work”. So a higher level and more abstract discussion.

The composition of the audience and attendees was a pretty similar make-up. Linux events always had a strong majority of the attendees who self-identified as sysadmins and then there tended to be a smaller number of developers. And many of the latter group had ended up in that camp due to necessity. The breakdown for DevOpsDays felt pretty similar with an interesting twist where there were speakers who said they were (paraphrasing) “developers first and fell into operations because they needed to”.

One thing that felt more evolutionary than anything else was that the side channel discussion for the event took place on Twitter rather than on IRC. I have (fond) memories of many conferences where attendees sat in an IRC channel and then basically continued to interact on IRC long after the conference had ended. In fact, I made many friends in this fashion. Similarly there was an ongoing discussion on Twitter using the #devopsdays hash tag and I have followed (and am being followed by) a number of the other attendees and hope to keep in touch and call them friends in the future.

And maybe the thing that struck me the most strongly was where people were “from”. Not in the sense of where they lived but rather where they worked. The attendees were almost all from startups. We were in Brooklyn and not the heart of downtown Manhattan, but NYC is probably home to more financial services companies than anywhere else in the world. And all of those companies have *many* people working in software dev and operations-y roles. But they weren’t there.

So it feels like “the DevOps movement” is going through a similar growth and evangelism pattern as open source and Linux did years ago. Maybe that’s why it feels so comfortable to me.

Announcing ami-creator

I’ve been having to build some new CentOS images to be used with EC2 for work recently. I went into it thinking that it shouldn’t be too big of a deal. I know that some work had been going on in this area and Fedora 14 is now available on EC2, so I figured I could convince the same toolchain to work.

Unfortunately, I was pretty disappointed with my options.

  • Do some building by hand on an actual instance, then do the bundling and upload off of the running instance.
  • Some of the ThinCrust stuff initially looked promising, but it seems like it’s largely unmaintained these days and the ec2 conversion bits didn’t really work at this point. I was able to get my initial images this way, but mostly by having a wrapper shell script of doom that made me sad.
  • There’s always the rPath tools, but I wanted to stick to something more native and fully open source
  • The new kid on the block is apparently BoxGrinder but I found it to be a lot over-complicated and not that robust. I’m sorry, but generating your own format that you then transform into a kickstart config and even run through appliance-creator via exec from your ruby tool just felt wrong. No offense, but just felt like a lot more than I wanted to deal with

So, I sat down and spent an evening hacking and have the beginnings of a working ami-creator.
It’s pretty straight-forward and uses all of the python-imgcreate stuff that’s used to build Fedora live images. Your input is a kickstart config and out the other side pops an image that you can bundle and upload to EC2.

Thus far, I’ve tested it to build CentOS 5 and Fedora 14 images. I’m sure there are some bugs but at this point, it’s worth getting it out for more people to play with. Hopefully it’s something that’s a lot simpler and more accessible for people to build images and I think it will also fit in a lot better with having Fedora release engineering building the EC2 images in Fedora 15 if they want.

One of the big outstanding pieces that I still want to add is the necessary bits to be able to (optionally) go ahead and upload and register as an AMI with your EC2 account.  But release early, release often.

Comments, etc appreciated in all the normal ways.

Minor update: switched the repo to live on github instead

Stop Using the Word “Cloud”

The more I see it, the more I want to just completely see the usage of the word “cloud” go away. While it’s somewhat of a cliche to say so, it’s a term that has a very hazy and non-concrete meaning. So whenever you start to use it, you immediately end up in the “well, what is a cloud” discussion. And thus, I have a set of suggestions for those places where you might have wanted to use the word “cloud” to instead use something which actually has meaning.

  • If you’re using cloud to refer to EC2, use EC2 instead. It’s concrete and it means very real things about your deployment and scaling models as well as how you’re managing your infrastructure.
  • If you’re using cloud to refer to some service which runs over the Internet, either refer to the service or just say the Internet. You don’t store your mail “in the cloud”, you host it with Google apps. You don’t backup “to the cloud”, you have your backups stored over the Internet with Mozy or Carbonite.
  • If you’re using cloud to refer to the idea of some hosted application platform, just say the platform. You don’t run your python app “in the cloud”, you run it on AppEngine (or something else).
  • If you’re using cloud to mean that you are using virtualization and have some management stack on top of it, then please just say you’re running in a virtualized environment.
  • If you’re using cloud to refer to having your server infrastructure hosted in a virtualized environment by someone else, again, just say you’re running in a virtualized environment.
  • If you’re using cloud to refer to a “visible mass of little drops of water or frozen crystals suspended in the atmosphere”, then congratulations, you can continue to use the word cloud. And thanks to Wikipedia for the definition

Following this simple idea will let you avoid the otherwise impossible to avoid discussion of the semantics of the word “cloud” and what you happen to mean about it and how you might be wrong and … This then means you’ll be that much closer to achieving whatever goal you hoped to achieve as you’ll spend less time talking and more time doing. And as an added benefit, you’ll avoid getting grumpy emails from me about the fact that you’ve used such a terribly over-used and under-meaninged term.

EC2 and Fedora: Still stuck at Fedora 8

Amazon’s EC2 service is great for being able to roll out new servers quickly and easily. It’s also really nice because we don’t ever have to worry about physical hardware and can just spin up more instances as we need them for experimenting or whatever.

Unfortunately, they’re still stuck in the dark ages with the newest AMIs available for Fedora being Fedora 8 based. With Fedora 12 around the corner, that’s two years old — something of an eternity in the pace of distribution development. I’d love to help out and build newer images, but while anyone can publish an AMI and make it public, you can’t publish newer kernel images, which really would be needed to use the newer system.

So, if you’re reading this at Amazon or know of someone I can talk with to try to move this forward, please let me know (katzj AT fedoraproject DOT org). I’d really strongly prefer to continue with Fedora and RHEL based images for our systems as opposed to starting to spin up Ubuntu images for the obvious reasons of familiarity.

Why do all deployment systems suck?

At HubSpot, we have a pretty wide array of different things being used for the webapps running behind the scenes. This isn’t surprising. There’a also some home-grown scripts (in python, as that’s the scripting language of choice… something I’m not complaining about) to take care of deploying the various webapps. It works, but I really want to get it doing a bit more so that it’s more useful and also get the different scripts doing a bit more sharing of code so that we can improve one place and get the benefits for everything.

Given that this seemed like a pretty typical problem, I figured I’d take a look and see what open source projects exist out there to see if any of them were suitable or could be at least close to a good fit for what we need and want. Unfortunately, I was kind of disappointed…

  • Capistrano seems to be the big player in this arena. It was originally written for Rails and still very very strongly shows that heritage. This isn’t necessarily bad, but it makes it a lot harder to get to work if you’re not doing something that’s rails-like. There are some people who have gotten some things working with Java app deployments for tomcat, but they all feel a bit hacky. The other downside for me/us is that Capistrano is very much Ruby-based, both in how its own deployment language looks as well as some of the “how it depends on things working” aspects. Also, the fact that it’s written in Ruby and thus a little bit more difficult for us to hack on if/when we run into problems is a point against. So it’s probably a non-starter for now, or at least a pretty difficult sell
  • Fabric is written in python and seems to be following in the footsteps of Capistrano. Right now, it’s far far simpler. This is in some ways good but some of the pieces that we’d want (eg, scm integration) aren’t there and so I’d have to write them. And I’m not sure if the Fabric devs are really interested in expanding in that way; haven’t sent email yet, but planning to tomorrow to feel it out.
  • Config Management + Binary deployment is the approach taken in Fedora Infrastructure for app deployment and it seems to be working pretty well there. It might be something to get to eventually, but that’s going to be a longer term thing and I’m not actually convinced that it’s really the best approach. For Fedora it grew out of only a couple of things which could be considered “webapps” and a lot of system config that has turned much later into more webapps. It also pre-supposes a bit more homogenous of an environment than we use at HubSpot from the work I did there
  • Func is something that a few people have been working on that I keep wanting to find a use for but it seems a little less well suited to doing a lot of java app building/deployment given that it’s more https/xml-rpc based than shell based.
  • Roll your own is what we’re doing now and what it seems like is pretty common. I don’t necessarily like this, but it’s certainly the path of least resistance

So, what am I missing? Is there some great tool out there that I haven’t come found that you’re using for Java (and more) webapp deployments? Bonus points if its python-based and pretty extensible.

Beginning A New Chapter

The end of one chapter and the beginning of a new one for me. Today is my last day as an employee of Red Hat. I still remember walking in the door for my first day at Red Hat and having Nalin set up my account so I could get started as Preston was a little bit late getting in that morning. It’s been a great eight+ years across five offices and two states working with lots of great people.

During that time, I’ve also had the opportunity to play a big role in the development and growth of Fedora. While the start was somewhat rocky, I think we’ve now built up an incredibly strong community that successfully releases a whole distribution (arguably, several!) on a regular schedule. And within that community, we’ve grown a pretty awesome set of leaders to continue to drive Fedora forward.

While I’m planning to still keep at least in touch with the goings-on of Fedora as well as running Fedora in places, I certainly won’t have the time to spend on it that I do today. I hope to keep in touch and see people at conferences and events from time to time. But right now, I’m looking forward to what’s next for me. And for those wondering, it’s something pretty different really. More on it next week..

Repeating the cycle, time to kill rhpl

Continuing on the historical vein, once upon a time there was a package included in Red Hat Linux called pythonlib. One of the things I helped do was to finish killing it off. We went along and then a few releases later, wanted to share some python code again. Thus was born rhpl – the Red Hat Python Library. It started out simply enough — some wrappers for translation stuff and one or two other little things. And then it began to grow, as these things do over time. Some of the things made sense, some less so. Over time, pieces have moved around into other things (including rhpxl — the Red Hat Python Xconfig library)

Fast-forward to today and it’s a bit of a mess with things contributed by various people and used in one config tool (or two) and barely maintained. Also a lot of the things being wrapped have gotten a lot better in the python standard library. The gettext module is leaps and bounds better than the one from python 1.5 and also the subprocess module is awesome for spawning processes.

Therefore, I think it’s time to continue the cycle and kill off rhpl for Fedora 12. I’m starting to make patches and file them for packages using rhpl to transition them over. Help much appreciated from anyone that wants to join in.

For the rhpl.translate -> gettext case, you generally want to replace the import of _ and N_ from rhpl.translate with something like

import gettext
_ = lambda x: gettext.ldgettext(domain, x)
N_ = lambda x: x

A request for some simple testing

Another thing that’s been on my list to look at that I’ve finally had time to sit down this week is the new isohybrid support in syslinux. This lets you take an ISO image, post-process it and then be able to either burn the ISO to a CD or write it to a USB stick with dd. Given that we stopped making a disk image form of boot.iso a couple of releases ago to save on duplicated/wasted space, this is obviously kind of cool.

The problem was that the first time I tested it, it looked like it overwrote the checksums we use for the mediacheck functionality in anaconda. It turns out I just wasn’t thinking — we need to implant the checksum *after* we do the isohybrid modification.

So without further ado, I’ve built a test version of the Fedora 11 boot.iso that is usable in this form. Testing of it would be much appreciated!

How to test

  1. Download the test image
  2. Try to burn it to a CD like you normally would. Ensure that it still boots normally. You don’t have to go through the full install, just boot it. Extra points if you can test mediacheck
  3. Find a USB stick that’s at least 256 megs that doesn’t have any data you care about on it. Now try to write the test image to it using dd (dd if=test-isohybrid-boot.iso of=/path/to/device bs=1M). Again, you don’t have to install, just boot into the installer. Note that we won’t automatically find the second stage and you’ll get asked where to find the installer images.
  4. Let me know the results in the comments (including type of machine).

Assuming this works, I’ll get the changes in so that we do this by default with boot.iso and then probably also try to make it so that the loader can automatically find the second stage image on either the CD or the USB stick. I’ll also consider doing similar for the livecds, although there’s more value with liveusb-creator / livecd-iso-to-disk there as you also want to set up persistence in a lot of cases.

Boot tales, woo ooh!

(Take the title in the context of the theme from Duck Tales and maybe it makes sense?)

There was a long and rambling discussion last week about the version of GRUB that’s shipped in Fedora and specifically the fact that the support for ext4 did not land in the version we shipped in Fedora 11. Now, as was said on the thread, this is because the patches weren’t reviewed and ready in time for beta (there are a couple of different ones… so which one is right?) and so we didn’t feel comfortable putting them in after beta, especially as with the way GRUB works, the same filesystem code gets used for ext2, ext3 and ext4 with the patches. A little unfortunate? Yes. Would it have been better if we had gotten them in so that you could do an install of Fedora 11 onto a single partition? Sure. But that’s one of the costs of a time-based release schedule.

In any case, one of the things that came out of the thread was that I gave a history of the version of GRUB in Fedora. For posterity, I’ll repeat that here, with some edits.

So, the gory history for those who might be interested. Eight years ago (!), we decided that the advantage of not having to rerun lilo after changing the config file as you can just read the config file off the filesystem with grub was worthwhile. We had, at that point, been patching lilo for quite a while to have a graphical menu. Therefore, keeping a graphical menu was a branding requirement. Connectiva at the time had a patch to grub that worked. We picked it up, shipped it, and it (mostly) worked. Efforts were made to integrate upstream, but they were largely uninterested. Along the way, significant changes to the graphics patch had to be made as grub evolved and a few other efforts were made to push it upstream. Eventually, the answer was “no, we’ll do something in the next big version of grub after grub 1.0″. Then the main developers went away and we were basically left maintaining a (large at this point) fork. As there is no upstream for grub 0.9x left, we’ve been left in a position of maintaining it and we’ve added some real features that have been needed along the way as grub 2’s progress has been slow at best and we were initially unhappy with some of the direction taken

So, that’s where we are today. We essentially ship a fork of GRUB 0.9x with graphics support, support for a lilo -R type functionality (so you can reboot once into a single kernel), EFI, and some more little things that I’m not thinking of right now.

With that in mind, I sat down and spent some time with a current snapshot of grub2. Overall, it’s made a lot of progress in the time since I last looked at it (a year ago? maybe a little more?). It was actually able to successfully boot for me in KVM and there’s equivalent graphics support to what we’re carrying in our grub 0.9x package. That said, there’s still quite a bit of things to verify exist before we can switch. And just in my look, there are a number of small things that would need work, especially around the way the config file gets created and updated. And with the very short runway for Fedora 12, I don’t think there’s really time to get it into shape in time. But I do think that it makes sense to look at for Fedora 13. So I’ve started a feature page to track as some of the things get tested and worked on. Then hopefully we can make the switch pretty painlessly early in the Fedora 13 cycle.