Last week I had the wonderful opportunity to travel to Paris for the 2012 EMEA FUDCon. I've never been to Paris and to my knowledge there has never been a Fedora kernel maintainer give the "state of the kernel" talk at an EMEA FUDCon before. While the talk itself is not earth-shattering, it does tend to give a decent overview of how the Fedora kernel team operates, the challenges we face, and what contributors can do to help. Hopefully it was worthwhile for the people that attended. I know I found some of the feedback interesting, and that is what I wanted to discuss a bit with this blog post.
After giving the overview of how we maintain the kernel in Fedora (rebases, regressions, bug reports, etc), we discussed rawhide testing a bit. The importance of rawhide kernel testing is sometimes lost in the overall Fedora test land. It is basically the easiest way to both contribute to a solid Fedora release, and contribute to the upstream kernel development process. The more testers we have on rawhide kernels, the more bugs we get reported and the closer we are to where the upstream action is. It's much easier to report a bug to an upstream developer if you're running what they're current working on, as opposed to running a release that is a few versions old. While that seemed to be well understood in the room, there were two main problems voiced about rawhide kernel testing: 1) rawhide eats babies, 2) usability.
It's true. Rawhide can definitely destroy your install. I really don't know anyone that continues to use rawhide on a daily basis and blindly update every day. You have to be fairly well in-tune with the state of the distro and that gets even harder to do when we're in a Branched state as rawhide tends to just be a dumping ground. The kernel itself is not immune to this, but it is much more isolated from the rest of the distro. Aside from the compiler and binutils, there really isn't much that can impact the kernel build so essentially the bugs to be found are isolated to the kernel code itself. We all know that's perfect, so what's the problem? The problem is that people don't want to have to run rawhide the distro to run the rawhide kernel.
Fortunately, they don't have to. In the past this was a problem, particularly during the KMS development timeframe. The kernel and xorg were very tightly tied together, and using a kernel from rawhide on a released version was likely not to work. Now that KMS is upstream, there are very very few situations where a stable Fedora release userspace can't work with a rawhide kernel and vice versa. When they do show up, we update the Requires/Conflicts in kernel.spec and they generally aren't issues after a few weeks because updates fix things across the board. This flexibility is a major reason we can continually rebase the kernel as newer versions come out, so it stands to reason that running development versions of those is definitely feasible.
To avoid the rest of the rawhide distro though, you have two (easy) options. You can either install the fedora-rawhide package and do:
yum update --enablerepo=rawhide kernel
or you can have a side repository somewhere with only kernels and have that enabled in your yum repos. Clearly the latter is just a special case of the former, but some people are apparently very scared of rawhide. A side repo also has some other advantages, in that you can put kernels that don't actually make it to rawhide in there too. However, such a repository doesn't exist at the moment.
Such a repository wouldn't be difficult to create, and I might well create one on my fedora people space for a few kernels. However, I think the work Seth Vidal is doing with coprs is a much better mechanism for distributing these. Once that is up and running, we might be able to utilize it to provide rawhide kernel builds. What builds we provide kind of ties into issue #2 above as well.
In terms of usability, many people have found the debug kernels we normally build in rawhide to be very poor in terms of performance. They aren't meaning benchmark performance, or throughput, etc. On certain machines, the debug kernels make the desktop response very very slow. I've seen this on one of my machines as well, and the current theory is that spin lock debugging is what triggers this the most. We could disable that, but it's one of the better debug options to have so issues can be found sooner rather than just having your machine hang one day.
To at least try and make something usable, we build one kernel for each major release and each kernel RC milestone (e.g. 3.7-rc2) with the debug options disabled entirely. Most often, those make it into the rawhide repository for a day or so before the next build comes along and has debug enabled again. The issue with that is it isn't trivial to determine from the kernel NVR if you're installing a debug kernel or not. You have to look at the installed config file or changelog to figure it out. To make this easier, the extra kernel repository might just contain non-debug builds of what goes into rawhide for a while. I would personally rather people run and test some form of rawhide kernel rather than get no testing at all.
Another item that came out of our discussion was what to do with new kernel features. In the past, everything was enabled as it showed up in rawhide. That is really no longer sane to do. It leads to bloat and sometimes upstream adds a feature they really don't intend to be used by a large number of people. So instead, we tend to enable new drivers but leave big features off. If someone requests it, we'll take a look and try and determine how big the demand is.
For example, Checkpoint/Restore is a new feature working towards completion upstream. It's somewhat usable now, but it really isn't ready for broad consumption. The 3.7 kernel will contain much improved support, but we'd like to know how buggy it is, etc. As a result, we discussed enabling it in the normal rawhide debug kernels and leaving it off in non-debug kernels. That will provide those wishing to experiment with the feature something to install and test. If the results show that the feature isn't going to cause a world of pain, we can leave it enabled in the release kernels as well. (I don't think C/R will wind up being enabled fully in 3.7, but I do think eventually demand for it will be there and the code is coming along nicely.)
That sums up the bulk of the discussion that we had during the talk. I found it very worthwhile. Now I have a question for you, whether you were at this talk, have seen a similar one at a different FUDCon, or want to speak up in general: Are these talks worthwhile to you? Working with the kernel day in and day out, we see a lot of new code coming in so the talk changes up a bit because of that, but the overall topics and message tend to be the same. If people find that to be valuable still, great! It's an easy talk to roll up and the format allows it to be pretty open. However, if you find it growing stale we need to know that too. What kind of information would you like to hear about? Is there something you've been wondering about but never asked? Would you rather do some kind of kernel hackfest instead of a state of the union talk? Some kind of demonstration? Etc.
The kernel team is really open to feedback and we want the FUDCon sessions to be valuable. Let us know what you think and we'll see what we can come up with. You don't have to be a kernel developer to have interest or provide a suggestion, so please feel free to drop us a line. Leave a comment here, or email the Fedora kernel list. After all, FUDCon Lawrence is only a few months away.