Log in

pointless pontifications of a back seat driver [entries|archive|friends|userinfo]

[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

When is a kernel bug not a kernel bug? [Jun. 30th, 2016|12:27 pm]

Think of this scenario: You're sitting at your shiny Fedora install and notice a kernel update is available. You get all excited, update it through dnf or Gnome Software, or whatever you use, reboot and then things stop working. "STUPID KERNEL UPDATE WHY DID YOU BREAK MY MACHINE" you might say. Clearly it's at fault, so you dutifully file a bug against the kernel (you do that instead of just complaining, right?). Then you get told it isn't a kernel problem, and you probably think the developers are crazy. How can a kernel update that doesn't work NOT be a kernel problem?

This scenario happens quite often. To be sure, a good portion of the issues people run into with kernel updates are clearly kernel bugs. However, there is a whole set of situations where it seems that way but really it isn't. So what is at fault? Lots of stuff. How? Let's talk about the boot sequence a bit.

Booting: a primer

Booting a computer is a terribly hacky thing. If you want a really deep understanding of how it works, you should probably talk to Peter Jones[1]. For the purposes of this discussion, we're going to skip all the weird and crufty stuff that happens before grub is started and just call it black magic.

Essentially there are 3 main pieces of software that are responsible for getting your machine from power-on to whatever state userspace is supposed to be in. Those are grub, the kernel, and the initramfs. Grub loads the kernel and initramfs into memory, then hands off control to the kernel. The kernel does the remainder of the hardware and low-level subsystem init, uncompresses the initramfs and jumps into the userspace code contained within. The initramfs bootstraps userspace as it sees fit, mounting the rootfs and switching control to that to finish up the boot sequence. Seems simple.

The initramfs

So what is this "initramfs"? In technical terms, it's a weird version of a CPIO archive that contains a subset of userspace binaries needed to get you to the rootfs. I say weird because it can also have CPU microcode tacked onto the front of it, which the kernel strips off before unpacking it and applies during the early microcode update. This is a good thing, but it's also kind of odd.

The binaries contained within the initramfs are typically your init process (systemd), system libraries, kernel modules needed for your hardware (though not all of them), firmware files, udev, dbus, etc. It's almost equivalent to the bare minimum you can get to a prompt with. If you want to inspect the contents for yourself, the lsinitrd command is very handy.

There are actually a couple of different 'flavors' of initramfs as well. The initramfs found in the install images is a generic initramfs that has content which should work on the widest variety of machines possible, and can be used as a rescue mechanism. It tends to be large though, which is why after an install the initramfs is switched to HostOnly mode. That means it is specific to the machine it is created on. The tool that creates the initramfs is called dracut, and if you're interested in how it works I would suggest reading the documentation.

The problems

OK, so now that we have the components involved, let's get to the actual problems that look like kernel bugs but aren't.

Cannot mount rootfs

One of the more common issues we see after an update is that the kernel cannot mount the rootfs, which results in the system panicking. How does this happen? Actually, there are a number of different ways. A few are:

* The initramfs wasn't included in the grub config file for unknown reasons and therefore wasn't loaded.
* The initramfs was corrupted on install.
* The kernel command line specified in the grub config file didn't include the proper rootfs arguments.

All of those happen, and none of them are the kernel's fault. Fortunately, they tend to be fairly easy to repair but it is certainly misleading to a user when they see the kernel panic.

A different update breaks your kernel update

We've established that the initramfs is a collection of binaries from the distro. It's worth clarifying that these binaries are pulled into the initramfs from what is already installed on the system. Why is that important? Because it leads to the biggest source of confusion when we say the kernel isn't at fault.

Fedora tends to update fast and frequently across the entire package set. We aren't really a rolling release, but even within a release our updates are somewhat of a firehose. That leads to situations where packages can, and often do, update independently across a given timeframe. In fact, the only time we test a package collection as a whole is around a release milestone (keep reading for more on this). So let's look at how this plays out in terms of a kernel update.

Say you're happily running a kernel from the GA release. A week goes by and you decided to update, which brings in a slew of packages, but no kernel update (rinse and repeat this N times). Finally, a kernel update is released. The kernel is installed, and the initramfs is built from the set of binaries that are on the system at the time of the install. Then you reboot and suddenly everything is broken.

In our theoretical example, let's assume there were lvm updates in the timeframe between release and your kernel update. Now, the GA kernel is using the initramfs that was generated at install time of the GA. It continues to do so forever. The initramfs is never regenerated automatically for a given kernel after it is created during the kernel install transaction. That means you've been using the lvm component shipped with GA, even though a newer version is available on the rootfs.

Again, theoretically say that lvm update contained a bug that made it not see particular volumes, like your root volume. When the new kernel is installed, the initramfs will suck in this new lvm with the bug. Then you reboot and suddenly lvm cannot see your root volume. Except it is never that obvious and it just looks like a kernel problem. Compounding that issue, everything works when you boot the old kernel. Why? Because the old kernel initramfs is still using the old lvm version contained within it, which doesn't have the bug.

This problem isn't specific to lvm at all. We've seen issues with lvm, mdraid, systemd, selinux, and more. However, because of the nature of updates and the initramfs creation, it only triggers when that new kernel is booted. This winds up taking quite a bit of time to figure out, with a lot of resistance (understandably) from users that insist it is a kernel problem.

Solution: ideas wanted

Unfortunately, we don't really have a great solution to any of these, particularly the one mentioned immediately above. People have suggested regenerating the initramfs after every update transaction, but that actually makes the problem worse. It takes something known to be working and suddenly introduces the possibility that it breaks.

Another solution that has been suggested is to keep the userspace components in the initramfs fixed, and only update to include newer firmware and modules. This sounds somewhat appealing at first, but there are a few issues with it. The first is that the interaction between the kernel and userspace isn't always disjoint. In rare cases, we might actually need a newer userspace component (such as the xorg drivers) to work properly with a kernel rebase. Today that is handled via RPM Requires, and fixing the initramfs contents cannot take that into account. Other times there may be changes within the userspace components themselves that mean something in the initramfs cannot interact with an update on the rootfs. That problem also exists in the current setup as well, but switching from today's known scenarios to a completely different setup while still having that problem doesn't sound like a good idea.

A more robust solution would be to stop shipping updates in the manner in which they are shipped in Fedora. Namely, treat them more like "service packs" or micro-releases that could be tested as a whole. Indeed, Fedora Atomic Host very much operates like this with a two week release cadence. However, that isn't prevalent across all of our Editions (yet). It also means individual package maintainers are impacted in their workflow. That might not be a bad thing in the long run, but a change of that proportion requires time, discussion, and significant planning to accomplish. It also needs to take into account urgent security fixes. All of that is something I think should be done, but none of it guarantees we solve these kinds of "kernel-ish" problems.

So should we all despair and throw up our hands and just live with it? I don't think so, no. I believe the distro as a whole will eventually help here, and in the meantime hopefully posts like this provide a little more clarity around how things work and why they may be broken. At the very least, hopefully we can use this to educate people and make the "no, this isn't a kernel problem" discussions a bit easier for everyone.

[1] It should be noted that Peter might not actually want to talk to you about it. It may bring up repressed memories of kludges and he is probably busy doing other things.
linkpost comment

(no subject) [May. 11th, 2016|02:03 pm]

Often we get bugs reported against the Fedora kernel for issues involving third party drivers. Sometimes those are virtualbox, sometimes VMWare guest tools, but most often it is the nvidia driver. We had another reported today. I'll pause to let you read it. Go ahead, read the whole thing. I'll wait.


Done? Good. You'll notice a couple things. First, it's closed as CANTFIX with a rote comment that we typically use for such bugs. That comment, while terse, is not incorrect. Second, the reporter is, frankly, pissed off. You know what?

He has every right to be.

So if the response in the bug is not incorrect, how does that line up with the assertion that the reporter's anger isn't wrong either? I thought I'd spend some time breaking this bug down in detail to try and explain it.

The crux of the reporters argument is that using the nvidia driver on Fedora causes pain to Fedora's users. I'm not going to argue against that. Using the nvidia driver on Fedora is very much painful. A user can finally get it working, and then we rebase the kernel and it breaks again for them. That isn't a good user experience at all. We've know this for a while and there are some tentative plans to help users in this situation by defaulting to a known working kernel if they have the nvidia driver installed. That doesn't fix the problem, but it at least reduces the element of surprise.

The reporter then goes on to make some assertions that might seem plausible, but in fact aren't accurate at all. Let's look at these more closely.

The claims

You do not intentionally break hardware compatibility? Oh, wait, you do.

We do not intentionally break anything. As we've written about before, we rebase the kernel to pick up the bugfixes the upstream maintainers are including in those newer releases. However, an additional benefit of those rebases is that we actively enable more new hardware by doing so. Yes, there are regressions and they are particularly prevalent if you are relying on out-of-tree drivers. Those regressions are unfortunate, but certainly not done out of malice.

You do not intentionally break API/ABI compatibility? Oh, wait, you do.

Greg-kh has talked and written extensively about the fact that the upstream kernel has no stable API. In fact, his document is included in the kernel source itself. Due to the fact that the kernel has no stable API, there is also no stable ABI. Now, it should be noted that ABI here is describing the ABI between the kernel and modules, not the ABI between the kernel and userspace. The kernel/userspace ABI is done via syscalls and that is stable and fanatically protected by the upstream kernel maintainers. However, modules don't have that luxury and therefore when a rebase is done the ABI can and does change. Include the fact that compiler versions change in Fedora, which can impact ABI, and it becomes evident that the reporter's claim is somewhat true.

We could freeze on a kernel version and attempt to keep the API/ABI stable, but that incurs a significant maintenance cost that our small team is not able to handle. Even the RHEL kernel, with it's much larger user base and development team, has a limited kABI they support. So yes, the API/ABI changes with a rebase (or more rarely with a new stable update), but that is one of the consequences of doing a rebase. It is not done with the intention of breaking anything purposefully.

You do not limit user's choice in regard to running software or drivers? Oh, wait, you do.

Fedora actually does not limit the user's choice in software. The user is free to install the nvidia driver or whatever other software they wish to use. Google Chrome, Lotus Notes, Steam, nvidia drivers, and other software not provided by Fedora has all been known to be installed and work. There is no problem with a user choosing to do this. It is their own computer!

What the Fedora kernel team cannot do is provide support for such software. As Justin mentions in the bug itself, providing such support is very difficult for us to do. We have no access to the driver source and cannot fix bugs in the driver. If there is a bug in the kernel itself that the driver happens to trigger, we lack a lot of context around what the driver is doing to cause. It is simply not a tenable position. Therefore we close all such bugs as CANTFIX.

You make sure your software is bugs free? Oh, wait, you don't.

This one is starting to leave reality of software in general, in that no software is ever bug free. We do, however, attempt to ensure we don't ship with known bugs. Also, the kernel Fedora ships is very close to the upstream kernel and 95% of the bugs reported are present in the upstream kernel as well. So "your" software here is collectively the entire kernel community.

You make sure you have the most stringent QA/QC process? Oh, wait, you don't.

I will not argue that our QA process is the most stringent. I won't even argue that it is more stringent than some other project's. It is, however, constantly improving. We've had an automated testsuite in place for more than a year now to test builds as they come out of koji. We continue to run tests on the kernel manually on a variety of machines to make sure things are not known to be broken on certain configurations. We are constantly looking to add more to this in as automated of a fashion as we can.

However, that only scales so far. Particularly in the case of the kernel, Fedora relies heavily on input from actual users via our updates-testing and bodhi infrastructure. Consider this a continued plea for help testing and catching things.

(paraphrased) The nouveau driver is slower, less stable, and unusable on recent nvidia hardware

Quite simply, there is a lot of truth to this statement particularly for accelerated graphics situations (which gnome-shell uses). However, this is not Fedora's fault or really anyone's fault. The nouveau driver is a reverse engineered solution that is continually improving despite having very few actual developers working on it. We've recognized this gap and there are more people assigned to nouveau now than ever before. Progress will be slow, but it is being made. As Justin mentions, support for newer nvidia cards should continue to improve in the 4.6 and 4.7 kernels now that some of the signed firmware issues are worked out.

(paraphrased) Fedora expects everyone to use Intel GPUs

We have no such expectations. Intel does have more market penetration due to the on-board GPUs it ships with its newer CPUs, and they have a development team working on their open source driver directly upstream. The i915 driver is held up as the ideal standard for open source GPU drivers, but it is not bug free by any means. The radeon driver is similarly open source and typically works well but it also is not bug free. GPU work is hard. At least those vendors are doing it in the open, and they should be applauded for it. That does not mean usage of their product is required.

(Personally, I use both Intel and ATI GPUs in my two primary machines.)

(paraphrased) Fedora sabotages its kernels to make it incompatible with third party drivers.

This is blatantly false. We do not add any patches to intentionally break anything. That would be untrue to Fedora's Foundations, limiting our users for no reason at all, and I would personally find it immoral.

The only limitation Fedora places on third party modules is under the Secure Boot case in order to fully support that mechanism. They must be signed with a cert that is imported in the kernel, and we have provided documentation on how to do this or disable Secure Boot for those wishing to not bother with it.

So now what

There is no good answer here. While I dislike the more inflammatory claims, personal attacks, and overall tone in the bug, I stand by my assertion that the reporter has every right to be angry. They simply want to use the hardware they have purchased. I think the anger is misdirected somewhat, but hopefully this post illustrates that there are two sides to every story and elaborates on why it isn't as simple as many make it out to be.

Fedora knows people want to use things that give them the best performance or user experience. We aren't actively trying to prevent either of those. We try and balance the needs of as many users as we can. In this specific case, we're also looking at ways to improve the user experience without compromising Fedora's stance on proprietary software. Unfortunately, there are situations where things will break and we simply cannot fix them. Please continue to tell us about them so we're aware. Perhaps with more understanding and less vitriol in the future.

This post reflects the author's opinion and is not necessarily representative of the Fedora project or the author's employer.
linkpost comment

4.4.y kernel and XFS. Sometimes being Fast and First is hard [Mar. 4th, 2016|02:51 pm]

TLDR: Update xfsprogs to xfsprogs-4.3.0-1 before you update the kernel to 4.4.4 on Fedora 22/23.

Every once in a while we get a bug report that sounds really really bad. So bad that it makes us do a double take and start thinking about it in detail even before we complete the normal daily bug review. Today was one of those days.

This morning I read a bug report that suggested the 4.4 kernel will not boot with an xfsprogs in userspace less than version 4.3.0. That sounds pretty bad. Fedora 23 has already been rebased to the 4.4.y stable series of kernels, and it is in update-testing for Fedora 22. Given that the default filesystem in Server edition is XFS, that definitely caused some concern.

However, thinking about it even for a moment it seemed somewhat odd. First, typically the only userspace component involved in mounting an XFS filesystem is mount(8). That isn't provided by the xfsprogs package. Even if that is what was broken, one would think that there would have been bug reports long before now given that Fedora 23 was rebased a while ago already. Perhaps fsck might come into play too, and that's a possibility but it isn't very likely. So it was time to dig in further. The report linked to an upstream discussion. Here is where things started to get clearer. I'll try and explain a bit.

Around the 3.15 kernel release, XFS added support for maintaining CRC checksums for on-disk metadata objects. According to the mkfs.xfs manpage:

CRCs enable enhanced error detection due to hardware issues, whilst the format changes also improves crash recovery algorithms and the ability of various tools to validate and repair metadata corruptions when they are found. The CRC algorithm used is CRC32c, so the overhead is dependent on CPU archi‐ tecture as some CPUs have hardware acceleration of this algorithm. Typically the overhead of calculat‐ ing and checking the CRCs is not noticable in normal operation.

Yay! More robustness and better error recovery. Fantastic. However, features like this are rarely enabled by default when they are first released. In fact, the addition of this necessitated bumping the version number of the XFS on-disk format to version 5. For further caution, xfsprogs didn't start supporting this until xfsprogs-3.2.0 and didn't make it the default for mkfs.xfs until xfsprogs-3.2.3. Pretty cautious on the part of the XFS developers. But that still leaves us trying to figure out why the report was submitted and what it means for Fedora.

First, the actual problem. In the 4.4 kernel release the XFS developers started validating some additional metadata of V5 filesystems against the log. This is fine for healthy filesystems. However, if you had a crash and ran xfs_repair from an older xfsprogs package, it would unconditionally zero out the log and the kernel would be very confused and report corruption when you tried to mount it. The xfsprogs-4.3 release fixed this in the xfs_repair utility. OK, so there's validity to the 4.4 kernel needing xfsprogs-4.3, but only in certain circumstances. Namely, you have to have a V5 filesystem on disk, and you have to run xfs_repair from an older xfsprogs against it.

Fortunately, that isn't a super common situation. For Fedora 22, we released with xfsprogs-3.2.2. That means any new XFS filesystems created from installation media should still be V4 and not hit this issue (prior EOL Fedora releases are in the same boat). It's possible someone manually specified crc=1 when they created their F22 XFS partition, but that is a rare case. Fedora 23, on the other hand, shipped with xfsprogs-3.2.4 and should be creating V5 XFS filesystems by default. Those users would still need to run xfs_repair, which isn't a massively common thing to do. Unfortunately, it is common enough we need to do something about it.

The first thing to do was get xfsprogs updated in Fedora 22 and 23 (24 and rawhide are fine already). Eric Sandeen, being the massively awesome human being he is, had that done before I had even fully understood the problem. He was, in fact, so fast that he had the bodhi updates file before we even talked about what to do in the kernel package. That isn't ideal, but it's workable.

The second thing to do was to make sure the kernel package took this into account. Originally we discussed adding a Requires on the appropriate xfsprogs version. That actually isn't a great idea though. The kernel package doesn't actually Require any of the filesystem creation tools packages and it shouldn't. Users don't want to be forced to drag around xfsprogs if they aren't even using XFS filesystems, or btrfs-progs if they aren't using btrfs, etc. So we quickly realized that we needed to use Conflicts.

For the 4.4.4 kernel updates in Fedora 22 and Fedora 23, we added the Conflicts for xfsprogs < 4.3.0-1. You may see output that looks similar to this when you dnf update:

Error: package kernel-core-4.4.4-301.fc23.x86_64 conflicts with xfsprogs < 4.3.0-1 provided by xfsprogs-3.2.4-1.fc23.x86_64.

The solution here is to make sure you update xfsprogs first. Then your kernel update should work fine.

We'd like to thank the original bug reporter and Eric Sandeen for their help.

[Edit: Added TLDR for people that don't like to read and/or hate my writing]
linkpost comment

Understanding the kernel release cycle [Dec. 2nd, 2015|09:14 pm]

A few days ago, a Fedora community member was asking if there was a 4.3 kernel for F23 yet (there isn't). When pressed for why, it turns out they were asking for someone that wanted newer support but thought a 4.4-rc3 kernel was too new. This was surprising to me. The assumption was being made that 4.4-rc3 was unstable or dangerous, but that 4.3 was just fine even though Fedora hasn't updated the release branches with it yet. This led me to ponder the upstream kernel release cycle a bit, and I thought I would offer some insights as to why the version number might not represent what most people think it does.

First, I will start by saying that the upstream kernel development process is amazing. The rate of change for the 4.3 kernel was around 8 patches per hour, by 1600 developers, for a total of 12,131 changes over 63 days total[1]. And that is considered a fairly calm release by kernel standards. The fact that the community continues to churn out kernels of such quality with that rate of change is very very impressive. There is actually quite a bit of background coordination that goes on between the subsystem maintainers, but I'm going to focus on how Linus' releases are formed for the sake of simplicity for now.

A kernel release is broken into a set of discrete, roughly time-based chunks. The first chunk is the 2 week merge window. This is the timeframe where the subsystem maintainers send the majority of the new changes for that release to Linus. He takes them in via git pull requests, grumbles about a fair number of them, refuses a few others. Most of the pull requests are dealt with in the first week, but there are always a few late ones so Linus waits the two weeks and then "closes" the window. This culminates in the first RC release being cut.

From that point on, the focus for the release is fixes. New code being taken at this point is fairly rare, but does happen in the early -rc releases. These are cut roughly every Sunday evening, making for a one week timeframe per -rc. Each -rc release tends to be smaller in new changesets than the previous, as the community becomes much more picky on what is acceptable the longer the release has gone on. Typically it gets to -rc7, but occasionally it will go to -rc8. One week after -rc7 is released, the "final" release is cut, which maps nicely with the 63 day timeframe quoted above.

Now, here is where people start getting confused. They see a "final" release and immediately assume it's stable. It's not. There are bugs. Lots and lots of bugs. So why would Linus release a kernel with a lot of bugs? Because finding them all is an economy of scale. Let's step back a second into the development and see why.

During the development cycle, people are constantly testing things. However, not everyone is testing the same thing. Each subsystem maintainer is often testing their own git tree for their specific subsystem. At the same time, they've opened their subsystem trees for changes for the next version of the kernel, not the one still in -rcX state. So they have new code coming in before the current code is even released. This is how they sustain that massive rate of change.

Aside from subsystem trees, there is the linux-next tree. This is daily merge of all the subsystem maintainer trees that have already opened up to new code on top of whatever Linus has in his tree. A number of people are continually testing linux-next, mostly through automated bots but also in VMs and running fuzzers and such. In theory and in practice, this catches bugs before they get to Linus the next round. But it is complicated because the rate of change means that if an issue is hit, it's hard to see if it's in the new new code only found in linux-next, or if it's actually in Linus' tree. Determining that usually winds up being a manual process via git-bisect, but sometimes the testing bots can determine the offending commit in an automated fashion.

If a bug is found, the subsystem maintainer or patch author or whomever must track which tree the bug is, whether it's a small enough fix to go into whatever -rcX state Linus' tree is in, and how to get it there. This is very much a manual process, and often involves multiple humans. Given that humans are terrible at multitasking in general, and grow ever more cautious the later the -rcX state is, sometimes fixes are missed or simply queued for the next merge window. That's not to say important bugs are not fixed, because clearly there are several weeks of -rcX releases where fixing is the primary concern. However, with all the moving parts, you're never going to find all the bugs in time.

In addition to the rate of change/forest of trees issue, there's also the diversity and size of the tester pool. Most of the bots test via VMs. VMs are wonderful tools, but they don't test the majority of the drivers in the kernel. The kernel developers themselves tend to have higher end laptops. Traditionally this was of the Thinkpad variety and a fair number of those are still seen, but there is some variance here now which is good. But it isn't good enough to cover all possible firmware, device, memory, and workload combinations. There are other testers to be sure, but they only cover a tiny fraction of the end user machines.

It isn't hard to see how bugs slip through, particularly in drivers or on previous generation hardware. I wouldn't even call it a problem really. No software project is going to cut a release with 0 bugs in it. It simply doesn't happen. The kernel is actually fairly high quality at release time in spite of this. However, as I said earlier, people tend to make assumptions and think it's good enough for daily use on whatever hardware they have. Then they're surprised when it might not be.

To combat this problem, we have the upstream stable trees. These trees backport fixes from the current development kernels that also apply to the already released kernels. Hence, 4.3.1 is Linus' 4.3 release, plus a number of fixes that were found "too late". This, in my opinion, is where the bulk of the work on making a kernel usable happens. It is also somewhat surprising when you look at it.

The first stable release of a kernel, a .1 release, is actually very large. It is often comprised of 100-200 individual changes that are being backported from the current development kernel. That means there are 100-200 bugs immediately being fixed there. Whew, that's a lot but OK maybe expected with everything above taken into account. Except the .2 release is often also 100-200 patches. And .3. And often .4. It isn't until you start getting into .5, .6, .7, etc that the patch count starts getting smaller. By the .9 release, it's usually time to retire the whole tree (unless it's a long-term stable) and start the fun all over again.

In dealing with the Fedora kernel, the maintainers take all of this into account. This is why it is very rare to see us push a 4.x.0 kernel to a stable release, and often it isn't until .2 that you see a build. For those thinking that this article is somehow deriding the upstream kernel development process, I hope you now realize the opposite is true. We rely heavily on upstream following through and tagging and fixing the issues it finds, either while under development or via the stable process. We help the stable process as well by reporting back fixes if they aren't already found.

So hopefully next time you're itching for a new kernel just because it's been released upstream, you'll pause and think about this. And if you really want to help, you'll grab a rawhide kernel and pitch in by reporting any issues when you find them. The only way to get the stable kernel releases smaller, and reduce the number of bugs still found in freshly released kernels, is to broaden the tester pool and let the upstream developers know as soon as possible. In this way, we're all part of the upstream kernel community and we can all keep making it awesome and impressive.

(4.3.y will likely be coming to F23 the first week of January. Greg-KH seems to have gone on some kind of walkabout the past few weeks so 4.3.1 hasn't been released yet. To be honest, it's a break well deserved. Or maybe he found 4.3 to be even more buggy as usual. Who knows!)

[1] https://lwn.net/Articles/661978/ (Of course I was going to link to lwn.net. If you aren't already subscribed to it, you really should be. They have amazing articles and technical content. They make my stuff look like junk even more than it already is. I'm kinda jealous at the energy and expertise they show in their writing.)
linkpost comment

Fedora kernel exploded tree part deux: Snakes and assumptions [Oct. 6th, 2015|01:13 pm]

A while back I wrote about some efforts to move to using an exploded source tree for the Fedora kernel. As that post details, it wasn't the greatest experience. However, I still think an exploded tree has good utility and I didn't want to give up on the idea of it existing. So after scraping our "switch" I decided to (slowly) work on a tool that would create such a tree automatically. In the spirit of release-early and pray nobody dies from reading your terrible code, we now have fedkernel. The readme file in the repo contains a high level overview of how the tool works, so I won't duplicate that here. Instead I thought I would talk about some of the process changes and decisions we made to make this possible.

Git all the things

One of the positive fallouts of the previous efforts was that all of the patches we carried in Fedora were nicely formatted with changelogs and authorship information. Being literally the output of git-format-patch instantly improved the patch quality. When it came time to figure out how to generate the patches from pkg-git to apply to the exploded tree, I really wanted to keep that quality. So I thought about how to accomplish this and then I realized there was no need to reinvent the wheel. The git-am and git-format-patch tools existed and were exactly what I wanted.

After discussing things with the rest of the team, we switched to using git-am to apply patches in the Fedora kernel spec. The mechanics of this are pretty simple: the spec unpacks the tarball (plus any -rcX patches) and uses this as the "base" commit. Stable update patches are applied as a separate commit on top of the base if it is a stable kernel. Then it walks through every patch and applies it with git-am. This essentially enforces our patch format guidelines for us. It does have the somewhat negative side effect of slowing down the %prep section quite a bit, but in practice it hasn't been slow enough to be a pain. (Doing a git add and git commit on the full kernel sources isn't exactly speedy, even on an SSD.)

So after %prep is done, the user is left with a git tree in the working directory that has all the Fedora patches as separate commits. "But wait, isn't the job done then?", you might ask. Well, no. We could call it good enough, but that isn't really what I or other users of an exploded tree were after. I wanted a tree with the full upstream commit history plus our patches. What this produces is just a blob, plus our patches. Not quite there yet but getting closer.


This is where fedkernel comes in. I needed tooling that could take the patches from this franken-tree and apply them to a real exploded git tree. My previous scripts were written in bash, and I could have done this in bash again but I wanted to make it automated and I wanted it to talk to the Fedora infrastructure. This means it was time to learn python again. Fortunately, the upstream python community has great documentation and there are modules for pretty much anything I needed. This makes my "bash keyboard in interactive python session" approach to the language pretty manageable and I was able to make decent progress.

To get the patches out of the prepped sources, I needed to know mainly one thing. What was the actual upstream base for this build? That is easy enough to figure out if you can parse certain macros in kernel.spec. The one part that proved to be somewhat difficult was for git snapshot kernels. We name these -gitY kernels, where X increases until the next -rcX release. E.g. kernel-4.3.0-0.rc3.git1.1, kernel-4.3.0-0.rc3.git2.1, etc. That's great for RPMs, but the only place we actually documented what upstream commit we generated the snapshot from was in an RPM %changelog comment.

Parsing it out of there is possible, but it's a bit cumbersome and it is somewhat error prone. The sha1sum is always recorded, but it isn't guaranteed to be the newest changelog. Other patches and changelogs can be added before the kernel is actually built. Fortunately, we use a script to generate these snapshots. To make it trivial to figure out the sha1sum, I modified the script to record the full commit has to a file called gitrev in pkg-git. Now fedkernel can easily read that file and use it as the base upstream revision to apply patches on top of. Yay for cheating and/or being lazy.

The rest of the code deals with prepping the tree, using the git python module to do manipulations and generate patches, and applying them to the other tree. Python actually made much of this very easy to do and again I'm really glad I used that instead of bash.


So now that we modified a few things in pkg-git to make this easier, the assumptions basically fall out to be:

- Patches in the prepped source can be retrieved via 'git format-patch'
- The upstream base revision is determinable from kernel.spec and the gitrev file.

Pretty simple, right? Yes. Except the code isn't complete by any means and it requires a bit more manual setup. Like having existing pkg-git and linux.git trees that it can modify which contain all the branches and proper remotes set up already. That isn't really a huge issue, but it does mean when f24 is branched and rawhide becomes f25, we'll need to do some updates. Perhaps before then we'll have fixed the code (or some nice person will submit a pull request that does so.)

I've been using the tool to generate exploded trees for the past week or so. It seems to be working well, and I've published them at https://git.kernel.org/cgit/linux/kernel/git/jwboyer/fedora.git/ once again. There is a history gap there as the tree fell into disrepair for a while, but it should be kept current going forward.

Even if the code is terrible and hacky, writing it was a good learning experience. I hope to keep refining it and improving things in the TODO over time. If you want to pitch in, patches are always welcome. You can email them to me, submit a pagure.io pull request, or mail the Fedora kernel list as usual.
linkpost comment

A word on Fedora meetings [Sep. 9th, 2015|07:01 pm]

A few people have noted that when I chair a Fedora meeting, I seem to move quickly and remain strictly focused on the current set topic. I discourage broader discussion on semi-related topics and tend to finish meetings faster than most. There is a reason for this. It is because THAT IS HOW MEETINGS ARE SUPPOSED TO WORK.

There are many many articles about productivity and meetings. I am not an expert on this at all, but I do strictly follow my own set of rules for meetings derived partly from said articles but mostly from experience. They can be summed up as so:

1) The meeting better have an agenda. If it doesn't, I'm likely to not pay attention. The agenda should lend itself to having items decided upon and completed during said meeting. If there are hairy topics, the should be last. The vast majority of discussion of agenda items should have already taken place elsewhere, either on lists or in tickets.

2) The meeting should actually STICK to the agenda. A meeting is really not a place to bring up random or tangential topics for discussion. At most such things should be noted for further discussion and decision at future meetings, preferably during open floor.

3) The chair of the meeting is responsible for keeping people on topic and completing the meeting in a timely manner. Be polite, but firm.

4) If it is clear that a decision is not going to be reached on an item during the meeting, defer it for further discussion elsewhere.

That's it.

Meetings should be focused with clear goals. Fedora meetings taking place on IRC necessitates some amount of leeway because of the medium, but that does not mean meetings exist for the sake of meetings or for chat time. IRC is also a terrible place to have in-depth discussions on things, as people feel time pressured and the format of the dialogue can be difficult to follow.

So yes, my FESCo meetings (or any other that I chair) tend to be shorter than most. This is simply because I want the meeting to be productive and I don't want to waste people's time. And if you feel that I am wrong on several points here, that is totally fine with me. Just volunteer to chair the meeting next time ;).
linkpost comment

LPC 2015 Day 2 [Aug. 24th, 2015|03:10 pm]
I started day 2 of LPC with the Graphics, mode setting, and Wayland microconference. This was an extremely dense microconference with lots of information on what is missing in various areas and what has been in the works for some time. To be honest, I felt out of my depth several times. However, it was very clear that the participants in the session were making good progress. One side-note: nobody likes to use mics :).

The afternoon session was heavy in hallway track for me. I had some conversations with Josef Bacik around btrfs and Fedora (summary: not yet). I also spoke with him and another engineer from Facebook around how they handle the kernel across their various machines. It was an interesting high level look at what a company is doing at that scale.

I also touched base with Matthew Garrett on the secure boot patchset. This has been something that we've carried in Fedora since around Fedora 18. It hasn't really changed much at all, but the patches have failed to get upstreamed for a number of trivial reasons. There are a few follow-ups that need to be looked into as well though. Matthew plans on submitting them upstream again and I'm going to see what I can do to help them get merged. Hopefully the third (fourth?) time is a charm.

The remainder of the afternoon was spent catching up with some old colleagues and meeting a few new people from various companies. Conversation was wide ranging. We touched on technical topics as well as completely irrelevant things. However, I think the time was very well spent as one of the major purposes of conferences is to meet up with people face to face. Without that interaction, it is too easy to resolve a person to nothing more than an email address and that never works out well.

The evening event was held at Rock Bottom Brewery and it was delicious. We had some impromptu Fedora kernel team time over dinner and shared dessert. Once all the ice cream and brownie was gone, I turned in shockingly early. Apparently the jet lag and two late nights in a row were starting to wear on me.
linkpost comment

LPC 2015 Day 1 [Aug. 21st, 2015|12:00 pm]
I have the privilege of attending the Linux Plumbers Conference in Seattle this week. This is by far my favorite conference. The talks and microconferences are all of high quality and the events are very well organized.

The first day is shared with LinuxCon, which lends itself to some talks that span both audiences. The first talk I attended with "Everything is a file descriptor" by Josh Triplett. Josh gave a great overview of why file descriptors are a great mechanism and some of the more recent system calls that have been added to give userspace the ability to write less crappy code. Things like timerfd, and signalfd allow userspace to avoid some of the awkwardness that comes with using the more traditional interfaces that UNIX has provided. He also described work on clonefd, which is a new system call to allow userspace to get a file descriptor tied to a task in the kernel. This will have some interested benefits, such as being able to pass the fd to a process that is not in your process hierarchy and being able to poll for child/task exit without hanging in waitpid. At the end Josh threw out some possible future additions that haven't been implemented. Overall a very well done talk.

Following that, I sat in on Steven Rostedt's talk on the RT patchset. I've seen this talk or a version of it around three times and I'm always highly entertained. Steven likes to pack a ton of information in his talks. The overall success of the RT patchset has been very good, and they are down to some of the final really hard bits. Some of them they actually need help with to figure out the best solution in the subsystem in question (like the VFS). One of the questions from the audience was about lessons learned working on the patchset, the issues they've fixed in mainline because of it, etc. Steven said that is a great idea for another talk, so I hope he writes that up. I would love to hear it.

THe afternoon session started off with an overview of ACPI and where some of the ACPI 6 features are coming into play. Things like low power idle and other PM related features look to be headed towards us in hardware, and ACPI is of course being adapted to work with this. Overall a good overview.

Following that I did a bit of hallway track and then went to Daniel Vetter's talk on screwing up (or how to not) ioctls. Working in the DRM layer has provided him with lots of experience over the past few years on how to properly design your code to make it easier to use. One of the main points that was stressed several times was having testcases for everything. This is fairly obvious, but he pointed out that the testcases are better written when you aren't looking at the code. If you just look at the data structures and create boundary cases based on that, with all kinds of crazy input, you will often catch cases that the code itself doesn't cover. He had a good amount to say and it was entertaining.

The traditional kernel panel was the wrap up for the day, but I skipped that to catch up on some work. Then it was off to the evening event at the Experience Music Project museum. This venue was amazing and had a variety of exhibits ranging from a guitar collection to Star Wars costumes. It was very cool and the food was excellent. After spending perhaps a bit more than I expected in the gift shop, it was back to the hotel to try and sleep off some of the weariness that comes from travel delays and cramming information into your head all day.
linkpost comment

Failing Fast [Jul. 20th, 2015|10:48 am]

Users of the Fedora exploded kernel git tree might have noticed that it has been stale for a couple of weeks now. What they might not know is why that is, and when it is going to be fixed. The answer is somewhat complicated and I'll try and summarize here.

The Fedora kernel team recently tried shifting to using the exploded tree as the canonical location for Fedora kernel work. The benefits and ideas were written here, and most of those still stand. So I went to work on some scripts that would make this easier to do. The results weren't terrible. Things worked, kernels were still built, and the exploded tree was spit out (albeit at a different location). By some measures, this was a success.

Except it really wasn't. Some of the motivation behind the change was to increase participation and transparency around the Fedora kernel. However, while we worked through the new process we quickly realized that it was much more cumbersome to actually produce kernel packages. Since the primary output of our team is packages, that seemed like a pretty bad side effect. Similarly, transparency wasn't decreased but the change made things much more confusing. Changes in the exploded tree weren't synced 1:1 back to the Fedora package repo, so it appeared like big code drops there instead of discrete commits. Telling people to look at the exploded tree was an option, but in Fedora's package-centric world it was a bit out of place.

So what do you do when you've spent time making a change and it isn't working out? Well, you scrap it. There's little point in pushing on with something that even the primary author finds to be awkward. The payoffs weren't likely to materialize in any sort of timeframe that would make it worth the continued effort there. Was the time wasted? In my opinion, no. It answered some questions we've had for a while, and showed us that both the tooling and processes Fedora uses aren't really amenable to working with exploded sources.

Unfortunately, the exploded tree has been stagnant since we changed back. However, I don't think it will remain that way forever. We tweaked some of the things we do to build the kernel package such that taking that content and creating an exploded tree will be easier. With a bit more scripting, it should be possible to almost automate the creation on every successful build. I'll be working on that off and on for the next few weeks. Hopefully by Flock I have something cobbled together to get things back on track again.

It's been a while since I've had what I would consider a pretty big failure. I make mistakes all the time like everyone else, but those are generally small and on short-term things. In a discussion with someone, they asked me if I was bothered by this being a big bigger. I'm not. Personally, I don't care if I fail spectacularly or not as long as I learn something from it. I think I did, so the exercise was worthwhile to me. Failing fast is a really good way to work through some complicated scenarios, and is far better than staying stagnant out of fear of failure. Time to move on to the next idea!

(A couple of notes:)

It was pointed out on the list that RHEL tooling can build from an exploded tree, but it seems it does this with some odd hacks that we'd likely try and avoid in Fedora. The workflow between RHEL and Fedora kernels are also massively different. I would love to see some more commonality between the two, but not at the expense of forcing ill fitting process on either one.

Pagure, is where the exploded tree was hosted in the interim. We tried it with the idea that making multiple committers there would be easier than kernel.org. It is a neat service, but I'm not sure we'd really use many of the features it was designed around. The only complaint I heard was that browsing via the web interface was slow.
linkpost comment

Fedora 22 and Kernel 4.0 [Apr. 16th, 2015|10:36 am]

As Ryan noted yesterday, Fedora 22 is on track to ship with the 4.0 kernel release. So what does that mean in the grand scheme of things? In short, not much.

The major version change wasn't done because of any major feature or change in process or really anything exciting at all. Linus Torvalds changed it because he felt the minor version number was getting a bit large and he liked 4.0 better. It was really a whim more than any thing contained within the kernel itself. The initial merge window builds of this kernel in Fedora were even called 3.20.0-rc0.gitX until the 4.0-rc1 release came out.

In fact, this kernel release is one of the more "boring" releases in a while. It has all the normal fixes and improvements and new hardware support one would normally expect, but overall it was a fairly quiet release. So version number aside, this is really more of the same from our beloved kernel development community.

However, there is one feature that people (and various media sites) seem to have keyed in on and that is the live patching functionality. This holds the promise of being able to patch your running kernel without rebooting. Indeed, this could be very useful, but it isn't quite there yet. And it also doesn't make a whole lot of sense for Fedora at this time. The Fedora kernels have this functionality disabled, both in Fedora 22 and Rawhide.

What was merged for 4.0 is the core functionality that is shared between two live patching projects, kPatch and kGraft. kPatch is being led by a development team in Red Hat whereas kGraft is being developed by a team from SuSE. They both accomplish the same end result, but they do so via a different approach internally. The two teams met at the Linux Plumbers conference last year and worked on some common ground to make it easier to merge into mainline rather than compete with each other. This is absolutely awesome and an example of how new features should be developed upstream. Kudos to all those involved on that front.

The in-kernel code can accept patches from both methods, but the process and tools to create those patches are still being worked on in their upstream communities. Neither set are in Fedora itself, and likely won't be for some time as it is still fairly early in the life of these projects. After discussing this a bit with the live patching maintainer, we decided to keep this disabled in the Fedora kernels for now. The kernel-playground COPR does have it enabled for those that want to dig in and generate their own patches and are willing to break things and support themselves.

In reality, we might not ever really leverage the live patching functionality in Fedora itself. It is understandable that people want to patch their kernel without rebooting, but the mechanism is mostly targeted at small bugfixes and security patches. You cannot, for example, live patch from version 4.0 to 4.1. Given that the Fedora kernel rebases both from stable kernel (e.g. 3.19.2 to 3.19.3) and major release kernels over the lifetime of a Fedora release, we don't have much opportunity to build the live patches. Our update shipping infrastructure also isn't really geared towards quick, small fixes. Really, the only viable target for this functionality in Fedora is likely on the oldest Fedora release towards the end of its lifecycle and even then it's questionable whether it would be worth the effort. So I won't say that Fedora will never have a live patch enabled kernel, but there is a lot of work and process stuff to be figured out before that ever really becomes an option.

So that's the story around the 4.0 kernel. On the one hand, it sounds pretty boring and is likely to disappoint those hoping for some amazing new thing. On the other hand, it's a great example of the upstream kernel process just chugging along and delivering pretty stable quality kernels. As a kernel maintainer, I like this quite a bit. If you have any questions about the 4.0 kernel, or really any Fedora kernel topics, feel free to email us at the Fedora kernel list. We're always happy to discuss things there.
link1 comment|post comment

[ viewing | most recent entries ]
[ go | earlier ]