|When is a kernel bug not a kernel bug?
||[Jun. 30th, 2016|12:27 pm]
Think of this scenario: You're sitting at your shiny Fedora install and notice a kernel update is available. You get all excited, update it through dnf or Gnome Software, or whatever you use, reboot and then things stop working. "STUPID KERNEL UPDATE WHY DID YOU BREAK MY MACHINE" you might say. Clearly it's at fault, so you dutifully file a bug against the kernel (you do that instead of just complaining, right?). Then you get told it isn't a kernel problem, and you probably think the developers are crazy. How can a kernel update that doesn't work NOT be a kernel problem?
This scenario happens quite often. To be sure, a good portion of the issues people run into with kernel updates are clearly kernel bugs. However, there is a whole set of situations where it seems that way but really it isn't. So what is at fault? Lots of stuff. How? Let's talk about the boot sequence a bit.
Booting: a primer
Booting a computer is a terribly hacky thing. If you want a really deep understanding of how it works, you should probably talk to Peter Jones. For the purposes of this discussion, we're going to skip all the weird and crufty stuff that happens before grub is started and just call it black magic.
Essentially there are 3 main pieces of software that are responsible for getting your machine from power-on to whatever state userspace is supposed to be in. Those are grub, the kernel, and the initramfs. Grub loads the kernel and initramfs into memory, then hands off control to the kernel. The kernel does the remainder of the hardware and low-level subsystem init, uncompresses the initramfs and jumps into the userspace code contained within. The initramfs bootstraps userspace as it sees fit, mounting the rootfs and switching control to that to finish up the boot sequence. Seems simple.
So what is this "initramfs"? In technical terms, it's a weird version of a CPIO archive that contains a subset of userspace binaries needed to get you to the rootfs. I say weird because it can also have CPU microcode tacked onto the front of it, which the kernel strips off before unpacking it and applies during the early microcode update. This is a good thing, but it's also kind of odd.
The binaries contained within the initramfs are typically your init process (systemd), system libraries, kernel modules needed for your hardware (though not all of them), firmware files, udev, dbus, etc. It's almost equivalent to the bare minimum you can get to a prompt with. If you want to inspect the contents for yourself, the lsinitrd command is very handy.
There are actually a couple of different 'flavors' of initramfs as well. The initramfs found in the install images is a generic initramfs that has content which should work on the widest variety of machines possible, and can be used as a rescue mechanism. It tends to be large though, which is why after an install the initramfs is switched to HostOnly mode. That means it is specific to the machine it is created on. The tool that creates the initramfs is called dracut, and if you're interested in how it works I would suggest reading the documentation.
OK, so now that we have the components involved, let's get to the actual problems that look like kernel bugs but aren't.
Cannot mount rootfs
One of the more common issues we see after an update is that the kernel cannot mount the rootfs, which results in the system panicking. How does this happen? Actually, there are a number of different ways. A few are:
* The initramfs wasn't included in the grub config file for unknown reasons and therefore wasn't loaded.
* The initramfs was corrupted on install.
* The kernel command line specified in the grub config file didn't include the proper rootfs arguments.
All of those happen, and none of them are the kernel's fault. Fortunately, they tend to be fairly easy to repair but it is certainly misleading to a user when they see the kernel panic.
A different update breaks your kernel update
We've established that the initramfs is a collection of binaries from the distro. It's worth clarifying that these binaries are pulled into the initramfs from what is already installed on the system. Why is that important? Because it leads to the biggest source of confusion when we say the kernel isn't at fault.
Fedora tends to update fast and frequently across the entire package set. We aren't really a rolling release, but even within a release our updates are somewhat of a firehose. That leads to situations where packages can, and often do, update independently across a given timeframe. In fact, the only time we test a package collection as a whole is around a release milestone (keep reading for more on this). So let's look at how this plays out in terms of a kernel update.
Say you're happily running a kernel from the GA release. A week goes by and you decided to update, which brings in a slew of packages, but no kernel update (rinse and repeat this N times). Finally, a kernel update is released. The kernel is installed, and the initramfs is built from the set of binaries that are on the system at the time of the install. Then you reboot and suddenly everything is broken.
In our theoretical example, let's assume there were lvm updates in the timeframe between release and your kernel update. Now, the GA kernel is using the initramfs that was generated at install time of the GA. It continues to do so forever. The initramfs is never regenerated automatically for a given kernel after it is created during the kernel install transaction. That means you've been using the lvm component shipped with GA, even though a newer version is available on the rootfs.
Again, theoretically say that lvm update contained a bug that made it not see particular volumes, like your root volume. When the new kernel is installed, the initramfs will suck in this new lvm with the bug. Then you reboot and suddenly lvm cannot see your root volume. Except it is never that obvious and it just looks like a kernel problem. Compounding that issue, everything works when you boot the old kernel. Why? Because the old kernel initramfs is still using the old lvm version contained within it, which doesn't have the bug.
This problem isn't specific to lvm at all. We've seen issues with lvm, mdraid, systemd, selinux, and more. However, because of the nature of updates and the initramfs creation, it only triggers when that new kernel is booted. This winds up taking quite a bit of time to figure out, with a lot of resistance (understandably) from users that insist it is a kernel problem.
Solution: ideas wanted
Unfortunately, we don't really have a great solution to any of these, particularly the one mentioned immediately above. People have suggested regenerating the initramfs after every update transaction, but that actually makes the problem worse. It takes something known to be working and suddenly introduces the possibility that it breaks.
Another solution that has been suggested is to keep the userspace components in the initramfs fixed, and only update to include newer firmware and modules. This sounds somewhat appealing at first, but there are a few issues with it. The first is that the interaction between the kernel and userspace isn't always disjoint. In rare cases, we might actually need a newer userspace component (such as the xorg drivers) to work properly with a kernel rebase. Today that is handled via RPM Requires, and fixing the initramfs contents cannot take that into account. Other times there may be changes within the userspace components themselves that mean something in the initramfs cannot interact with an update on the rootfs. That problem also exists in the current setup as well, but switching from today's known scenarios to a completely different setup while still having that problem doesn't sound like a good idea.
A more robust solution would be to stop shipping updates in the manner in which they are shipped in Fedora. Namely, treat them more like "service packs" or micro-releases that could be tested as a whole. Indeed, Fedora Atomic Host very much operates like this with a two week release cadence. However, that isn't prevalent across all of our Editions (yet). It also means individual package maintainers are impacted in their workflow. That might not be a bad thing in the long run, but a change of that proportion requires time, discussion, and significant planning to accomplish. It also needs to take into account urgent security fixes. All of that is something I think should be done, but none of it guarantees we solve these kinds of "kernel-ish" problems.
So should we all despair and throw up our hands and just live with it? I don't think so, no. I believe the distro as a whole will eventually help here, and in the meantime hopefully posts like this provide a little more clarity around how things work and why they may be broken. At the very least, hopefully we can use this to educate people and make the "no, this isn't a kernel problem" discussions a bit easier for everyone.
 It should be noted that Peter might not actually want to talk to you about it. It may bring up repressed memories of kludges and he is probably busy doing other things.