I have recently been thinking about a Linux distribution providing “vanilla packages” and what this really means. The basic idea is simple – provide packages as upstream released them. In practice, this is an impossibility (for reasons I will cover below).
Firstly, lets cover the reason for providing vanilla packages. And I use “reason” singular deliberately, because I think it comes down to a single point. The software developer knows their software better than you do. Your patch could unintentionally introduce a security issue – it has happened before… Or you could introduce a new feature that is then introduced by the developer in a different and incompatible way in the next official release. Also, any bugs that are found in your modified piece of software will need to be triaged in an unpatched version of the software in order to report the issue upstream.
So, in an ideal world, we would just run the equivalent of “./configure; make; make install” and all software would install perfectly. But the world is far from ideal… I will cover two points across two posts: firstly patching, followed by configure options and dependencies.
Patching software is a necessity in any Linux distribution. I am only considering rolling release distributions in the discussion below, so that removes backporting fixes and features from newer versions of a software package. So, what patching is minimally required:
- Patches for build issues
- Patches for security issues
- Patches to fix major software features
I am completely ignoring patches released by upstream as part of their update process. For example the Linux kernel provides a large patch that updates from their x.y.0 release to x.y.z. Bash releases a patch for each minor update, so bash-4.2.042 requires applying 42 patches… These are obviously required patching and are fully sanctioned by the developer so do not deviate from vanilla.
It should be fairly obvious what patches for build issues are… the software will not compile for some reason so you need to do patching to fix that. A piece of software is almost never tested in every single environment before release and, even if it was, updates to other pieces of software can cause build issues. This is particularly common with gcc updates which have become progressively stricter on which headers needed included for a function. Even worse, a software developer might release with the “-Werror” flag enabled by default, meaning any new warning will result in a build failure (I do not have kind words about software developers that do that…). Then there are more complicated issues involving a library update with API changes requiring much more extensive fixes. While adding a single extra include really does not require upstream approval before applying, even that should be forwarded upstream.
Security issues are an important part of patching for non-rolling release distributions. However, new versions of software are usually released whenever a security hole is found, so rolling release distributions only need to update. Again, just grabbing a patch from anywhere on the internet and applying it is not a good idea – I have seen this actually result in a larger security issue than the original.
The final category of patches are those that fix major software features. For example, if an IRC client has a bug preventing it connecting to any channels, firstly shake your fist at the developer and tell them to do some testing before release, and then patch it. If there is a typo in the help output, file a bug or submit a patch upstream, but there is no need to patch it. The guiding principle should be something like “if this is the only issue found in the software, would the developer consider making a new release?”. The answer to that question “yes”, then it will be “yes” to provide the patch.
One guiding factor that can not be stressed enough here is that all patches should be approved by upstream. The best situation is if upstream have committed the patch into the version control repository – preferably on the branch for the version you are using so no mistakes can be made back-porting. Failing that, a post on a mailing list or bug tracker by one of the main developers of the software approving a patch is acceptable.
Of course, much of this is subjective. Is that broken feature big enough to patch? Does this bug constitute a security issue? If upstream is rather unresponsive at the moment, should I apply this fix for a security bug? Is this build fix minor enough that I do not need to wait on an upstream comment before applying? Given it is hard to formulate these ideas into precise rules, I think the answer becomes one of how strict the packager is. I was far more likely to include patches when I started packaging for Arch Linux than I am now. So maybe it is not how strict the packager is, but rather how grumpy…
If you want vanilla packaging, start by moving opt dependencies back to dependencies where the package actually links against the “optional” library. I very much doubt upstream developers intended for their packages to be installed with broken dependencies.
I at least partially agree with you there. Issues with configuration and dependencies will be discussed in Part 2.