Advantage of a Simple “Database” Format
One fairly common criticism of the pacman package manager is that is very slow due to not using some sort of binary database as its backend. I found suggestions to use sqlite dating back to 2005 (although I am sure they go back further) and mailing list activity peaked around late 2007. Speed is one of pacman’s main features – and it beats the competition by a wide margin according to Linux Format – but I guess people want it even faster.
The problem is that we use a filesystem based “database” where each package has its information stored in multiple files. This means that we can get fragmentation of our “database” and the reading of all these files from the filesystem can be quite slow. Usually most of this is cached by the kernel after the first read so speed improves markedly after the first usage.
This was improved a lot in the pacman 3.5 release (March 2011). The sync databases started to be read directly from the downloaded tarball and the local database had the “desc” and “depends” files for each package merged into one file. This increased the speed of reading from the sync databases massively and was a reasonable improvement to the local database too.
So the local package “database” could be improved by reducing it to one or a few files. But every time I think about changing it, I am reminded why I like the plain text file format. I was updating a reasonably out of date computer when I had an issue with the python-pygame package being renamed to python2-pygame. All packages needing in the Arch Linux repos were rebuilt with the new dependency name, so it did not need a provides entry. But my solarwolf package from the AUR still depended on the old name:
$ pacman -S python2-pygame
looking for inter-conflicts...
:: python2-pygame and python-pygame are in conflict. Remove python-pygame? [y/N] y
error: failed to prepare transaction (could not satisfy dependencies)
:: solarwolf: requires python-pygame
As we have a file based database, adjusting the dependency is easy without rebuilding the package. Just open the relevant file and edit away (or use sed…)
$ vim /var/lib/pacman/local/solarwolf-1.5-5/desc
Now I can see my local database has an issue using the handy testdb tool – solarwolf depends on python2-pygame, but that is not installed.
missing python2-pygame dependency for solarwolf
But now I update as usual, installing python2-pygame which removes python-pygame, and my local pacman database is fully consistent.
I am sure all of this would still be possible if the database was in some other format, but it would have required more tools than a simple text editor. Of course, most people should never need to edit their local database, but I have introduced changes to it several times during pacman development and I consider being able to easily fix or revert these in the category of a “good thing”. And yes, I develop and test directly on my production system…
Of course, it is better to use a real database in performance critical situations. But pacman really does not fall into that category.