Monday, October 13, 2014

New features in mock-1.2

You may noticed there's been a new release of mock in rawhide (only). It incorporates all the new features I've been working on during my Google Summer of Code project, so I'd like to summarize them here for the people who haven't been reading my blog. Note that there were some other new features that weren't implemented by me, so I don't mention them here. You can read more about the release at http://miroslav.suchy.cz/blog/archives/2014/10/12/big_changes_in_mock/index.html.

LVM plugin
The usual way to cache already initialized buildroot is using tarballs. Mock can now also use LVM as a backend for caching buildroots which is a bit faster and enables efficient snapshotting (copy-on-write). This feature is intended to be used by people who maintain a lot packages and find themselves waiting for mock to install the same set of BuildRequires over and over again.
Mock uses LVM thin provisioning which means that one logical volume (called thinpool) can hold all thin logical volumes and snapshots used by all buildroots (you have to set it like that in the config) without each of them having fixed size. Thinpool is created by mock when it's starts initializing and after the buildroot is initialized, it creates a postinit snapshot which will be used as default. Default snapshot means that when you execute clean or start a new build without --no-clean option, mock will rollback to the state in default snapshot. As you install more packages you can create your own snapshots (usually for dependency chains that are common to many of your packages). I'm a Java packager and most of my packages BuildRequire maven-local which pulls in 100MB worth of packages. Therefore I can install maven-local just once and then make a snapshot with
mock --snapshot maven
and then it will be used as the default snapshot to which --clean will rollback whenever I build another package. When I want to rebuild a package that doesn't use maven-local, I can use
mock --rollback-to postinit
and the initial snapshot will be used for following builds. My maven snapshot will still exist, so I can get back to it later using --rollback-to maven. To get rid of it completely, I can use
mock --remove-snapshot maven
So how do you enable it?
The plugin is distributed as separate subpackage mock-lvm because it pulls in additional dependencies which are not available on RHEL6. So you first need to install it.
You need to specify a volume group which mock will use to create it's thinpool. Therefore you need to have some unoccupied space in your volume group, so you'll probably need to shrink some partition a bit. Mock won't touch anything else in the VG, so don't be afraid to use the VG you have for your system. It won't eat your data, I promise. The config for enabling it will look like this:
config_opts['plugin_conf']['root_cache_enable'] = False
config_opts['plugin_conf']['lvm_root_enable'] = True
config_opts['plugin_conf']['lvm_root_opts'] = {
    'volume_group': 'my-volume-group',
    'size': '8G',
    'pool_name': 'mock',
}

To explain it: You need to disable root_cache - having two caches with the same contents would just slow you down. You need to specify a size for the thinpool. It can be shared across all mock buildroots so make sure it's big enough. Ideally there will be just one thinpool. Then specify name for the thinpool - all configs which have the same pool_name will share the thinpool, thus being more space-efficient. Just make sure the name doesn't clash with existing volumes in your system (you can list existing volumes with lvs command). For information about more configuration options for LVM plugin see config documentation in /etc/mock/site-defaults.cfg.
Additional notes:
Mock leaves the volume mounted by default so you can easily acces the data. To conveniently unmount it, there's a command --umount. To remove all volumes use --scrub lvm. This will also remove the thinpool only if no other configuration has it's volumes there.
Make sure there's always enough space, overflown thinpool will stop working.

Nosync - better IO performance

One of the reasons why mock has always been quite slow is because installing a lot of packages generates heavy IO load. But the main bottleneck regarding IO is not unpacking files from packages to disk but writing Yum DB entries. Yum DB access (used by both yum and dnf) generates a lot of fsync(2) calls. Those don't really make sense in mock because people generally don't try to recover mock buildroots after hardware failure. We discovered that getting rid of fsync improves the package installation speed by almost a factor of 4. Mikolaj Izdebski developed small C library 'nosync' that is LD_PRELOADed and replaces fsync family of calls with (almost) empty implementations. I added support for it in mock.
How to activate it?
You need to install nosync package (available in rawhide) and for multilib systems (x86_64) you need version for both architectures. Then it can be enabled in mock by setting
config_opts['nosync'] = True
It is requires those extra steps to set up but it really pays off quickly.

DNF support
Mock now has support for using DNF as package manager instead of Yum. To enable it, set
config_opts['package_manager'] = 'dnf'
You need to have dnf and dnf-plugins-core installed. There are also commandline switches --yum and --dnf which you can use to choose the package manager without altering the config. The reason for this is that DNF is still not yet 100% mature and there may be a situation where you'd need to revert back to Yum to install something.
You can specify separate config for dnf with dnf.conf config option. If you omit it, mock will use the configuration you have for Yum (config_opts['yum.conf']). To use yum-cache with DNF you have to explicitly set
cachedir=/var/cache/yum
in the dnf.conf or yum.conf config option.
Otherwise, it should behave the same in most situations and also be a bit faster.

Printing more useful output on terminal
Mock will now print the output of Yum/DNF and rpmbuild. It also uses a pseudoterminal to trick it into believing it's attached to terminal directly and also get package downloading output including the progress bars. That way you know whether it's dowloading something or it cannot connect. You need to have debuglevel=2 in your yum.conf for this to work.

Concurrent shell acces to buildroot
Non-desctructive operations use just a shared lock intread of exclusive one. That means you can get shell even though there's a build running. Please use it with caution to not alter the environment of the running build. Destructive operations like clean still need exclusive lock.

Executing package management commands
Mock now has a switch --pm-cmd which you can use to execute arbitrary Yum/DNF command. Example:
mock --pm-cmd clean metadata-expire
There are also --yum-cmd and --dnf-cmd aliases which force using particular package manager.

--enablerepo and --disablerepo options
Passing --enablerepo/--disablerepo to package manager whenever mock invokes it. Now you can have a list of disabled repos in your mock config and enable them only when you need them.

short-circuit
rpmbuild has --short-circuit option that can skip certain stages of build. It can be very useful for debuging builds which fail in later stages. Mock now also has --short-circuit option which leverages it. It accepts a name of the stage that will be the first one to be executed. Available stages are: build, install and binary. (prep stage is also possible, but I'm not the one who added that and I have no idea what it's supposed to do :D). Example:
mock --short-circuit install foo.1.2-fc22.src.rpm
rpmbuild arguments
You can specify arbitrary options that will be passed to rpmbuild with --rpmbuild-opts. Mainly for build debugging purposes.

Configurable executable paths
Mock now also supports specifying paths to rpm, rpmbuild, yum, yum-builddep and dnf executables so you can use different than system-wide versions. This may be useful for Software Collections in the future.

Automatic initialization
You don't need to call --init, you can just do --rebuild and it will do init for you. It will also correctly detect when the initialization didn't finish succesfully and start over.

More thorough cleanup logic
There sould be no more mounted volumes left behind after you interrupted build by ^C. And if they are (i.e. because it was killed), it should handle it without crashing.

Python 3 support
Main part of mock should be fully Python 3 compatible. Python 2 is still used as default. Unported parts are LVM plugin and mockchain.

--unique-ext
This is a feature that was already present for few releases, but it seems only a few people know about it, so I'd like to mention it even though it's not new. I quite often find myself in situation when I want to build a package with the same config, but there's some other build already running, so I cannot. A lot of people just copy the config and change the name of chroot, but that means additional work and most importantly it cannot use the same caches as the original config, because mock sees them as something different. Unique-ext provides a better way. It's a commandline switch that adds a suffix to chroot name, so mock creates different chroot, but it uses the same config and in turn also same caches. Caching mechanisms provide locking to make this work. Using unique-ext with LVM plugin means that the new chroot is based on the postinit snapshot. There's a lock that prevents the postinit snapshot being unnecessarily initialized twice.

If you have any questions, ping me on #fedora-devel (nick msimacek)