Wednesday, June 25, 2014

GSoC week 5

Nofsync
I created another implementation of nofsync plugin (disables fsync(), makes it much faster), this time in python as DNF plugin that disables fsyncing in the YumDB. It is a small bit slower than the C library using LD_PRELOAD, because it doesn't eliminate fsyncs made from scriptlets (by gtk-update-icon-cache and such). But it's much simpler from packaging perspective (mock can stay noarch) and could be actually upstreamable (in dnf), because there are some other use cases, where you don't try to recover from hardware failure anyway - for example anaconda. If the power goes down, you probably don't try to resume existing installation. And this could make it faster (nofsync makes package installation approximately 3 times faster).
To compare the two implementations, set either
config_opts['nofsync'] = 'python'
or
config_opts['nofsync'] = 'ld_preload'
Default is python, to disable it, set the option to something else (empty string)

LVM support
Last week I implemented base for LVM plugin for mock using regular snapshots. This week I rewrote the plugin to use LVM Thin snapshots, which offer better performance, flexibility and share the space with the original volume and other snapshots, therefore don't waste much space. I created basic commands that can be used to manipulate the snapshots.
Example workflow:
I'll try to demonstrate how building different packages can be faster with LVM plugin. Let's repeat the configuration options necessary to set it up:
config_opts['plugin_conf']['root_cache_enable'] = False
config_opts['plugin_conf']['lvm_root_enable'] = True
config_opts['plugin_conf']['lvm_root_opts'] = {
    'volume_group': 'my-volume-group',

}
You can now also specify 'mount_options', which will be passed to -o option of mount. To set size to larger than the default 2GB, use for example 'size': '4G' (it is passed to lvcreate's -L option, so it can be any string lvcreate will understand). Now let's initialize it:
$ mock --init
Mock will now create thin pool with given size, create a logical volume in it, mount it and install the base packages into it. After the initialization is done, it creates a new snapshot named 'postinit', which will be then used to rollback changes during --clean (which is by default also executed as part of --rebuild). Now try to install some packages you often use for building your own packages. I'm a Java packager and almost every Java package in Fedora requires maven-local to build.
$ mock --install maven-local
But now since I want to rebuild more Java packages, I'd like to make snapshot of the buildroot.
$ mock --snapshot mvn
This creates a new snapshot of the current state and sets it as the default. We can list snapshots of current buildroot with --list-snapshots command (the default snapshot is prefixed with asterisk)
$ mock --list-snapshots
Snapshots for mock-devel:
  postinit
* mvn


So let's rebuild something
$ mock --rebuild jetty-9.2.1-1.fc21.src.rpm
$ mock --rebuild jetty-schemas-3.1-3.fc21.src.rpm
Because the 'mvn' snapshot was set as the default, it means that each clean executed as part of the rebuild command didn't return to the state in 'postinit', but to the state in 'mvn' snapshot. And that was the reason we wanted LVM support in the first place - it didn't have to install 300+MB of maven-local's dependencies again (with original mock, this would probably take more than 3 minutes) but still the buildroot was cleaned of the packages pulled in by previous build. We could then install some additional packages, for example eclipse, and make a snapshot that can be used to build eclipse plugins.
Now let's pretend there has been an update to my 'rnv' package, which is in C and doesn't use maven-local.
$ mock --rollback-to postinit
$ mock --list-snapshots
  mvn
* postinit
Now 'postinit' snapshot was set as default and buildroot has been restored to the state it was in when 'postinit' snapshot was taken (after initialization, no maven-local there). The 'mvn' snapshot is retained and we can switch back again using --rollback-to mvn.
So now I can rebuild my hypothetical rnv update. If I decide that I don't need the 'mvn' snapshot anymore, I can remove it with
$ mock --remove-snapshot mvn
You cannot remove 'postinit' snapshot. To remove all logical volumes belonigng to the buildroot, use mock --scrub lvm
 
So that's it. You can create as many snapshots as you want (and snapshots of snapshots) and keep a hierarchy of them to build packages that have different sets of BuildRequires.
Few more details:
  • The real snapshot names passed to LVM commands have root name prefixed to avoid clashes with other buildroots or volumes that don't belong to mock at all. It also checks whether the snapshots belong to mock's thinpool.
  • The volume group needs to be provided by user, mock won't create one. It won't touch anything else besides the thinpool, so it should be quite safe if it uses the same volume group as you system (I have it like that).
  • The command names suck. I know. I'll try to provide short options for them.
  • If you try the version in my jenkins repository, everything is renamed to xmock including the command - to allow it to exist alongside original mock.

Wednesday, June 18, 2014

GSoC - week 4

Last week I had exams at the university and that left me with less time for work. But I made some progress anyway.

Mock performance
Mock builds usually take a considerable amount of time and there is nothing much that can be done about the speed of actual building but the package installation can be improved. Last time I created noverify plugin which provided considerable speed up and my mentor recomended to try removing fsync calls during the package installation. I do that by making a small library in C containg only empty fsync() and fdatasync() functions and copying it into the buidroot. Then using LD_PRELOAD to make it replace actual libc implementation of this calls. My mentor measured the performance differences on his kvm virtual machine and the results are amazing - times installing @buildsys-build and maven-local (look at the wall clock time):

Standard yum:
User time (seconds): 55.66
System time (seconds): 5.78
Percent of CPU this job got: 25%
Elapsed (wall clock) time (h:mm:ss or m:ss): 4:02.03

Standard dnf:
User time (seconds): 49.61
System time (seconds): 5.68
Percent of CPU this job got: 23%
Elapsed (wall clock) time (h:mm:ss or m:ss): 3:50.94

With noverify plugin:
User time (seconds): 47.85
System time (seconds): 5.32
Percent of CPU this job got: 36%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:25.25
Maximum resident set size (kbytes): 150248

With noverify plugin, fsync() and fdatasync() disabled:
User time (seconds): 46.38
System time (seconds): 4.97
Percent of CPU this job got: 87%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:58.56
Maximum resident set size (kbytes): 150260


That's more than 4x faster and could be valuable improvement for both packagers and koji builders.

LVM plugin
I started implementing the basis for LVM support. Now it can already use LVM snapshot instead of root cache. To enable it, put the following in your config:
config_opts['plugin_conf']['root_cache_enable'] = False
config_opts['plugin_conf']['lvm_root_enable'] = True
config_opts['plugin_conf']['lvm_root_opts'] = {
    'volume_group': 'mock-vg'
}

where mock-vg is the name (not path) of the volume group you want mock to use for creating new volumes. There are other configuration options possible - filesystem, size, snapshot_size, mkfs_args. Root cache is disabled because it would be redundant. When started, it creates a logical volume, mounts it and after it's initialized, it makes a snapshot and mounts the snapshot instead of the original. Then all following builds alter only the snapshot and when clean command is executed (usually at the beginning of new build) the snapshot is deleted and replaced with new one. I originally tried to implement it the other way around - making a snapshot and still working with the original volume and then merging it when cleaning. But it was very slow - the merging took more than 10s. The current approach is fast enough - cleaning is just deleting a snapshot and creating new one which happens almost instantly (compared to deleting buildroot and unpacking root cache).

The next week I'll try to implement more advanced features of the LVM plugin - snapshot management which would allow having a hierarchy of snapshots with different preinstalled packages facilitating faster workflow for packagers working with more diverse set of packages.


Thursday, June 5, 2014

GSoC - week 3

This week I've been mostly continuing the parts I started the week before. I performed more drastical refactoring - splitting the main monolithic class (Root) that does basically everything except few specific features that are delegated to other modules. I moved rest of the buildroot-building code to Buildroot class I created before for this purpose. The state logging and plugin loading/calling was decoupled to separate classes. The package management code resides in PackageManager class and Yum/Dnf subclasses. I renamed the former Root class to Commands which now does just what the name suggests - executes commands, such as build, buildsrpm, in buildroot. I've adapted plugins to the new model and customized the initialization to prepare field for the LVM backend, which will of course be implemented as a plugin (but doing that without hardcoding some parts into core will require enhancing the plugin API).

Jenkins
I've requested a Jenkins project for my improved version of mock which is now available at http://jenkins.cloud.fedoraproject.org/job/mock/ and provides built packages (and repository) at http://jenkins.cloud.fedoraproject.org/job/mock/ws/RPMS. I build it using the spec file in the mock source tree but before building I inject a script into %prep which replaces all occurences of 'mock' with 'xmock' (including filenames). Thus, the package can be installed alongside upstream mock without any conflicts. Just everything is renamed to xmock - the binary, the config directory, also the group. So feel free to test it :-)
(But be careful, mock is running as root. If I make a mistake it may have consequences on your system)

DNF
I've already got feedback from my mentor (Mikolaj Izdebski). The bad news is that dnf installroot support doesn't work for him althought it works perfectly fine for me. The problem is that for him, dnf doesn't load the config from the installroot and uses the system-wide one. I've read the corresponding part of dnf's source code and I don't see any reason why it might fail to find the configuration. If installroot is defined and the config file within is readable (tested with access(2)) it will be used. I'm yet to find why it doesn't work for him.

Interactive output
I've modified mock to print output of building and package management commands. One thing my mentor suggested was to get more output from dnf/yum. Currently, neither dnf nor yum output anything when they're synchronizing repos or downloading packages, only when installing. The reason is that they check whether the output is a terminal (with os.isatty()), which is not - in case of mock, it's a pipe. I had to trick it to them to think the output is a tty. I solved it using a pseudoterminal instead of a pipe and now os.isatty() happily returns True in the child processes and dnf now prints everything including the progressbars when downloading. But since it uses carriage retruns and backspaces to erase parts of already printed output it also introduced lot of mess to the logs, because in text files those characters aren't interpreted the same way as on terminal. So I also had to modify the output logging to get rid of these.
Note: to get reasonable output from yum/dnf, it's debuglevel has to be set to at least 2 in the config (current default is 1)

Skipping package verification
Another feature he requested was speeding up mock by skipping verification of packages when installing. Neither yum nor dnf have an option to disable those, but since they're written in Python, there's always a hackish way to accomplish what you need. There are two solutions to this problem: 1. create a plugin that modifies yum/dnf to not verify packages 2. create a wrapper module that will modify them. I chose number 1 and implemented it for dnf for now. I created a simple plugin that, once loaded, rebinds the dnf's method verify_transaction to a no-op lambda. Then I just copy it into buildroot and inject the plugin path to dnf.conf. It can be toggled with config option 'noverify' (enabled by default).

Other
Also, I fixed some corner case behavior. Now it doesn't fail when you delete the /var/lib/mock directory. Previously it recreated it with wrong permissions making it hard to detect, why it failed and how to make it work again (user had to manually set the setgid bit on the directory). Also it prints warning when incorrectly executed as regular user without  setuid wrapper (Previously it printed OSError: Success).

Future
I've been exploring the possibility of mock executing commands in a contained environment and possibly not doing most of it's work as root. I've discovered some interesting things about Linux namespaces that might quite change the way mock works. I will try to make a follow up post about this soon.