Development/Packaging/IurtAnalysis

From Mandriva Community Wiki

Jump to: navigation, search
Iurt2 Analysis

Iurt is the build bot used to build Mandriva Linux packages. This pages provides information on the internal workings of Iurt, the environmental requirements to replicate it on a different system and studies its main problems and possible solutions.

Contents


[edit] Introduction

[edit] About Iurt

Iurt is a "build box" program that builds an rpm package in a freshly created chroot jail. Supplementary tasks such as spawning and coordinating builds on different architectures or uploading built packages to distribution repositories are performed by other tools such as Ulri and Emi.

Currently Iurt relies heavily on Mandriva-specific tools and requires a carefully built environment to work properly. Iurt optionally works layering a different filesystem over its base chroot using unionfs. This usage was not investigated in this document.

Trivia: The "Iurt" name is not an acronym, but the name of a character of the "Le Patriarche" sci-fi novel written by Florent Villard.

[edit] Purpose of this document

This document presents an analysis of Iurt as of October 20th, 2006, documenting a simple usage case and the requirements for deployment in another environment. It goes further by commenting on the design issues, problems and possible improvements to the tool. Measurements are conducted in a separate, simpler script, since overall complexity of the current Iurt implementation doesn't allow easy instrumentation for benchmarking.

This document was started as a collection of notes on Iurt deployment in third-party environments, a requirement of the EDOS project.

[edit] Basic usage

[edit] Invocation

Iurt's core functionality is implemented in the iurt2 program. iurt2 is normally invoked by the iurt wrapper script which takes care of adjusting the environment and supplying the appropriate command-line arguments toiurt2.

The wrapper script must be executed as the superuser or sudo in order to execute iurt2 as the build user (mandrake). The build user (mandrake) should also have sudo access in order to chroot.

Note: the package name must be specified in the command line with full path name, otherwise recreate_srpm() will fail looking for the package in the wrong place (in ./ inside the chroot).

Note: Only the chroot should be executed as the superuser. The way Iurt changes users should be redesigned. The package should be built as an ordinary user with fakeroot.

[edit] Execution

  1. Run iurt2 as the build user (mandrake) with correct enviroment and command-line arguments.
  2. Create build chroot.
    • Verify if a recent (less than one week old ~mandrake/chroot_cooker.i586.tar.gz ) tarball already exists, otherwise rebuild from scratch.
    • Populate chroot creating directories (with iurt_root_command --mkdir -p ).
    • Initialize rpm database (with iurt_root_command --initdb ).
    • Install base packages with urpmi from /cooker. Base packages are: basesystem, rpm-build, rpm-mandriva-setup-build, sudo, urpmi.
    • Create rpm directories in /home/builder/rpm using iurt_root_command --mkdir inside the chroot filesystem.
  3. Remove /var/lib/rpm/__db* inside the chroot filesystem.
  4. Create chroot tarball to speed up recreation of chroot environments.
  5. Remove temporary chroot (?).
  6. Install urpmi (again) and sudo inside the chroot.
  7. Create appropriate /home/builder/.rpmmacros inside the chroot.
  8. (As the original user) Use iurt_root_command --cp to install the .src.rpm inside the chroot and recreate it according to target environment standards.
  9. Install build dependencies inside the chroot.
  10. Install the package to be built inside the chroot.
  11. Build the packages with rpm inside the chroot.
  12. Copy the resulting packages to ~user/iurt/<distro>/<arch>/.
  13. Remove chroot with iurt_root_command --rm.

Note: In step 2 above rpm reports the following error. It is not fatal and the rpm database is created correctly.

Running /usr/local/bin/iurt_root_command --initdb <chroot>
error: can't create transaction lock on <chroot>/var/lib/rpm/__db.000
iurt_root_command: Success!

Note: Removal of /var/lib/rpm/__db* inside the chroot filesystem may fail as well with the following error:

Running /usr/local/bin/iurt_root_command --rm <chroot>/var/lib/rpm/__db*
iurt_root_command: removal of <chroot>/var/lib/rpm/__db* forbidden
iurt_root_command: nothing deleted
ERROR: Removing files

[edit] Output tree

After built, binary packages are placed under a pathname specified in the iurt2 command line, along with logs generated during the build. The structure is:

cooker
cooker/i586
cooker/i586/log
cooker/i586/log/chroot
cooker/i586/log/chroot/initialize_chroot-1..log
cooker/i586/log/cowsay-3.03-11mdv2007.0.src.rpm
cooker/i586/log/cowsay-3.03-11mdv2007.1.src.rpm
cooker/i586/log/cowsay-3.03-11mdv2007.1.src.rpm/install_deps_cowsay-3.03-11mdv2007.1.src.rpm-1.0.20061021220204.log
cooker/i586/log/cowsay-3.03-11mdv2007.1.src.rpm/rpm_qa_cowsay-3.03-11mdv2007.1.src.rpm.0.20061021220204.log
cooker/i586/log/cowsay-3.03-11mdv2007.1.src.rpm/build_cowsay-3.03-11mdv2007.1.src.rpm.0.20061021220204.log
cooker/i586/log/cowsay-3.03-11mdv2007.1.src.rpm/binary_test_cowsay-3.03-11mdv2007.1.src.rpm-1.log
cooker/i586/log/wrong_srpm_names.log
cooker/i586/log/status.log
cooker/i586/cowsay-3.03-11mdv2007.1.noarch.rpm
cooker/i586/cowsay-3.03-11mdv2007.1.src.rpm

Note: The log file names are too long. Simple names such as install.log, packages.log, build.log and test.log would suffice, since namespace is protected by the directory name.

[edit] Deployment requirements

[edit] Runtime environment

Iurt currently runs on Mandriva Linux systems only. A number of requirements mandates this, urpmi being the most significant.

In order to run Iurt successfully, the following items should be available and configured in the target environment:

  • Build user: builds are performed by a specific user which also hosts chroot trees in his/hers home directory. On the Mandriva build system this user is mandrake.
  • Sudo: sudo must be configured for the user executing the iurt wrapper script (if not the superuser).
  • Build user sudo: the build user must also be allowed to sudo, and NOPASSWD: ALL in sudoers is mandatory (verified by check_sudo_access() in iurt2 ).
  • Repository: packages under /cooker/i586 are used to build the chroot.
  • Chroot repository: a different repository (accessible for urpmi ) is used when inside the chroot and specified in the command line for iurt2.
  • Maintainer list: is retrieved during the build process from http://qa.mandriva.com/cgi-bin/srpmmaints.cgi (thus requiring an Internet connection).

[edit] Dependencies

In the Mandriva (post-2007.0 Cooker) distribution, the following packages and their dependencies are required to run Iurt:

  • perl-RPM4
  • perl-Filesys-Statvfs_Statfs_Df
  • perl-MIME-tools
  • perl-File-NCopy
  • mkcd
  • rpmmon
  • urpmi

[edit] Analysis and comments

[edit] Main problems

  • Excessive, confusing or misleading output.
  • UID dance: script changes euid to build user and superuser often and in a not very organized way; build user needs sudo access.
  • Usage is tied to rpm and Mandriva.
  • Interfacing with Ulri and Emi is not robust.
  • No package status tracking.
  • Error recovery is subpar.
  • The script connects to a remote server during the build process.
  • Lack of documentation for maintainers.

[edit] Design issues

It is easy to criticize Iurt for doing things in a confusing or needlessly complex way, but much of the problem is caused by (a) the nature of the operations put as requirements for the utility, and (b) design issues of other tools used by Iurt (e.g. urpmi and rpm ). Writing another script with the same requirements would result in a similar design. Iurt's problem is more shifted towards internal organization (source code layout, consistency, programming pratices).

The main constraints are:

  • chroot() calls must be performed by the superuser.
  • rpm --root calls chroot() internally so the command must be executed as the superuser.
  • urpmi --root calls rpm --root internally so the command must be executed as the superuser.
  • If the script is to be executed by an unprivileged user, a suid or sudoable helper to do the above is needed (and here is the origin of iurt_root_command, although it is being abused in the current implementation of iurt2 ).
  • urpmi --distrib assumes a distribution repository in the filesystem.
  • Installation of a chroot environment is a slow procedure and introduces a significant overhead for small packages.

[edit] Solutions

  • chroot as superuser problem: There is not much one can do if rpm --root is to be used. Debootstrap has a fake chroot option, but it is not limited by rpm design issues.
  • Package tracking: can be handled by a build agent that runs Iurt instead of Iurt itself.
  • Environment construction overhead: persistent chroot environment with unionfs.
  • EUID unpredictability: use sudo, iurt_root_command or suid agents strictly when needed.

[edit] Possible improvements

[edit] Bugfixes

  • Don't test for glob characters in iurt_root_command. Or at least check for /[*?]/ instead of /\*?/. (Is globbing really needed? Doesn't the shell take care of glob expansion?)
  • Prevent rpm database creation from reporting false error.

[edit] Performance

  • Don't compress chroot tarball. Or better yet, don't create a tarball but rsync to/from a cache tree, or keep a single, updated work tree. Together with unionfs for discarding transient data, this will give the best performance for keeping a sound chroot environment. Regenerate the work tree each n compilations or days to ensure sanity.
  • Don't retrieve the full list of maintainers, just get what we want. Or don't rely on an external list of maintainers and trust the source rpm package.
  • Add distributed compiling client (e.g. icecream) support inside the chroot. Add distributed compiling servers in the other farm nodes.

[edit] Security

  • Audit iurt_root_command. Is it really necessary?
  • Build packages with fakeroot instead of real superuser account.
  • Avoid no-password sudo for build user.

[edit] Interface sanity

  • Assume package in pwd when only the file basename is given.
  • Abort execution when a mandatory stage (e.g. chroot creation) fails.
  • Reduce verbosity, improve signal/noise ratio.
  • Log file names could be shorter.
  • Rethink configurable parameters.
  • Don't get packages from two different sources for setting up chroot and using it. Adding build dependencies to the chroot environment can be done using urpmi --src --root before chrooting.

[edit] Modularity

  • General source code refactoring. Organize according to the tasks performed by the utility:
    • Create a chroot environment
      • Install base system
    • Build package
      • Copy source rpm to chroot
      • Install basic development tools (if not in dependency list)
      • Install build dependencies
      • Rebuild package as the build user, using fakeroot
    • Preserve binary packages
    • Clean up after build
  • Move base system installation to another program (like mdvbootstrap, similar to debootstrap ). This program could be used for generic distro bootstrapping e.g. for xen guests. (Mandriva bootstraping with urpmi is very simple so this may not be needed.)
  • Insulate urpmi and rpm-specific package building in plug-ins similar to those sed in Youri. Plug-ins for different distros could be created, using distro-specific tools for chroot and package building.

[edit] Maintenability

  • General source code refactoring.
  • Remove old/unused/obsolete code.
  • Encourage consistent usage of functions in source code e.g. sometimes Iurt uses copy() , sometimes it uses system("cp") .
  • Create a test environment. Test new code in the test environment and commit it prior to deployment instead of live updating the production system with experimental code.
  • Rethink %todo, %run and %options, replace them with more straightforward, rational variables.
  • Modularize rpm-specific portions.
  • Modularize urpmi-specific portions.
  • Modularize Mandriva-specific portions.
  • Move chroot creation to a stand-alone program.
  • Don't depend on {{prog|mkcd)).
  • Architecture documentation.

[edit] Integration

  • Signal build failure in filesystem so Emi could easily see that something went wrong.
  • Be controlled by a build agent which communicates with a build system manager through a network protocol instead of being spawned via ssh by Ulri.

[edit] Nice to have

  • Web-formatting parser for build logs.
  • Pod-based man page.
  • Easy-to-deploy packaged software.

[edit] Proof-of-concept remake

[edit] A simple script

The following simple remake performs the essential actions of a build box, with no caching, linting or special logging. In order to be able to chroot and run rpm --root, the script must be executed as the superuser. Like Iurt, it is rpm and Mandriva-specific, but modularizable portions are clearly visible. Unlike Iurt, it doesn't require urpmi to be installed inside the chroot jail or /proc to be mounted (but an auto-updated persistent chroot environment would need these).

#!/bin/bash

ROOT=/home/claudio/mdvbs/test/a/
DIST=/cooker/i586
PKG=gkrellm-2.2.9-4mdv2007.0.src.rpm

report() {
        echo "[`date +%H:%M:%S`] $@"
}

installpkg() {
        report "install dependencies for package(s) $@"
        LC_ALL=POSIX \
        urpmi --use-distrib "$DIST" --root "$ROOT" --auto -q $@ \
                2>&1 | grep ^installing | cut -d' ' -f 1,2
}

installsrc() {
        report "install build dependencies for $@"
        LC_ALL=POSIX \
        urpmi --use-distrib "$DIST" --root "$ROOT" --auto -q --src $@ \
                2>&1 | grep ^installing | cut -d' ' -f 1,2
}

report "creating chroot environment"
rm -Rf "$ROOT"
mkdir "$ROOT"

report "creating rpm database"
mkdir -p "$ROOT/var/lib/rpm"
rpm --initdb --root "$ROOT"

installpkg basesystem
installpkg rpm-build
installpkg rpm-mandriva-setup-build
installpkg fakeroot
installsrc $PKG

# workaround for bug in fakeroot package
ln -sf libfakeroot.so.0 "$ROOT/usr/lib/libfakeroot.so"

report "copy source package to buildroot"
cp "$PKG" "$ROOT"

report "generate buildscript"
cat <<EOF > "$ROOT/buildscript"
1 !/bin/bash
export LC_ALL=POSIX
adduser build
chown -R build /usr/src/rpm
su build -c "fakeroot rpm --rebuild \"/$PKG\""
EOF
chmod +x "$ROOT/buildscript"

report "execute buildscript"
chroot "$ROOT" /buildscript
report "finished"

[edit] Results

Using the script to build gkrellm, we get the following results (package installation and build log removed for clarity):

[23:33:03] creating chroot environment
[23:33:14] creating rpm database
[23:33:15] install dependencies for package(s) basesystem
[23:35:19] install dependencies for package(s) rpm-build
[23:36:17] install dependencies for package(s) rpm-mandriva-setup-build
[23:36:43] install dependencies for package(s) fakeroot
[23:37:09] install build dependencies for gkrellm-2.2.9-4mdv2007.0.src.rpm
[23:40:03] copy source package to buildroot
[23:40:03] generate buildscript
[23:40:03] execute buildscript
[23:42:11] finished

Image:gkrellm-m1-noopt.png

[edit] Comments

Note that the whole setup time produces a significant overhead for small packages that compile fast, such as gkrellm. In this case, compilation takes roughly 2 minutes of the 10 minute span used by the entire build process. Of the remaining 8 minutes, approximately 3.5 minutes can be optimized with chroot filesystem caching. Installation of build dependencies for the packages takes additional 3 minutes, and there isn't much that can be done for it except for auditing requirements to remove unnecessary dependencies and using a faster filesystem of faster computer. (With a significant amount of extra complexity we can differentially install/remove packages based on the dependencies of the previous package, but the loss in robustness and maintainability doesn't make this a recommended solution.)

Using an updated persistent chroot tree overlaid with unionfs would save approximately 4 minutes per build, or 160 hours (almost a week) if each package in the main section is built once.

[edit] Conclusions

Although performing all the tasks it is supposed to do -- even with sophistication such as when optimizing chroot environment creation by maintaining a cache tarball -- Iurt has a sub-optimal implementation regarding internal organization, robustness and maintainability. Much of the mandatory complexity arises from design issues of tools used by Iurt such as rpm, and these cannot be easily changed or replaced. A number of feasible improvements have been listed in this document, but in order to add modularity a significant rewrite and internal redesign is needed. Modularity and distro-agnosticism are not requirements for internal usage at Mandriva, but is needed by EDOS. Indirectly, however, Mandriva could strongly benefit from the gain in maintainability and robustness arising from this rewrite.

A simple script that reimplements basic Iurt functionality shows the steps needed to build a package inside a chroot jail in Mandriva Linux, and can be used as a roadmap in case of a modular reimplementation of Iurt. The script doesn't handle chroot caching or logging, it just creates the environment and builds a new package. It shows that basic chroot environment build from an NFS-mounted distribution tree takes roughly 2 minutes in a 1.8GHz Sempron-based desktop system, with dependencies for rpm-build taking an extra minute. These times are expected to decrease significantly on a server system with fast disk.

Integration with the rest of the system, currently done through Ulri and Emi, is somewhat crude and generates problems in package tracking. A more robust solution is recommended, perhaps based on a protocol-based agent daemon that executes the build bot and reports status to a build system manager.

Personal tools