Development/Packaging/BuildSystem/Theory

From Mandriva

Jump to: navigation, search
The Build System Explained

Status: Work in progress

Contents


[edit]

Introduction

[edit]

What it is

The Mandriva Buildsystem is a collection of repositories and bots created to automate the process of building distribution packages in a clean, trusted environment, offering revison control of package data and metadata and transparent multi-architecture support.

The system as it exists today was implemented by Florent Villard using custom tools and third-party products such as Youri and Mdvsys. It can be roughly separated in a Front desk subsystem which receives and stores packages, and a Build core subsystem that performs the actual build. The Front desk was designed by Gustavo Niemeyer for use in the Conectiva buildsystem, and the Build core is a brainchild of Florent Villard.

[edit]

About this document

This description of the build system was primarily written for build system maintainers, and not for package developers. It is also not an original design document: instead, it was written after an analysis of the implemented system, so names and terms used in this document (such as "build box" or "the pipeline") may not match names and terms used in the original buildsystem design. Furthermore, this document only presents the general concepts of how the system works: discussion of problems and possible improvements are in Build System Improvements, and installation details are in Build System Details.

Curious users might find the description of the system interesting, since knowing the internals allows the user to take advantage of the design and implementation to work in a more productive way. This knowledge, however, is not essential and one can use the system normally after reading only the end-user documentation.

[edit]

Further reference

[edit]

How it works

[edit]

The big picture

The build systems's main characters are Ulri (the scheduler), Iurt (the builder) and Emi (the uploader). Ulri keeps a queue of incoming packages and chooses a suitable build node to build packages on, and runs Iurt on it. Emi asynchronously collects resulting packages from Iurt jobs and uploads them to the distribution repository.

Image:bigpicture_small.png

Build system architecture (View large | SVG)

Ulri and Emi are asynchronous tasks executed by cron. Iurt is executed by Ulri via ssh to a remote machine. Input packages for Ulri are assembled with data extracted from a Subversion repository by user's request. This job is accomplished by Repsys or Mdvsys (the extractor). In this document we'll call the entire path that a package follows from the developer's workstation to the distribution repository as The Pipeline.
[edit]

Package flow

The complete package pathway goes through a six-stage pipeline, the first stage being the developer's computer and the last one being the FTP site from where the final product can be obtained.

Image:pipeline_small.png

The Pipeline (View large | SVG)

A package is born at the Developer's workstation. There the developer packages a piece of software, tests the packaging and then commits tarballs, patches and specfiles to the Subversion repository. He can change this package as many times as needed. Only after a submission process the package enters the pipeline heading to the distribution repository. When a job is submitted, data is checked out from the Subversion repository and a source rpm package is generated. The package lands at the Input queue managed by Ulri, which can then schedule the package. When it is scheduled, the packaged is passed along the pipeline to Iurt to be built in a Build box. Different instances of Iurt handle different architectures. Built packages wait in an Output queue. After build in all mandatory architectures is complete, Emi collects the resulting binary packages from the Output queue and uploads them to the Distribution repository. This is the master repository which is then mirrored to many sites worldwide.
[edit]

The implementation

[edit]

Package transport

The build system relies in different protocols for data transfer and control, including scp, NFS and HTTP. Data is copied to/from the Subversion repository using svn+ssh, source packages are sent to the build cluster via scp and binary packages are retrieved also using scp.

Image:protocols_small.png

Protocols used for package transport and control (View large | SVG)

The cooker repository used to build the chroot repository in the build cluster nodes is accessed through the filesystem using NFS. If an update is needed, the chroot environment is updated using HTTP. Binary packages collected by Emi are copied to the distribution repository using rsync, and the cluster cooker repository is synced back also via rsync.
[edit]

Pipeline stages

Most of the system runs in kenobi. It serves the subversion repository, both input and output queues, has a local copy of the distribution repository and runs the scheduler. The final upload to the distribution directory is performed by a script that runs in ken.

Image:machines_small.png

Actual location of pipeline stages (View large | SVG)

The actual location of the input queue is kenobi:/home/mandrake/uploads/todo, and the output is placed in <node>:/home/mandrake/iurt/<arch> (whose contents are later moved to kenobi:/home/mandrake/uploads/done, see Build System Details for a detailed description). Critical data transfers are marked in the diagram with red arrows: special care must be taken in order to recover the system if something wrong (network outage, hardware failure) happens at these points. This is discussed in Build System Improvements.
[edit]

The build cluster

The build cluster builds binary packages for different targets and architectures (such as i586 2007.1 or x86_64 cooker). It currently has three i586 nodes (n1, n3 and n5) and two x86_64 nodes (seggie and deborah). A single source rpm package is dispatched to all architectures and is considered "built" only if build succeeds in all mandatory architectures.

Cluster nodes must have an account for the build user, with sudo access to run iurt. Nodes are scheduled using ssh and files copied with scp, so the build user should be able to authenticate using public key instead of an interactive password.

[edit]

Final upload

[edit]

The scheduler

[edit]

Queue management

Packages wait in the input queue until scheduled. At each run the scheduler checks for availability of build nodes for each package in the queue; if there is a free build bot it dispatches the build job for build in that node, otherwise it exits.

[edit]

Scheduling algorithm

Personal tools