Tools/urpmi/Development/Synthesis

From Mandriva Community Wiki

Jump to: navigation, search
Synthesis hdlist

This page presents the format used by synthesis.hdlist.cz index, generated by genhdlist.

Note: take care where hdlists/synthesis are built on mirrors that there are hard links from media/media_info/hdlist_cz to media/main/media_info/hdlist.cz

So if you want to rebuild your hdlist, don't forget to remake the hard link. The best way to regenerate hdlists for all media is to use gendistrib.

Contents


Parsing synthesis is an easy process. I did it in python in 1/2 day, without doc on it or special libs in python. See the attachment.

[edit] Format

First, synthesis.hdlist.cz are, as the name does not imply, is compressed with gzip. So the first thing to do is to use something like perl IO::Gzip or python gzip module

The format is easy to understand, even if not documented at all. Most of the job is done by a perl-XS library and the "description" can be found in perl-URPM source code.

Here is a sample entry in the file :

@provides@openldap1[== 1.2.12-4mdk]
@requires@libldap1[== 1.2.12-4mdk]@rpm-helper[*]@/bin/sh[*]@/bin/sh[*]@bash@libc.so.6@libc.so.6(GLIBC_2.0)@libc.so.6(GLIBC_2.1)\
@libc.so.6(GLIBC_2.3)@libcrypt.so.1@libcrypt.so.1(GLIBC_2.0)@libnsl.so.1@libpthread.so.0@libpthread.so.0(GLIBC_2.0)@libpthread.so.0(GLIBC_2.1)\
@libpthread.so.0(GLIBC_2.3.2)@libresolv.so.2@libtermcap.so.2
@summary@LDAP servers and sample clients.
@info@openldap1-1.2.12-4mdk.i586@0@2054148@System/Servers

The lines are always in the same order. Or at least, the @info@ part is always marking the end of the entry.

The first field is always the type of the line. So far, it can be :

  • provides
  • requires
  • obsoletes
  • conflict
  • summary
  • info

The 4 first tags ( provides, requires, obsoletes, and conflict ), are using the same scheme. They are followed by one or more package names, sometimes with version restriction ( like package[== version] ) . Restriction can be <= >= or ==, as far as i have seen. Multiples packages are separated by @.

The summary is simple too, since it is only followed by the summary on one line.

The last line is info, split like this :

@info@name-version-release.arch@epoch@size@group@

As most names are self-explanatory, I will not explain them in detail. arch is src for src.rpm, or i586,pcc, noarch. Size is in bytes, and the group is the rpm group, as listed in Mandriva Groups.

[edit] Problem with synthesis

However, one problem remains.

What if an rpm includes a @ in the name, or in the description ?

Right now, there is nothing to avoid the problem, and genhdlist will crash, and most of the tools using synthesis will be broken by this bug.

Personal tools