Optimizing urpmi

From Mandriva Community Wiki

Jump to: navigation, search

Here's some ideas and ramblings about possible ways of optimizing urpmi and related stuff. -Per Øyvind

Contents

[edit] Depency optimization

Claudio made suggestions on how to optimize dependencies in his review earlier, unfortunately this way is very intrusive while also removing possibly useful package metadata. Alternative, non-intrusive way off optimizing dependencies can be accomplished rather easily for dependency solver by resolving all dependencies to direct package depencies while removing redundant dependencies from existing hdlist/synthesis to a new synthesis file.

This way urpmi will be faster as it doesn't have to resolve all canonical and redundant depencies while the synthesis will be a lot smaller as all redundant depencies and unused provides are dropped.


[edit] Example:

[edit] Original synthesis

@provides@libecalbackendplanner.so.0()(64bit)@libeds-plugin.so()(64bit)@liborg-gnome-planner-source.so()(64bit)@planner-evolution[== 0.14.2-6mdv2007.1] @requires@planner[== 0.14.2-6mdv2007.1]@libICE.so.6()(64bit)@libORBit-2.so.0()(64bit)@libSM.so.6()(64bit)@libart_lgpl_2.so.2()(64bit)@libatk-1.0.so.0()(64bit) @libbonobo-2.so.0()(64bit)@libbonobo-activation.so.4()(64bit)@libbonoboui-2.so.0()(64bit)@libc.so.6()(64bit)@libc.so.6(GLIBC_2.2.5)(64bit) @libcairo.so.2()(64bit)@libcamel-1.2.so.10()(64bit)@libcamel-provider-1.2.so.10()(64bit)@libdl.so.2()(64bit)@libebook-1.2.so.9()(64bit) @libecal-1.2.so.7()(64bit)@libedata-cal-1.2.so.6()(64bit)@libedataserver-1.2.so.9()(64bit)@libeutil.so.0()(64bit)@libgconf-2.so.4()(64bit) @libgdk-x11-2.0.so.0()(64bit)@libgdk_pixbuf-2.0.so.0()(64bit)@libglib-2.0.so.0()(64bit)@libgmodule-2.0.so.0()(64bit)@libgnome-2.so.0()(64bit) @libgnome-keyring.so.0()(64bit)@libgnomecanvas-2.so.0()(64bit)@libgnomeui-2.so.0()(64bit)@libgnomevfs-2.so.0()(64bit)@libgobject-2.0.so.0()(64bit) @libgthread-2.0.so.0()(64bit)@libgtk-x11-2.0.so.0()(64bit)@libm.so.6()(64bit)@libpango-1.0.so.0()(64bit)@libpangocairo-1.0.so.0()(64bit) @libpangoft2-1.0.so.0()(64bit)@libplanner-1.so.0()(64bit)@libpopt.so.0()(64bit)@libpthread.so.0()(64bit)@librt.so.1()(64bit)@libxml2.so.2()(64bit) @rtld(GNU_HASH) @summary@Planner evolution support @info@planner-evolution-0.14.2-6mdv2007.1.x86_64@0@68782@Office


[edit] Optimized synthesis

@requires@evolution@lib64GConf2_4@lib64ORBit2_0@lib64art_lgpl2@lib64atk1.0_0@lib64bonobo2_0@lib64bonoboui2_0@lib64cairo2@lib64camel-provider10 @lib64camel10@lib64ebook9@lib64ecal7@lib64edata-cal6@lib64edataserver9@lib64gdk_pixbuf2.0_0@lib64glib2.0_0@lib64gnome-keyring0@lib64gnome-vfs2_0 @lib64gnome2_0@lib64gnomecanvas2_0@lib64gnomeui2_0@lib64gtk+-x11-2.0_0@lib64ice6@lib64pango1.0_0@lib64planner-1_0@lib64popt0@lib64sm6@lib64xml2 @planner @summary@Planner evolution support @info@planner-evolution-0.14.2-6mdv2007.1.x86_64@0@68782@Office


[edit] Original size

-rw-r--r-- 1 peroyvind peroyvind 595K 2007-07-27 17:31 synthesis.hdlist_main.cz

[edit] New size

-rw-r--r-- 1 peroyvind peroyvind 244K 2007-09-26 18:02 testsynth.gz

-rw-r--r-- 1 peroyvind peroyvind 163K 2007-08-22 13:38 testsynth.lzma

[edit] Latest

With a new, better and almost safe implementation written in python, latest results are:
-rw-r--r-- 1 peroyvind peroyvind 200K 2007-10-17 02:08 newsynt6.lzma
-rw-r--r-- 1 peroyvind peroyvind 294K 2007-10-17 01:48 newsynt6.gz
-rw-r--r-- 1 peroyvind peroyvind 1,3M 2007-10-17 17:28 newsynt6
-rw-r--r-- 1 peroyvind peroyvind 390K 2007-10-18 01:42 syntmain.lzma
-rw-r--r-- 1 peroyvind peroyvind 594K 2007-10-11 21:26 syntmain.gz
-rw-r--r-- 1 peroyvind peroyvind 3,5M 2007-10-17 17:20 syntmain


With descriptions added:
-rw-r--r-- 1 peroyvind peroyvind 499K 2007-10-17 17:29 newsynt6desc.lzma
-rw-r--r-- 1 peroyvind peroyvind 760K 2007-10-17 17:29 newsynt6desc.gz
-rw-r--r-- 1 peroyvind peroyvind 2,9M 2007-10-17 17:29 newsynt6desc
-rw-r--r-- 1 peroyvind peroyvind 690K 2007-10-17 17:20 syntmaindesc.lzma
-rw-r--r-- 1 peroyvind peroyvind 1,1M 2007-10-17 17:20 syntmaindesc.gz
-rw-r--r-- 1 peroyvind peroyvind 5,1M 2007-10-17 17:20 syntmaindesc

[edit] Code

Synthesis above was generated with a bash shell which didn't really produce usable synthesis, but demonstrated at least the difference.

[edit] Bash script

Proof of concept Image:Syntoptimize.sh (don't know to include any other way, so it's included as a picture).

Better and working version written in python coming soon..

[edit] Questions

  • How much is actually gained in speed by this?
  • Optimizing synthesis only and not actually packages/rpmdb - difference?

[edit] Issues

  • Metadata not in use in distro can still be used for third party repositories/packages

[edit] Presolved dependencies

Solving all required dependencies up front would avoid the need for resolving them on user side thus improve speed. Also implenting a utility in C using for this would improve speed a lot more as perl is carrying a lot of the responsibility for slow speed.

With a list size of less than 200KB (see below) downloading of these lists should be quite fast.


[edit] Example

[peroyvind@localhost ~]$ echo @`urpmq -d gcc-gfortran`|sed -e "s# #@#g"

@ash@bash@binutils@bzip2@chkconfig@coreutils@cracklib-dicts@findutils@gawk@gcc@gcc-cpp@gcc-gfortran@glibc@glibc-devel@grep @info-install@lib64acl1@lib64attr1@lib64audit0@lib64binutils2@lib64bzip2_1@lib64crack2@lib64db4.2@lib64gmp3@lib64mpfr1 @lib64pam0@lib64pcre0@lib64termcap2@libgcc1@libgfortran2@libstdc++6@lzma@mktemp@pam@perl-base@rpm-helper@setup@shadow-utils@update-alternatives

Size:

-rw-r--r-- 1 peroyvind peroyvind 12M 2007-09-25 15:03 tmp/deps2.txt

-rw-r--r-- 1 peroyvind peroyvind 848K 2007-09-25 14:57 tmp/deps2.txt.gz

-rw-r--r-- 1 peroyvind peroyvind 169K 2007-09-25 14:57 tmp/deps2.txt.lzma

[edit] Utility

[edit] Recipe

  • rpmlib
  • libcurl
  • liblzma(dec)

[edit] Ramblings

  • Avoid unnecessary rpm checks by doing them when generating list, only include md5sum of verified file in list)
Personal tools