Optimizing urpmi
From Mandriva Community Wiki
Here's some ideas and ramblings about possible ways of optimizing urpmi and related stuff. -Per Øyvind
Contents |
[edit] Depency optimization
Claudio made suggestions on how to optimize dependencies in his review earlier, unfortunately this way is very intrusive while also removing possibly useful package metadata. Alternative, non-intrusive way off optimizing dependencies can be accomplished rather easily for dependency solver by resolving all dependencies to direct package depencies while removing redundant dependencies from existing hdlist/synthesis to a new synthesis file.
This way urpmi will be faster as it doesn't have to resolve all canonical and redundant depencies while the synthesis will be a lot smaller as all redundant depencies and unused provides are dropped.
[edit] Example:
[edit] Original synthesis
@provides@libecalbackendplanner.so.0()(64bit)@libeds-plugin.so()(64bit)@liborg-gnome-planner-source.so()(64bit)@planner-evolution[== 0.14.2-6mdv2007.1] @requires@planner[== 0.14.2-6mdv2007.1]@libICE.so.6()(64bit)@libORBit-2.so.0()(64bit)@libSM.so.6()(64bit)@libart_lgpl_2.so.2()(64bit)@libatk-1.0.so.0()(64bit) @libbonobo-2.so.0()(64bit)@libbonobo-activation.so.4()(64bit)@libbonoboui-2.so.0()(64bit)@libc.so.6()(64bit)@libc.so.6(GLIBC_2.2.5)(64bit) @libcairo.so.2()(64bit)@libcamel-1.2.so.10()(64bit)@libcamel-provider-1.2.so.10()(64bit)@libdl.so.2()(64bit)@libebook-1.2.so.9()(64bit) @libecal-1.2.so.7()(64bit)@libedata-cal-1.2.so.6()(64bit)@libedataserver-1.2.so.9()(64bit)@libeutil.so.0()(64bit)@libgconf-2.so.4()(64bit) @libgdk-x11-2.0.so.0()(64bit)@libgdk_pixbuf-2.0.so.0()(64bit)@libglib-2.0.so.0()(64bit)@libgmodule-2.0.so.0()(64bit)@libgnome-2.so.0()(64bit) @libgnome-keyring.so.0()(64bit)@libgnomecanvas-2.so.0()(64bit)@libgnomeui-2.so.0()(64bit)@libgnomevfs-2.so.0()(64bit)@libgobject-2.0.so.0()(64bit) @libgthread-2.0.so.0()(64bit)@libgtk-x11-2.0.so.0()(64bit)@libm.so.6()(64bit)@libpango-1.0.so.0()(64bit)@libpangocairo-1.0.so.0()(64bit) @libpangoft2-1.0.so.0()(64bit)@libplanner-1.so.0()(64bit)@libpopt.so.0()(64bit)@libpthread.so.0()(64bit)@librt.so.1()(64bit)@libxml2.so.2()(64bit) @rtld(GNU_HASH) @summary@Planner evolution support @info@planner-evolution-0.14.2-6mdv2007.1.x86_64@0@68782@Office
[edit] Optimized synthesis
@requires@evolution@lib64GConf2_4@lib64ORBit2_0@lib64art_lgpl2@lib64atk1.0_0@lib64bonobo2_0@lib64bonoboui2_0@lib64cairo2@lib64camel-provider10 @lib64camel10@lib64ebook9@lib64ecal7@lib64edata-cal6@lib64edataserver9@lib64gdk_pixbuf2.0_0@lib64glib2.0_0@lib64gnome-keyring0@lib64gnome-vfs2_0 @lib64gnome2_0@lib64gnomecanvas2_0@lib64gnomeui2_0@lib64gtk+-x11-2.0_0@lib64ice6@lib64pango1.0_0@lib64planner-1_0@lib64popt0@lib64sm6@lib64xml2 @planner @summary@Planner evolution support @info@planner-evolution-0.14.2-6mdv2007.1.x86_64@0@68782@Office
[edit] Original size
-rw-r--r-- 1 peroyvind peroyvind 595K 2007-07-27 17:31 synthesis.hdlist_main.cz
[edit] New size
-rw-r--r-- 1 peroyvind peroyvind 244K 2007-09-26 18:02 testsynth.gz
-rw-r--r-- 1 peroyvind peroyvind 163K 2007-08-22 13:38 testsynth.lzma
[edit] Latest
With a new, better and almost safe implementation written in python, latest results are:
-rw-r--r-- 1 peroyvind peroyvind 200K 2007-10-17 02:08 newsynt6.lzma
-rw-r--r-- 1 peroyvind peroyvind 294K 2007-10-17 01:48 newsynt6.gz
-rw-r--r-- 1 peroyvind peroyvind 1,3M 2007-10-17 17:28 newsynt6
-rw-r--r-- 1 peroyvind peroyvind 390K 2007-10-18 01:42 syntmain.lzma
-rw-r--r-- 1 peroyvind peroyvind 594K 2007-10-11 21:26 syntmain.gz
-rw-r--r-- 1 peroyvind peroyvind 3,5M 2007-10-17 17:20 syntmain
With descriptions added:
-rw-r--r-- 1 peroyvind peroyvind 499K 2007-10-17 17:29 newsynt6desc.lzma
-rw-r--r-- 1 peroyvind peroyvind 760K 2007-10-17 17:29 newsynt6desc.gz
-rw-r--r-- 1 peroyvind peroyvind 2,9M 2007-10-17 17:29 newsynt6desc
-rw-r--r-- 1 peroyvind peroyvind 690K 2007-10-17 17:20 syntmaindesc.lzma
-rw-r--r-- 1 peroyvind peroyvind 1,1M 2007-10-17 17:20 syntmaindesc.gz
-rw-r--r-- 1 peroyvind peroyvind 5,1M 2007-10-17 17:20 syntmaindesc
[edit] Code
Synthesis above was generated with a bash shell which didn't really produce usable synthesis, but demonstrated at least the difference.
[edit] Bash script
Proof of concept Image:Syntoptimize.sh (don't know to include any other way, so it's included as a picture).
Better and working version written in python coming soon..
[edit] Questions
- How much is actually gained in speed by this?
- Optimizing synthesis only and not actually packages/rpmdb - difference?
[edit] Issues
- Metadata not in use in distro can still be used for third party repositories/packages
[edit] Presolved dependencies
Solving all required dependencies up front would avoid the need for resolving them on user side thus improve speed. Also implenting a utility in C using for this would improve speed a lot more as perl is carrying a lot of the responsibility for slow speed.
With a list size of less than 200KB (see below) downloading of these lists should be quite fast.
[edit] Example
[peroyvind@localhost ~]$ echo @`urpmq -d gcc-gfortran`|sed -e "s# #@#g"
@ash@bash@binutils@bzip2@chkconfig@coreutils@cracklib-dicts@findutils@gawk@gcc@gcc-cpp@gcc-gfortran@glibc@glibc-devel@grep @info-install@lib64acl1@lib64attr1@lib64audit0@lib64binutils2@lib64bzip2_1@lib64crack2@lib64db4.2@lib64gmp3@lib64mpfr1 @lib64pam0@lib64pcre0@lib64termcap2@libgcc1@libgfortran2@libstdc++6@lzma@mktemp@pam@perl-base@rpm-helper@setup@shadow-utils@update-alternatives
Size:
-rw-r--r-- 1 peroyvind peroyvind 12M 2007-09-25 15:03 tmp/deps2.txt
-rw-r--r-- 1 peroyvind peroyvind 848K 2007-09-25 14:57 tmp/deps2.txt.gz
-rw-r--r-- 1 peroyvind peroyvind 169K 2007-09-25 14:57 tmp/deps2.txt.lzma
[edit] Utility
[edit] Recipe
- rpmlib
- libcurl
- liblzma(dec)
[edit] Ramblings
- Avoid unnecessary rpm checks by doing them when generating list, only include md5sum of verified file in list)

