Original TODO issue was reincarnated and (thank you git log -S
) and originally marked fixed by 6.20160613-81-gc4229be9a AKA 6.20160619~60. Using standalone build of annex is notably (~30%) slower than any other. I was stracing running of a sample datalad test and looked inside to see e.g.
4188353 1595043333.316032 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
4188353 1595043333.316089 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
4188353 1595043333.316524 openat(AT_FDCWD, "/usr/lib/git-annex.linux/bin/git", O_RDONLY) = 3
4188353 1595043333.316818 openat(AT_FDCWD, "/usr/lib/git-annex.linux/shimmed/git/git", O_RDONLY|O_CLOEXEC) = 3
4188353 1595043333.316992 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/gconv/tls/haswell/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317022 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/gconv/tls/haswell/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317049 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/gconv/tls/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317079 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/gconv/tls/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317106 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/gconv/haswell/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317133 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/gconv/haswell/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317160 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/gconv/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317187 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/gconv/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317216 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/audit/tls/haswell/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317248 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/audit/tls/haswell/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317278 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/audit/tls/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317306 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/audit/tls/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317336 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/audit/haswell/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317367 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/audit/haswell/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317395 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/audit/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317426 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/audit/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317455 openat(AT_FDCWD, "/usr/lib/git-annex.linux//etc/ld.so.conf.d/tls/haswell/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317482 openat(AT_FDCWD, "/usr/lib/git-annex.linux//etc/ld.so.conf.d/tls/haswell/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317509 openat(AT_FDCWD, "/usr/lib/git-annex.linux//etc/ld.so.conf.d/tls/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317536 openat(AT_FDCWD, "/usr/lib/git-annex.linux//etc/ld.so.conf.d/tls/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317562 openat(AT_FDCWD, "/usr/lib/git-annex.linux//etc/ld.so.conf.d/haswell/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317589 openat(AT_FDCWD, "/usr/lib/git-annex.linux//etc/ld.so.conf.d/haswell/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317615 openat(AT_FDCWD, "/usr/lib/git-annex.linux//etc/ld.so.conf.d/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317642 openat(AT_FDCWD, "/usr/lib/git-annex.linux//etc/ld.so.conf.d/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317670 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib64/tls/haswell/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317697 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib64/tls/haswell/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317724 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib64/tls/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317750 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib64/tls/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317778 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib64/haswell/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317805 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib64/haswell/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317832 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib64/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317858 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317889 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib/x86_64-linux-gnu/tls/haswell/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317917 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib/x86_64-linux-gnu/tls/haswell/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317943 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib/x86_64-linux-gnu/tls/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317970 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib/x86_64-linux-gnu/tls/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.317997 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib/x86_64-linux-gnu/haswell/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.318024 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib/x86_64-linux-gnu/haswell/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.318051 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib/x86_64-linux-gnu/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.318077 openat(AT_FDCWD, "/usr/lib/git-annex.linux//lib/x86_64-linux-gnu/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.318105 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/tls/haswell/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.318135 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/tls/haswell/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.318162 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/tls/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.318189 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/tls/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.318216 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/haswell/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.318243 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/haswell/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.318270 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/x86_64/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
4188353 1595043333.318299 openat(AT_FDCWD, "/usr/lib/git-annex.linux//usr/lib/x86_64-linux-gnu/libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = 3
so there is about 40 unsuccessful attempts to find/open libpcre2-8.so.0 . Other libraries might be less or more likely -- I didn't check... this one alone starts at 1595043333.316992 and finally succeeds at 1595043333.318299 -- so ~1.3 msec to just find this single library in a single invocation of git-annex (and we do many of them)... may be prelink or some other trick could be tried again? or if that is the lost effect of the original patch -- could there be some regression test added ?
git annex build: 8.20200617+git192-g5849bd634-1~ndall+1
a test could be I guess as simple as
or some other number than the ideal
2
(i.e. a direct hit right away ;)) but something less than current 47also for some reason paths like
ld.so.conf.d
are consultedI thought that somehow may be ld.so.conf could indeed be used to actually configure paths for libraries to index at debian package installation level...
interestingly, if I do directly what you are doing in the ld.so symlink shim, I get a "direct hit":
but something I guess throws it off in the "top level" runshell (sorry -- dunno a better word) since I do observe all those numerous attempts stracing the
/usr/lib/git-annex.linux/git-annex
also for some reason paths like
ld.so.conf.d
are consultedI thought that somehow may be ld.so.conf could indeed be used to actually configure paths for libraries to index at debian package installation level...
interestingly, if I do directly what you are doing in the ld.so symlink shim, I get a "direct hit":
but something I guess throws it off in the "top level" runshell (sorry -- dunno a better word) since I do observe all those numerous attempts stracing the
/usr/lib/git-annex.linux/git-annex
and since debian standalone is a "static" installation, most likely all the dance in
runshell
could be avoided upon every invocation and paths could be computed/recorded at installation time and then shim could be as lean as what I get withand speed up on a random sample test is already obvious:
vs
and since debian standalone is a "static" installation, most likely all the dance in
runshell
could be avoided upon every invocation and paths could be computed/recorded at installation time and then shim could be as lean as what I get withand speed up on a random sample test is already obvious:
vs
bundled git is also a victim on many counts (bringing locale question back as well)
bundled git is also a victim on many counts (bringing locale question back as well)
For details of the older todo, including some timings with and without prelinking, see c4229be9a7a2318ef71b9ae433bc14bf604c9caf.
A ghc bug, since fixed, was causing it to look in IIRC, thousands of unnecessary directories per library. This todo, by contrast, complains about less than 100 extra lookups total.
The way you run the shim does not put the bundled git in PATH. That kind of throws off your results, because git, not git-annex is what links to pcre. I was not actually able to reproduce your result of it finding pcre without any failed seeks, but perhaps it ran a different git binary than the ones I have access to. Anyway, that all seems like a bit of a red herring due to that problem and puts your timings in doubt too.
Simply moving all the libraries to a single directory would cut down on the failed seeking a lot:
The /lib64/tls/x86_64/x86_64/ seems like kind of weird behavior from ld-linux.so, I wonder if that's a bug on its part. The rest of the seeking seems reasonable. I guess these extra 14 seeks are not a major performance hit.
But, that library consolidation does not seem to speed it up appreciably at all. Timings were almost identical before and after. 100 failed opens, when cache is hot, is just not that much overhead compared with the script's overhead. I don't think it's even worth implementing the library consolidiation based on this.
Running
git-annex.linux/git-annex version
takes 0.060s. Compare with around 0.030s to run/usr/bin/git-annex version
. (Sometimes it runs in more like 0.020s, but not often.. Probably have to catch the cache in the right mood.) So, runshell and the other 2 shell scripts have around 0.030s overhead themsevles.(This is with runshell modified so
GIT_ANNEX_PACKAGE_INSTALL
is set. IIRC the way the git-annex-standalone deb is built sets that. Otherwise, runshell does an additional 0.060s of locale setup stuff.)I tried putting set -x in all 3 levels of shell scripts and time stamping the output to see what was expensive. This was a bit surprising, because other than the abovementioned locale setup stuff, all the rest of the set -x output happened within 0.000965s. Which is 30 times faster than the timings above say it should be. Could be that the time stamping, which used
ts
, is not accurate enough.Anyway, runshell uses several unix utilities (and at least dirname is run redundantly between the git-annex script and runshell), and there are 3 levels of shell scripts for the shell to parse and run. Combining them into a single shell script would eliminate some redundant work. Probably rewriting in C would be a bigger win.
quick comment:
note that some directories seems to be considered twice, e.g.
and
I guess some paths get listed multiple times.
sure not major but still wasteful if could be avoided.
I do not think I did anything special, and here is my reproduced full example:
installing bleeding edge build (8.20200720.1+git52-gf5e65d680-1~ndall+1) from datalad-extensions
and it does not matter how I invoke (via outside git or directly through git-annex bundle)
I get those 46 failed lookups
WOW -- a random discovery and possible note to myself: I am getting all those libpcre misses when I run
git annex version
within a git repository and not otherwise:Not sure why
git annex version
needs to rungit
when inside a git repo for itsversion
. (same happens forgit annex --help
)that was with a bit dated (8.20200501+git61-g64e081d58-1~ndall+1) version, with 8.20200720.1-1~ndall+1 it looks a bit better:
but still a more pertinent test/demonstration of current situation would be
It is premature optimisation to try to reduce these seeks when they have not been shown to reduce performance, and at least in my tests, have been shown to not affect it to the limits I can measure.
I don't think it would be safe to consolidate all the libs into one directory, as I did in my test. Because it seems possible that there could be two versions of a .so file with the same name, in eg lib64/tls and lib64/x86_64. Installing only one of them would either lose actual optimisations in the other .so file, or cause breakage. I don't know when those optimised .so files exist, but implementating this consolidation would make the build more fragile.
I don't know if
prelink(8)
is able to prelink things such that they can be relocated, as git-annex.linux can. I think it hardcodes a fixed path in the binary. Similarly, ld.so.cache (which is why this seeking doesn't happen on Debian AFAICS, not preliking) contains a list of directories and the libraries in them, so unless it were created by runshell the first time, it would not work. And creating it by runshell the first time would make the thing more complicated, and thus more fragile. (Also /sbin/ldconfig is 1mb in size so would increase the bundle rather a lot.) Also, I benchmarked prelinking back in c4229be9a7a2318ef71b9ae433bc14bf604c9caf and the speedup was not measurable.So these are not appealing with the information I have. It's a very different situation than the ghc bug that was adding one directory to rpath for every haskell library, which had an easily measurable performance impact because there are hundreds of those libraries and 30000 seeks did add up to a measurable time.
git-annex init
runs git something like 30 times, so it's close to the worst case for a single git-annex command, other than when smudge filters are run.I tried inlining runshell into both git and git-annex scripts, thinking that the overhead of starting the second shell script might be measurable. It was not; I saw
git-annex init
taking 0.10-0.14s before and after.I also tried trimming out some parts of the script that normally don't run, like the android support, but that didn't speed it up.
With the consolidated lib dirs, I did see
git-annex init
drop to 0.07-0.10s.Ok, I found a way to consolidate the directories that will not include directories for optimised libs. Implemented that.
Setting
LD_HWCAP_MASK=0
prevents the linker from looking for hardware optimised libraries. I compared with and without it, and there was a savings of 280 failed seeks acrossgit-annex init
.In particular ones with a double
x86_64
like this one:While it still does ones like this, which I would have thought would also be disabled:
And, it eliminates looking for libpcre twice in the same place (git-annex.linux//lib/x86_64-linux-gnu/tls/x86_64/libpcre2-8.so.0) that it otherwise does while linking git.
This is some weird behavior from the linker. I get the impression that it's looking in
x86_64
for two different reasons, hwcap and something else. And tls is not being filtered either. Found this message that I think explains why: https://sourceware.org/pipermail/libc-alpha/2020-May/113878.htmlSo, the kernel provides hwcaps, which we can filter, but then these fake ones are added on top of it.
It seems likely that's a bug... What if I actually had a good reason to want to mask out those libraries from being used, and the linker used them anyway?
Anyway, that reduces the runtime to 0.08-0.09s from 0.10-0.11s.
Enabling this would need some way to detect that there are no hwcap optimised libs being included in the bundle, otherwise it would probably be a pessimisation.
Just a brief one:
Have a look at the original issue description. They introduced 1.3ms to seek a single library... We saved about 30% of runtime switching Travis CI from standalone to conda build of annex. So performance reduction is obvious IMHO.
Awesome! I have restarted gh workflows job which runs perspectively added test running
annex init
and counting missing seeks for libpcre (was over 90)... unfortunately regulargit annex test
failed before it tried added "custom" ones (I should isolate them I guess), so dunno what is the status on that github action.FTR: testing of git annex started to show some failures 2 days ago : https://github.com/datalad/datalad-extensions/actions/runs/187943962 (although only in crippled fs runs, although might be just the sign of lakiness) so not necessarily a fresh one. yet to download a fresh built package and see what it brings locally
Already reported here and fixed by Joey with 5a5873e05 (fix bug caught by test suite, 2020-07-31).
Kyle, get ready for me to use a new word I just have learned!
as for "... included in the bundle". For neurodebian git-annex-standalone ideally any check of such kinda should be done at package build time since it is when it could check just once and avoid any checks later when installed.
Implemented the
LD_HWCAP_MASK=0
optimisation, which left only these:There are more failed opens now for locale files for commands like grep when running it than there are for libraries. So, no need to consider further prelinking.
I think that rewriting runshell in C would be the logical next choice, but dunno if it would speed it up by enough to be worth the effort. So I'm going to close this now.