It seems that git annex status
is much slower than git status
, at least in direct mode. The man page does not give any hint about why it should be slower.
Does git annex status
do something that git status
does not?
Here is an example in a repo with 8000+ files in direct mode and with no modified files:
$ time git -c core.bare=false status --porcelain > /dev/null
real 0m0.096s
user 0m0.042s
sys 0m0.071s
$ time git annex status
real 0m17.144s
user 0m10.555s
sys 0m1.934s
It is strange to see that git annex status
is ~200 times slower than the bare git status
.
git status
looks at the index and work tree. In an indirect mode repository,git annex status
does too, and is not significantly slower.In direct mode,
git annex status
has to look up from git the key that corresponds to each file in the work tree. This is the main thing that slows it down.(See the code for details, it's quite clear.)
The best workaround is proably to pass git-annex status a subdirectory that you're interested in, so it can only look at the contents of that one directory.
git annex status
should take more time in direct mode then what I'm experiencing is strange. On windows every 100M file adds approximately 1 second tostatus
duration (on my laptop), but on linux it does not. On linuxgit annex status
even in direct mode takes milliseconds. What is wrong with my setup?The sizes of the files should not affect how fast git-annex status runs.
But, direct mode certianly does. git-annex has to do significantly more work in direct mode to figure out the status of a file. Including querying git. In indirect mode, it can just stat the symlink and see if its content is present, which is much faster.
(There's probably also some other inneficiencies in Windows.)