a very minor and probably should not happen in real-life since noone should check out git-annex branch (besides may be to fix up something manually) but I still do not see the point of scanning for unlocked/annexed files if init
ran in that branch. ATM (8.20210428-270-g651fe3f39) I see
$> bash annex-scanning-git-annex
> set -eu
>> mktemp -d /home/yoh/.tmp/dl-XXXXXXX
> cd /home/yoh/.tmp/dl-yHMtPGl
> git clone --branch git-annex http://datasets.datalad.org/.git out
Cloning into 'out'...
Fetching objects: 11612, done.
> cd out
> git branch
* git-annex
> git annex init
init (scanning for annexed files...)
ok
(recording state in git...)
real 0m0.622s
user 0m0.435s
sys 0m0.163s
so it reports scanning while in git-annex, although goes quickly (has only about 1000 keys there), so may be doesn't even actually do any scanning and it is just a matter of reporting?
Actually -- it made me think: is that scanning
branch specific? then what would happen if e.g. master has no unlocked files in the tree and some other branch has unlocked files in the tree -- I could checkout/switch between branches without causing git-annex
to redo its scanning and would require manual git annex init
?
I feel this is unnecessary complexity and I've optimised the scans quite a lot in the meantime, so wontfix --Joey
Well, the init scanning is quite optimised by now, and since it will find no annexed objects, there are no database writes needed, which are the slower part of that.
You will pay the price though when you later check out the master branch, since it then has to scan the delta. And that scanning is less optimised. It would be more beneficial to speed up that scanning (reconcileStaged), which should be doable by using the git cat-key --batch trick.
I think the effect is somewhat psychological; if it says it's doing a scan then people are going to feel on guard that it's expensive. I know that sometimes git avoids displaying certain progress messages unless it's determined the operation is going to take a long time (eg git status will show a progress display in some circumstances but not commonly). That could be useful here, and probably in other parts of git-annex.
Made the scanning message not be displayed unless it takes at least 1 second. Of course, if some test suite is still looking at that message, it will break..
That got implemented.