When git-annex is updating the git-annex branch, it currently uses a separate index file. This adds overhead and complexity to the code. Especially when there are many files, the index file gets large and writing it gets slow.
It might be an improvement to use
git mktree --batch to inject a
tree object into git, without using the index file.
is already used to add the files to git. All that would be needed is to
generate an updated tree containing the new file(s), and then update each
parent tree up to the root tree. This new tree can then be committed using
The only thing I can see that might make this slow at all is reading the old
tree contents, in order to update it. This would need a
git ls-tree for
each tree; it does not have a batch mode, so 4 processes would need to be
spawned when generating a tree that changes 1 file. For any repo that's not
very small, that's probably still faster than rewriting the index file.
- The union merge code currently uses the index. No particular reason it needs to; that's just how the code is written, and it might be a large rewrite to change it.
- A new git-annex branch can be pushed into the repository at any time.
The current code uses the index to detect when this happens, and
union merges the new branch head into the index. Would need something
GIT_ANNEX_HEADref to do the same if the index is removed.
Thanks to sm for indirectly suggesting this. --Joey