I understand from here that there is no reverse index from a key to list of file paths pointing to that key (i.e. pointing to the value).
find . -lname '*<key>'
would be an extremely slow operation on a big repo as it would go through the whole repo. And this is probably a common operation I frequently want to do.
What if I would want to build one? How would I make sure that potential moves/renames will update the index?
I understand from here that you can attach meta information to a key (via git annex metadata
). This sounds as it would be useful to contain such reverse information, right?
If I had a good answer to that question I would have built it already.
I mean, a post-commit hook can notice changed after the fact, but noticing them when they've just been staged is harder.
It does not make sense to store such an index in the git-annex branch, because it's redundant information to what's already stored in git trees.
This is discussed in cache key info.
Thanks for the answer.
How does
git status
checks for changes? I feel it is quite fast at that.So you could update the persistent database by post-commit hook, and have a temporary virtual overlay when used which takes current staged changes also into account. And maybe you can also add a
--fast
option, which would skip this part, because the user probably knows when to expect staged changes.I think this would be pretty useful. This would also change somewhat the whole way how I would use the Annex. I expect that I have this case quite often, that some file content is referenced from multiple file paths.
I just figured this out today!
In v8 repos, every annexed file gets replaced by a text string like this:
((
you can see them with
git log -p
or))
So to get the reverse pointer, you can use
git grep
:git grep
might be just as slow asfind
in the end, but so far it has always been very fast for me. Maybe I just don't have a hyper-huge number of files.git grep doesn't use an index per se, but is probably faster than
find
at least sometimes. And probably somewhat faster than things involvinggit-annex find
too.It's unfortunate that
git grep
doesn't search on the content of symlink texts, so that can't be used to find locked files.The equivilant using git-annex find, which will find both types: