I want to see aggregate stats on all keys known to git-annex while using matching options like --in here
, --copies
etc.
The obvious thing I tried was to use git annex info --in here
but that complains:
git-annex: File matching options can only be used when getting info on a directory.
There should be a way to use info
to query aggregate information properties of all keys instead of directories.
I have used git annex info .
in the repos I used up until now because every key was in the tree. Though I also have a feeling that operating on all keys could be significantly faster than filtering them to match some directory.
You can use:
git-annex info . --in here
But, it should be possible when getting info for all keys to use the matching options like --in that do not match on a filename. It used to be that there was no way to tell which kind of matching options where used, but now
matchNeedsFileName
is available and it could only reject those.So this can be implemented by making cachedPresentData and cachedRemoteData (etc) get the matcher, check if it's the right kind and apply it to the keys.
done
Sorry for the late reply but this somehow didn't reach me.
I just tried querying some basic stats such as
--not --copies 2
and running that withgit annex info .
works as expected:git annex info --not --copies 2
confusingly displays the repo overview and then starts calculating first local and then "annexed files in working tree" keys. It does not show the useful repo overview which displays how many matched keys are in each repo that has at least one however.With that filter,
git annex info .
takes 15s in my case whilegit annex info
takes 20s, likely due to the local key lookup. Trying to avoid the local lookup using--fast
unfortunately also avoids the working tree lookup; outputting the same info asgit annex info --fast
without any filters which doesn't seem very useful.Does the latter really query all keys however? It appears to me that it's the same as querying
.
. I have nearly 100 unused keys in that repo but both info commands show the same amount.git annex info --not --copies 2
displays the global overview (because you have not specified a directory), and limits it to keys that have the specified number of copies.Unused keys are included in the "local annexed keys" count when using
git-annex info
without a directory.I agree it would make sense for
git-annex info
to display something similar to the "repositories containing these files". Although in the global overview it should show the total annex size of each repository. info show total annex sizes of repositoriesI may have been unclear in my wording but what I really want is
info
to be able to show me numbers on all keys of the git-annex branch, including keys not in the current working tree or repo.The intent is that I want to preserve past versions and also want to be able to query copies, where they're located etc. of those together with the "used" keys. I don't care whether they're referenced in the current revision and want to be able to tell
git-annex info
to not care too.Ideally, this should skip any unused check; operating only on the git-annex branch's data and none of master's. (That might also yield a much needed performance benefit.)
Total size of annex sounds amazing; I've been wanting that ever since I set up distributed cold storage. I'll give it a spin