It would be great to have a way to find out when a remote was last "active" and what happened. This can help with:
- knowing which remotes can safely be marked "dead", as it didn't "do" anything for the last 10 years
- finding out what exactly a repository is or where it was, if the (auto-generated) description isn't helpful
- double-checking if one's setup indeed does the expected, puts files where they should go, etc.
- maybe for sorting the remotes in
git annex info
by alast contact
time? (see also here, sorting by description would already be cool and better than sorting by UUID, which is random: https://git-annex.branchable.com/todo/Sorting_remotes_by_description_in_96git_annex_info96/)
For a specific file, one can already get an activity log for a specific remote with git annex log
:
git annex log thatfile.pdf | grep 5091aa91-fb08-44b1-aece-7406257103f8
# + Mon, 27 Mar 2023 21:39:07 CEST thatfile.pdf | 5091aa91-fb08-44b1-aece-7406257103f8 -- ThatRemoteName
But that is very slow and doesn't scale at all for considering all files.
There's also the activity.log
in the git-annex branch, which AFAIK currently only logs git annex fsck
invocation timestamps:
git annex log thatfile.pdf | grep 091aa91-fb08-44b1-aece-7406257103f8
# 091aa91-fb08-44b1-aece-7406257103f8 Fsck timestamp=1743873054s
# ...
The commit messages on the git-annex branch don't include the remote's ID (which might be helpful but hogs disk space?), so that can't be used to inspect activity - one doesn't know who did the commit.
Any ideas how this could be done?
Cheers, Yann
Apparently,
git grep
in the git-annex branch is pretty performant, so this can be used to find activity times:git grep for uuid is a good simple solution.
Maybe
git-annex log --all
could be made to show all location log changes for all keys. Then you could just grep that for the uuid to see what changes have been happening to what files (if it mapped keys back to current filenames when possible). Implementation would begit log
filtered to location log files, with--raw
to get the diff, then parsing the diff.There is already code that does something very similar in Annex.RepoSize.diffBranchRepoSizes. And since that is already run by
git-annex info
, it would be cheap to pull out a last activity date for each repo at the same time as the repo's size, and havegit-annex info
display it or use it in the other ways you suggest.The only wrinkle is that is an incremental diff since the last time it was called, so would not include dates for repos that have not changed since. So the dates would need to be cached somewhere.