It would be great to have a way to find out when a remote was last "active" and what happened. This can help with:
- knowing which remotes can safely be marked "dead", as it didn't "do" anything for the last 10 years
- finding out what exactly a repository is or where it was, if the (auto-generated) description isn't helpful
- double-checking if one's setup indeed does the expected, puts files where they should go, etc.
- maybe for sorting the remotes in
git annex infoby alast contacttime? (see also here, sorting by description would already be cool and better than sorting by UUID, which is random: https://git-annex.branchable.com/todo/Sorting_remotes_by_description_in_96git_annex_info96/)
For a specific file, one can already get an activity log for a specific remote with git annex log:
git annex log thatfile.pdf | grep 5091aa91-fb08-44b1-aece-7406257103f8
# + Mon, 27 Mar 2023 21:39:07 CEST thatfile.pdf | 5091aa91-fb08-44b1-aece-7406257103f8 -- ThatRemoteName
But that is very slow and doesn't scale at all for considering all files.
There's also the activity.log in the git-annex branch, which AFAIK currently only logs git annex fsck invocation timestamps:
git annex log thatfile.pdf | grep 091aa91-fb08-44b1-aece-7406257103f8
# 091aa91-fb08-44b1-aece-7406257103f8 Fsck timestamp=1743873054s
# ...
The commit messages on the git-annex branch don't include the remote's ID (which might be helpful but hogs disk space?), so that can't be used to inspect activity - one doesn't know who did the commit.
Any ideas how this could be done?
Cheers, Yann
Apparently,
git grepin the git-annex branch is pretty performant, so this can be used to find activity times:git grep for uuid is a good simple solution.
Maybe
git-annex log --allcould be made to show all location log changes for all keys. Then you could just grep that for the uuid to see what changes have been happening to what files (if it mapped keys back to current filenames when possible). Implementation would begit logfiltered to location log files, with--rawto get the diff, then parsing the diff.There is already code that does something very similar in Annex.RepoSize.diffBranchRepoSizes. And since that is already run by
git-annex info, it would be cheap to pull out a last activity date for each repo at the same time as the repo's size, and havegit-annex infodisplay it or use it in the other ways you suggest.The only wrinkle is that is an incremental diff since the last time it was called, so would not include dates for repos that have not changed since. So the dates would need to be cached somewhere.
Basically the same todo previously: show time of last interaction with a repo
I'll close that one in favor of this new one. The old one did have some ideas about using groups to manually track activity, and a way to use
git-annex expireto list recently fsked repos.Copying a related idea from @nobodyinperson on remove webapp:
Furthermore, a command like
git annex activitythat goes arbitrarily far back in time and statically (non-live) lists recent activities like:document.txt(10MB)document.txt(from today 10:45) to remote1, remote2 and remote3document.txt(12MB) and uploaded to remote2Basically a human-readable (or as JSON), chronological log of things that happened in the repo. This is a superpower of git-annex: all this information is available as far back as one wants, we just don't have a way to access it nicely.
git logandgit annex logexist, but they are too specific, too broad or a bit hard to parse on their own. For example:git annex activity --since="2 weeks ago" --include='*.doc'would list things (who committed, which remote received it, etc.) that happened in the last two weeks to *.doc filesgit annex activity --only-annex --in=remote2would list recent annex operations (in thegit-annexbranch only) of remote2git annex activity --only-changes --largerthan=10MBwould list recent file changes (additions, modifications, deletions, etc., ingit logonly)This
git annex assistant-logandgit annex activitywould be a very nice feature to showcase git-annex's power (which other file syncing tool can to this? 🤔) and also solve Recent remote activities.A
git-annex activity(orgit-annex log) could also optionally stream live activity as it is happening. Eg, when a transfer is started it could display the start, and then later the end. That would be easy to build with what's in git-annex already. The assistant already uses the transfer logs that way, using inotify to notice changes.This is essentially the same as
git-annex logwith a path. It also supports --since and --json. The difference I guess is the idea to also include information about git commits of the files, not only git-annex location changes. That would complicate the output, and apparentlygit-annex log's output is too hard to parse already. So a design for a better output would be needed.This is the same as
git-annex log --allwith the output filtered to only list a given remote. (--indoes not influence--allcurrently).Can probably be accomplished with
git logwith some -S regexp.