It would be great to have a way to find out when a remote was last "active" and what happened. This can help with:
- knowing which remotes can safely be marked "dead", as it didn't "do" anything for the last 10 years
- finding out what exactly a repository is or where it was, if the (auto-generated) description isn't helpful
- double-checking if one's setup indeed does the expected, puts files where they should go, etc.
- maybe for sorting the remotes in
git annex info
by alast contact
time? (see also here, sorting by description would already be cool and better than sorting by UUID, which is random: https://git-annex.branchable.com/todo/Sorting_remotes_by_description_in_96git_annex_info96/)
For a specific file, one can already get an activity log for a specific remote with git annex log
:
git annex log thatfile.pdf | grep 5091aa91-fb08-44b1-aece-7406257103f8
# + Mon, 27 Mar 2023 21:39:07 CEST thatfile.pdf | 5091aa91-fb08-44b1-aece-7406257103f8 -- ThatRemoteName
But that is very slow and doesn't scale at all for considering all files.
There's also the activity.log
in the git-annex branch, which AFAIK currently only logs git annex fsck
invocation timestamps:
git annex log thatfile.pdf | grep 091aa91-fb08-44b1-aece-7406257103f8
# 091aa91-fb08-44b1-aece-7406257103f8 Fsck timestamp=1743873054s
# ...
The commit messages on the git-annex branch don't include the remote's ID (which might be helpful but hogs disk space?), so that can't be used to inspect activity - one doesn't know who did the commit.
Any ideas how this could be done?
Cheers, Yann
Apparently,
git grep
in the git-annex branch is pretty performant, so this can be used to find activity times:git grep for uuid is a good simple solution.
Maybe
git-annex log --all
could be made to show all location log changes for all keys. Then you could just grep that for the uuid to see what changes have been happening to what files (if it mapped keys back to current filenames when possible). Implementation would begit log
filtered to location log files, with--raw
to get the diff, then parsing the diff.There is already code that does something very similar in Annex.RepoSize.diffBranchRepoSizes. And since that is already run by
git-annex info
, it would be cheap to pull out a last activity date for each repo at the same time as the repo's size, and havegit-annex info
display it or use it in the other ways you suggest.The only wrinkle is that is an incremental diff since the last time it was called, so would not include dates for repos that have not changed since. So the dates would need to be cached somewhere.
Basically the same todo previously: show time of last interaction with a repo
I'll close that one in favor of this new one. The old one did have some ideas about using groups to manually track activity, and a way to use
git-annex expire
to list recently fsked repos.Copying a related idea from @nobodyinperson on remove webapp:
Furthermore, a command like
git annex activity
that goes arbitrarily far back in time and statically (non-live) lists recent activities like:document.txt
(10MB)document.txt
(from today 10:45) to remote1, remote2 and remote3document.txt
(12MB) and uploaded to remote2Basically a human-readable (or as JSON), chronological log of things that happened in the repo. This is a superpower of git-annex: all this information is available as far back as one wants, we just don't have a way to access it nicely.
git log
andgit annex log
exist, but they are too specific, too broad or a bit hard to parse on their own. For example:git annex activity --since="2 weeks ago" --include='*.doc'
would list things (who committed, which remote received it, etc.) that happened in the last two weeks to *.doc filesgit annex activity --only-annex --in=remote2
would list recent annex operations (in thegit-annex
branch only) of remote2git annex activity --only-changes --largerthan=10MB
would list recent file changes (additions, modifications, deletions, etc., ingit log
only)This
git annex assistant-log
andgit annex activity
would be a very nice feature to showcase git-annex's power (which other file syncing tool can to this? 🤔) and also solve Recent remote activities.A
git-annex activity
(orgit-annex log
) could also optionally stream live activity as it is happening. Eg, when a transfer is started it could display the start, and then later the end. That would be easy to build with what's in git-annex already. The assistant already uses the transfer logs that way, using inotify to notice changes.This is essentially the same as
git-annex log
with a path. It also supports --since and --json. The difference I guess is the idea to also include information about git commits of the files, not only git-annex location changes. That would complicate the output, and apparentlygit-annex log
's output is too hard to parse already. So a design for a better output would be needed.This is the same as
git-annex log --all
with the output filtered to only list a given remote. (--in
does not influence--all
currently).Can probably be accomplished with
git log
with some -S regexp.