trust based on time since last fsck

It'd be really useful if I could specify my level of trust in a remote holding a file as a function of the time since the file has last been fsck'd in that remote.

This way, if I haven't fsck'd say my off-site cold storage in x amount of time, git-annex would automatically try to create additional copies of its files in other remotes for example.

Expiry can be used in a similar way but declaring the remote as dead is overkill and has unwanted side-effects.

RSS Atom

comment 1

You can query for repositories that have not been fscked for some amount of time:

git annex expire 10d --no-act --activity=Fsck

From there, it's a simple script to set the unfscked ones to untrusted, or whatever.

| grep '^expire' | awk '{print $2}' | xargs git-annex untrust

I suppose git-annex expire could have an option added, like --untrust to specify how to expire, rather than the default of marking the repo dead.

I suppose you'd want a way to also go the other way, to stop untrusting a repo once it's been fscked.. There is not currently a way to do that.

Note that a fsck that is interrupted does not count as a fsck activity, and it's not keeping track of what files were fscked. That would bloat the git-annex branch. On the other hand, if you git annex fsck onefile that counts as a fsck activity, even though other files in the repo didn't get fscked. So you would have to limit the ways you use fsck to ones that generate the activity you want, perhaps to git annex fsck --all.

Perhaps fsck should also have a way to control whether it records an activity or not..

Comment by joey — Mon Jun 14 17:14:44 2021

Remove comment

comment 2

What if git annex fsck --all recorded an additional activity, eg FsckAll. Then there could be a command, or a config that untrusts repos that do not have a FsckAll activity that happened recently enough.

A git config would be simplest, eg:

git config annex.untrustLastFscked 10d

Comment by joey — Mon Jun 14 17:29:29 2021

Remove comment

comment 3

Tried to implement this, but ran into a problem adding FsckAll: If it only logs FsckAll and not also Fsck, then old git-annex expire will see the FsckAll and not understand it, and treats it as no activity, so expires. (I did fix git-annex now so an unknown activity is not treated as no activity.)

And, the way recordActivity is implemented, it removes previous activities, and adds the current activity. So a FsckAll followed by a Fsck would remove the FsckAll activity.

That could be fixed, and both be logged, but old git-annex would probably not be able to parse the result. And if old git-annex is then used to do a fsck, it would log Fsck and remove the previously added FsckAll.

So, it seems this will need to use some log other than activity.log to keep track of fsck --all.

Comment by joey — Mon Jun 14 17:56:23 2021

Remove comment

comment 4

Maybe it's better to not tie this directly in to fsck. Another way would be:

git annex untrust foo --after=100days

The first time this is run, it would record that the trust level will change to untrust after 100 days. The next time it's run, it would advance the timeout.

So, you could do whatever fsck or other checks make you still trust the repo, and then run this again.

Implementation would I guess need a separate future-trust.log in addition to trust.log, and when loading trust levels, if there was a value in future-trust.log that has a newer timestamp than the value in trust.log, and enough time has passed, use it instead of the value from trust.log. That way it avoids breaking older git-annex with changes to trust.log.

No need to change what's in trust.log, although it could, which would also let older git-annex versions learn about the change to trust.

Comment by joey — Mon Jun 14 18:20:06 2021

Remove comment

Add a comment