Whenever I do an fsck, it's always annoyed me that you have to think of adding --incremental
and then also think about whether an incremental fsck was started and interrupted before which would then require --more
instead.
Forgetting to add --incremental
can leave you in a pickle when you later find out that you need to interrupt the fsck, losing all progress.
I've found myself wondering whether there'd ever be a case where I'd not want an fsck to be resumeable. Could git-annex not just simply always store that information and leave it up to the next fsck execution to decide whether to use it or not?
I actually don't see much reason to not make use of an incremental fsck either unless it's really old but I find this a lot more debatable than at least storing fsck state on each run.
On that note: There also does not appear to be a documented method to figure out whether a fsck was interrupted before. You could infer existence and date from the annex internal directory structure but seeing the progress requires manual sql.
Perhaps there could be a fsck --info
flag for showing both interrupted fsck progress and perhaps also the progress of the current fsck.
I've implemented the default recording to the fsck database. done --Joey
I think it could make sense, when --incremental/--more are not passed, to initialize a new fsck database if there is not already one, and add each fscked key to the fsck database.
That way, the user could run any combination of fscks, interrupted or not, and then use --more to fsck only new files. When the user wants to start a new fsck pass, they would use --incremental.
It would need to avoid recording an incremental fsck pass start time, to avoid interfering with --incremental-schedule.
The only problem I see with this is, someone might have a long-term incremental fsck they're running that is doing full checksumming. If they then do a quick fsck --fast for other reasons, it would record that every key has been fscked, and so lose their place. So it seems --fast should disable this new behavior. (Also incremental --fast fsck is not likely to be very useful anyway.)
That's a hard judgement call for a program to make... someone might think 10 minutes is really old, and someone else that a month is.
As to figuring out whether a fsck was interrupted before, surely what matters is you remembering that? All git-annex has is a timestamp when the last fsck pass started, which is available in
.git/annex/fsck/*/state
, and a list of the keys that were fscked, which is not very useful as far as determining the progress of that fsck.