Fsck remote files in-flight

git-annex/ todo/ Fsck remote files in-flight

Edit
RecentChanges
History
Preferences
Branchable
2 comments

install
assistant
walkthrough
tips
bugs
todo
forum
comments
contact
thanks

When fsck'ing a remote repo, files seem to be copied from the remote to a local dir (thus written to disk), read back again for checksumming and then deleted.

This is very time-inefficient and wastes precious SSD erase cycles which is especially problematic in the case of special remotes because they can only be fsck'd "remotely" (AFAIK).

Instead, remote files should be directly piped into an in-memory checksum function and never written to disk on the machine performing the fsck.

done per my comments --Joey

RSS Atom

comment 1

Only some remotes support checksums in-flight; this recently includes downloads from other git-annex repositories over ssh. Progress on that front is being tracked at https://git-annex.branchable.com/todo/OPT__58_____34__bundle__34___get_+_check___40__of_checksum__41___in_a_single_operation/ Most special remotes can't yet, but that should change eventually for at least some of them.

I've made fsck notice when content was able to be verified as part of a transfer, and avoid a redundant checksum of them.

What I've not done, and don't think I will be able to, is make the file not be written to disk by fsck in that case. Since the retrieveKeyFile interface is explicitly about writing to a file on disk, it would take ether a whole separate interface being implemented for all remotes that avoids writing to the file when they can checksum in flight, or it would need some change to the retrieveKeyFile interface to do the same.

Neither seems worth the complication to implement just to reduce disk IO in this particular case. And it seems likely that, for files that fit in memory, it never actually reaches disk before it's deleted. Also if this is a concern for you, you can I guess avoid fscking remotes too frequently or use a less fragile medium?

Comment by joey — Wed Apr 14 17:07:50 2021

Remove comment

comment 2

Checksum during transfer is now implemented for as many remotes as it reasonably can be, which is almost all of them. But not 100% of all remotes in all circumstances. And there's no way to know if a remote will support it before doing the transfer.

To avoid changing the API, it occurs to me that retrieveKeyFile could be passed /dev/null. But any remote that does not support resuming and tries to overwrite the existing destination file would fail.

Also some kinds of remotes download to the file in one process or thread and while the download is happening, git-annex checksums the file as new data appears in it. External special remotes in particular do this. That would break with /dev/null too.

Putting the temp file on some other medium seems like the only way to address this. If there were a config of a directory to use, you could point it at a disk rather than the SSD, or even at a ram disk, if you have sufficient memory. Unsure if it's worth adding such an option though, probably few people would use it. And cloning the repository onto the other medium and running the remote fsck from there would have the same result without needing an option.