Please describe the problem.
Probably it is more of a todo than a bug.
What steps will reproduce the problem?
This is a use-case where I am trying to establish a special remote to be shared by multiple unrelated repositories.
So I had original repo1 in which I
- created an external special remote with chunking, it got UUID1
- uploaded some data (all got chunked)
created repo2 in which I
- initialized special remote with identical settings and provided
uuid=UUID1
- decided to test if annex would be able to get a key from the shared special remote
but annex fsck --key KEY --from remote --fast
, since it doesn't have an exact chunking list, just provides special remote backend with original full key only, which is obviously not found, and it reports failure. But I wondered -- couldn't git-annex
just use chunking size and "mint" possible chunked-keys to test on the special remote since it has all the information? After all chunk keys AFAIK are deterministically minted and pretty much are just "augmented" original key with -S<chunksize>-C<chunkindex>
added to the key.
What version of git-annex are you using? On what operating system?
8.20200908+git175-g95d02d6e2-1~ndall+1
Note that what you are trying to do will only work if the special remote is not encrypted.
As well as your use case, which seems very unusual, I think one other use case would be if a clone uploaded to the special remote, but never synced out its git-annex branch before being lost, and fsck --from remote is being run in another clone to reconstruct it. Currently it won't try chunks as none are recorded.
Speculatively trying the current remote's chunk config would handle the majority of cases, though wouldn't help if the other clone had adjusted the special remote's chunk size too.
There's some overhead, but it can check it last, and not check it if it's in the list of known chunks, so the overhead would only usually be paid if the content git-annex expected to be present had gone missing, which I think is rare enough to not care about.
(Also, this can only be done when the size of the key is known, so not eg addurl --relaxed keys.)
Implemented that. But..
As implemented, there's nothing to make the chunk size get stored in the chunk log for a key, after it accesses its content using the configured chunk size.
So, changing the chunk= of the remote can prevent accessing content that was accessible before. Of course, avoiding that is why chunk sizes are logged in the first place.
Seems like maybe fsck --from should fix the chunk log? I think fsck would always need to be used, to fix up the location log, before any other commands rely on the data being in the special remote, so it seems fine to only fix the chunk log there.
But, also a bit unclear how fsck would find out when it needs to do this. It only needs to when the remote's configured chunk size is not listed in the chunk log. But that's also common after changing the chunk size of a remote. So it would have to mess around with checking the presence of chunk keys itself, which would be extra work and also ugly to implement.
I'm leaving this todoWbug open for now due to this.
Ok, made update the chunk log as needed while checking if chunks are present. So this is done.