Please describe the problem.
❯ ( source ~/git-annexes/10.20230407+git201-g5df89d58c7.env; git annex version | head -n 1; git annex findkeys --in here | git annex dropkey --force --batch -z ; )
git-annex version: 10.20230407+git201-g5df89d58c7-1~ndall+1
git-annex: <stdout>: commitBuffer: resource vanished (Broken pipe)
dropkey MD5E-s5663237--4608ffbd6b78ce3a325eb338fa556589.nii.gz
ok
❯ ls -ld .git/annex/objects/**/*gz/MD5E-s5663237--4608ffbd6b78ce3a325eb338fa556589.nii.gz
-r-------- 1 yoh yoh 5663237 May 19 09:50 .git/annex/objects/V7/Pj/MD5E-s5663237--4608ffbd6b78ce3a325eb338fa556589.nii.gz/MD5E-s5663237--4608ffbd6b78ce3a325eb338fa556589.nii.gz
❯ ( source ~/git-annexes/10.20230407+git201-g5df89d58c7.env; git annex version | head -n 1; git annex findkeys --in here | git annex dropkey --force --batch ; )
git-annex version: 10.20230407+git201-g5df89d58c7-1~ndall+1
git-annex: <stdout>: commitBuffer: resource vanished (Broken pipe)
dropkey MD5E-s5663237--4608ffbd6b78ce3a325eb338fa556589.nii.gz ok
❯ ls -ld .git/annex/objects/**/*gz/MD5E-s5663237--4608ffbd6b78ce3a325eb338fa556589.nii.gz
ls: cannot access '.git/annex/objects/**/*gz/MD5E-s5663237--4608ffbd6b78ce3a325eb338fa556589.nii.gz': No such file or directory
and also was reported on 10.20230407 to not return anything causing us to stall: https://github.com/datalad/datalad/issues/7315#issuecomment-1554348911.
What steps will reproduce the problem?
What version of git-annex are you using? On what operating system?
Please provide any additional information below.
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
# End of transcript or log.
You are piping non-null-terminated output into a command that needs terminating nulls. So, it reads the entire findkeys output, including newlines as the name of a key. And drops that key, which doesn't exist of course.
With
findkeys --print0
, it does work. It would also be fine to not use-z
, since keys should never actually contain a newline in their name.However, after successfully dropping all the keys with
--print0
, there is then this oddity:That's a bug in nul splitting when there's a trailing nul. Oops. I've fixed that.
Also while I reproduced the rest of the behavior, I didn't see this part:
I'm not sure which command that comes from. Probably I think the findkeys, if its entire output was not consumed for some reason.
ok
for dropping an unknown key. I guess like withrm unknownfile
(unless-f
is used) I would have expected it to error out.re vanished -- it is from
annex version
whenever its output is not fully written out due to use ofhead
:re vanished -- it is from
annex version
whenever its output is not fully written out due to use ofhead
:Aha, thanks for clearing up that
git-annex version
does that! That seems like a bit of a bug on its own really... Fixed that.
The reason dropkeys does not error on an unknown key is that it's entirely possible to get a repository into a state where a key's content is present but the key is otherwise unknown to git-annex. Eg, it doesn't have any location tracking information for it, there are no files in the git repo that point to it, etc.
It makes sense to support dropping the content of such a key.
And, dropkeys intentionally operates the same on a key when its content is not present as it does when the content is present and it successfully dropped it. Because in either case the result is now that the specified key's content is not present.
Gotcha. Just a food for possible discussion/future: I think it is more then of "annotation" of the action outcome to be not just a binary "ok/fail". Indeed
dropkey
can say "ok" as to the promise that in the end there is no key (either it was known or not etc). But it can arrive there differently. Similarish for "fail". In DataLad we have now 4 "status" states: "ok", "notneeded", "impossible", "error" for that reason where first two are for "ok" and the other two for "fail". [documented here](https://github.com/datalad/datalad/blob/HEAD/docs/source/design/result_records.rst#status]. So, heredropkey unknown
was more of "notneeded" success I guess if it was for datalad to report. May be--json
records and non-json output ofgit-annex
in the future could somehow discriminate between those outcomes.Many commands do reflect "notneeded" by not displaying any output.
(I suppose that could even be a problem with --json --batch, since a command like drop will not output anything when it has nothing to do.)
In the case of dropkey, it could have skipped displaying anything for keys that don't exist, but changing that now doesn't seem wise.