As seen in https://github.com/datalad/datalad/pull/6510 datalad is having to parse error messages to determine when nonexistant filenames were passed to git-annex. It would be better if there was some kind of API that it could use, since error messages can (and have) changed.
What would this API look like? It seems it could not be a special exit status of git-annex, because exit status 101 is already allocated for indicating when things like --size-limit and --time-limit have caused it to skip processing certian files.
Perhaps something like --json-error-unmatch
which would make it output
an additional line with a JSON record indicating that git ls-files
--error-unmatch had errored out.
Or --json
could enable this additional JSON record output.
The advantage would be that datalad would not need to check the git-annex
version to see if the option is supported. It would need to output a
JSON record even when git-ls-files did not fail. Perhaps something like
this:
{"git-ls-files exit status": 0}
{"git-ls-files exit-status": 1}
This does risk breaking things that parse the existing JSON and fall over on the new record, but I think git-annex should be free to add new records and fields to its JSON output in general, and it has probably at least added new fields before. --Joey
Although sounds logical, FWIW, not sure if DataLad would be robust to such an assumption since I don't think it was exercised before and indeed all returned records were typically associated with some particular
"path"
so our code might rely on that. Also it makes it a bit unclear on what datalad code should do with "unrecognized" outputs -- are they sign of a problem, or could be safely ignored?Ideally, IMHO and IIRC, DataLad should not concern itself with exit codes of
git
commands whichgit-annex
ran -- it should be up togit-annex
to decide on what any particular underlyinggit
invocation supposed to do etc. I guess only in the cases of some faulty, unknown to git-annex how to handle, behavior where git-annex itself reports an error, might be worth including details on underlying failedgit
invocation in the output, so datalad also could channel it to the user.But datalad is apparently concerning itself with just that, since there was a bug when git-annex's "not found" stderr changed.
I do think this needs to have a good motivation before it's implemented, and that's currently all I have, that datalad for some reason cares about it. Maybe there is not a good reason for datalad to care though..
You may well be right about adding new JSON objects breaking parsers.