Recent comments posted to this site:

This is is a great feature, especially --hide-missing! I really missed this in the past. (Strangely it took me until now to notice that you implemented it.) Thank you.
Comment by mario Thu Jan 23 19:52:47 2020
Thank you Joey! I can only confirm that the file system was likely a crippled/NFS one... So we would likely need to do some sensing on DataLad side and instruct git-annex. Will continue on our end at https://github.com/datalad/datalad/issues/4075
Comment by yarikoptic Thu Jan 23 17:51:58 2020
"the only user-visible improvement is these error messages" -- FWIW, I've been bitten by the lack of config param checking in the past (thought I had set a chunk size but didn't due to misspelled param name, had to re-create the remote.)
Comment by Ilya_Shlyakhter Thu Jan 23 16:51:44 2020

I notice that debug output has no BatchMode=true in any ssh call. But the version of git-annex you show always runs ssh with that when -J is used, unless sshcaching is disabled.

More evidence that sshcaching is disabled in your transcript is that when it does run ssh, it does not pass -S.

I think the repository must be on a crippled filesystem, on which git-annex can't do ssh connection caching, because the filesystem does not support unix sockets. (Or it potentially could be crippled in some other way.) So it ignores the annex.sshcaching setting. You could work around this by setting the (undocumented) GIT_ANNEX_TMP_DIR to some temporary directory on a non-crippled filesystem.

I'm going to add a warning message in this situation.

Comment by joey Thu Jan 23 15:51:46 2020
Error message has been improved.
Comment by joey Wed Jan 22 17:04:09 2020

No, the external special remote protocol is not aimed at downloading git config files. Anyway, this code path is never involved with using special remotes; the uuid of a special remote is known and so there is no need to ever download a git config file to discover it.

Comment by joey Wed Jan 22 16:31:16 2020

git-annex could use git credential if the config download fails with 401 unauthorized and then retry with the credentials. (The git-lfs special remote already does this.) And it would also need to do the same thing when getting a key from the remote.

But that would not help with the https://git.bic.mni.mcgill.ca example, apparently, because there's no 401, but a 302 redirect to a 200, that is indistingishable from a successful download.

Yeah, when git-annex expects a git config, if it doesn't parse as one, it could retry, asking for credentials. But that seems asking for trouble: what if it fails to parse for another reason, maybe the web server served up something other than the expected config, maybe a captive portal got in the way. There would be a username/password prompt that doesn't make sense to the user at all.

And if this happens in a key download, git-annex certianly has no way to tell that what it downloaded is not intended as the content of a key, short of verifying the content, and failure to verify certainly doesn't justify prompting for a username/password.

So, I am not comfortable with falling back to ask for credentials unless I've seen a http status code that indicates they are necessary. And IMHO gitlab's use of a 302 redirect to a login page is a bug in gitlab, and will need to be fixed there, or a better http server used.

Comment by joey Wed Jan 22 16:04:37 2020

I've said this before, but I'll say it one more time: --json-error-messages is not a guarantee that every possible error message that may be output by git-annex in some exceptional circumstance will be formatted as json.

In this case, while I happened to make it be captured in an unrelated change, there's actually no benefit to it being captured. If it were no longer captured tomorrow, I would not consider that a bug. This error message is not specific to a particular file in the repository, so if git-annex get outputs it, it doesn't help for the error message to be wrapped up in json. The actual purpose of --json-error messages is being able to correlate a failure to eg, get a particular file with an error message related to that action. Not in avoiding all possible stderr.


The extra newlines output to stdout are there because the warning action does not know if something may have been output to stdout earlier without a terminating newline, and it wants to avoid an ugly interleave of stdout and stderr. While state could be maintained to keep track of that, the end result would be git-annex would become some milliseconds slower, and it does not seem worth the complexity or minor speed hit to cater to the case where stderr is /dev/nulled. Note that this doesn't happen when using --json. Also, IIRC it's avoided when using concurrent output, which does pay the time/complexity overhead already to keep track of the state of the display to that extent. Anyway, I'm obviosuly not going to leave this bug report open for such a minor and tangential issue after the main issue in it is fixed, so it's kind of annoying to need to write this wall of text about it. May I suggest one bug report per distinct issue is a good way to avoid my current annoyed state?


Not wanting to sit down and write all this is why, the previous two or three times I opened this issue, I promptly closed the window rather than addressing any part of it.

Comment by joey Wed Jan 22 15:11:05 2020

Honestly I feel like the (perceived) semantics of sync are broken by this behaviour. I would expect git-annex to do what it has to to make what I asked for happen.

I agree that in general it's a good thing not to needlessly override git settings but for the sync command I really don't see any way that not merging can be considered sensible behaviour. To me as a user it just feels like I changed a setting completely unrelated to git-annex-sync and suddenly sync broke.

Consider this: the git-annex-sync(1) man page never actually mentions that it will run git-merge. On the other hand git-pull(1) is very forthcoming with the fact that it's just a shorthand for git fetch; git merge so it's obvious to me that settings affecting merge will affect git-pull, not so for sync.

I've been unable to sync my git-annex repos for a couple of months now because of this issue so firmly believe this is a serious usabiliy issue.

At the very least we have a documentation issue here. Though I would still argue the behaviour is bonkers :)

Comment by dxld Tue Jan 21 19:28:29 2020

I feel it's right for git-annex sync to honor git configs, so it's right for it to not merge origin/master. And, without that merge, it's right for it to fail to push master to origin. Since it does push synced/master, this does not prevent other clones of the repo, where git-annex sync is later ran, from getting the changes made by this sync.

That leaves only this ugly thing:

fatal: ambiguous argument 'refs/heads/master..refs/heads/synced/master': unknown revision or path not in the working tree.

Which comes from Git.Branch.changed, but I'm not clear how the fast forward configuration would prevent either of those refs from existing.

Comment by joey Tue Jan 21 18:31:22 2020