Please describe the problem.
This is a continuation to the prior report/discussion to facilitate access to private repositories on public hosting portals.
If we place more odd/custom behavior of gitlab etc installations which forward to login screen (thus no 401 or 404 response) upon attempt to access something which might be within private rep, aside, the situation with github and gogs (github clone) which powers gin (which I had mentioned in that prior discussion)) is different: they return 404 response. And I think (didn't check git code, but just based on its behavior) git
is then asking for credentials as the "next way to try". I think git-annex should do the same -- if 404 received, ask git credential
to fill for that domain (as it would do now in case of 401).
What steps will reproduce the problem?
Try to clone and get data from a private repository on https://gin.g-node.org/ (repo could be created, or let me know and I would create one, but you would still need to register there). I am not yet 100% certain that upon authentication you would be able to fetch that /config
(haven't tried). Satellite issue/discussion I just initiated on gin is here
What version of git-annex are you using? On what operating system?
8.20201127+git54-ga1b227171-1~ndall+1
edit 1: although probably a deeper look into how/why git decides to ask for credentials for private repos might be due. May be similar check should be done by git-annex first, since otherwise there might be no way to tell apart from a "proper" 404 for inability to get /config
from github
The git source code does not appear to behave like that, see http.c
normalize_curl_result
, which reauths on 401, but not on 404. If you think git behaves like this, you need to show an example where it clearly accesses an url that is 404 and goes on to authenticate.Seems to me that these hosting sites may simply not be exposing foo.git/config to http. Git does not request that file over http. Such a hosting site would probably also not expose foo.git/annex/ over http, so git-annex would not be able to use it anyway. To support git-annex, it would need to expose both, and then git-annex's handling of 401 should work fine for authentication.
foo.git/annex/
-- that is what gin has extended original borg with. Example repo to try on https://gin.g-node.org/ljchang/Sherlock . The problem/difficulty is only in access to "private" repositories -- access to config and annexed files is working fine through httpIt still seems easy to demonstrate that git does not ask for creds on 404:
So I need you to show me what makes you think that git does such a strange thing, before I can take seriously a request to replicate that behavior in git-annex. Because the only possible reason I would implement such an insane thing is if git has lost its collective mind and so I needed to follow into the abyss.
If the actual issue is that gogs has implemented support for git-annex, but that it sends 404 when git-annex requests config from a private repo, rather than 401, it seems to me the place to fix that is in gogs.
yeap, it is not about 404 ...
with gogs/gin situation is obscure but "easyish" - 401 is returned upon access to
/info/refs
but not above:github is ... trickier, or to say -- my C/gdb/whatever foo is not good enough, since
it is still 404 with simple wget but git remote-https seems to get 401:
but overall the point is that git does seems to get 401 with auth availability (although I failed to dig out how exactly it gets it). So I will leave it to the experts to figure out how
These possibilities seem about equally likely to me:
So why try to work around it in git-annex when it's a coin flip whether git-annex can at all, when in either case there's clearly a bug in gogs, and is specifically in code in gogs that is intended to support git-annex?
github has a bad habit of using user-agent to make urls do different things when git accesses them than when other http clients do. That is the case in your example; use wget -U git/1 and it will 401. But I don't see how that's relevant, since git-annex does not talk to github except for a) via git and b) via its git-lfs implementation (which supports http basic auth although I can't remember if I tested it against github's server or only other servers like gitlab).
If github's lfs endpoint did do user-agent sniffing, IMHO that would violate their spec, but also yeah, I'd probably put in some appropiately snarky fake user-agent in git-annex there. But not in general, and none of this says git-annex should be treating 404 like 401.
gin
portal just returns 401 in such casesgithub's rationalle for the sniffing, such as it is, is that an url to a git repository lets you view it in the web ui, and the same url can be cloned by git.
Agreed, I'll close this in git-annex, and they can fix it in gin.