Please describe the problem.
With p2phttp --wideopen
, a git annex drop
will lock content on the remote before dropping. With p2phttp --unauth-readonly
git annex drop
will instead be satisfied with a "RecentlyVerifiedCopy". This is an issue for forgejo-aneksajo, as it does its own authentication before handing over to p2phttp --wideopen
, at which point a drop will try to lock the file on the remote but authentication will fail. Instead, it should fallback to the "recently verified is enough" behavior of unauth-readonly (and dumb http).
Sorry for the rather unuseful title, the character limit makes coming up with a good summary hard.
What steps will reproduce the problem?
- serve a repository with
git annex --debug p2phttp --wideopen
- get and drop a file in a clone
- observe file locking
- do the same with
git annex --debug p2phttp --unauth-readonly
- do not observe file locking
What version of git-annex are you using? On what operating system?
git-annex version: 10.20240927-g3d7f94ea398b5e84dab3bc89bc5b37746de1d40c
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Servant Benchmark Feeds Testsuite S3 WebDAV
dependency versions: aws-0.24.1 bloomfilter-2.0.1.2 crypton-0.33 DAV-1.3.4 feed-1.3.2.1 ghc-9.4.7 http-client-0.7.14 persistent-sqlite-2.13.2.0 torrent-10000.1.3 uuid-1.3.15 yesod-1.6.2.1
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL GITBUNDLE GITMANIFEST VURL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg rclone hook external
operating system: linux x86_64
supported repository versions: 8 9 10
upgrade supported from repository versions: 0 1 2 3 4 5 6 7 8 9 10
local repository version: 10
Please provide any additional information below.
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
# End of transcript or log.
git annex p2phttp
itself does, is that right? Maybe the specific response expected by git-annex in case of an unauthorized requests could be part of the protocol specification...It's not a special case about locking. p2phttp always uses 403 when the mode it's serving does not allow the class of action.
Eg with --unauth-appendonly a remove request will cause a 403 response. And with --unauth-readonly any non-read request does.
The docs say:
"When authentication is successful but does not allow a request to be performed, it will fail with 403 Forbidden."
A 401 does make git-annex prompt for a password. p2phttp responds to that when --authenv is used and the client didn't basic authenticate.
I see. Is a combination of
--unauth-*
and--authenv
supposed to work?I just tested it and if I serve a repository with
git annex p2phttp --unauth-readonly --authenv-http -J2 --port 54321
and try to do agit annex drop --from origin
then it responds with a 403 and doesn't ask for credentials, even though there is a user configured that has write permissions and dropping works without the--unauth-readonly
. Even if I previously authenticated and have the credentials in my keyring it still 403s, as git-annex seems to always first try the request without authentication.This means the
--unauth-readonly
option currently isn't "allow unauthenticated read-access", but "only allow unauthenticated read-access, deny all writes".I think wanting anonymous read and authenticated write access is quite common, so maybe this should be supported?
This kind of thing starts to work as soon as p2phttp responds with 401 for non-read requests, prompting git-annex to ask for credentials, but then you get the issue that a drop on the client-side will try to lock, gets a 401, and asks for credentials, instead of falling back to the read-only way of dropping (which is where lockcontent is special: it isn't strictly necessary for a drop to succeed, compared to the other endpoints which have nothing meaningful to fallback to).
This is why I assumed that lockcontent was handled specially already, and maybe it should be?
I think the way it is written in the design document doesn't support the current behavior. It says "When authentication is successful but does not allow a request to be performed, it will fail with 403 Forbidden." but authentication hasn't even been attempted before returning a 403 with
--unauth-readonly
. Instead, it also says "When a request needs authentication, it will fail with 401 Unauthorized.", which would apply to this situation (under the assumption that--unauth-readonly
doesn't mean "no authentication possible at all", which I had).I didn't consider combining the two in the current implementation, so behavior is essentially undefined. It happens to check for
--unauth-*
before--authenv
currently.Agreed.
Well there are benefits to it actually locking rather than the fallback. It allows dropping in more situations. So falling back on a 401 does not seem like a good idea to me.
It might be that lockcontent should be allowed in a readonly connection. The only possible issue is that would allow an anon to keep an object locked indefinitely as some kind of DOS attack, so long as they were willing to keep a connection open for keeplocked.
I've implemented combining --unauth-readonly (or --unauth-appendonly) with --authenv/--authenv-http.
Read-only drop locking in that configuration still needs to be addressed, it does prompt for authentication currently.
In a sense the underlying problem here is that git-annex as a client to p2phttp does not know if the user wants it to be read-only or prompt for a password as necessary to perform write operations.
Another way that would be a problem is if a p2phttp server supports both readonly and authenticated operation, but the user does not have an account, and is using say,
git-annex assist
, which wants to store content on the server if possible. So it will prompt repeatedly for a login and password which the user does not have.In this sense, the server is fine in sending a 401, the problem is that the client doesn't know when the user doesn't want that to result in a password prompt. If the client did know it could treat that 401 the same as a 403.
Looking at drop locking through this lens, if the client wants to avoid password prompts and the server requires authentication for lockcontent, it's reasonable for the client to fall back to checkpresent for dropping. The same as it's reasonable for checkpresent to be used when the remote is a dumb http git remote.
The url to the p2phttp remote seems like the natural way for the user to tell git-annex if they want an anonymous or an authenticated connection.
It already works to use
annex+http://joey@example.com/git-annex/
, which will make it prompt for my password when operations need authentication.So it would make sense to support "anonymous@" and special case that to treat 401 the same as 403.
@matrss would there be a way for forgejo-aneksajo to draw this distinction between anonymous and authenticated urls in its user interface?
I agree, this would be easy to deal with if the user intent was somehow made clear.
Technically anonymous is a valid username in Forgejo.
The way I've implemented it so far the user interface only exposes the
http(s)://
URL for the normal git remote and theannex+http(s)://
URL is then taken from the server-side git config (ongit annex init
). Since I believe that git-annex will always try to read the config without authentication first I don't yet see a way to distinguish user intent. It could show the user different URLs for the git remote depending on if the user is logged in or not, but I don't think there is way to make it dependent on the plain-git remote URL? That also doesn't sound like the best idea, and I am not sure if it really is a good proxy for user intent; I would expect people to not be authenticated in the web interface and still want to push files to the server after cloning (e.g. maybe they got logged out from a timeout, don't notice, and now go to a public repository of their own to clone it and make changes).To mark a remote as read-only there is the already existing
remote.<remote>.annex-readonly
config option that could be used. But when to set it...I still think the most practical option would be to special-case lockcontent... I'd expect that any user who seriously uses http will have some kind of credential helper configured. If they have used copy/move/sync/assist/etc then they will already have their credentials in the helper and the lockcontent request can just use them, and if they have just gotten some files and now drop them and don't have credentials for the remote in their helper then I think it is safe to assume they don't want to specify them just for dropping, i.e. fallback to checkpresent. In a situation where a lock is really necessary for dropping and it wouldn't succeed without git-annex could still ask for credentials. Is there a situation in which that wouldn't do "the right thing"?
Ok I decided to allow locking for unauthenticated users by default.
In case that gets abused there is a --unauth-nolocking option which will result in a 401 when --authenv is used, or a 403 otherwise.