Recent comments posted to this site:

comment 1

git-annex get picks which remote to use, and falls back as needed to another remote if the first is not available, and of course does nothing if the content is present already.

It would be perhaps most symmetric with that if git-annex put picked one remote to send content to (ie, the lowest cost one that wants it), fell back to the next best remote if that one was not available, and avoided sending any content for files that are in some other repository already.

As well as just being symmetric, that feels like a useful behavior that is not currently possible to get from any git-annex command.

That's in tension with the idea that git-annex put --json would send to the same remotes that git-annex push would. Maybe that behavior should be an option? Or maybe that belongs in yet another command.

Just how useful would the 1 copy behavior be? One indication maybe is that noone has ever asked for that behavior. And it seems like it would be easy for the content to go to an unexpected place and break a workflow. Eg, suppose a user starts using git-annex put, which sends the content to origin and makes it available to others. But they also have a remote for a local USB drive, which has been disconnected all that time. When they one day reconnect that drive, it has a lower cost, and so their puts start going there, preventing others from accessing the files.

Also it's worth noting that pull picks a remote to get to, but push sends to all remotes that want it. So this particular symmetry is not maintained all the way up. So perhaps it's not a useful symmetry.

Overall, it seems like something that could be an option and not the default. If someone has a good use case.

Comment by joey
comment 11

Hmm, git-annex get --auto and git-annex copy --auto don't just check preferred content, they will also make a copy if numcopies is not satisfied for the file.

That's something that git-annex pull and git-annex push don't do.

I think it would be a good idea to add a --wanted option to commands that support --auto. It would only operate on preferred content, avoiding making copies only to work toward satisfying numcopies.

Update: Added --wanted

Comment by joey
comment 10

Summarizing the current status of this todo, if someone wants the equivilant of git-annex pull --json $someremote, they can run:

git-annex pull --no-content $someremote
git-annex get --auto --json --from $someremote
git-annex drop --auto --json

The first command above does not have json output, but outputs the usual git pull messages for the user to deal with as they see fit.

And, if someone wants the equivilant of git-annex push --json $someremote, they can run:

git-annex copy --auto --json --to $someremote
git-annex drop --auto --json --from $someremote
git-annex push --no-content $someremote

The last command above does not have json output, but outputs the usual git push messages for the user to deal with as they see fit.

The argument for adding --json to pull/push now seems to be reduced. But not gone entirely I suppose.

For example, git-annex push without a remote pushes content to all remotes that want it. That needs multiple runs of git-annex copy, one per remote. Notice that in the git-annex pull case, it can be made to operate on all remotes:

git-annex pull --no-content
git-annex get --auto --json
git-annex drop --auto --json

This seems like an argument for adding a git-annex put command that copies to all remotes that want content. Which is nicely symmetric with git-annex get.

Another difference is that a single git-annex push or pull (or sync) does less work than several git-annex commands. In the scripts above, git-annex has to traverse the tree twice. That is a pretty small difference in overhead.

Comment by joey
comment 9

FWIW, I've split updateBranches between pull and push now.

On git-annex push all it does is propagate adjusted branches changes back to the original branch.

On git-annex pull it handles updating the view branch and/or propagating changes from the original branch to the adjusted branch.

Also, git-annex push was fixed to not merge synced/master into master and to not update the adjusted branch when the original branch has changed.

Comment by joey
comment 3

I'm surprised it responds to HEAD at all. It's not a documented part of the p2phttp API, and the implementation is only a GET endpoint. I guess that servant makes GET endpoints also support HEAD? Urk.

Yes, I think all of the "higher-level http server frameworks" I've encountered (definitely Flask and the construct Forgejo is using, but also others) automatically support HEAD for all GET endpoints, because a properly implemented HEAD is a subset of GET anyway. I'd expect servant to do the same.

(I do think it could have also happened without HEAD with just the right timing of the client hanging up on GET, still have not verified that. Of course, we had a whole bug about p2phttp can get stuck with interrupted clients that was dealt with previously, but maybe we missed it back then.)

At least I didn't get the p2phttp server stuck with interrupted clients while investigating this issue (that was my initial guess on what was causing the server to get stuck in the first place), but I did see a different bug that I didn't yet report which caused the p2phttp server to exit with exit code 141 if a client was interrupted at the "right" time. This one might already be fixed by https://git-annex.branchable.com/bugs/SIGPIPE_behavior_change/ though.

I've also documented HEAD /git-annex/$uuid/key/$key as supported by p2phttp because if you give a HTTP client an URL, I suppose it may try HEAD.

The initial use-case by mih was to point git annex addurl at this key endpoint, and that does try HEAD, which triggered the bug. So even git-annex itself does it, it just fell out of the report when I reduced the reproducer as far as possible :)

Fixed this.

Thank you!

Comment by matrss
comment 2

Fixed this.

(I do think it could have also happened without HEAD with just the right timing of the client hanging up on GET, still have not verified that. Of course, we had a whole bug about p2phttp can get stuck with interrupted clients that was dealt with previously, but maybe we missed it back then.)

I've also documented HEAD /git-annex/$uuid/key/$key as supported by p2phttp because if you give a HTTP client an URL, I suppose it may try HEAD.

I would rather that the versioned GET endpoints not also support HEAD, just because it's not part of the interface git-annex uses. If I find a way to prevent servant from automatically supporting HEAD for those, I will use it.

Comment by joey
comment 1

Reproduced this.

I'm surprised it responds to HEAD at all. It's not a documented part of the p2phttp API, and the implementation is only a GET endpoint. I guess that servant makes GET endpoints also support HEAD? Urk.

It seems possible that this doesn't only happen on HEAD, but also on a GET where the client disconnects without reading any of the response body. The code path through looks like it would possibly be the same.

It is getting stuck on getP2PConnection. So far I've determined that the connection servicer thread gets stuck handling a connection release. Which is why the subsequent HEAD fails. So will any subsequent request actually. So this can take down a p2phttp server with a single request.

Comment by joey
comment 1

Note that the transcripts are not quite what git-annex usually outputs, due to this bug, which has now been fixed.

I have long disliked how this is displayed in the ssh case too.


At the level of the P2P protocol, a solution to this could be for the server to send an ERROR message back while the client is still in the process of sending the file with DATA. The P2P protocol allows ERROR to be sent at any time.

This would look something like P2P.IO.runNet in SendBytes on error, trying to getProtocolLine, and when it gets an ERROR returning the error message as the Left ProtoFailureMessage rather than the current exception.

(Something would need to be done in the P2PHandleTMVar to handle proxying too.)

For the P2P protocol over http, the /put response would look something like:

{"stored": false, "error-message": "not enough free space, need 1.05 GB more}"}

Currently p2phttp actually replies with 500 Internal Server Error, which git-annex does display to the user.


The other side of the problem though is that the disk space message is displayed as a warning. So how would git-annex-shell, or p2phttp intercept it to send it along to the client? There would need to be quite a lot of restructuring to make that an exception.

There are other warnings as well that it would be good to send to the client. One that comes to mind is "transfer already in progress, or unable to take transfer lock". So this is a more general problem.

Comment by joey
Red herring
The last two updates (posted yesterday by myself) are misleading and the underlying cause is different. At least some aspects are explained by differences in handling access tokens provisioned by forgejo for action runs vs longer-lived access tokens. A fix for this other issue is in the works for forgejo-aneksajo.
Comment by mih
Credential is rejected!

I wanted to investigate further and added a credential "helper" that documents what was queried

cat << EOT > /usr/local/bin/git-credential-echo
#!/usr/bin/env bash
exec cat >&2
EOT
chmod +x /usr/local/bin/git-credential-echo
git config --global --add credential.helper echo

I also switched from annex push to annex copy (because this is the aspect that failed). I now see (what I could have seen in the log above already). The issue is not that the credential isn't retrieved properly. It is actually rejected, and the superficial/original error is the result of prompting for another valid credential. Here is the log of a copy call:

git annex --debug copy -t origin .
[2026-05-16 10:01:43.073507233] (Utility.Process) process [639] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","git-annex"]
[2026-05-16 10:01:43.075431464] (Utility.Process) process [639] done ExitSuccess
[2026-05-16 10:01:43.075932187] (Utility.Process) process [640] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/git-annex"]
[2026-05-16 10:01:43.077917969] (Utility.Process) process [640] done ExitSuccess
[2026-05-16 10:01:43.078259638] (Annex.Branch) read remote.log
[2026-05-16 10:01:43.079214293] (Utility.Process) process [641] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch"]
[2026-05-16 10:01:43.081261767] (Annex.Branch) read proxy.log
[2026-05-16 10:01:43.082419578] (Utility.Process) process [642] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","ls-files","--stage","-z","--error-unmatch","--","."]
[2026-05-16 10:01:43.082798858] (Utility.Process) process [643] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch-check=%(objectname) %(objecttype) %(objectsize)","--buffer"]
[2026-05-16 10:01:43.083296281] (Utility.Process) process [644] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch=%(objectname) %(objecttype) %(objectsize)","--buffer"]
[2026-05-16 10:01:43.083778403] (Utility.Process) process [641] done ExitSuccess
[2026-05-16 10:01:43.086359241] (Utility.Process) process [645] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch=%(objectname) %(objecttype) %(objectsize)","--buffer"]
copy static/graph.json (to origin...) [2026-05-16 10:01:43.210382475] (Utility.Process) process [647] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","-c","filter.annex.smudge=","-c","filter.annex.clean=","-c","filter.annex.process=","write-tree"]
[2026-05-16 10:01:43.214071033] (Utility.Process) process [647] done ExitSuccess
[2026-05-16 10:01:43.214666158] (Utility.Process) process [648] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/annex/last-index"]
[2026-05-16 10:01:43.218272853] (Utility.Process) process [648] done ExitSuccess
[2026-05-16 10:01:43.218310804] (Database.Keys) reconcileStaged start
[2026-05-16 10:01:43.218806637] (Utility.Process) process [649] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch-check=%(objectname) %(objecttype) %(objectsize)","--buffer"]
[2026-05-16 10:01:43.219327951] (Utility.Process) process [650] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch=%(objectname) %(objecttype) %(objectsize)","--buffer"]
[2026-05-16 10:01:43.219921627] (Utility.Process) process [651] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","-c","filter.annex.smudge=","-c","filter.annex.clean=","-c","filter.annex.process=","-c","diff.external=","diff","ee47aafecb91c163b0eb9e7ef1a35b07d5b1e0b9","8ea4ce9a4065396e07306bc2f30bcf295837ad6f","--raw","-z","--no-abbrev","-G/annex/objects/","--no-renames","--ignore-submodules=all","--no-textconv","--no-ext-diff"]
[2026-05-16 10:01:43.223295855] (Utility.Process) process [651] done ExitSuccess
[2026-05-16 10:01:43.225251807] (Database.Handle) commitDb start
[2026-05-16 10:01:43.225610276] (Database.Handle) commitDb done
[2026-05-16 10:01:43.225676608] (Utility.Process) process [650] done ExitSuccess
[2026-05-16 10:01:43.2257297] (Utility.Process) process [649] done ExitSuccess
[2026-05-16 10:01:43.226178161] (Utility.Process) process [652] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","update-ref","refs/annex/last-index","8ea4ce9a4065396e07306bc2f30bcf295837ad6f"]
[2026-05-16 10:01:43.228765129] (Utility.Process) process [652] done ExitSuccess
[2026-05-16 10:01:43.22880699] (Database.Keys) reconcileStaged end
[2026-05-16 10:01:43.246406143] (Utility.Process) process [653] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","credential","fill"]
[2026-05-16 10:01:43.253477499] (Utility.Process) process [653] done ExitSuccess
[2026-05-16 10:01:43.274667127] (Utility.Process) process [656] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","credential","reject"]
protocol=https
host=hub.psychoinformatics.de
username=myuser
password=***
[2026-05-16 10:01:43.2873155] (Utility.Process) process [656] done ExitSuccess
25%   31.98 KiB        70 MiB/s 0s
100%  126.9 KiB       179 MiB/s 0s[2026-05-16 10:01:43.330200019] (Utility.Process) process [662] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","credential","fill"]
protocol=https
host=hub.psychoinformatics.de
fatal: could not read Username for 'https://hub.psychoinformatics.de': No such device or address
[2026-05-16 10:01:43.341958838] (Utility.Process) process [662] done ExitFailure 128

  user error (git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","credential","fill"] exited 128)
[2026-05-16 10:01:43.357710043] (Utility.Process) process [668] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","credential","fill"]
protocol=https
host=hub.psychoinformatics.de
fatal: could not read Username for 'https://hub.psychoinformatics.de': No such device or address
[2026-05-16 10:01:43.369272597] (Utility.Process) process [668] done ExitFailure 128

  user error (git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","credential","fill"] exited 128)

failed
[2026-05-16 10:01:43.371334491] (Utility.Process) process [645] done ExitSuccess
[2026-05-16 10:01:43.371445144] (Utility.Process) process [644] done ExitSuccess
[2026-05-16 10:01:43.371533016] (Utility.Process) process [643] done ExitSuccess
[2026-05-16 10:01:43.371599238] (Utility.Process) process [642] done ExitSuccess
copy: 1 failed
Comment by mih