Recent changes to this wiki:
initial report on the dances with annexurl
diff --git a/doc/bugs/annex_overwrites_existing_p2p_annexurl.mdwn b/doc/bugs/annex_overwrites_existing_p2p_annexurl.mdwn new file mode 100644 index 0000000000..0decc22896 --- /dev/null +++ b/doc/bugs/annex_overwrites_existing_p2p_annexurl.mdwn @@ -0,0 +1,71 @@ +### Please describe the problem. + +I am trying to orchestrate testing against a forgejo+aneksjo fixture instance in [datalad-fuse](https://github.com/datalad/datalad-fuse/pull/127) where I run forgejo+aneksjo in podman container mapping to a different (fixed) external ports. Overall verdict - "have difficulties". + + +One particular, potentially a core issue (besides that odd "push master first to get things rolling", likely to initiate smth on server side I guess), is that `annexurl`, even if I predefine it correct: + +```shell +❯ git annex version | head -n 1 +git-annex version: 10.20260316+git1-g768707adf4-1~ndall+1 +❯ git annex copy --to=forgejo testfile.bin < /dev/null +Username for 'http://127.0.0.1:41713': ^C +❯ tail -n 5 .git/config +[remote "forgejo"] + url = http://127.0.0.1:41713/testadmin/test-annex-91e0e3b7.git + fetch = +refs/heads/*:refs/remotes/forgejo/* + pushurl = http://testadmin:c8bacefb551205d5b1f75bba7af38aabc2dd7287@127.0.0.1:41713/testadmin/test-annex-91e0e3b7.git + annexurl = annex+http://127.0.0.1:41713/git-annex-p2phttp +❯ git push forgejo master +Enumerating objects: 9, done. +Counting objects: 100% (9/9), done. +Delta compression using up to 20 threads +Compressing objects: 100% (8/8), done. +Writing objects: 100% (9/9), 1003 bytes | 1003.00 KiB/s, done. +Total 9 (delta 1), reused 0 (delta 0), pack-reused 0 (from 0) +To http://127.0.0.1:41713/testadmin/test-annex-91e0e3b7.git + * [new branch] master -> master +❯ tail -n 5 .git/config +[remote "forgejo"] + url = http://127.0.0.1:41713/testadmin/test-annex-91e0e3b7.git + fetch = +refs/heads/*:refs/remotes/forgejo/* + pushurl = http://testadmin:c8bacefb551205d5b1f75bba7af38aabc2dd7287@127.0.0.1:41713/testadmin/test-annex-91e0e3b7.git + annexurl = annex+http://127.0.0.1:41713/git-annex-p2phttp +``` + +it would get overridden by git-annex: + +```shell +❯ git annex copy --to=forgejo testfile.bin < /dev/null +copy testfile.bin (unable to connect to HTTP server: Network.Socket.connect: <socket: 30>: does not exist (Connection refused)) failed +copy: 1 failed +❯ tail -n 5 .git/config + url = http://127.0.0.1:41713/testadmin/test-annex-91e0e3b7.git + fetch = +refs/heads/*:refs/remotes/forgejo/* + pushurl = http://testadmin:c8bacefb551205d5b1f75bba7af38aabc2dd7287@127.0.0.1:41713/testadmin/test-annex-91e0e3b7.git + annexurl = annex+http://localhost:3000/git-annex-p2phttp + annex-uuid = ab484350-97c4-48a7-9d3d-8321dc966cb4 +``` + +and fixing port is not enough: + +```shell +❯ sed -i -e 's,:3000,:41713,g' .git/config +❯ tail -n 5 .git/config + url = http://127.0.0.1:41713/testadmin/test-annex-91e0e3b7.git + fetch = +refs/heads/*:refs/remotes/forgejo/* + pushurl = http://testadmin:c8bacefb551205d5b1f75bba7af38aabc2dd7287@127.0.0.1:41713/testadmin/test-annex-91e0e3b7.git + annexurl = annex+http://localhost:41713/git-annex-p2phttp + annex-uuid = ab484350-97c4-48a7-9d3d-8321dc966cb4 +❯ git annex copy --to=forgejo testfile.bin < /dev/null +Username for 'annex+http://localhost:41713/git-annex-p2phttp': ^C +``` + +seems that pointing to IP "fixes" it + +``` +❯ sed -i -e 's,localhost:41713,127.0.0.1:41713,g' .git/config +❯ git annex copy --to=forgejo testfile.bin < /dev/null +copy testfile.bin (to forgejo...) ok +(recording state in git...) +```
Added a comment: Re: registering multi-file torrent urls for existing files
diff --git a/doc/special_remotes/bittorrent/comment_6_588119c1ac99558a94631689dce9363a._comment b/doc/special_remotes/bittorrent/comment_6_588119c1ac99558a94631689dce9363a._comment new file mode 100644 index 0000000000..dc1b2cea7e --- /dev/null +++ b/doc/special_remotes/bittorrent/comment_6_588119c1ac99558a94631689dce9363a._comment @@ -0,0 +1,16 @@ +[[!comment format=mdwn + username="miris" + avatar="http://cdn.libravatar.org/avatar/bd975774ecba53f7454a61d50fc7d8cc" + subject="Re: registering multi-file torrent urls for existing files" + date="2026-03-22T12:36:40Z" + content=""" +Coming back to this one, turns out this is feasible using `git annex registerurl`: + +``` +aria2c --show-files multi-file.torrent +# get the number of the file +git annex registerurl \"$(git annex lookupkey file.mp4)\" \"magnet:...#<number>\" +``` + +Works as intended; however, this will skip verification, so it's imperative that this is only used when you are 100% certain the file in the torrent is exactly the same your local one. :) +"""]]
diff --git a/doc/forum/__91__android__93_____91__adb__93___Support_for_adb_over_tcp.mdwn b/doc/forum/__91__android__93_____91__adb__93___Support_for_adb_over_tcp.mdwn new file mode 100644 index 0000000000..11900c6a50 --- /dev/null +++ b/doc/forum/__91__android__93_____91__adb__93___Support_for_adb_over_tcp.mdwn @@ -0,0 +1,7 @@ +What is the best way to use adb special remote with device connected and paired over wifi? It uses random ports for connection every time, so I haven't even bothered yet to try IP:PORT as a serialnumber. + + +``` +$ adb devices -l +XXXXX:12345 device product:XXX model:XXX device:XXX transport_id:1 +```
mih seems stated that it works for him just fine.
diff --git a/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn b/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn index a657aa92b1..6f3492dc88 100644 --- a/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn +++ b/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn @@ -106,4 +106,4 @@ FWIW -- https://github.com/datalad/git-annex testing was not happy for awhile bu [[!meta author=yoh]] -[[!tag projects/FZJ]] +
update
diff --git a/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs/comment_1_fa5d62289f76fefaf808edfea41622cd._comment b/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs/comment_1_fa5d62289f76fefaf808edfea41622cd._comment index e943321592..05b7cf0035 100644 --- a/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs/comment_1_fa5d62289f76fefaf808edfea41622cd._comment +++ b/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs/comment_1_fa5d62289f76fefaf808edfea41622cd._comment @@ -3,20 +3,21 @@ subject="""comment 1""" date="2026-03-17T17:47:39Z" content=""" -Note that forgejo does support raw file access, and I expect that supports -range requests for annex objected. +Note that forgejo does support raw file access, and I expect that it +supports range requests for annex objects. -The endpoint where it might make sense to support range requests -is `https://git-annex.branchable.com/design/p2p_protocol_over_http/#index1h3` +The p2phttp endpoint where it might make sense to support range requests +is `/git-annex/$uuid/key/$key` When p2phttp is proxying to a special remote, it would need to download the whole file from the special remote even if the range request was for a small part. So I don't think it should be supported for proxying. One way to implement this might be to use Servant.Server.StaticFiles -to serve `.git/annex/objects/`, and make the `serveGetGeneric` API endpoint -redirect requests to that. That uses warp's built-in static file serving, -which supports range requests. +with a StaticSettings ssLookupFiles that returns the file location under +`.git/annex/objects/` (or even a location in another git-annex repository +when proxying to one, eg a cluster node.) That uses warp's built-in static +file serving, which supports range requests. But how to handle authentication? It seems like the only way would be to reimplement p2phttp's authentication checking as
comment
diff --git a/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs/comment_1_fa5d62289f76fefaf808edfea41622cd._comment b/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs/comment_1_fa5d62289f76fefaf808edfea41622cd._comment new file mode 100644 index 0000000000..e943321592 --- /dev/null +++ b/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs/comment_1_fa5d62289f76fefaf808edfea41622cd._comment @@ -0,0 +1,24 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 1""" + date="2026-03-17T17:47:39Z" + content=""" +Note that forgejo does support raw file access, and I expect that supports +range requests for annex objected. + +The endpoint where it might make sense to support range requests +is `https://git-annex.branchable.com/design/p2p_protocol_over_http/#index1h3` + +When p2phttp is proxying to a special remote, it would need to download +the whole file from the special remote even if the range request was for a +small part. So I don't think it should be supported for proxying. + +One way to implement this might be to use Servant.Server.StaticFiles +to serve `.git/annex/objects/`, and make the `serveGetGeneric` API endpoint +redirect requests to that. That uses warp's built-in static file serving, +which supports range requests. + +But how to handle authentication? It seems like +the only way would be to reimplement p2phttp's authentication checking as +WAI middleware. +"""]] diff --git a/doc/todo/p2phttp_ranged_requests.mdwn b/doc/todo/p2phttp_ranged_requests.mdwn deleted file mode 100644 index 674125a12c..0000000000 --- a/doc/todo/p2phttp_ranged_requests.mdwn +++ /dev/null @@ -1,13 +0,0 @@ -The p2phttp endpoint for raw download of a key does not currently support -range requests or other such things. While it's not always possible for -p2phttp to support that, eg when proxying to a special remote it cannot, it -would be useful if it supported it in configurations where it is possible -to do so. - -One way to implement this might be to use Servant.Server.StaticFiles -to serve `.git/annex/objects/`, and make the `serveGetGeneric` API endpoint -redirect requests to that. And reject ranged requests when proxying. - -But how to handle authentication? It seems like -the only way would be to reimplement p2phttp's authentication checking as -WAI middleware. --[[Joey]]
removed association with FZJ project as out of current scope/need
diff --git a/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs.mdwn b/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs.mdwn index bbbbb8775b..368d470e88 100644 --- a/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs.mdwn +++ b/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs.mdwn @@ -9,4 +9,3 @@ and It would be great to have those supported. The former should facilitate datalad-fuse via fsspec sparse access to data via p2p protocol, the latter -- any additional verification/treatment of the fetched content by any standard http* library/downloader. [[!meta author=yoh]] -[[!tag projects/FZJ]]
todo
diff --git a/doc/todo/p2phttp_ranged_requests.mdwn b/doc/todo/p2phttp_ranged_requests.mdwn new file mode 100644 index 0000000000..674125a12c --- /dev/null +++ b/doc/todo/p2phttp_ranged_requests.mdwn @@ -0,0 +1,13 @@ +The p2phttp endpoint for raw download of a key does not currently support +range requests or other such things. While it's not always possible for +p2phttp to support that, eg when proxying to a special remote it cannot, it +would be useful if it supported it in configurations where it is possible +to do so. + +One way to implement this might be to use Servant.Server.StaticFiles +to serve `.git/annex/objects/`, and make the `serveGetGeneric` API endpoint +redirect requests to that. And reject ranged requests when proxying. + +But how to handle authentication? It seems like +the only way would be to reimplement p2phttp's authentication checking as +WAI middleware. --[[Joey]]
reassign to existing FZJ
diff --git a/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn b/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn index 5b19dcb9f3..a657aa92b1 100644 --- a/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn +++ b/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn @@ -106,4 +106,4 @@ FWIW -- https://github.com/datalad/git-annex testing was not happy for awhile bu [[!meta author=yoh]] -[[!tag projects/trr379.de]] +[[!tag projects/FZJ]]
reassign to already existing FZJ
diff --git a/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs.mdwn b/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs.mdwn index 985fd7d0a2..bbbbb8775b 100644 --- a/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs.mdwn +++ b/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs.mdwn @@ -9,4 +9,4 @@ and It would be great to have those supported. The former should facilitate datalad-fuse via fsspec sparse access to data via p2p protocol, the latter -- any additional verification/treatment of the fetched content by any standard http* library/downloader. [[!meta author=yoh]] -[[!tag projects/trr379.de]] +[[!tag projects/FZJ]]
TODO on range requests
diff --git a/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs.mdwn b/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs.mdwn new file mode 100644 index 0000000000..985fd7d0a2 --- /dev/null +++ b/doc/todo/p2p__58___add_Range___40__in__41___and_Content-Length___40__out__41___hdrs.mdwn @@ -0,0 +1,12 @@ +ATM according to [.../design/p2p_protocol_over_http/](https://git-annex.branchable.com/design/p2p_protocol_over_http/) + +> Request headers are currently ignored, so eg Range requests are not supported. (This would be possible to implement, up to a point.) + +and + +> Note that there is no Content-Length header. + +It would be great to have those supported. The former should facilitate datalad-fuse via fsspec sparse access to data via p2p protocol, the latter -- any additional verification/treatment of the fetched content by any standard http* library/downloader. + +[[!meta author=yoh]] +[[!tag projects/trr379.de]]
added server side log and relation to trr379 project
diff --git a/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn b/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn index ddf14504b5..5b19dcb9f3 100644 --- a/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn +++ b/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn @@ -1,6 +1,6 @@ ### Please describe the problem. -Finally got to try the neat p2p, ultimately with the hope to connect to datalad-fuse. Wanted to test range request support (since was reported to be lacking by claude on a forgejo+aneksjo instance) and thus thought to try on the most recent version locally. +Finally got to try the neat p2p (inspired by trr379 use-case raised in matrix), ultimately with the hope to connect to datalad-fuse. Wanted to test range request support (since was reported to be lacking by claude on a forgejo+aneksjo instance) and thus thought to try on the most recent version locally. Unfortunately <details> @@ -77,6 +77,20 @@ curl: (52) Empty reply from server curl: (56) Recv failure: Connection reset by peer ``` +edit: added the server side: + +``` +❯ git annex version --raw; echo; git annex --debug p2phttp --port 8081 --wideopen + +10.20260316+git1-g768707adf4-1~ndall+1 +[2026-03-17 10:16:22.435272107] (Utility.Process) process [2104460] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","git-annex"] +[2026-03-17 10:16:22.441213037] (Utility.Process) process [2104460] done ExitSuccess +[2026-03-17 10:16:22.441610323] (Utility.Process) process [2104461] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/git-annex"] +[2026-03-17 10:16:22.446123885] (Utility.Process) process [2104461] done ExitSuccess +[2026-03-17 10:16:22.447208284] (Utility.Process) process [2104462] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch"] +[2026-03-17 10:16:22.453605123] (Annex.Branch) read proxy.log + +``` </details> @@ -89,3 +103,7 @@ this was on a local clone of https://datasets.datalad.org/dbic/QA/.git with `sou 10.20260316+git1-g768707adf4-1~ndall+1 - not ok; 10.20251029 - not ok. FWIW -- https://github.com/datalad/git-annex testing was not happy for awhile but I think it was due to some change in behavior affecting RIA archives (I did not have yet chance to troubleshoot manually to report) , thus unrelated here. + + +[[!meta author=yoh]] +[[!tag projects/trr379.de]]
reporting about inability for p2p with recent version
diff --git a/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn b/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn new file mode 100644 index 0000000000..ddf14504b5 --- /dev/null +++ b/doc/bugs/recent_annex_p2phttp_silently___40__in_--debug__41___refuses.mdwn @@ -0,0 +1,91 @@ +### Please describe the problem. + +Finally got to try the neat p2p, ultimately with the hope to connect to datalad-fuse. Wanted to test range request support (since was reported to be lacking by claude on a forgejo+aneksjo instance) and thus thought to try on the most recent version locally. +Unfortunately + +<details> +<summary>whenever p2phttp worked fine (for a full file) request on 10.20251029</summary> + +```shell +❯ curl http://localhost:8081/git-annex/90d896aa-00d0-4f85-bcae-2fd1e992fcab/key/SHA256E-s101318091--02b4a96d66121ddbb5d51fa3f22c2b929bc16d955f438421c1f1b04a1264a50f.tgz >| out.tgz + % Total % Received % Xferd Average Speed Time Time Time Current + Dload Upload Total Spent Left Speed +100 96.62M 0 96.62M 0 0 563.2M 0 0 +❯ sha256sum out.tgz +02b4a96d66121ddbb5d51fa3f22c2b929bc16d955f438421c1f1b04a1264a50f out.tgz + +``` + +and on server side: + +``` +❯ git annex version --raw; echo; git annex --debug p2phttp --port 8081 --wideopen +10.20251029 +[2026-03-17 10:12:16.496986375] (Utility.Process) process [2098931] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","git-annex"] +[2026-03-17 10:12:16.499045819] (Utility.Process) process [2098931] done ExitSuccess +[2026-03-17 10:12:16.499500423] (Utility.Process) process [2098934] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/git-annex"] +[2026-03-17 10:12:16.501265293] (Utility.Process) process [2098934] done ExitSuccess +[2026-03-17 10:12:16.502428567] (Utility.Process) process [2098935] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch"] +[2026-03-17 10:12:16.505874247] (Annex.Branch) read proxy.log +[2026-03-17 10:12:26.349175199] (P2P.IO) [http client] [ThreadId 22] P2P > GET 0 SHA256E-s101318091--02b4a96d66121ddbb5d51fa3f22c2b929bc16d955f438421c1f1b04a1264a50f.tgz +[2026-03-17 10:12:26.349539971] (P2P.IO) [http server] [ThreadId 19] P2P < GET 0 SHA256E-s101318091--02b4a96d66121ddbb5d51fa3f22c2b929bc16d955f438421c1f1b04a1264a50f.tgz +[2026-03-17 10:12:26.350670225] (Utility.Process) process [2099177] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","-c","filter.annex.smudge=","-c","filter.annex.clean=","-c","filter.annex.process=","write-tree"] +[2026-03-17 10:12:26.366674858] (Utility.Process) process [2099177] done ExitSuccess +[2026-03-17 10:12:26.367081645] (Utility.Process) process [2099178] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/annex/last-index"] +[2026-03-17 10:12:26.369153188] (Utility.Process) process [2099178] done ExitSuccess +[2026-03-17 10:12:26.370404636] (P2P.IO) [http server] [ThreadId 19] P2P > DATA 101318091 +[2026-03-17 10:12:26.370496387] (P2P.IO) [http client] [ThreadId 22] P2P < DATA 101318091 +[2026-03-17 10:12:26.518654447] (P2P.IO) [http client] [ThreadId 22] P2P > SUCCESS +[2026-03-17 10:12:26.518800582] (P2P.IO) [http server] [ThreadId 19] P2P < SUCCESS + +``` +</details> + + +<details> +<summary> +with the most recent version 10.20260316+git1-g768707adf4-1~ndall+1 we get silent treatment -- no content provided and nothing in the --debug log +</summary> + + +``` +❯ curl http://localhost:8081/git-annex/90d896aa-00d0-4f85-bcae-2fd1e992fcab/key/SHA256E-s101318091--02b4a96d66121ddbb5d51fa3f22c2b929bc16d955f438421c1f1b04a1264a50f.tgz >| out.tgz + % Total % Received % Xferd Average Speed Time Time Time Current + Dload Upload Total Spent Left Speed + 0 0 0 0 0 0 0 0 0 +curl: (52) Empty reply from server +❯ curl -v http://localhost:8081/git-annex/90d896aa-00d0-4f85-bcae-2fd1e992fcab/key/SHA256E-s101318091--02b4a96d66121ddbb5d51fa3f22c2b929bc16d955f438421c1f1b04a1264a50f.tgz >| out.tgz +* Host localhost:8081 was resolved. +* IPv6: ::1 +* IPv4: 127.0.0.1 + % Total % Received % Xferd Average Speed Time Time Time Current + Dload Upload Total Spent Left Speed + 0 0 0 0 0 0 0 0 0* Trying [::1]:8081... +* connect to ::1 port 8081 from ::1 port 56576 failed: Connection refused +* Trying 127.0.0.1:8081... +* Established connection to localhost (127.0.0.1 port 8081) from 127.0.0.1 port 58866 +* using HTTP/1.x +> GET /git-annex/90d896aa-00d0-4f85-bcae-2fd1e992fcab/key/SHA256E-s101318091--02b4a96d66121ddbb5d51fa3f22c2b929bc16d955f438421c1f1b04a1264a50f.tgz HTTP/1.1 +> Host: localhost:8081 +> User-Agent: curl/8.18.0 +> Accept: */* +> +* Request completely sent off +* Recv failure: Connection reset by peer + +* closing connection #0 +curl: (56) Recv failure: Connection reset by peer +``` + +</details> + + +### What steps will reproduce the problem? + +this was on a local clone of https://datasets.datalad.org/dbic/QA/.git with `sourcedata/sub-qa64/ses-20240715/func/sub-qa64_ses-20240715_acq-faX77_bold.dicom.tgz` content + +### What version of git-annex are you using? On what operating system? + +10.20260316+git1-g768707adf4-1~ndall+1 - not ok; 10.20251029 - not ok. + +FWIW -- https://github.com/datalad/git-annex testing was not happy for awhile but I think it was due to some change in behavior affecting RIA archives (I did not have yet chance to troubleshoot manually to report) , thus unrelated here.
improve docs to close bug
diff --git a/doc/bugs/git_annex_export_--fast_deletes_files_on_remote.mdwn b/doc/bugs/git_annex_export_--fast_deletes_files_on_remote.mdwn index ae34d49200..a9ad943cf8 100644 --- a/doc/bugs/git_annex_export_--fast_deletes_files_on_remote.mdwn +++ b/doc/bugs/git_annex_export_--fast_deletes_files_on_remote.mdwn @@ -45,3 +45,5 @@ local repository version: 10 [[!tag projects/ICE4]] + +> [[fixed|done]] --[[Joey]] diff --git a/doc/bugs/git_annex_export_--fast_deletes_files_on_remote/comment_9_31a1d65c06fff526c08eea4d35e5c4d5._comment b/doc/bugs/git_annex_export_--fast_deletes_files_on_remote/comment_9_31a1d65c06fff526c08eea4d35e5c4d5._comment new file mode 100644 index 0000000000..6c45a883ab --- /dev/null +++ b/doc/bugs/git_annex_export_--fast_deletes_files_on_remote/comment_9_31a1d65c06fff526c08eea4d35e5c4d5._comment @@ -0,0 +1,7 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 9""" + date="2026-03-17T13:39:50Z" + content=""" +Updated documentation. +"""]] diff --git a/doc/git-annex-export.mdwn b/doc/git-annex-export.mdwn index 0ec85668d8..c87051c07c 100644 --- a/doc/git-annex-export.mdwn +++ b/doc/git-annex-export.mdwn @@ -59,12 +59,22 @@ tell it what branch to track. For example: git config remote.myremote.annex-tracking-branch master git annex push myremote -You can combine using `git annex export` to send changes to a special +When the special remote is not also configured with `importtree=yes`, +git-annex does not try to preserve other files that may be written to the +special remote by other means. Exporting a tree of files to a special remote +will overwrite any files already stored on it with the same filenames. +In some cases, an update to what is exported that deletes a subdirectory +will delete not only the exported files that were in that subdirectory, +but any other files that might have been written to the same subdirectory +by other means. + +When the special remote is also configured with `importtree=yes`, +you can combine using `git annex export` to send changes to a special remote with `git annex import` to fetch changes from a special remote. -When a file on a special remote has been modified by software other than -git-annex, exporting to it will not overwrite the modified file, and the -export will not succeed. You can resolve this conflict by using -`git annex import`. +In this case, when a file on a special remote has been modified by +software other than git-annex, exporting to it will not overwrite the +modified file, and the export will not succeed. You can resolve this +conflict by using `git annex import`. (Some types of special remotes such as S3 with versioning may instead let an export overwrite the modified file; then `git annex import`
correct branch name
diff --git a/doc/todo/Ephemeral_special_remotes/comment_9_c0fb4d7034229dd12d89c54c046f15e4._comment b/doc/todo/Ephemeral_special_remotes/comment_9_c0fb4d7034229dd12d89c54c046f15e4._comment index f4a292dd86..fa0d3f7a4d 100644 --- a/doc/todo/Ephemeral_special_remotes/comment_9_c0fb4d7034229dd12d89c54c046f15e4._comment +++ b/doc/todo/Ephemeral_special_remotes/comment_9_c0fb4d7034229dd12d89c54c046f15e4._comment @@ -3,7 +3,7 @@ subject="""comment 9""" date="2026-03-05T19:00:16Z" content=""" -Started developing this in the `ephemeral` branch. +Started developing this in the `delegate` branch. It seems to also make sense to allow DELEGATE as a response to WHEREIS.
add news item for git-annex 10.20260316
diff --git a/doc/news/version_10.20251029.mdwn b/doc/news/version_10.20251029.mdwn deleted file mode 100644 index b98b28583c..0000000000 --- a/doc/news/version_10.20251029.mdwn +++ /dev/null @@ -1,5 +0,0 @@ -git-annex 10.20251029 released with [[!toggle text="these changes"]] -[[!toggleable text=""" * Support ssh remotes with '#' and '?' in the path to the repository, - the same way git does. - * assistant: Fix reversion that caused files to be added locked by - default."""]] \ No newline at end of file diff --git a/doc/news/version_10.20260316.mdwn b/doc/news/version_10.20260316.mdwn new file mode 100644 index 0000000000..7ca6f4e414 --- /dev/null +++ b/doc/news/version_10.20260316.mdwn @@ -0,0 +1,17 @@ +git-annex 10.20260316 released with [[!toggle text="these changes"]] +[[!toggleable text=""" * Added CHECKPRESENT-URL extension to the external special remote protocol. + * Fix reversion in previous version that caused auto-initializing of + local git remotes that have annex-ignore set. + * Fix bug that caused git credential to be rejected when a http request + failed for some reason other than 401. + * Importing from the directory special remote will no longer add sizes + to keys, which overrode backends that generate unsized keys. + * Fix retrival from http git remotes of keys with '%' in their names. + * Fix behavior when initremote is used with --sameas= + combined with --private. + * web, S3, git: Fix bugs in checking if content is present on a remote + when configuration does not allow accessing it. + * httpalso: Fix bugs in handling content not being present on the remote. + * adb: Avoid deleting contents of a non-empty directory when + removing the last exported file from the directory. + * Improve display of http exceptions."""]] \ No newline at end of file
response
diff --git a/doc/special_remotes/borg/comment_5_d6c9d91578ddcb12a5a33e313094179e._comment b/doc/special_remotes/borg/comment_5_d6c9d91578ddcb12a5a33e313094179e._comment new file mode 100644 index 0000000000..456d576eff --- /dev/null +++ b/doc/special_remotes/borg/comment_5_d6c9d91578ddcb12a5a33e313094179e._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 5""" + date="2026-03-16T10:48:29Z" + content=""" +I have not looked at borg 2 in any detail, but it seems they are trying to +keep the CLI to some extent the same. So it seems it would depend on whether +CLI changes break something git-annex relies on. + +I'd treat any incompatability as a [[bug|bugs]] or [[todo]] on the git-annex +side, so if you try it and find problems, please report them. +"""]]
Added a comment: borg 2.0
diff --git a/doc/special_remotes/borg/comment_4_2327577bd65e66d1c88b5d05de38cb5c._comment b/doc/special_remotes/borg/comment_4_2327577bd65e66d1c88b5d05de38cb5c._comment new file mode 100644 index 0000000000..40c5e76b46 --- /dev/null +++ b/doc/special_remotes/borg/comment_4_2327577bd65e66d1c88b5d05de38cb5c._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="nadir" + avatar="http://cdn.libravatar.org/avatar/2af9174cf6c06de802104d632dc40071" + subject="borg 2.0" + date="2026-03-15T13:27:42Z" + content=""" +With borg 2.0 on the horizon, I was wondering how support for that would look should that be planned. + +With how much has changed and the necessity to create new repos, it might make the most sense to create a separate borg2 remote, but I have now idea. Mostly just curious if there are any plans for that at all. +"""]]
update
diff --git a/doc/thanks/list b/doc/thanks/list index 0c9056062d..0e8eb59c97 100644 --- a/doc/thanks/list +++ b/doc/thanks/list @@ -129,3 +129,4 @@ Andrew Poelstra, joshingly, Melody Tolly, username, +Steffen Vogel,
update
diff --git a/doc/thanks/list b/doc/thanks/list index 0a65388abb..0c9056062d 100644 --- a/doc/thanks/list +++ b/doc/thanks/list @@ -128,3 +128,4 @@ mpol, Andrew Poelstra, joshingly, Melody Tolly, +username,
comment
diff --git a/doc/bugs/import_adds_size_to_external_backend_keys/comment_4_18d10c9f97fec86ce16776021777bd17._comment b/doc/bugs/import_adds_size_to_external_backend_keys/comment_4_18d10c9f97fec86ce16776021777bd17._comment new file mode 100644 index 0000000000..b0924ad2fe --- /dev/null +++ b/doc/bugs/import_adds_size_to_external_backend_keys/comment_4_18d10c9f97fec86ce16776021777bd17._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 4""" + date="2026-03-11T15:14:44Z" + content=""" +Turns out Remote.Directory did not need to add the size at all. I've fixed +it. +"""]]
close
diff --git a/doc/bugs/import_adds_size_to_external_backend_keys.mdwn b/doc/bugs/import_adds_size_to_external_backend_keys.mdwn index df9ed4c6f5..7574182477 100644 --- a/doc/bugs/import_adds_size_to_external_backend_keys.mdwn +++ b/doc/bugs/import_adds_size_to_external_backend_keys.mdwn @@ -85,3 +85,5 @@ index 0000000..8620ffe [[!tag projects/ICE4]] + +> [[fixed|done]] --[[Joey]]
removed
diff --git a/doc/special_remotes/bittorrent/comment_5_45b73c0d9f8b2fd2238412fb9feb3e5b._comment b/doc/special_remotes/bittorrent/comment_5_45b73c0d9f8b2fd2238412fb9feb3e5b._comment deleted file mode 100644 index 0643ca5fee..0000000000 --- a/doc/special_remotes/bittorrent/comment_5_45b73c0d9f8b2fd2238412fb9feb3e5b._comment +++ /dev/null @@ -1,26 +0,0 @@ -[[!comment format=mdwn - username="miris" - avatar="http://cdn.libravatar.org/avatar/bd975774ecba53f7454a61d50fc7d8cc" - subject="registering multi-file torrent urls for existing files" - date="2026-03-11T15:09:29Z" - content=""" -Heyo, - -Is it possible to assign multi-file torrent links to files, without needing to download the entire torrent again using Annex? - -Consider the following scenario: - -A multi-file torrent has been downloaded in the past, say, using qBittorrent... -... I then add the files to Git Annex, and would like to efficiently register torrent/magnet URLs for them... - -Annex maps files to a multi-file torrent using `#<n>` suffixes, so I've tried the following: - -1. `aria2c --show-files <torrent file>` -- to get the numbers for each file -2. `git annex addurl --file \"file.mp4\" \"magnet:...#<N>\" -- to map the file to its original torrent/magnet - -The following error occurs: - -> (downloading torrent file...) git-annex: That url contains multiple files according to the bittorrent remote; cannot add it to a single file. - -If there's any information or clarification needed, please don't hesitate to let me know :) -"""]]
Added a comment: registering multi-file torrent urls for existing files
diff --git a/doc/special_remotes/bittorrent/comment_6_ee0e1e0ed6e97dffc5840db0baf7c25c._comment b/doc/special_remotes/bittorrent/comment_6_ee0e1e0ed6e97dffc5840db0baf7c25c._comment new file mode 100644 index 0000000000..61faf94ccd --- /dev/null +++ b/doc/special_remotes/bittorrent/comment_6_ee0e1e0ed6e97dffc5840db0baf7c25c._comment @@ -0,0 +1,27 @@ +[[!comment format=mdwn + username="miris" + avatar="http://cdn.libravatar.org/avatar/bd975774ecba53f7454a61d50fc7d8cc" + subject="registering multi-file torrent urls for existing files" + date="2026-03-11T15:10:37Z" + content=""" +Heyo, + +Is it possible to assign multi-file torrent links to files, without needing to download the entire torrent again using Annex? + +Consider the following scenario: + +A multi-file torrent has been downloaded in the past, say, using qBittorrent... + +... I then add the files to Git Annex, and would like to efficiently register torrent/magnet URLs for them... + +Annex maps files to a multi-file torrent using `#<n>` suffixes, so I've tried the following: + +1. `aria2c --show-files <torrent file>` -- to get the numbers for each file +2. `git annex addurl --file \"file.mp4\" \"magnet:...#<N>\"` -- to map the file to its original torrent/magnet + +The following error occurs: + +> (downloading torrent file...) git-annex: That url contains multiple files according to the bittorrent remote; cannot add it to a single file. + +If there's any information or clarification needed, please don't hesitate to let me know :) +"""]]
Added a comment: registering multi-file torrent urls for existing files
diff --git a/doc/special_remotes/bittorrent/comment_5_45b73c0d9f8b2fd2238412fb9feb3e5b._comment b/doc/special_remotes/bittorrent/comment_5_45b73c0d9f8b2fd2238412fb9feb3e5b._comment new file mode 100644 index 0000000000..0643ca5fee --- /dev/null +++ b/doc/special_remotes/bittorrent/comment_5_45b73c0d9f8b2fd2238412fb9feb3e5b._comment @@ -0,0 +1,26 @@ +[[!comment format=mdwn + username="miris" + avatar="http://cdn.libravatar.org/avatar/bd975774ecba53f7454a61d50fc7d8cc" + subject="registering multi-file torrent urls for existing files" + date="2026-03-11T15:09:29Z" + content=""" +Heyo, + +Is it possible to assign multi-file torrent links to files, without needing to download the entire torrent again using Annex? + +Consider the following scenario: + +A multi-file torrent has been downloaded in the past, say, using qBittorrent... +... I then add the files to Git Annex, and would like to efficiently register torrent/magnet URLs for them... + +Annex maps files to a multi-file torrent using `#<n>` suffixes, so I've tried the following: + +1. `aria2c --show-files <torrent file>` -- to get the numbers for each file +2. `git annex addurl --file \"file.mp4\" \"magnet:...#<N>\" -- to map the file to its original torrent/magnet + +The following error occurs: + +> (downloading torrent file...) git-annex: That url contains multiple files according to the bittorrent remote; cannot add it to a single file. + +If there's any information or clarification needed, please don't hesitate to let me know :) +"""]]
comment
diff --git a/doc/bugs/import_adds_size_to_external_backend_keys/comment_3_f6c47c2feb1d1fc1351bc21f9bc1e2d7._comment b/doc/bugs/import_adds_size_to_external_backend_keys/comment_3_f6c47c2feb1d1fc1351bc21f9bc1e2d7._comment new file mode 100644 index 0000000000..0bc811b44b --- /dev/null +++ b/doc/bugs/import_adds_size_to_external_backend_keys/comment_3_f6c47c2feb1d1fc1351bc21f9bc1e2d7._comment @@ -0,0 +1,16 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 3""" + date="2026-03-11T14:48:31Z" + content=""" + $ git annex migrate --remove-size + migrate data.bin + git-annex: failed creating link from old to new key + +That happens when the file is not present in the local repository. +If you get the file first it will work. + +Migrate needs the content present so it can populate the new key. +Otherwise, there can be situation where the new key never ends up being +populated with the content. +"""]]
comment
diff --git a/doc/bugs/import_adds_size_to_external_backend_keys/comment_2_b5f15e7850a6c162349010157e2b55be._comment b/doc/bugs/import_adds_size_to_external_backend_keys/comment_2_b5f15e7850a6c162349010157e2b55be._comment new file mode 100644 index 0000000000..8ec73b1954 --- /dev/null +++ b/doc/bugs/import_adds_size_to_external_backend_keys/comment_2_b5f15e7850a6c162349010157e2b55be._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 2""" + date="2026-03-11T14:21:42Z" + content=""" +This bug only happens with the directory special remote and not others. + +The problem is that the directory special remote implements `importKey`, +and explicitly adds the size back! +[[!commit 15000dee07a06d04285351616915794bd6ec7f14]] explains why +it needs to do that. +"""]]
rename
diff --git a/doc/todo/extrahader_config.mdwn b/doc/todo/extraheader_config.mdwn similarity index 100% rename from doc/todo/extrahader_config.mdwn rename to doc/todo/extraheader_config.mdwn
todo
diff --git a/doc/todo/extrahader_config.mdwn b/doc/todo/extrahader_config.mdwn new file mode 100644 index 0000000000..f753319af7 --- /dev/null +++ b/doc/todo/extrahader_config.mdwn @@ -0,0 +1,23 @@ +Support git config `http.extraHeader` and `http.<url>.extraheader`. + +This would particularly be useful for P2P over HTTP, where an `annex+https` +url could be configured to send headers for http basic auth. + +And related to that case, it might also make sense to make the +`remote.<name>.annexUrl` default to inheriting the extraheader +of the `remote.<name>.url`. So that when using eg, forgejo-aneksajo, +the user only needs to configure the auth header in one place. + +One concern with `http.extraHeader` is that, since git only uses that for +git repo access, it could easily contain auth headers that the user would +be surprised to find `git-annex addurl` using, for example. So it might +make sense to only support the `http.<url>.extraheader` form for uses in +git-annex. + +The P2P over HTTP inheriting idea above could still use +`http.extraheader` when the annexUrl and url are on the same host. +(As is already done when querying `git credential` in that case, +via `isP2PHttpSameHost`.) +--[[Joey]] + +[[!tag projects/INM7]]
Added a comment
diff --git a/doc/bugs/import_adds_size_to_external_backend_keys/comment_1_6bc5340138bfcbe312a0fec446370094._comment b/doc/bugs/import_adds_size_to_external_backend_keys/comment_1_6bc5340138bfcbe312a0fec446370094._comment new file mode 100644 index 0000000000..1df2668084 --- /dev/null +++ b/doc/bugs/import_adds_size_to_external_backend_keys/comment_1_6bc5340138bfcbe312a0fec446370094._comment @@ -0,0 +1,19 @@ +[[!comment format=mdwn + username="matrss" + avatar="http://cdn.libravatar.org/avatar/cd1c0b3be1af288012e49197918395f0" + subject="comment 1" + date="2026-03-06T13:53:07Z" + content=""" +Unfortunately it seems like I can't even remove the size after the import: + +``` +$ git switch --detach importdir/import +HEAD is now at 6d3aa43 import from importdir +$ git annex migrate --remove-size +migrate data.bin +git-annex: failed creating link from old to new key +failed +migrate: 1 failed +[ble: exit 1] +``` +"""]]
diff --git a/doc/bugs/import_adds_size_to_external_backend_keys.mdwn b/doc/bugs/import_adds_size_to_external_backend_keys.mdwn new file mode 100644 index 0000000000..df9ed4c6f5 --- /dev/null +++ b/doc/bugs/import_adds_size_to_external_backend_keys.mdwn @@ -0,0 +1,87 @@ +### Please describe the problem. + +I have an external backend for grib files that deliberately does not include a size in its generated keys. It intentionally produces the same key for data that is different on a binary level, but equivalent in practice. This means I don't want to include a size, because two files with equivalent data can have a different size but generate the same key after normalization. + +Now, the problem is that `git annex import` seems to unconditionally add a size to imported keys anyway, as seen in the log below. + + +### What steps will reproduce the problem? + +1. Take the XFOO example backend from here: <https://git-annex.branchable.com/design/external_backend_protocol/git-annex-backend-XFOO> +2. Make this change to remove the size from generated keys: + + ``` + 38c38 + < echo "GENKEY-SUCCESS" "XFOO-s$sz--$hash" + --- + > echo "GENKEY-SUCCESS" "XFOO--$hash" + ``` +3. Create a repository and `git annex add --backend XFOO` a file +4. Validate that this file's key does not have a size +5. Set up a directory special remote with importtree=yes and import the same file from it +6. Observe that the imported file's key has a size set + +### What version of git-annex are you using? On what operating system? + +``` +git-annex version: 10.20260213-g1b947233f21755c0c4d1f00e5a24f39d62fa3f1e +build flags: Assistant Webapp Inotify DBus DesktopNotify TorrentParser MagicMime Benchmark Feeds Testsuite S3 WebDAV Servant OsPath +dependency versions: aws-0.25.2 bloomfilter-2.0.1.3 crypton-1.0.4 DAV-1.3.4 feed-1.3.2.1 ghc-9.10.3 http-client-0.7.19 torrent-10000.1.3 uuid-1.3.16 yesod-1.6.2.1 +key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL GITBUNDLE GITMANIFEST VURL X* +remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg rclone hook external compute mask +operating system: linux x86_64 +supported repository versions: 8 9 10 +upgrade supported from repository versions: 0 1 2 3 4 5 6 7 8 9 10 +``` + +### Please provide any additional information below. + +[[!format sh """ +# If you can, paste a complete transcript of the problem occurring here. +# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log + +$ datalad create test-import-key-size +create(ok): /home/icg149/Playground/test-import-key-size (dataset) +$ head -c 10K /dev/urandom > test-import-key-size/data.bin +$ mkdir test-import-key-size-import-dir +$ cp test-import-key-size/data.bin test-import-key-size-import-dir/ +$ cd test-import-key-size +$ git annex add --backend XFOO data.bin +add data.bin +ok +(recording state in git...) +$ ls -l +total 4 +lrwxrwxrwx 1 icg149 icg149 102 Mär 6 14:17 data.bin -> .git/annex/objects/jM/ZJ/XFOO--cc9401a4c19c25864b650740c215b3bd/XFOO--cc9401a4c19c25864b650740c215b3bd +$ git annex initremote importdir type=directory directory=../test-import-key-size-import-dir importtree=yes encryption=none +initremote importdir ok +(recording state in git...) +$ git annex import --no-content --backend XFOO import --from importdir +list importdir ok +import importdir data.bin +ok +update refs/remotes/importdir/import ok +(recording state in git...) +$ git show importdir/import +commit 6d3aa436a9c643cef7d6f02647b952107bc09951 (importdir/import) +Author: Matthias Riße <m.risse@fz-juelich.de> +Date: Fri Mar 6 14:18:14 2026 +0100 + + import from importdir + +diff --git a/data.bin b/data.bin +new file mode 120000 +index 0000000..8620ffe +--- /dev/null ++++ b/data.bin +@@ -0,0 +1 @@ ++.git/annex/objects/GF/8Z/XFOO-s10240--cc9401a4c19c25864b650740c215b3bd/XFOO-s10240--cc9401a4c19c25864b650740c215b3bd +\ No newline at end of file + +# End of transcript or log. +"""]] + +### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders) + + +[[!tag projects/ICE4]]
Added a comment: FTR commands to check and "fix up"
diff --git a/doc/bugs/tries_to_download_a_.mkv_video_without_yt-dlp/comment_3_f911957e4dc3c2c61cbdb04e54218705._comment b/doc/bugs/tries_to_download_a_.mkv_video_without_yt-dlp/comment_3_f911957e4dc3c2c61cbdb04e54218705._comment
new file mode 100644
index 0000000000..77c7838a67
--- /dev/null
+++ b/doc/bugs/tries_to_download_a_.mkv_video_without_yt-dlp/comment_3_f911957e4dc3c2c61cbdb04e54218705._comment
@@ -0,0 +1,91 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="FTR commands to check and "fix up""
+ date="2026-03-05T22:46:18Z"
+ content="""
+in fears against modification of files in git-annex branch directly, here is the commands to 'check'
+
+
+```
+$> f=Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv
+$> key=$(readlink \"$f\" | xargs basename); alog=$(git ls-tree -r git-annex | grep \"$key\" | awk '/.web$/{print $4;}'); git show \"git-annex:$alog\"
+1772708470s 1 https://www.youtube.com/watch?v=0fcKYGsBZxU
+```
+
+First I tried to fix via re-addurl, and we do get some difference:
+
+$> git rm \"$f\"; git annex addurl --no-raw --file \"$f\" \"$url\"
+rm 'Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv'
+addurl https://www.youtube.com/watch?v=0fcKYGsBZxU (using yt-dlp) (to Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv) ok
+(recording state in git...)
+$> git status
+On branch master
+Your branch is up to date with 'origin/master'.
+
+Changes to be committed:
+ (use \"git restore --staged <file>...\" to unstage)
+ modified: Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv
+
+$> git diff --cached
+diff --git a/Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv b/Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv
+index e59a58c35..e12bb1280 120000
+--- a/Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv
++++ b/Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv
+@@ -1 +1 @@
+-../.git/annex/objects/KQ/x1/URL-s0--https&c%%www.youtube.com%watch,63v,610fcKYGsBZxU/URL-s0--https&c%%www.youtube.com%watch,63v,610fcKYGsBZxU
+\ No newline at end of file
++../.git/annex/objects/wq/jM/URL--yt&chttps&c%%www.youtube.com%watch,63v,610fcKYGsBZxU/URL--yt&chttps&c%%www.youtube.com%watch,63v,610fcKYGsBZxU
+\ No newline at end of file
+
+```
+
+for which I did not really care as long as I got that file if metadata transferred, but it didn't:
+
+```
+$> git commit -m 'redownloaded \"unlucky\" video for which no yt: was added' $f
+[master 379d379ea] redownloaded \"unlucky\" video for which no yt: was added
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+$> git annex metadata \"$f\"
+metadata Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv
+
+ok
+```
+
+Also when I used recent after 202601 version which would auto-upgrade to use VURL key difference was to switch from URL to VURL. Could you please point me on where to read up on VURLs and their benefit for relaxed URLs?
+
+
+then I tried to do the dance with unregisterurl, rmurl, addurl, which ended up having
+
+```
+$> key=$(readlink \"$f\" | xargs basename); alog=$(git ls-tree -r git-annex | grep \"$key\" | awk '/.web$/{print $4;}'); git show \"git-annex:$alog\"
+1772750261s 0 https://www.youtube.com/watch?v=0fcKYGsBZxU
+1772750309s 1 yt:https://www.youtube.com/watch?v=0fcKYGsBZxU
+
+```
+
+and for which I still was not able to get it:
+
+```
+$> git annex get \"$f\"
+get Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv (from web...)
+ Verification of content failed
+
+ Unable to access these remotes: web
+
+ No other repository is known to contain the file.
+
+ (Note that these git remotes have annex-ignore set: origin)
+failed
+get: 1 failed
+git annex get \"$f\" 8.22s user 3.63s system 112% cpu 10.505 total
+```
+
+although I think it did fetch it. But i guess it is because of the `-s0` in the original key! So original way with `git rm` + `addurl` was kinda legit as it also fixed up the URL BUT it lost the metadata for the key.
+
+Is there a quick way to copy metadata from another key? (like internally it does for the same path?)
+
+Or is there a better way to 'fix up URL/key' which would you recommend Joey so I could retain metadata?
+
+
+"""]]
Added a comment: FTR commands to check and "fix up"
diff --git a/doc/bugs/tries_to_download_a_.mkv_video_without_yt-dlp/comment_2_fe63fdf9a0d957c7944ad5fe241243fc._comment b/doc/bugs/tries_to_download_a_.mkv_video_without_yt-dlp/comment_2_fe63fdf9a0d957c7944ad5fe241243fc._comment
new file mode 100644
index 0000000000..8eee05f75d
--- /dev/null
+++ b/doc/bugs/tries_to_download_a_.mkv_video_without_yt-dlp/comment_2_fe63fdf9a0d957c7944ad5fe241243fc._comment
@@ -0,0 +1,91 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="FTR commands to check and "fix up""
+ date="2026-03-05T22:45:31Z"
+ content="""
+in fears against modification of files in git-annex branch directly, here is the commands to 'check'
+
+
+```
+$> f=Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv
+$> key=$(readlink \"$f\" | xargs basename); alog=$(git ls-tree -r git-annex | grep \"$key\" | awk '/.web$/{print $4;}'); git show \"git-annex:$alog\"
+1772708470s 1 https://www.youtube.com/watch?v=0fcKYGsBZxU
+```
+
+First I tried to fix via re-addurl, and we do get some difference:
+
+$> git rm \"$f\"; git annex addurl --no-raw --file \"$f\" \"$url\"
+rm 'Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv'
+addurl https://www.youtube.com/watch?v=0fcKYGsBZxU (using yt-dlp) (to Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv) ok
+(recording state in git...)
+$> git status
+On branch master
+Your branch is up to date with 'origin/master'.
+
+Changes to be committed:
+ (use \"git restore --staged <file>...\" to unstage)
+ modified: Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv
+
+$> git diff --cached
+diff --git a/Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv b/Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv
+index e59a58c35..e12bb1280 120000
+--- a/Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv
++++ b/Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv
+@@ -1 +1 @@
+-../.git/annex/objects/KQ/x1/URL-s0--https&c%%www.youtube.com%watch,63v,610fcKYGsBZxU/URL-s0--https&c%%www.youtube.com%watch,63v,610fcKYGsBZxU
+\ No newline at end of file
++../.git/annex/objects/wq/jM/URL--yt&chttps&c%%www.youtube.com%watch,63v,610fcKYGsBZxU/URL--yt&chttps&c%%www.youtube.com%watch,63v,610fcKYGsBZxU
+\ No newline at end of file
+
+```
+
+for which I did not really care as long as I got that file if metadata transferred, but it didn't:
+
+```
+$> git commit -m 'redownloaded \"unlucky\" video for which no yt: was added' $f
+[master 379d379ea] redownloaded \"unlucky\" video for which no yt: was added
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+$> git annex metadata \"$f\"
+metadata Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv
+
+ok
+```
+
+Also when I used recent after 202601 version which would auto-upgrade to use VURL key difference was to switch from URL to VURL. Could you please point me on where to read up on VURLs and their benefit for relaxed URLs?
+
+
+then I tried to do the dance with unregisterurl, rmurl, addurl, which ended up having
+
+```
+$> key=$(readlink \"$f\" | xargs basename); alog=$(git ls-tree -r git-annex | grep \"$key\" | awk '/.web$/{print $4;}'); git show \"git-annex:$alog\"
+1772750261s 0 https://www.youtube.com/watch?v=0fcKYGsBZxU
+1772750309s 1 yt:https://www.youtube.com/watch?v=0fcKYGsBZxU
+
+```
+
+and for which I still was not able to get it:
+
+```
+$> git annex get \"$f\"
+get Чат_рулетка/2026-03-05-_армянин_за_путина._Армянин_из_россии_Воевал_против_Украины.mkv (from web...)
+ Verification of content failed
+
+ Unable to access these remotes: web
+
+ No other repository is known to contain the file.
+
+ (Note that these git remotes have annex-ignore set: origin)
+failed
+get: 1 failed
+git annex get \"$f\" 8.22s user 3.63s system 112% cpu 10.505 total
+```
+
+although I think it did fetch it. But i guess it is because of the `-s0` in the original key! So original way with `git rm` + `addurl` was kinda legit as it also fixed up the URL BUT it lost the metadata for the key.
+
+Is there a quick way to copy metadata from another key? (like internally it does for the same path?)
+
+Or is there a better way to 'fix up URL/key' which would you recommend Joey so I could retain metadata?
+
+
+"""]]
initial report on stuck'iness
diff --git a/doc/bugs/git_annex_copy_+_git_push_get_stuck__in_parallel.mdwn b/doc/bugs/git_annex_copy_+_git_push_get_stuck__in_parallel.mdwn new file mode 100644 index 0000000000..9f301ff7f3 --- /dev/null +++ b/doc/bugs/git_annex_copy_+_git_push_get_stuck__in_parallel.mdwn @@ -0,0 +1,49 @@ +### Please describe the problem. + +I was investigating the safety of parallel workers pushing + annex copying to the same original local origin location. +It gets stuck (I have not even tried yet that "ssh" option...) and it seems potentially the counter play of 'annex copy' like + +``` +yoh 3858496 0.0 0.0 7628 3844 pts/29 S 15:25 0:00 | \_ /usr/bin/bash -c worker 3 +yoh 3861583 0.0 0.0 7984 3936 pts/29 S 15:25 0:00 | | \_ git annex copy --to origin file-3-1.txt file-3-2.txt file-3-3.txt file-3-4.txt file-3-5.txt file-common.txt +yoh 3861585 3.7 0.1 1075475816 72956 pts/29 Sl 15:25 0:05 | | \_ /usr/bin/git-annex copy --to origin file-3-1.txt file-3-2.txt file-3-3.txt file-3-4.txt file-3-5.txt file-common.txt +yoh 3861886 0.0 0.0 8116 4564 pts/29 S 15:25 0:00 | | \_ git --git-dir=.git --work-tree=. --literal-pathspecs cat-file --batch + +``` +(of which I have ATM multiple going on) + +and 'annex post-receive' hook (of which I have only one) running upon `git push` + +``` +yoh 3858498 0.0 0.0 7628 3764 pts/29 S 15:25 0:00 | \_ /usr/bin/bash -c worker 4 +yoh 3863122 0.0 0.0 17340 4948 pts/29 Sl 15:25 0:00 | | \_ git push origin master:br-4 +yoh 3863130 0.0 0.0 2692 1912 pts/29 S 15:25 0:00 | | \_ /bin/sh -c git-receive-pack '/home/yoh/.tmp/parallel-push-3858314/origin' git-receive-pack '/home/yoh/.tmp/parallel-push-3858314/origin' +yoh 3863134 0.0 0.0 16980 5312 pts/29 Sl 15:25 0:00 | | \_ git-receive-pack /home/yoh/.tmp/parallel-push-3858314/origin +yoh 3863172 0.0 0.0 2692 1872 pts/29 S 15:25 0:00 | | \_ /bin/sh hooks/post-receive +yoh 3863192 0.0 0.0 7984 4060 pts/29 S 15:25 0:00 | | \_ git annex post-receive +yoh 3863195 0.3 0.0 1074074572 17816 pts/29 Sl 15:25 0:00 | | \_ /usr/bin/git-annex post-receive +``` + + +### What steps will reproduce the problem? + +[here](https://www.oneukrainian.com/tmp/parallel-push.sh) is a claude-code (with use of LLMs and HI; I did not even review in detail/try yet 'ssh' part coded there) generated script (look or use at your own discretion), on execution of which as `./parallel-push.sh --max-file-size 409600 --max-files 10 40 20` it gets stuck (this is the 4th run I think, consistent stuck at different places) with + +``` +copy file-common.txt ok +copy file-12-1.txt (to origin...) (checksum...) ok +(recording state in git...) +clone-12: done (1 files) +Cloning into '/home/yoh/.tmp/parallel-push-3858314/clone-12'... +done. +remote: (recording state in git...) +remote: (recovering from race...) +To /home/yoh/.tmp/parallel-push-3858314/origin + * [new branch] master -> br-12 + + +``` + +### What version of git-annex are you using? On what operating system? + +this run is with `10.20251029` but I tried bleeding edge standalone build `10.20260213+git57-gffa771e735-1~ndall+1` to the same result but process traces are more garbled so for the benefit of our both HI I pasted from the non-standalone built version
comment
diff --git a/doc/todo/Ephemeral_special_remotes/comment_9_c0fb4d7034229dd12d89c54c046f15e4._comment b/doc/todo/Ephemeral_special_remotes/comment_9_c0fb4d7034229dd12d89c54c046f15e4._comment new file mode 100644 index 0000000000..f4a292dd86 --- /dev/null +++ b/doc/todo/Ephemeral_special_remotes/comment_9_c0fb4d7034229dd12d89c54c046f15e4._comment @@ -0,0 +1,19 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 9""" + date="2026-03-05T19:00:16Z" + content=""" +Started developing this in the `ephemeral` branch. + +It seems to also make sense to allow DELEGATE as a response to WHEREIS. + +I'm on the fence about delegating GETORDERED. Probably most remotes won't +bother to respond to GETORDERED at all, and the only time it makes sense to +delegate it is when always delegating to the same type of special remote. +If delegating to different special remotes at different times, it +doesn't make sense to delegate it to a single on of them. + +Similarly I don't think it makes sense to delegate GETINFO unless only +delegating to a single special remote. Will probably wait to see if someone +has a use case before supporting GETINFO, GETAVAILABILITY, CLAIMURL, etc. +"""]]
format
diff --git a/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment b/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment index 5c8fedfa69..0443160cc2 100644 --- a/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment +++ b/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment @@ -21,4 +21,4 @@ of reinitialization. With ephemeral=yes, the delegate is automatically removed when the external special remote program shuts down (unless another one is using it.) With ephemeral=no, the delegate remains initialized for use next time. -""""]] +"""]]
remove name
diff --git a/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment b/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment index 0ed6c2b9ae..5c8fedfa69 100644 --- a/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment +++ b/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment @@ -6,7 +6,7 @@ Add to external special remote protocol, enabled by the `DELEGATE` extension: - DELEGATE name type=whatever ephemeral=yes|no [params] + DELEGATE type=whatever ephemeral=yes|no [params] Which can be used as a response to TRANSFER, REMOVE, CHECKPRESENT, TRANSFEREXPORT, CHECKPRESENTEXPORT, REMOVEEXPORT, REMOVEEXPORTDIRECTORY,
format
diff --git a/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment b/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment index 76679a1fa4..0ed6c2b9ae 100644 --- a/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment +++ b/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment @@ -21,4 +21,4 @@ of reinitialization. With ephemeral=yes, the delegate is automatically removed when the external special remote program shuts down (unless another one is using it.) With ephemeral=no, the delegate remains initialized for use next time. -""]] +""""]]
comment
diff --git a/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment b/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment new file mode 100644 index 0000000000..76679a1fa4 --- /dev/null +++ b/doc/todo/Ephemeral_special_remotes/comment_8_e89046bfe2cd0cfff9d317367e3675f1._comment @@ -0,0 +1,24 @@ +[[!comment format=mdwn + username="joey" + subject="""simplified design with better name""" + date="2026-03-05T15:30:59Z" + content=""" +Add to external special remote protocol, enabled by the `DELEGATE` +extension: + + DELEGATE name type=whatever ephemeral=yes|no [params] + +Which can be used as a response to TRANSFER, REMOVE, CHECKPRESENT, +TRANSFEREXPORT, CHECKPRESENTEXPORT, REMOVEEXPORT, REMOVEEXPORTDIRECTORY, +RENAMEEXPORT + +This initializes a delegate special remote in a private namespace, and +uses it to perform the operation. + +Subsequent uses of DELEGATE with the same configuration avoid the overhead +of reinitialization. + +With ephemeral=yes, the delegate is automatically removed when the external +special remote program shuts down (unless another one is using it.) +With ephemeral=no, the delegate remains initialized for use next time. +""]]
rename
diff --git a/doc/todo/Ephemeral_special_remotes/comment_6_213e927f9d25acc01c9c647712c349ee._comment b/doc/todo/Ephemeral_special_remotes/comment_7_213e927f9d25acc01c9c647712c349ee._comment similarity index 98% rename from doc/todo/Ephemeral_special_remotes/comment_6_213e927f9d25acc01c9c647712c349ee._comment rename to doc/todo/Ephemeral_special_remotes/comment_7_213e927f9d25acc01c9c647712c349ee._comment index 2eab7302d7..2699c1fc66 100644 --- a/doc/todo/Ephemeral_special_remotes/comment_6_213e927f9d25acc01c9c647712c349ee._comment +++ b/doc/todo/Ephemeral_special_remotes/comment_7_213e927f9d25acc01c9c647712c349ee._comment @@ -1,6 +1,6 @@ [[!comment format=mdwn username="joey" - subject="""comment 6""" + subject="""Re: comment 6""" date="2026-03-05T14:44:39Z" content=""" > Is a non-ephemeral aspect visible/accessible outside the context of the special remote that set it up? Would it appear as a regular special remote for a CLI user, as if they ran initremote?
comment
diff --git a/doc/todo/Ephemeral_special_remotes/comment_6_213e927f9d25acc01c9c647712c349ee._comment b/doc/todo/Ephemeral_special_remotes/comment_6_213e927f9d25acc01c9c647712c349ee._comment new file mode 100644 index 0000000000..2eab7302d7 --- /dev/null +++ b/doc/todo/Ephemeral_special_remotes/comment_6_213e927f9d25acc01c9c647712c349ee._comment @@ -0,0 +1,59 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 6""" + date="2026-03-05T14:44:39Z" + content=""" +> Is a non-ephemeral aspect visible/accessible outside the context of the special remote that set it up? Would it appear as a regular special remote for a CLI user, as if they ran initremote? + +I think it would be best for it not to be visible to the user. Since these +remotes can still set their own git configs though, they will necessarily +show up in `git remote list`. (Any `git config remote.foo.bar` setting is +enough for that.) It would be possible for git-annex to not treat them as +valid remotes when used outside of the aspect context though. +Easiest would be to set annex-ignore on them. + +It would be possible to point `GIT_CONFIG` at a different config file +when setting up and using the ephemeral special remote. That would have +the problem though that if the special remote looks at some user-set +git configs, it wouldn't see them. An example that comes to mind +that a special remote would be expected to see is the +"credential.helper" configuration. Maybe git-annex could merge .git/config +into the ephemeral remote's version when using it? Seems complex and +potentially slow though. + +(BTW, Even ephemeral aspects will be user-visible while git-annex is running.) + +> At which stage would INITASPECT (have to) be used? The PREPARE stage, I guess. + +I think it could be used at any point. + +> How expensive could INITASPECT be? Would it (immediately) trigger init/prepare of the aspect-remote? + +As expensive as `git-annex initremote` initially, but subsequenty close to +a noop when the remote configuration includes emphemeral=no + +Also, calling it repeatedly in the same session with the same configuration +should be a noop after the first time. So you could call it immediately +before USEASPECT. + +That does suggest a simplification: Rather than having a separate +INITASPECT command: + +USEASPECT type=whatever ephemeral=yes|no [params] + +Neat, this avoids needing to name the aspect! And avoids any problem with +the aspect name having been used before with a different config. + +It also means that any +failure to initialize will necessarily make the USEASPECT response be +an error message, so error handling takes care of itself. + +git-annex would still need a remote name internally; it could eg hash the +configuration to get a name. + +I'm inclined to go with this simplification. + +> Do I understand correctly that it would be possible to set the active aspect on a per-key and per-operation basis? + +It's per-operation. If you want different aspects for different types of keys it would be up to you to pick between them. +"""]]
Added a comment
diff --git a/doc/todo/Ephemeral_special_remotes/comment_6_1d9cd49500c4b896dd26095c487db142._comment b/doc/todo/Ephemeral_special_remotes/comment_6_1d9cd49500c4b896dd26095c487db142._comment new file mode 100644 index 0000000000..10c4eac6ff --- /dev/null +++ b/doc/todo/Ephemeral_special_remotes/comment_6_1d9cd49500c4b896dd26095c487db142._comment @@ -0,0 +1,22 @@ +[[!comment format=mdwn + username="mih" + avatar="http://cdn.libravatar.org/avatar/f881df265a423e4f24eff27c623148fd" + subject="comment 6" + date="2026-03-05T08:08:45Z" + content=""" +I concur with you reasoning, also in particular with the observation that making this about URLs would be a mistake. I was already trying to have the \"redirect\" approach do things, it did not want to be used for. + +Here is my understanding of the proposed design: + +I could use this to implement an \"orchestration\" special remote that, rather then implementing store and retrieve procedures, is focused on what other implementations shall be used. For this, it can rely on the full set of special remotes available on a system. It would be possible to have a single remote (using this new feature) abstract a data holding site that can be talked to via various protocols, and the specific access approach can be selected dynamically. This would, therefore, include the ability to use a redirect special remote for URL-based downloads. + +Few questions which I could not answer with confidence: + +- Is a non-ephemeral aspect visible/accessible outside the context of the special remote that set it up? Would it appear as a regular special remote for a CLI user, as if they ran `initremote`? +- At which stage would `INITASPECT` (have to) be used? The `PREPARE` stage, I guess. +- How expensive could `INITASPECT` be? Would it (immediately) trigger init/prepare of the aspect-remote? +- `INITASPECT-OK|INITASPECT-FAILURE` are responses sent by the main git-annex process to the special remote, right? Any implementation would need to implement some kind of error handline (try another aspect, or error also). +- Do I understand correctly that it would be possible to set the active aspect on a per-key and per-operation basis? + + +"""]]
update
diff --git a/doc/todo/Ephemeral_special_remotes/comment_5_4ec8597802b2f76839a7e5050417409d._comment b/doc/todo/Ephemeral_special_remotes/comment_5_4ec8597802b2f76839a7e5050417409d._comment index 3a186efadb..f7272e1c8f 100644 --- a/doc/todo/Ephemeral_special_remotes/comment_5_4ec8597802b2f76839a7e5050417409d._comment +++ b/doc/todo/Ephemeral_special_remotes/comment_5_4ec8597802b2f76839a7e5050417409d._comment @@ -25,7 +25,12 @@ With ephemeral=yes, the aspect is automatically removed when the external special remote program shuts down (unless another one is using it.) With ephemeral=no, the aspect remains initialized for use next time. -Note that INITASPECT will successfully do nothing if the remote already -exists with the same config. If a remote exists with that name but a -different config, it will remove the old one and init the new one. +Note that INITASPECT will successfully do nothing if the aspect already +exists with the same config. If an aspect exists with that name but a +different config, it will fail. I earlier thought it could remove the old +one and make a new one, but that risks removing an aspect that is still +in use by another process, which could result in unexpected behavior when +that aspect reads its git config or cached creds or etc. It should +be easy enough in most cases to avoid reusing the same aspect name for +two different configs. """]]
layout
diff --git a/doc/todo/Ephemeral_special_remotes/comment_5_4ec8597802b2f76839a7e5050417409d._comment b/doc/todo/Ephemeral_special_remotes/comment_5_4ec8597802b2f76839a7e5050417409d._comment index 423e78f281..3a186efadb 100644 --- a/doc/todo/Ephemeral_special_remotes/comment_5_4ec8597802b2f76839a7e5050417409d._comment +++ b/doc/todo/Ephemeral_special_remotes/comment_5_4ec8597802b2f76839a7e5050417409d._comment @@ -2,7 +2,7 @@ username="joey" subject="""proposed design""" date="2026-03-04T16:31:28Z" - content="""" + content=""" I'm here going with the name "aspect" to refer to a sameas remote that is in a private namespace belonging to the external special remote that uses it. This name is a bit of a placeholder, but I think some name is needed,
comment
diff --git a/doc/todo/Ephemeral_special_remotes/comment_5_4ec8597802b2f76839a7e5050417409d._comment b/doc/todo/Ephemeral_special_remotes/comment_5_4ec8597802b2f76839a7e5050417409d._comment new file mode 100644 index 0000000000..423e78f281 --- /dev/null +++ b/doc/todo/Ephemeral_special_remotes/comment_5_4ec8597802b2f76839a7e5050417409d._comment @@ -0,0 +1,31 @@ +[[!comment format=mdwn + username="joey" + subject="""proposed design""" + date="2026-03-04T16:31:28Z" + content="""" +I'm here going with the name "aspect" to refer to a sameas remote that is +in a private namespace belonging to the external special remote that uses +it. This name is a bit of a placeholder, but I think some name is needed, +because it would be surprising if "INITREMOTE" did a different thing +than `git-annex initremote`. + +Add to external special remote protocol, enabled by the `REDIRECTREMOTE` +extension: + + INITASPECT name type=whatever ephemeral=yes|no [params] + INITASPECT-OK + INITASPECT-FAILURE reason + +Add response to TRANSFER, REMOVE, CHECKPRESENT, TRANSFEREXPORT, +CHECKPRESENTEXPORT, REMOVEEXPORT, REMOVEEXPORTDIRECTORY, RENAMEEXPORT: + + USEASPECT name + +With ephemeral=yes, the aspect is automatically removed when the external +special remote program shuts down (unless another one is using it.) +With ephemeral=no, the aspect remains initialized for use next time. + +Note that INITASPECT will successfully do nothing if the remote already +exists with the same config. If a remote exists with that name but a +different config, it will remove the old one and init the new one. +"""]]
formatting
diff --git a/doc/todo/Ephemeral_special_remotes/comment_4_69f32af134b36304ca01f93c62c0c9cf._comment b/doc/todo/Ephemeral_special_remotes/comment_4_69f32af134b36304ca01f93c62c0c9cf._comment index 2533a7a46d..dca6b4ccc2 100644 --- a/doc/todo/Ephemeral_special_remotes/comment_4_69f32af134b36304ca01f93c62c0c9cf._comment +++ b/doc/todo/Ephemeral_special_remotes/comment_4_69f32af134b36304ca01f93c62c0c9cf._comment @@ -24,12 +24,12 @@ automatically use that namespace. For example: - INITREMOTE blah type=blah url=whatever - INITREMOTE-OK - [...] - REDIRECT_REMOTE blah - [...] - REMOVEREMOTE blah + INITREMOTE blah type=blah url=whatever + INITREMOTE-OK + [...] + REDIRECT_REMOTE blah + [...] + REMOVEREMOTE blah That might make a remote named eg "foo-$foouuid-blah" where $foouuid is the uuid of the special remote foo that owns it. So there is no possibility
comment
diff --git a/doc/todo/Ephemeral_special_remotes/comment_4_69f32af134b36304ca01f93c62c0c9cf._comment b/doc/todo/Ephemeral_special_remotes/comment_4_69f32af134b36304ca01f93c62c0c9cf._comment new file mode 100644 index 0000000000..2533a7a46d --- /dev/null +++ b/doc/todo/Ephemeral_special_remotes/comment_4_69f32af134b36304ca01f93c62c0c9cf._comment @@ -0,0 +1,51 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 4""" + date="2026-03-04T15:47:39Z" + content=""" +Continuing my line of thought, `REDIRECT_REMOTE` would I guess be +provided with a remote name, not a uuid, since with --sameas the +remote would have the same uuid. + +While special remote "foo" could use "foo-bar", "foo-baz" etc as +the name of its not-really-ephemeral helper remotes, that is not +entirely satisfactory, since the user might have their own "foo-bar" +remote. Or the user might notice "foo-bar" exists, and start using it, +and then it would be painful if "foo" later removes it. + +And, new protocol command like `initremote` does seem to be needed, +because if a special remote runs `git-annex initremote` itself, the +git-annex process that is using the special remote won't know about +the new remote. + +If there's an `initremote`-like protocol command, the special remotes +it inits could be in a separate namespace, and `REDIRECT_REMOTE` could +automatically use that namespace. + +For example: + + INITREMOTE blah type=blah url=whatever + INITREMOTE-OK + [...] + REDIRECT_REMOTE blah + [...] + REMOVEREMOTE blah + +That might make a remote named eg "foo-$foouuid-blah" where $foouuid is +the uuid of the special remote foo that owns it. So there is no possibility +of collision. That would be in `.git/config` for the reasons I discussed +earlier. + +Depending on the type of remote, it might be cheap enough to INITREMOTE +and REMOVEREMOTE in the same session. Making it emphmeral, athough with +some disk writes happening behind the scenes to update the git config etc. +Or, the REMOVEREMOTE could be skipped to leave it set up for the next +session. Then an `INITREMOTE` with the same settings would be optimised +to a no-op. + +That would have `git remote remote foo` leave behind the configs for +the not-so-ephemeral remotes that it set up. Not a big problem, the user +can go in and delete them or a `git-annex removeremote` could handle it, +as well as deleting `.git/annex/journal-private/remote.log`, cached creds, +etc. +"""]]
removed
diff --git a/doc/forum/support_for_git_sparse_checkout/comment_2_3eadd702526c31b73b187d2e89f2e1ab._comment b/doc/forum/support_for_git_sparse_checkout/comment_2_3eadd702526c31b73b187d2e89f2e1ab._comment deleted file mode 100644 index 215ab35502..0000000000 --- a/doc/forum/support_for_git_sparse_checkout/comment_2_3eadd702526c31b73b187d2e89f2e1ab._comment +++ /dev/null @@ -1,12 +0,0 @@ -[[!comment format=mdwn - username="yarikoptic" - avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4" - subject="Reigniting interested in this topic and linking to related efforts (BABS etc)" - date="2026-03-02T16:18:28Z" - content=""" -I just saw support for [git sparse-checkout](https://git-scm.com/docs/git-sparse-checkout) merged in [BABS](https://github.com/PennLINC/babs/pull/337) and frankly I never knew/used it before! Inspired by an enthusiastic [Meng](https://github.com/just-meng) who ~~made a strategic mistake for her PhD progress~~ [pioneered use of use of git worktrees](https://blog.datalad.org/posts/git-worktree-workflow/) in DataLad having attended Distribits 2025, I thought to check if git-annex has support for the `sparse-checkout`. - -In conjunction with `sparse-checkout` (existing already) support for worktrees in `git-annex` can make a perfect \"couple\" for an efficient [ephemeral](https://myyoda.github.io/principles-examples/stamped_principles/e/) compute where we checkout only what is really needed, e.g. following the `datalad run` input/output specifications. - -This is just a summary of the potential research/implementation since may be it even somehow magically all works already given that BABS merged the sparse-checkout support and they extensively use `git annex` already...? -"""]]
comment
diff --git a/doc/forum/support_for_git_sparse_checkout/comment_3_cba791314c59f61dd49e10b92e41637a._comment b/doc/forum/support_for_git_sparse_checkout/comment_3_cba791314c59f61dd49e10b92e41637a._comment new file mode 100644 index 0000000000..e51735c17c --- /dev/null +++ b/doc/forum/support_for_git_sparse_checkout/comment_3_cba791314c59f61dd49e10b92e41637a._comment @@ -0,0 +1,40 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 3""" + date="2026-03-03T22:50:38Z" + content=""" +In a sparse checkout, most git commands behave as is they were run in a +worktree that contains only the files in the sparse checkout, and not other +files. Since git-annex uses git commands extensively when identifying files +to work on, its commands skip over files not in the sparse checkout. + +There are exceptions, the main one seems to be `git ls-files`, which +does list files not in the sparse checkout. All commands that operate on +annexed files and that use `git ls-files` to enumerate files though feed +the files into `git cat-file --batch`, and that will say a file is not +found when it's not part of the sparse checkout. So git-annex skips those. + +The only exception I can find, and possibily the only one, is that +`git-annex add` will add files that are in a subdirectory that is not +included in the sparse checkout. (It uses `git ls-files` without `git +cat-file`.) That is a different behavior than `git +add`, which refuses to add such files (though the --sparse option +overrides and causes them to be added). + +I don't know if this `git-annex add` behavior would be a problem. +The documentation for `--sparse` says that the reason git add doesn't +default to it is because, after it adds such a file, it could get removed +from the worktree without warning. Which would make it hard to get the +file's content back if it didn't get committed first. + +If that were a problem, it could be fixed by making git-annex run `git +ls-files` with the `--sparse` option, which is supposed to filter out files +not in the sparse checkout... Except that doesn't seem to work right when +I try it. Maybe a bug in git (2.51.0)? + +Anyway, my impression is that this would all need +playing with to determine if it happens to meet your needs. +Bearing in mind that sparse checkout is itself an experimental feature +(for 6+ years?) that is documented to be subject to future behavior +changes. +"""]]
Added a comment: Reigniting interested in this topic and linking to related efforts (BABS etc)
diff --git a/doc/forum/support_for_git_sparse_checkout/comment_2_3eadd702526c31b73b187d2e89f2e1ab._comment b/doc/forum/support_for_git_sparse_checkout/comment_2_3eadd702526c31b73b187d2e89f2e1ab._comment new file mode 100644 index 0000000000..215ab35502 --- /dev/null +++ b/doc/forum/support_for_git_sparse_checkout/comment_2_3eadd702526c31b73b187d2e89f2e1ab._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="yarikoptic" + avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4" + subject="Reigniting interested in this topic and linking to related efforts (BABS etc)" + date="2026-03-02T16:18:28Z" + content=""" +I just saw support for [git sparse-checkout](https://git-scm.com/docs/git-sparse-checkout) merged in [BABS](https://github.com/PennLINC/babs/pull/337) and frankly I never knew/used it before! Inspired by an enthusiastic [Meng](https://github.com/just-meng) who ~~made a strategic mistake for her PhD progress~~ [pioneered use of use of git worktrees](https://blog.datalad.org/posts/git-worktree-workflow/) in DataLad having attended Distribits 2025, I thought to check if git-annex has support for the `sparse-checkout`. + +In conjunction with `sparse-checkout` (existing already) support for worktrees in `git-annex` can make a perfect \"couple\" for an efficient [ephemeral](https://myyoda.github.io/principles-examples/stamped_principles/e/) compute where we checkout only what is really needed, e.g. following the `datalad run` input/output specifications. + +This is just a summary of the potential research/implementation since may be it even somehow magically all works already given that BABS merged the sparse-checkout support and they extensively use `git annex` already...? +"""]]
comment
diff --git a/doc/bugs/git_annex_export_--fast_deletes_files_on_remote/comment_8_ed8bec2b8b941fd816cfb17704f5290f._comment b/doc/bugs/git_annex_export_--fast_deletes_files_on_remote/comment_8_ed8bec2b8b941fd816cfb17704f5290f._comment new file mode 100644 index 0000000000..2df3b55051 --- /dev/null +++ b/doc/bugs/git_annex_export_--fast_deletes_files_on_remote/comment_8_ed8bec2b8b941fd816cfb17704f5290f._comment @@ -0,0 +1,13 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 8""" + date="2026-03-02T16:02:51Z" + content=""" +Ok, I've tagged the todos about import support from rsync, and hopefully +that will be able to get implemented. + +As for this bug, it seems that at least documentation improvements are +needed in order to close it. I have also fixed the adb special remote to +avoid the behavior, which leaves webdav and any external special remotes +that might have the behavior. +"""]]
Added a comment: Reigniting interested in this topic and linking to related efforts (BABS etc)
diff --git a/doc/forum/support_for_git_sparse_checkout/comment_1_d65f3931efc3bdc101d12594694db4f4._comment b/doc/forum/support_for_git_sparse_checkout/comment_1_d65f3931efc3bdc101d12594694db4f4._comment new file mode 100644 index 0000000000..2388998b37 --- /dev/null +++ b/doc/forum/support_for_git_sparse_checkout/comment_1_d65f3931efc3bdc101d12594694db4f4._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="yarikoptic" + avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4" + subject="Reigniting interested in this topic and linking to related efforts (BABS etc)" + date="2026-03-02T16:17:46Z" + content=""" +I just saw support for [git sparse-checkout](https://git-scm.com/docs/git-sparse-checkout) merged in [BABS](https://github.com/PennLINC/babs/pull/337) and frankly I never knew/used it before! Inspired by an enthusiastic [Meng](https://github.com/just-meng) who ~~made a strategic mistake for her PhD progress~~ [pioneered use of use of git worktrees](https://blog.datalad.org/posts/git-worktree-workflow/) in DataLad having attended Distribits 2025, I thought to check if git-annex has support for the `sparse-checkout`. + +In conjunction with `sparse-checkout` (existing already) support for worktrees in `git-annex` can make a perfect \"couple\" for an efficient [ephemeral](https://myyoda.github.io/principles-examples/stamped_principles/e/) compute where we checkout only what is really needed, e.g. following the `datalad run` input/output specifications. + +This is just a summary of the potential research/implementation since may be it even somehow magically all works already given that BABS merged the sparse-checkout support and they extensively use `git annex` already...? +"""]]
adb: Avoid deleting contents of a non-empty directory when removing the last exported file from the directory
Same as was done for rsync in commit e1ce4a530bbcd7ded1bd67613c15f9f75005336d
toybox rmdir supports --ignore-fail-on-non-empty, as does busybox rmdir,
so I assume this will work broadly across current android.
Same as was done for rsync in commit e1ce4a530bbcd7ded1bd67613c15f9f75005336d
toybox rmdir supports --ignore-fail-on-non-empty, as does busybox rmdir,
so I assume this will work broadly across current android.
diff --git a/CHANGELOG b/CHANGELOG
index 2fb70abd8b..951afee0c9 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -10,6 +10,8 @@ git-annex (10.20260214) UNRELEASED; urgency=medium
* Improve display of http exceptions.
* Fix reversion in previous version that caused auto-initializing of
local git remotes that have annex-ignore set.
+ * adb: Avoid deleting contents of a non-empty directory when
+ removing the last exported file from the directory.
-- Joey Hess <id@joeyh.name> Mon, 16 Feb 2026 13:38:21 -0400
diff --git a/Remote/Adb.hs b/Remote/Adb.hs
index 745e4dc207..1352296057 100644
--- a/Remote/Adb.hs
+++ b/Remote/Adb.hs
@@ -274,7 +274,7 @@ removeExportDirectoryM serial abase dir =
unlessM go $
giveup "adb failed"
where
- go = adbShellBool serial [Param "rm", Param "-rf", File (fromAndroidPath adir)]
+ go = adbShellBool serial [Param "rmdir", Param "--ignore-fail-on-non-empty", File (fromAndroidPath adir)]
adir = androidExportLocation abase (mkExportLocation (fromExportDirectory dir))
checkPresentExportM :: AndroidSerial -> AndroidPath -> Key -> ExportLocation -> Annex Bool
diff --git a/Types/Remote.hs b/Types/Remote.hs
index 87a4f1009e..e83395a3c2 100644
--- a/Types/Remote.hs
+++ b/Types/Remote.hs
@@ -288,7 +288,7 @@ data ExportActions a = ExportActions
, removeExport :: Key -> ExportLocation -> a ()
-- Removes an exported directory. Typically the directory will be
-- empty, but it could possibly contain files or other directories,
- -- and it's ok to delete those (but not required to).
+ -- and it's ok to delete those (but better to avoid doing so).
-- If the remote does not use directories, or automatically cleans
-- up empty directories, this can be Nothing.
--
diff --git a/doc/design/external_special_remote_protocol/export_and_import_appendix.mdwn b/doc/design/external_special_remote_protocol/export_and_import_appendix.mdwn
index 0bb70b7e60..f89256798c 100644
--- a/doc/design/external_special_remote_protocol/export_and_import_appendix.mdwn
+++ b/doc/design/external_special_remote_protocol/export_and_import_appendix.mdwn
@@ -91,8 +91,8 @@ a request, it can reply with `UNSUPPORTED-REQUEST`.
The directory will be in the form of a relative path, and may contain path
separators, whitespace, and other special characters.
Typically the directory will be empty, but it could possibly contain
- files or other directories, and it's ok to remove those, but not required
- to do so.
+ files or other directories, and it's ok to remove those, but better to
+ avoid doing so.
* `REMOVEEXPORTDIRECTORY-SUCCESS`
Indicates that a `REMOVEEXPORTDIRECTORY` was done successfully.
* `REMOVEEXPORTDIRECTORY-FAILURE`
tag ice4 based on https://git-annex.branchable.com/bugs/git_annex_export_--fast_deletes_files_on_remote/#comment-5b577e1037b84b82c6b93a100fd5153f
diff --git a/doc/todo/import_tree_from_rsync_special_remote.mdwn b/doc/todo/import_tree_from_rsync_special_remote.mdwn index d2238d79d3..c31dd0b199 100644 --- a/doc/todo/import_tree_from_rsync_special_remote.mdwn +++ b/doc/todo/import_tree_from_rsync_special_remote.mdwn @@ -37,4 +37,10 @@ importtree, but there are several roadblocks: So, it seems that, importtree would need to be able to run commands other than rsync on the server. --[[Joey]] +Or, implement [[todo/importtree_only_remote]] and make rsync special +remotes support either importtree or extporttree, but not both, which +avoids that problem. --[[Joey]] + [[!tag needsthought]] + +[[!tag projects/ICE4]] diff --git a/doc/todo/importtree_only_remotes.mdwn b/doc/todo/importtree_only_remotes.mdwn index 2f9174b670..1728406876 100644 --- a/doc/todo/importtree_only_remotes.mdwn +++ b/doc/todo/importtree_only_remotes.mdwn @@ -78,4 +78,4 @@ Or by complicating Remote.Helper.ExportImport further.. --[[Joey]] [[!tag confirmed]] -[[!tag projects/dandi/potential]] +[[!tag projects/ICE4]]
comment
diff --git a/doc/forum/Cheap_and_costly_special_remote_using_sameas__63__/comment_3_cb1762046ebf95beb7c3555567fd2af5._comment b/doc/forum/Cheap_and_costly_special_remote_using_sameas__63__/comment_3_cb1762046ebf95beb7c3555567fd2af5._comment new file mode 100644 index 0000000000..5927ee0daf --- /dev/null +++ b/doc/forum/Cheap_and_costly_special_remote_using_sameas__63__/comment_3_cb1762046ebf95beb7c3555567fd2af5._comment @@ -0,0 +1,15 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 3""" + date="2026-03-02T15:45:30Z" + content=""" +git-annex allows you to have any number of remotes pointing at the same git +repository. It is able to tell it's the same git repository, so you don't +need anything like sameas in this case. + +All you need is a git remote with an url pointing at the network local +host, and set the `<remote>.annex-cost` of that one lower than the other +remote. And git-annex will try it first. + +Of course a dynamic ssh config is a fine way to do it too.. +"""]]
comment
diff --git a/doc/bugs/macos_switch_to_openrsync_seems_to_break_sync/comment_1_ef46a9e4de73ecf9ac1932212851f92a._comment b/doc/bugs/macos_switch_to_openrsync_seems_to_break_sync/comment_1_ef46a9e4de73ecf9ac1932212851f92a._comment new file mode 100644 index 0000000000..874ea7439d --- /dev/null +++ b/doc/bugs/macos_switch_to_openrsync_seems_to_break_sync/comment_1_ef46a9e4de73ecf9ac1932212851f92a._comment @@ -0,0 +1,18 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 1""" + date="2026-03-02T15:35:22Z" + content=""" +I'm not sure how it could be a bug in git-annex that it uses +any and all rsync options it might choose to use. + +In any case, "rsync error: unexpected end of file" kind of looks like +openrsync is having difficulty communicating with the remote rsync server, +and not like an un-implemented option. + +Searching the web for that error message finds plenty of other openrsync +users, that are not using git-annex, and have similar problems with it. + +I think my suggestion has to be to install the real rsync somewhere in PATH +before openrsync, so git-annex will use it. +"""]]
response
diff --git a/doc/forum/unrelated_history/comment_1_e929a305503c5fa8af5e9d4f399e4015._comment b/doc/forum/unrelated_history/comment_1_e929a305503c5fa8af5e9d4f399e4015._comment new file mode 100644 index 0000000000..ef1605a07b --- /dev/null +++ b/doc/forum/unrelated_history/comment_1_e929a305503c5fa8af5e9d4f399e4015._comment @@ -0,0 +1,16 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 1""" + date="2026-03-02T15:28:01Z" + content=""" +The git-annex branch is automatically merged by git-annex, +it doesn't matter if it has unrelated histories or not. The merge will +always succeed, without conflicts. + +All you need to do is pull the git-annex branch from a remote, and run +`git-annex merge`. + +If for some reason you need to manually merge the git-annex branches, +yes all it takes is a simple union merge where on conflict you concatenate +both versions of the conflicted file together. +"""]]
fix auto-initializing reversion
commit f79f7d322bc0278e4edba13c3b57093753fded6c caused auto-initalizing
in local git remotes that have annex-ignore set.
Not only did tryGitConfigRead do auto-initialization, but
git-annex sync's use of onLocalRepo did too. And possibly in other cases
too. Now fixed comprehensively.
commit f79f7d322bc0278e4edba13c3b57093753fded6c caused auto-initalizing
in local git remotes that have annex-ignore set.
Not only did tryGitConfigRead do auto-initialization, but
git-annex sync's use of onLocalRepo did too. And possibly in other cases
too. Now fixed comprehensively.
diff --git a/CHANGELOG b/CHANGELOG
index ecef14dd3c..2fb70abd8b 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -8,7 +8,8 @@ git-annex (10.20260214) UNRELEASED; urgency=medium
* web, S3, git: Fix bugs in checking if content is present on a remote
when configuration does not allow accessing it.
* Improve display of http exceptions.
- *
+ * Fix reversion in previous version that caused auto-initializing of
+ local git remotes that have annex-ignore set.
-- Joey Hess <id@joeyh.name> Mon, 16 Feb 2026 13:38:21 -0400
diff --git a/Command/Sync.hs b/Command/Sync.hs
index e1e9c146f0..5fed2027bf 100644
--- a/Command/Sync.hs
+++ b/Command/Sync.hs
@@ -694,7 +694,7 @@ pushRemote o remote (Just branch, _) = do
, return True
)
where
- needemulation = Remote.Git.onLocalRepo repo $
+ needemulation = Remote.Git.onLocalRepo remote repo $
(annexCrippledFileSystem <$> Annex.getGitConfig)
<&&>
needUpdateInsteadEmulation
diff --git a/Remote/Git.hs b/Remote/Git.hs
index 8e2f6a7443..9d810ff5f7 100644
--- a/Remote/Git.hs
+++ b/Remote/Git.hs
@@ -191,7 +191,8 @@ configRead autoinit r = do
annexignore <- liftIO $ getDynamicConfig (remoteAnnexIgnore gc)
case (repoCheap r, annexignore, hasuuid) of
(True, _, _)
- | remoteAnnexCheckUUID gc -> tryGitConfigRead gc autoinit r hasuuid
+ | remoteAnnexCheckUUID gc ->
+ tryGitConfigRead gc (not annexignore && autoinit) r hasuuid
| otherwise -> return r
(_, True, _)
| remoteAnnexIgnoreAuto gc ->
@@ -413,9 +414,12 @@ tryGitConfigRead gc autoinit r hasuuid
": " ++ show e
Annex.getState Annex.repo
let r' = r { Git.repoPathSpecifiedExplicitly = True }
- s <- newLocal r'
- liftIO $ Annex.eval s $ check
- `finally` quiesce True
+ if autoinit
+ then do
+ s <- newLocal r'
+ liftIO $ Annex.eval s $ check
+ `finally` quiesce True
+ else liftIO $ Git.Config.read r'
failedreadlocalconfig = do
unless hasuuid $ case Git.remoteName r of
@@ -758,12 +762,15 @@ repairRemote r a = return $ do
ensureInitialized noop (pure [])
a `finally` quiesce True
-data LocalRemoteAnnex = LocalRemoteAnnex Git.Repo (MVar [(Annex.AnnexState, Annex.AnnexRead)])
+data LocalRemoteAnnex = LocalRemoteAnnex Git.Repo Bool (MVar [(Annex.AnnexState, Annex.AnnexRead)])
{- This can safely be called on a Repo that is not local, but of course
- onLocal will not work if used with the result. -}
-mkLocalRemoteAnnex :: Git.Repo -> Annex (LocalRemoteAnnex)
-mkLocalRemoteAnnex repo = LocalRemoteAnnex repo <$> liftIO (newMVar [])
+mkLocalRemoteAnnex :: Git.Repo -> RemoteGitConfig -> Annex (LocalRemoteAnnex)
+mkLocalRemoteAnnex repo gc =
+ LocalRemoteAnnex repo
+ <$> liftIO (getDynamicConfig (remoteAnnexIgnore gc))
+ <*> liftIO (newMVar [])
{- Runs an action from the perspective of a local remote.
-
@@ -777,9 +784,9 @@ mkLocalRemoteAnnex repo = LocalRemoteAnnex repo <$> liftIO (newMVar [])
onLocal :: State -> Annex a -> Annex a
onLocal (State _ _ _ _ _ lra) = onLocal' lra
-onLocalRepo :: Git.Repo -> Annex a -> Annex a
-onLocalRepo repo a = do
- lra <- mkLocalRemoteAnnex repo
+onLocalRepo :: Remote -> Git.Repo -> Annex a -> Annex a
+onLocalRepo r repo a = do
+ lra <- mkLocalRemoteAnnex repo (gitconfig r)
onLocal' lra a
newLocal :: Git.Repo -> Annex (Annex.AnnexState, Annex.AnnexRead)
@@ -795,15 +802,17 @@ newLocal repo = do
})
onLocal' :: LocalRemoteAnnex -> Annex a -> Annex a
-onLocal' (LocalRemoteAnnex repo mv) a = liftIO (takeMVar mv) >>= \case
+onLocal' (LocalRemoteAnnex repo annexignore mv) a = liftIO (takeMVar mv) >>= \case
[] -> do
liftIO $ putMVar mv []
v <- newLocal repo
- go (v, ensureInitialized noop (pure []) >> a)
+ go (v, initialized >> a)
(v:rest) -> do
liftIO $ putMVar mv rest
go (v, a)
where
+ initialized = unless annexignore $
+ ensureInitialized noop (pure [])
go ((st, rd), a') = do
curro <- Annex.getState Annex.output
let act = Annex.run (st { Annex.output = curro }, rd) $
@@ -902,7 +911,7 @@ mkState r u gc = do
pool <- Ssh.mkP2PShellConnectionPool
copycowtried <- liftIO newCopyCoWTried
fastcopy <- getFastCopy gc
- lra <- mkLocalRemoteAnnex r
+ lra <- mkLocalRemoteAnnex r gc
(duc, getrepo) <- go
return $ State pool duc copycowtried fastcopy getrepo lra
where
diff --git a/doc/bugs/annex-ignore_check_is_skipped_for_local_remotes.mdwn b/doc/bugs/annex-ignore_check_is_skipped_for_local_remotes.mdwn
index c7b4683be0..a34c828ab7 100644
--- a/doc/bugs/annex-ignore_check_is_skipped_for_local_remotes.mdwn
+++ b/doc/bugs/annex-ignore_check_is_skipped_for_local_remotes.mdwn
@@ -107,3 +107,5 @@ fi
[[!meta author=yoh]]
[[!tag projects/forgejo]]
+
+> [[fixed|done]] --[[Joey]]
comment
diff --git a/CHANGELOG b/CHANGELOG
index 346e077c7a..ecef14dd3c 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -8,6 +8,7 @@ git-annex (10.20260214) UNRELEASED; urgency=medium
* web, S3, git: Fix bugs in checking if content is present on a remote
when configuration does not allow accessing it.
* Improve display of http exceptions.
+ *
-- Joey Hess <id@joeyh.name> Mon, 16 Feb 2026 13:38:21 -0400
diff --git a/doc/bugs/annex-ignore_check_is_skipped_for_local_remotes/comment_1_e422a8cb9dedf13de845a5cdf3b9a14c._comment b/doc/bugs/annex-ignore_check_is_skipped_for_local_remotes/comment_1_e422a8cb9dedf13de845a5cdf3b9a14c._comment
new file mode 100644
index 0000000000..3c86ad17c6
--- /dev/null
+++ b/doc/bugs/annex-ignore_check_is_skipped_for_local_remotes/comment_1_e422a8cb9dedf13de845a5cdf3b9a14c._comment
@@ -0,0 +1,20 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2026-03-02T14:04:38Z"
+ content="""
+The LLM generated text incorrect. It is conflating two different commits
+which both modified the same code.
+
+[[!commit 87e0b77a0435522dce7be8ebec77a1326f2ede20]] was the push-to-create
+commit, and it only involves cases after the repoCheap case is handled.
+
+In the meantime, we have [[!commit f79f7d322bc0278e4edba13c3b57093753fded6c]]
+which explicitly involves annex-ignore and local git remotes and is
+intentionally making the config be read
+
+I have to say that, once again, I find this kind of LLM-generated text
+counterproductive. My work on git-annex is necessarily detail-oriented, and
+needing to deal with something that subtly gets details wrong by
+construction is not helpful.
+"""]]
diff --git a/doc/forum/unrelated_history.mdwn b/doc/forum/unrelated_history.mdwn index b0bb693d0b..11e9ddeaad 100644 --- a/doc/forum/unrelated_history.mdwn +++ b/doc/forum/unrelated_history.mdwn @@ -2,7 +2,8 @@ Hi Joey, We have a particular case where we need to merge datalad repos which were created separately. I was thinking to simply merge the git-annex branches with `--allow-unrelated-histories` and initial tests show limited conflicts that can be easily fixed by using an union merge driver. -Conflicts mainly arise in +Conflicts mainly arise in: + - preferred-content.log - remote.log - uuid.log
diff --git a/doc/forum/unrelated_history.mdwn b/doc/forum/unrelated_history.mdwn new file mode 100644 index 0000000000..b0bb693d0b --- /dev/null +++ b/doc/forum/unrelated_history.mdwn @@ -0,0 +1,13 @@ +Hi Joey, + +We have a particular case where we need to merge datalad repos which were created separately. +I was thinking to simply merge the git-annex branches with `--allow-unrelated-histories` and initial tests show limited conflicts that can be easily fixed by using an union merge driver. +Conflicts mainly arise in +- preferred-content.log +- remote.log +- uuid.log +- very few <key>.tsv and <key>.tsv.log for file that were created as identical of both sides. + +Do you see any issues doing that? + +Thanks
Added a comment
diff --git a/doc/forum/Cheap_and_costly_special_remote_using_sameas__63__/comment_2_c748bbd8d0c21624e22da873b5d9a9e9._comment b/doc/forum/Cheap_and_costly_special_remote_using_sameas__63__/comment_2_c748bbd8d0c21624e22da873b5d9a9e9._comment new file mode 100644 index 0000000000..e6cd91cdbe --- /dev/null +++ b/doc/forum/Cheap_and_costly_special_remote_using_sameas__63__/comment_2_c748bbd8d0c21624e22da873b5d9a9e9._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="xentac" + avatar="http://cdn.libravatar.org/avatar/773b6c7b0dc34f10b66aa46d2730a5b3" + subject="comment 2" + date="2026-02-24T20:18:01Z" + content=""" +I may have actually come up with a solution. Instead of creating a second remote, I was able to make my `~/.ssh/config` dynamic based on the results of a `dig` command: https://fmartingr.com/blog/2022/08/12/using-ssh-config-match-to-connect-to-a-host-using-multiple-ip-or-hostnames/ + +Thanks for going with me on this journey! +"""]]
Added a comment
diff --git a/doc/forum/Cheap_and_costly_special_remote_using_sameas__63__/comment_1_c00027a3e12fae961ff0797891504101._comment b/doc/forum/Cheap_and_costly_special_remote_using_sameas__63__/comment_1_c00027a3e12fae961ff0797891504101._comment new file mode 100644 index 0000000000..7b49d3b7c7 --- /dev/null +++ b/doc/forum/Cheap_and_costly_special_remote_using_sameas__63__/comment_1_c00027a3e12fae961ff0797891504101._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="xentac" + avatar="http://cdn.libravatar.org/avatar/773b6c7b0dc34f10b66aa46d2730a5b3" + subject="comment 1" + date="2026-02-24T19:54:47Z" + content=""" +Trying to set up and test this, I just realized these aren't special remotes. They're proper git (git-annex) remotes on my nas. I'm trying to figure out what setting I need to change to mark them as `sameas` and if cost will work in that case as well. +"""]]
diff --git a/doc/forum/Cheap_and_costly_special_remote_using_sameas__63__.mdwn b/doc/forum/Cheap_and_costly_special_remote_using_sameas__63__.mdwn new file mode 100644 index 0000000000..623533c7c0 --- /dev/null +++ b/doc/forum/Cheap_and_costly_special_remote_using_sameas__63__.mdwn @@ -0,0 +1,3 @@ +I'm using tailscale for a lot of my infrastructure, even to ssh to my nas from any internet connection. This means that tailscale encrypts my traffic and ssh also encrypts the traffic. + +This can be fine for a lot of things, but when I'm using git-annex to `git annex sync --content` it would be nice if I could use the host directly on my network instead of using tailscale's wireguard. I think I can do this by creating two separate special remotes and associating them using `sameas`. The part I don't fully grasp is how I tell git-annex that it should first prefer the network local host before trying the tailscale global host. Is `<remote>.annex-cost` enough to do that?
initial report on a regression
diff --git a/doc/bugs/annex-ignore_check_is_skipped_for_local_remotes.mdwn b/doc/bugs/annex-ignore_check_is_skipped_for_local_remotes.mdwn new file mode 100644 index 0000000000..c7b4683be0 --- /dev/null +++ b/doc/bugs/annex-ignore_check_is_skipped_for_local_remotes.mdwn @@ -0,0 +1,109 @@ +### Please describe the problem. + +we started to get a test test_ria_postclone_noannex to fail, claude bisected to + + This is a git-annex regression in 10.20260213. The configRead function in Remote/Git.hs was reordered to support a "Push to Create" feature: + +``` + Before (10.20250630) — annex-ignore checked first: + (_, True, _) -> return r -- annex-ignore → bail out immediately + (True, _, _) | remoteAnnexCheckUUID gc -> tryGitConfigRead ... + + After (10.20260213) — local repos checked first, bypassing annex-ignore: + (True, _, _) | remoteAnnexCheckUUID gc -> tryGitConfigRead ... -- local repo → auto-init! + (_, True, _) | remoteAnnexIgnoreAuto gc -> checkpushedtocreate gc + + For local remotes, tryGitConfigRead → readlocalannexconfig → autoInitialize recreates the annex/ directory, even though annex-ignore=true is set. The annex-ignore case is never reached due to Haskell's top-to-bottom pattern matching. +``` + +which indeed sounds correct as pushToCreate feature still should not touch remove which are already known to be annex-ignore'd I think. If needed -- flag should be cleared first + +<details> +<summary>reproducer it created which passes on 10.20251029-1 and fails with 10.20260115+git119-g43a3f3aaf2-1~ndall+1 (might have been my patched version, so subtract few commits back)</summary> + +```shell +#!/bin/bash +# +# Reproducer for git-annex regression: annex-ignore not respected for local remotes +# +# In git-annex <= 10.20250630, configRead in Remote/Git.hs checked annex-ignore +# BEFORE repoCheap (local), so local remotes with annex-ignore=true were skipped. +# +# In git-annex >= 10.20260213, the case ordering was swapped for "Push to Create" +# support. Now repoCheap is matched first, causing tryGitConfigRead -> +# readlocalannexconfig -> autoInitialize to run even when annex-ignore=true. +# +# Expected: annex/ directory is NOT created on the bare remote +# Actual (>= 10.20260213): annex/ directory IS created + +set -eu + +echo "=== git-annex annex-ignore regression reproducer ===" +echo "git-annex version: $(git annex version --raw 2>/dev/null || git annex version | head -1)" +echo + +WORKDIR=$(mktemp -d) +trap "chmod -R u+w '$WORKDIR' 2>/dev/null; rm -rf '$WORKDIR'" EXIT + +ORIGIN="$WORKDIR/origin" +BARE="$WORKDIR/bare.git" +CLONE="$WORKDIR/clone" + +# 1. Create an annex repo with some content +echo "--- Step 1: Create origin repo with annexed content" +git init "$ORIGIN" +cd "$ORIGIN" +git annex init "origin" +echo "hello" > file.txt +git annex add file.txt +git commit -m "add file" +echo + +# 2. Create bare repo, push git-annex branch but remove annex/ and annex UUID. +# This simulates a RIA store where the annex objects dir was removed — +# the bare repo has a git-annex branch (metadata) but no local annex. +echo "--- Step 2: Create bare repo, push, then strip local annex state" +git clone --bare "$ORIGIN" "$BARE" +cd "$ORIGIN" +git remote add bare "$BARE" +git push bare --all +git annex copy --to bare file.txt +git annex sync --content bare 2>&1 | tail -5 +# Now strip the annex/ directory and annex.uuid from the bare repo, +# simulating a store that was never locally annex-initialized +chmod -R u+w "$BARE/annex" +rm -rf "$BARE/annex" +git -C "$BARE" config --unset annex.uuid || true +git -C "$BARE" config --unset annex.version || true +echo +echo "annex/ exists in bare after stripping: $(test -d "$BARE/annex" && echo YES || echo NO)" +echo "annex.uuid in bare: $(git -C "$BARE" config annex.uuid 2>/dev/null || echo '<unset>')" +echo + +# 3. Clone from the bare repo, set annex-ignore BEFORE git-annex init +echo "--- Step 3: Clone from bare, set annex-ignore=true, then git-annex init" +git clone "$BARE" "$CLONE" +cd "$CLONE" +git config remote.origin.annex-ignore true +echo "remote.origin.annex-ignore = $(git config remote.origin.annex-ignore)" +git annex init "clone" +echo + +# 4. Check: was annex/ recreated on the bare repo? +echo "--- Result" +if test -d "$BARE/annex"; then + echo "FAIL: annex/ was recreated on the bare remote despite annex-ignore=true" + echo " This is the git-annex regression." + echo " annex.uuid in bare is now: $(git -C "$BARE" config annex.uuid 2>/dev/null || echo '<unset>')" + exit 1 +else + echo "OK: annex/ was NOT recreated. annex-ignore is respected." + exit 0 +fi + +``` +</details> + + +[[!meta author=yoh]] +[[!tag projects/forgejo]]
comment
diff --git a/doc/todo/Ephemeral_special_remotes/comment_3_5c222cb37669b5f0168579e7e642ef70._comment b/doc/todo/Ephemeral_special_remotes/comment_3_5c222cb37669b5f0168579e7e642ef70._comment new file mode 100644 index 0000000000..f055601ce0 --- /dev/null +++ b/doc/todo/Ephemeral_special_remotes/comment_3_5c222cb37669b5f0168579e7e642ef70._comment @@ -0,0 +1,36 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 3""" + date="2026-02-19T18:10:08Z" + content=""" +After a bug fix, it's now possible to make a sameas remote that +is private to the local repository. + + git-annex initremote bar --sameas=foo --private type=... + +While not ephemeral as such, if you `git remote remove bar`, +the only trace left of it will probably +be in `.git/annex/journal-private/remote.log`, and possibly +any creds that got cached for it. +It would be possible to have a command that removes the remote, and also +clears that. + +If that is close enough to ephemeral, then we could think about the +second part, extending the external special remote protocol with +REDIRECT-REMOTE. + +That is similar to [[todo/Special_remote_redirect_to_URL]]. +And a few comments over there go in a similar direction. +In particular, the discussion of CLAIMURL. If TRANSFER-RETRIEVE-URL +and TRANSFER-CHECKPRESENT-URL supported CLAIMURL, then if the ephermeral +special remote had some type of url, that it claimed, those could be used +rather than REDIRECT-REMOTE. + +That would not cover TRANSFER STORE and REMOVE though. And it probably +doesn't make sense to extend those to urls generally. (There are too many +ways to store to an url or remove an url, everything isn't WebDAV..) + +I don't know if it is really elegant to drag +urls into this anyway. The user may be left making up an url scheme for +something that does not involve urls at all. +"""]]
Added CHECKPRESENT-URL extension to the external special remote protocol
diff --git a/Annex/Url.hs b/Annex/Url.hs
index 148fa2f188..05e408351c 100644
--- a/Annex/Url.hs
+++ b/Annex/Url.hs
@@ -15,6 +15,7 @@ module Annex.Url (
getUserAgent,
ipAddressesUnlimited,
checkBoth,
+ checkBoth',
download,
download',
exists,
@@ -197,6 +198,10 @@ checkBoth url expected_size uo =
Right r -> return r
Left err -> warning (UnquotedString err) >> return False
+checkBoth' :: U.URLString -> Maybe Integer -> U.UrlOptions -> Annex (Either String Bool)
+checkBoth' url expected_size uo = either (Left . show) id
+ <$> tryNonAsync (liftIO $ U.checkBoth url expected_size uo)
+
download :: MeterUpdate -> Maybe IncrementalVerifier -> U.URLString -> OsPath -> U.UrlOptions -> Annex Bool
download meterupdate iv url file uo =
liftIO (U.download meterupdate iv url file uo) >>= \case
diff --git a/CHANGELOG b/CHANGELOG
index a7573f8004..4301e3804a 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -3,6 +3,7 @@ git-annex (10.20260214) UNRELEASED; urgency=medium
* Fix retrival from http git remotes of keys with '%' in their names.
* Fix behavior when initremote is used with --sameas=
combined with --private.
+ * Added CHECKPRESENT-URL extension to the external special remote protocol.
-- Joey Hess <id@joeyh.name> Mon, 16 Feb 2026 13:38:21 -0400
diff --git a/Remote/External.hs b/Remote/External.hs
index 6418dc7f89..02de11dbea 100644
--- a/Remote/External.hs
+++ b/Remote/External.hs
@@ -95,7 +95,7 @@ gen rt externalprogram r u rc gc rs
{ storeExport = storeExportM external
, retrieveExport = retrieveExportM external gc
, removeExport = removeExportM external
- , checkPresentExport = checkPresentExportM external
+ , checkPresentExport = checkPresentExportM external gc
, removeExportDirectory = Just $ removeExportDirectoryM external
, renameExport = Just $ renameExportM external
}
@@ -118,7 +118,7 @@ gen rt externalprogram r u rc gc rs
(storeKeyM external)
(retrieveKeyFileM external gc)
(removeKeyM external)
- (checkPresentM external)
+ (checkPresentM external gc)
rmt
where
mk c cst ordered avail towhereis togetinfo toclaimurl tocheckurl exportactions cheapexportsupported =
@@ -276,8 +276,8 @@ removeKeyM external _proof k = either giveup return =<< go
respErrorMessage "REMOVE" errmsg
_ -> Nothing
-checkPresentM :: External -> CheckPresent
-checkPresentM external k = either giveup id <$> go
+checkPresentM :: External -> RemoteGitConfig -> CheckPresent
+checkPresentM external gc k = either giveup id <$> go
where
go = handleRequestKey external CHECKPRESENT k Nothing $ \resp ->
case resp of
@@ -288,6 +288,8 @@ checkPresentM external k = either giveup id <$> go
CHECKPRESENT_UNKNOWN k' errmsg
| k' == k -> result $ Left $
respErrorMessage "CHECKPRESENT" errmsg
+ CHECKPRESENT_URL k' url
+ | k == k' -> checkKeyUrl' gc k url
_ -> Nothing
whereisKeyM :: External -> Key -> Annex [String]
@@ -327,8 +329,8 @@ retrieveExportM external gc k loc dest p = do
_ -> Nothing
req sk = TRANSFEREXPORT Download sk (fromOsPath dest)
-checkPresentExportM :: External -> Key -> ExportLocation -> Annex Bool
-checkPresentExportM external k loc = either giveup id <$> go
+checkPresentExportM :: External -> RemoteGitConfig -> Key -> ExportLocation -> Annex Bool
+checkPresentExportM external gc k loc = either giveup id <$> go
where
go = handleRequestExport external loc CHECKPRESENTEXPORT k Nothing $ \resp -> case resp of
CHECKPRESENT_SUCCESS k'
@@ -338,6 +340,8 @@ checkPresentExportM external k loc = either giveup id <$> go
CHECKPRESENT_UNKNOWN k' errmsg
| k' == k -> result $ Left $
respErrorMessage "CHECKPRESENT" errmsg
+ CHECKPRESENT_URL k' url
+ | k == k' -> checkKeyUrl' gc k url
UNSUPPORTED_REQUEST -> result $
Left "CHECKPRESENTEXPORT not implemented by external special remote"
_ -> Nothing
@@ -861,6 +865,11 @@ checkKeyUrl gc k = do
us <- getWebUrls k
anyM (\u -> withUrlOptions (Just gc) $ checkBoth u (fromKey keySize k)) us
+checkKeyUrl' :: RemoteGitConfig -> Key -> URLString -> Maybe (Annex (ResponseHandlerResult (Either String Bool)))
+checkKeyUrl' gc k url =
+ Just $ withUrlOptions (Just gc) $ \uo ->
+ Result <$> checkBoth' url (fromKey keySize k) uo
+
getWebUrls :: Key -> Annex [URLString]
getWebUrls key = filter supported <$> getUrls key
where
diff --git a/Remote/External/Types.hs b/Remote/External/Types.hs
index 75f6d801f5..724b54486a 100644
--- a/Remote/External/Types.hs
+++ b/Remote/External/Types.hs
@@ -1,6 +1,6 @@
{- External special remote data types.
-
- - Copyright 2013-2025 Joey Hess <id@joeyh.name>
+ - Copyright 2013-2026 Joey Hess <id@joeyh.name>
-
- Licensed under the GNU AGPL version 3 or higher.
-}
@@ -116,6 +116,7 @@ supportedExtensionList = ExtensionList
, "GETGITREMOTENAME"
, "UNAVAILABLERESPONSE"
, "TRANSFER-RETRIEVE-URL"
+ , "CHECKPRESENT-URL"
, asyncExtension
]
@@ -247,6 +248,7 @@ data Response
| CHECKPRESENT_SUCCESS Key
| CHECKPRESENT_FAILURE Key
| CHECKPRESENT_UNKNOWN Key ErrorMsg
+ | CHECKPRESENT_URL Key URLString
| REMOVE_SUCCESS Key
| REMOVE_FAILURE Key ErrorMsg
| COST Cost
@@ -286,6 +288,7 @@ instance Proto.Receivable Response where
parseCommand "CHECKPRESENT-SUCCESS" = Proto.parse1 CHECKPRESENT_SUCCESS
parseCommand "CHECKPRESENT-FAILURE" = Proto.parse1 CHECKPRESENT_FAILURE
parseCommand "CHECKPRESENT-UNKNOWN" = Proto.parse2 CHECKPRESENT_UNKNOWN
+ parseCommand "CHECKPRESENT-URL" = Proto.parse2 CHECKPRESENT_URL
parseCommand "REMOVE-SUCCESS" = Proto.parse1 REMOVE_SUCCESS
parseCommand "REMOVE-FAILURE" = Proto.parse2 REMOVE_FAILURE
parseCommand "COST" = Proto.parse1 COST
diff --git a/doc/design/external_special_remote_protocol.mdwn b/doc/design/external_special_remote_protocol.mdwn
index f79b8230ae..7c97d721b2 100644
--- a/doc/design/external_special_remote_protocol.mdwn
+++ b/doc/design/external_special_remote_protocol.mdwn
@@ -45,7 +45,7 @@ Recent versions of git-annex respond with a message indicating
protocol extensions that it supports. Older versions of
git-annex do not send this message.
- EXTENSIONS INFO ASYNC GETGITREMOTENAME UNAVAILABLERESPONSE TRANSFER-RETRIEVE-URL
+ EXTENSIONS INFO ASYNC GETGITREMOTENAME UNAVAILABLERESPONSE TRANSFER-RETRIEVE-URL CHECKPRESENT-URL
The special remote can respond to that with its own EXTENSIONS message, listing
any extensions it wants to use.
@@ -162,6 +162,11 @@ The following requests *must* all be supported by the special remote.
* `CHECKPRESENT-UNKNOWN Key ErrorMsg`
Indicates that it is not currently possible to verify if the key is
present in the remote. (Perhaps the remote cannot be contacted.)
+ * `CHECKPRESENT-URL Key Url`
+ Rather than the special remote checking an url itself,
+ this lets it offload that work to git-annex. This response is a protocol
+ extension; it's only safe to send it to git-annex after it sent an
+ `EXTENSIONS` that included `CHECKPRESENT-URL`.
* `REMOVE Key`
Requests the remote to remove a key's contents.
* `REMOVE-SUCCESS Key`
@@ -488,6 +493,9 @@ These protocol extensions are currently supported.
* `TRANSFER-RETRIEVE-URL`
This allows the `TRANSFER-RETRIEVE-URL` response to be used
in reply to `TRANSFER` and `TRANSFEREXPORT`.
+* `CHECKPRESENT-URL`
+ This allows the `CHECKPRESENT-URL` response to be used
+ in reply to `CHECKPRESENT` and `CHECKPRESENTEXPORT`.
## signals
diff --git a/doc/design/external_special_remote_protocol/export_and_import_appendix.mdwn b/doc/design/external_special_remote_protocol/export_and_import_appendix.mdwn
index 1f30828c48..0bb70b7e60 100644
--- a/doc/design/external_special_remote_protocol/export_and_import_appendix.mdwn
+++ b/doc/design/external_special_remote_protocol/export_and_import_appendix.mdwn
@@ -71,6 +71,11 @@ a request, it can reply with `UNSUPPORTED-REQUEST`.
* `CHECKPRESENT-UNKNOWN Key ErrorMsg`
Indicates that it is not currently possible to verify if content is
present in the remote. (Perhaps the remote cannot be contacted.)
+ * `CHECKPRESENT-URL Key Url`
+ Rather than the special remote checking an url itself,
+ this lets it offload that work to git-annex. This response is a protocol
+ extension; it's only safe to send it to git-annex after it sent an
+ `EXTENSIONS` that included `CHECKPRESENT-URL`.
* `REMOVEEXPORT Key`
Requests the remote to remove content stored by `TRANSFEREXPORT`
with the previously provided `EXPORT` Name.
diff --git a/doc/todo/CHECKPRESENT_redirect_to_URL.mdwn b/doc/todo/CHECKPRESENT_redirect_to_URL.mdwn
index a44884ef63..103c522c71 100644
--- a/doc/todo/CHECKPRESENT_redirect_to_URL.mdwn
+++ b/doc/todo/CHECKPRESENT_redirect_to_URL.mdwn
(Diff truncated)
update
diff --git a/doc/thanks/list b/doc/thanks/list index dfeda7a813..0a65388abb 100644 --- a/doc/thanks/list +++ b/doc/thanks/list @@ -126,3 +126,5 @@ Lilia.Nanne, Dusty Mabe, mpol, Andrew Poelstra, +joshingly, +Melody Tolly,
break out todo
diff --git a/doc/todo/CHECKPRESENT_redirect_to_URL.mdwn b/doc/todo/CHECKPRESENT_redirect_to_URL.mdwn new file mode 100644 index 0000000000..a44884ef63 --- /dev/null +++ b/doc/todo/CHECKPRESENT_redirect_to_URL.mdwn @@ -0,0 +1,15 @@ +Following up on [[todo/Special_remote_redirect_to_URL]], +it would be useful for CHECKPRESENT (and also CHECKPRESENTEXPORT) +to be able to redirect to an url, and let git-annex do the checking. + +This will let external special remotes that are readonly and can calculate +urls on their own avoid needing to implement HTTP at all. + +The protocol extension would be: + + EXTENSIONS CHECKPRESENT-URL + CHECKPRESENT-URL Key Url + +--[[Joey]] + +[[!tag projects/INM7]] diff --git a/doc/todo/Special_remote_redirect_to_URL/comment_9_2a6eaab78886f0d6c372c7bdc929d7c4._comment b/doc/todo/Special_remote_redirect_to_URL/comment_9_2a6eaab78886f0d6c372c7bdc929d7c4._comment new file mode 100644 index 0000000000..d83f4a8fb6 --- /dev/null +++ b/doc/todo/Special_remote_redirect_to_URL/comment_9_2a6eaab78886f0d6c372c7bdc929d7c4._comment @@ -0,0 +1,7 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 9""" + date="2026-02-18T20:51:57Z" + content=""" +Opened [[todo/CHECKPRESENT_redirect_to_URL]]. +"""]]
update
diff --git a/doc/todo/Special_remote_redirect_to_URL/comment_8_81c4dd572b3f91eaa577158756150f7e._comment b/doc/todo/Special_remote_redirect_to_URL/comment_8_81c4dd572b3f91eaa577158756150f7e._comment index f44216176b..f6180724bf 100644 --- a/doc/todo/Special_remote_redirect_to_URL/comment_8_81c4dd572b3f91eaa577158756150f7e._comment +++ b/doc/todo/Special_remote_redirect_to_URL/comment_8_81c4dd572b3f91eaa577158756150f7e._comment @@ -7,11 +7,9 @@ Some of this strikes me as perhaps coming at [[todo/Ephemeral_special_remotes]] from a different direction? Re the inflation of the git-annex branch when using sameas, -I've checked and `git-annex initremote --sameas=foo --private` -still writes to the git-annex branch. But -it should be possible to keep the sameas remote's -config out of the git-annex branch and only stored locally. -Opened a bug report, [[bugs/sameas_private]]. +I fixed a bug ([[bugs/sameas_private]]) and you'll be able to use +`git-annex initremote --sameas=foo --private` to keep the configuration +of the new sameas remote out of the git-annex branch. So, it seems to me that your broker, if it knows of several different urls that can be used to access `myplace`, can be configured at `initremote`
make annex-private use annex-config-uuid when set, rather than annex-uuid
Fix behavior when initremote is used with --sameas= combined with --private.
Fix behavior when initremote is used with --sameas= combined with --private.
diff --git a/CHANGELOG b/CHANGELOG index 21670f4a1f..a7573f8004 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1,6 +1,8 @@ git-annex (10.20260214) UNRELEASED; urgency=medium * Fix retrival from http git remotes of keys with '%' in their names. + * Fix behavior when initremote is used with --sameas= + combined with --private. -- Joey Hess <id@joeyh.name> Mon, 16 Feb 2026 13:38:21 -0400 diff --git a/Types/GitConfig.hs b/Types/GitConfig.hs index 5772bebb0c..962cb732ae 100644 --- a/Types/GitConfig.hs +++ b/Types/GitConfig.hs @@ -307,8 +307,10 @@ extractGitConfig configsource r = GitConfig | Git.Config.isTrueFalse' v /= Just True = Nothing | isRemoteKey (remoteAnnexConfigEnd "private") k = do remotename <- remoteKeyToRemoteName k - toUUID <$> Git.Config.getMaybe - (remoteAnnexConfig remotename "uuid") r + let getu c = + toUUID <$> Git.Config.getMaybe + (remoteAnnexConfig remotename c) r + getu "config-uuid" <|> getu "uuid" | otherwise = Nothing in mapMaybe get (M.toList (Git.config r)) ] diff --git a/doc/bugs/sameas_private.mdwn b/doc/bugs/sameas_private.mdwn index db48f5b688..e6aae78912 100644 --- a/doc/bugs/sameas_private.mdwn +++ b/doc/bugs/sameas_private.mdwn @@ -1,13 +1,22 @@ `git-annex initremote --sameas=foo --private` is not actually -private. +private, or not in a way that seems to make sense. -It writes to the git-annex branch, adding in remote.log the config uuid of the -sameas remote. +Currently, it writes to the git-annex branch, adding in remote.log the +config uuid of the sameas remote. It should be possible to avoid writing that there. Since the config uuid is the only place a sameas remote touches the git-annex branch, this would -allow making up sameas remotes for local use. Location log changes -for a private sameas remote would still be recorded in the git-annex -branch, as long as the remote uuid is not itself private. --[[Joey]] +allow making up sameas remotes for local use. + +But also, and worse, that actually makes location log changes for remote +foo be logged to the private journal. That happens because +remote.name.annex-private is set for the sameas remote, and +it has the same annex-uuid as foo. This is highly surprising and wrong +behavior! + +The fix will be to make remote.name.annex-private affect the +annex-config-uuid when there is one, rather than the annex-uuid. + +> [[fixed|done]] --[[Joey]] [[!tag projects/INM7]] diff --git a/doc/git-annex-initremote.mdwn b/doc/git-annex-initremote.mdwn index bcb3494b7f..f4d5705308 100644 --- a/doc/git-annex-initremote.mdwn +++ b/doc/git-annex-initremote.mdwn @@ -87,6 +87,11 @@ want to use `git annex renameremote`. branch. The special remote will only be usable from the repository where it was created. + When used in combination with `--sameas=foo`, the configuration of the + new special remote is kept private. But when files are sent to the new + special remote, it will be public that they are present in remote "foo", + unless it is also private. + * `--json` Enable JSON output. This is intended to be parsed by programs that use
formatting
diff --git a/doc/todo/Special_remote_redirect_to_URL/comment_8_81c4dd572b3f91eaa577158756150f7e._comment b/doc/todo/Special_remote_redirect_to_URL/comment_8_81c4dd572b3f91eaa577158756150f7e._comment index 0b59cb561c..f44216176b 100644 --- a/doc/todo/Special_remote_redirect_to_URL/comment_8_81c4dd572b3f91eaa577158756150f7e._comment +++ b/doc/todo/Special_remote_redirect_to_URL/comment_8_81c4dd572b3f91eaa577158756150f7e._comment @@ -1,6 +1,6 @@ [[!comment format=mdwn username="joey" - subject="""Re: "download URL broker"""" + subject="""Re: download URL broker""" date="2026-02-18T19:43:01Z" content=""" Some of this strikes me as perhaps coming at
response
diff --git a/doc/bugs/sameas_private.mdwn b/doc/bugs/sameas_private.mdwn new file mode 100644 index 0000000000..db48f5b688 --- /dev/null +++ b/doc/bugs/sameas_private.mdwn @@ -0,0 +1,13 @@ +`git-annex initremote --sameas=foo --private` is not actually +private. + +It writes to the git-annex branch, adding in remote.log the config uuid of the +sameas remote. + +It should be possible to avoid writing that there. Since the config uuid +is the only place a sameas remote touches the git-annex branch, this would +allow making up sameas remotes for local use. Location log changes +for a private sameas remote would still be recorded in the git-annex +branch, as long as the remote uuid is not itself private. --[[Joey]] + +[[!tag projects/INM7]] diff --git a/doc/todo/Special_remote_redirect_to_URL/comment_8_81c4dd572b3f91eaa577158756150f7e._comment b/doc/todo/Special_remote_redirect_to_URL/comment_8_81c4dd572b3f91eaa577158756150f7e._comment new file mode 100644 index 0000000000..0b59cb561c --- /dev/null +++ b/doc/todo/Special_remote_redirect_to_URL/comment_8_81c4dd572b3f91eaa577158756150f7e._comment @@ -0,0 +1,21 @@ +[[!comment format=mdwn + username="joey" + subject="""Re: "download URL broker"""" + date="2026-02-18T19:43:01Z" + content=""" +Some of this strikes me as perhaps coming at +[[todo/Ephemeral_special_remotes]] from a different direction? + +Re the inflation of the git-annex branch when using sameas, +I've checked and `git-annex initremote --sameas=foo --private` +still writes to the git-annex branch. But +it should be possible to keep the sameas remote's +config out of the git-annex branch and only stored locally. +Opened a bug report, [[bugs/sameas_private]]. + +So, it seems to me that your broker, if it knows of several different urls +that can be used to access `myplace`, can be configured at `initremote` +time which set of urls to use. And you can initialize multiple instances +of the broker, each configured to use a different set of url, with +`--sameas --private`. +"""]]
response 3
diff --git a/doc/todo/Special_remote_redirect_to_URL/comment_7_f9d5dbb6ca33aa5e9d143fe7f5c93822._comment b/doc/todo/Special_remote_redirect_to_URL/comment_7_f9d5dbb6ca33aa5e9d143fe7f5c93822._comment new file mode 100644 index 0000000000..fc5b383949 --- /dev/null +++ b/doc/todo/Special_remote_redirect_to_URL/comment_7_f9d5dbb6ca33aa5e9d143fe7f5c93822._comment @@ -0,0 +1,19 @@ +[[!comment format=mdwn + username="joey" + subject="""Re: CLAIMURL""" + date="2026-02-18T19:29:02Z" + content=""" +CLAIMURL is not currently used for TRANSFER-RETRIEVE-URL. (It's also +not quite accurate to say that the `web` special remote is used.) + +Supporting that would mean that, each time a remote replies with +TRANSFER-RETRIEVE-URL, git-annex would need to query each other remote +in turn to see if they claim the url. That could mean starting up a lot +of extenal special remote programs (when not running yet) and doing a +roundtrip through them, so latency might start to become a problem. + +Also, there would be the possibility of loops between 2 or more remotes. +Eg, remote A replies with TRANSFER-RETRIEVE-URL with an url that remote B +CLAIMURLs, only to then reply with TRANSFER-RETRIEVE-URL, with an url that +remote A CLAIMURLs. +"""]]
comment 2
diff --git a/doc/todo/Special_remote_redirect_to_URL/comment_6_7bb4abf3b30527d67b3c72a397fc4cd0._comment b/doc/todo/Special_remote_redirect_to_URL/comment_6_7bb4abf3b30527d67b3c72a397fc4cd0._comment new file mode 100644 index 0000000000..ffce28fb67 --- /dev/null +++ b/doc/todo/Special_remote_redirect_to_URL/comment_6_7bb4abf3b30527d67b3c72a397fc4cd0._comment @@ -0,0 +1,18 @@ +[[!comment format=mdwn + username="joey" + subject="""Re: multiple URLS for a key""" + date="2026-02-18T19:16:06Z" + content=""" +TRANSFER-RETRIEVE-URL was designed as a redirect, so it only redirects to +one place. And git-annex won't try again to retrieve from the same remote +if url fails to download. + +I could imagine extending TRANSFER-RETRIEVE-URL to have a list of urls. But +I can also imagine needing to extend it with HTTP headers to use for the +url, and these things conflict, given the simple line and word based +protocol. + +I think that sameas remotes that use other urls might be a solution. +Running eg `git-annex get` without specifying a remote, it will keep trying +different remotes until one succeeds. +"""]]
response 1
diff --git a/doc/todo/Special_remote_redirect_to_URL/comment_5_c38df20873ec737243f27f1f33882703._comment b/doc/todo/Special_remote_redirect_to_URL/comment_5_c38df20873ec737243f27f1f33882703._comment new file mode 100644 index 0000000000..cd8dd55280 --- /dev/null +++ b/doc/todo/Special_remote_redirect_to_URL/comment_5_c38df20873ec737243f27f1f33882703._comment @@ -0,0 +1,35 @@ +[[!comment format=mdwn + username="joey" + subject="""Re: CHECKPRESENT""" + date="2026-02-18T18:47:51Z" + content=""" +Yes CHECKPRESENT still needs the special remote to do HTTP. + +I do think that was an oversight. The original todo mentioned +"taking advantage of the testing and security hardening of the +git-annex implementation" and if a special remote is read-only, +CHECKPRESENT may be the only time it needs to do HTTP. + +A protocol extension for this would look like: + + EXTENSIONS CHECKPRESENT-URL + CHECKPRESENT-URL Key Url + +--- + +> Would it impact the usage of such a special remote, if it would be configured +> with sameas=otherremote? Would both remote implementations need to implement +> CHECKPRESENT (consistently), or would one (in this case otherremote) by enough. + +git-annex won't try to use the otherremote when it's been asked to use +the sameas remote. + +If one implemented CHECKPRESENT and the other always replied with +"CHECKPRESENT-UNKNOWN", then a command like `git-annex fsck --fast --from` +when used with the former remote would be able to verify that the content +is present, and when used with the latter remote would it would error out. + +So you could perhaps get away with not implementing that. For a readonly +remote, fsck is I think the only thing that uses CHECKPRESENT on a +user-specified remote. It's more used on remotes that can be written to. +"""]]
Added a comment: Functionality gaps?
diff --git a/doc/todo/Special_remote_redirect_to_URL/comment_4_d7d6814a2a19227ceb43ba2ba05c32ba._comment b/doc/todo/Special_remote_redirect_to_URL/comment_4_d7d6814a2a19227ceb43ba2ba05c32ba._comment new file mode 100644 index 0000000000..b115e0cd15 --- /dev/null +++ b/doc/todo/Special_remote_redirect_to_URL/comment_4_d7d6814a2a19227ceb43ba2ba05c32ba._comment @@ -0,0 +1,34 @@ +[[!comment format=mdwn + username="mih" + avatar="http://cdn.libravatar.org/avatar/f881df265a423e4f24eff27c623148fd" + subject="Functionality gaps?" + date="2026-02-18T18:10:07Z" + content=""" +I looked into adopting this new feature for a special remote implementation. Four questions arose: + +1. In order to implement CHECKPRESENT it appears that a special remote still needs to implemented the logic for the equivalent of a HTTP HEAD request. From my POV this limits the utility of a git-annex based download, because significant logic still needs to be implemented in a special remote itself. Would it impact the usage of such a special remote, if it would be configured with `sameas=otherremote`? Would both remote implementations need to implement CHECKPRESENT (consistently), or would one (in this case `otherremote`) by enough. + +2. I am uncertain re the signaling in case of multiple possible URL targets for a key, and an eventual download failure regarding one URL communicated via TRANSFER-RETRIEVE-URL. I believe that, when git-annex fails to download from a reported URL successfully, it can only send another TRANSFER-RETRIEVE request to the special remote (possibly go to the next remote first). This would mean that the special remote either needs to maintain a state on which URL has been reported before, or it would need to implement the capacity to test for availability (essentially the topic of Q1), and can never report more than one URL. Is this correct? + +3. What is the logic git-annex uses to act on a URL communicated via TRANSFER-RETRIEVE-URL. Would it match it against all available special remotes via CLAIMURL, or give it straight to `web` (and only that)? + +4. I am wondering, if it would be possible and sensible, to use this feature for implementing a download URL \"broker\"? A use case would be an informed selection of a download URL from a set of URLs associated with a key. This is similar to the `urlinclude/exclude` feature of the `web` special remote, but (depending on Q3) is relevant also to other special remotes acting as downloader implementations. + + +Elaborating on (4) a bit more: My thinking is focused on the optimal long-term accessibility of keys -- across infrastructure transitions and different concurrent environments. From my POV git-annex provides me with the following options for making `myplace` as a special remote optimally work across space and time. + +- via `sameas=myplace`, I can have multiple special remotes point to `myplace`. In each environment I can use the additional remotes (by name) to optimally access `myplace`. The decision making process it independent of git-annex. However, the possible access options need to be encoded in the annex branch to make this work. This creates a problem of inflation of this space in case of repositories that are used in many different contexts (think public (research) data that want to capitalize on the decentralized nature of git-annex). + +- via `enableremote` I can swap out the type and parameterization of `myplace` entirely. However, unlike with `initremote` there is no `--private`, so this is more geared toward the use case of \"previous access method is no longer available\", rather than a temporary optimization. + +- when key access is (temporarily) preferred via URLs, I could generated a temporary `web` special remote via `initremote --private` and a `urlinclude` pattern. + +In all cases, I cannot simply run `git annex get`, but I need to identify a specific remote that may need to be created first, or set a low cost for it. + +I'd be glad to be pointed at omissions in this assessment. Thanks! + + + + + +"""]]
diff --git a/doc/bugs/macos_switch_to_openrsync_seems_to_break_sync.mdwn b/doc/bugs/macos_switch_to_openrsync_seems_to_break_sync.mdwn new file mode 100644 index 0000000000..6a5fc6ce7e --- /dev/null +++ b/doc/bugs/macos_switch_to_openrsync_seems_to_break_sync.mdwn @@ -0,0 +1,46 @@ +### Please describe the problem. +Syncing content to an rsync remote no longer works on macOS 26.3. +(Specifically an encrypted rsync remote with shared encryption) + +### What steps will reproduce the problem? +MacOS 26.3 (with openrsync) +run: `git annex sync my-rsync-remote --content` +observe: `rsync error: unexpected end of file.` +(these files are all fine and can be opened locally) + +### What version of git-annex are you using? On what operating system? +MacOS 26.3 with git-annex from homebrew + +git-annex version: 10.20260213 +build flags: Assistant Webapp FsEvents TorrentParser MagicMime Benchmark Feeds Testsuite S3 WebDAV Servant +dependency versions: aws-0.25.2 bloomfilter-2.0.1.3 crypton-1.0.6 DAV-1.3.4 feed-1.3.2.1 ghc-9.14.1 http-client-0.7.19 torrent-10000.1.3 uuid-1.3.16 yesod-1.6.2.1 +key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL GITBUNDLE GITMANIFEST VURL X* +remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg rclone hook external compute mask +operating system: darwin aarch64 +supported repository versions: 8 9 10 +upgrade supported from repository versions: 0 1 2 3 4 5 6 7 8 9 10 +local repository version: 10 + +### Please provide any additional information below. + +I have a suspicion its because macOS now has openrsync instead of rsync. +The files themselves all seem to be fine. +The error I'm seeing seems to be some kind of rsync crash in the background. +Googling shows that openrsync doesn't accept all the command line arguments rsync does. +I don't know how git-annex is using it in the backend but I imagine this could be something that would break it. +Are you able to confirm if git-annex should work on MacOS with openrsync? (default) +If it is indeed the case that openrsync isn't supported, checking compatibility and aborting with a message directing the user on how to switch to use the other rsync would be great. +Note: It's totally possible its something else... no idea what though. + +[[!format sh """ +# If you can, paste a complete transcript of the problem occurring here. +# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log + + +# End of transcript or log. +"""]] + +### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders) +Yeah was working great for syncing and archiving large static files across multiple drives and machines. (helps deduping them too) +Have also had success in the past syncing to NAS storage using rsync remote and then using it to pull data on demand when away from home. +Thanks for taking the time to read this.
switch arm autobuild links to armhf
armel no longer being updated
armel no longer being updated
diff --git a/doc/install/Linux_standalone.mdwn b/doc/install/Linux_standalone.mdwn index 57bcf5bfae..83bae6af9b 100644 --- a/doc/install/Linux_standalone.mdwn +++ b/doc/install/Linux_standalone.mdwn @@ -7,7 +7,7 @@ dependencies and is self-contained. * x86-64: [download tarball](https://downloads.kitenet.net/git-annex/linux/current/git-annex-standalone-amd64.tar.gz) * x86-32: [download tarball](https://downloads.kitenet.net/git-annex/linux/current/git-annex-standalone-i386.tar.gz) -* arm: [download tarball](https://downloads.kitenet.net/git-annex/linux/current/git-annex-standalone-armel.tar.gz) +* arm: [download tarball](https://downloads.kitenet.net/git-annex/linux/current/git-annex-standalone-armhf.tar.gz) * arm64: [download tarball](https://downloads.kitenet.net/git-annex/linux/current/git-annex-standalone-arm64.tar.gz) * arm64, for ancient kernels: [download tarball](https://downloads.kitenet.net/git-annex/linux/current/git-annex-standalone-arm64-ancient.tar.gz) @@ -37,7 +37,7 @@ An hourly autobuild is also available, hosted by [[Joey]]: * x86-32: [download tarball](https://downloads.kitenet.net/git-annex/autobuild/i386/git-annex-standalone-i386.tar.gz) ([build logs](https://downloads.kitenet.net/git-annex/autobuild/i386/)) * arm64: [download tarball](https://downloads.kitenet.net/git-annex/autobuild/arm64/git-annex-standalone-arm64.tar.gz) ([build logs](https://downloads.kitenet.net/git-annex/autobuild/arm64/)) * arm64, for ancient kernels: [download tarball](https://downloads.kitenet.net/git-annex/autobuild/arm64-ancient/git-annex-standalone-arm64-ancient.tar.gz) ([build logs](https://downloads.kitenet.net/git-annex/autobuild/arm64-ancient/)) -* arm: [download tarball](https://downloads.kitenet.net/git-annex/autobuild/armel/git-annex-standalone-armel.tar.gz) ([build logs](https://downloads.kitenet.net/git-annex/autobuild/armel/)) +* arm: [download tarball](https://downloads.kitenet.net/git-annex/autobuild/armhf/git-annex-standalone-armhf.tar.gz) ([build logs](https://downloads.kitenet.net/git-annex/autobuild/armhf/)) ## download security
switch to armhf
diff --git a/doc/builds.mdwn b/doc/builds.mdwn index dc639170c7..48744dda62 100644 --- a/doc/builds.mdwn +++ b/doc/builds.mdwn @@ -9,8 +9,8 @@ <h2>Linux amd64</h2> <iframe width=1024 height=40em scrolling=no frameborder=0 marginheight=0 marginwidth=0 src="https://downloads.kitenet.net/git-annex/autobuild/amd64/build-version"> </iframe> -<h2>Linux armel</h2> -<iframe width=1024 height=40em scrolling=no frameborder=0 marginheight=0 marginwidth=0 src="https://downloads.kitenet.net/git-annex/autobuild/armel/build-version"> +<h2>Linux armhf</h2> +<iframe width=1024 height=40em scrolling=no frameborder=0 marginheight=0 marginwidth=0 src="https://downloads.kitenet.net/git-annex/autobuild/armhf/build-version"> </iframe> <h2>Linux arm64</h2> <iframe width=1024 height=40em scrolling=no frameborder=0 marginheight=0 marginwidth=0 src="https://downloads.kitenet.net/git-annex/autobuild/arm64/build-version"> @@ -34,8 +34,8 @@ <h2>Linux amd64</h2> <iframe width=1024 scrolling=no frameborder=0 marginheight=0 marginwidth=0 src="https://downloads.kitenet.net/git-annex/autobuild/amd64/"> </iframe> -<h2>Linux armel</h2> -<iframe width=1024 scrolling=no frameborder=0 marginheight=0 marginwidth=0 src="https://downloads.kitenet.net/git-annex/autobuild/armel/"> +<h2>Linux armhf</h2> +<iframe width=1024 scrolling=no frameborder=0 marginheight=0 marginwidth=0 src="https://downloads.kitenet.net/git-annex/autobuild/armhf/"> </iframe> <h2>Linux arm64</h2> <iframe width=1024 scrolling=no frameborder=0 marginheight=0 marginwidth=0 src="https://downloads.kitenet.net/git-annex/autobuild/arm64/">
comments
diff --git a/doc/bugs/assistant__58___nothing_added_to_commit_but_untracked/comment_2_922d428bb1506ba413b9ae6e5119aa25._comment b/doc/bugs/assistant__58___nothing_added_to_commit_but_untracked/comment_2_922d428bb1506ba413b9ae6e5119aa25._comment new file mode 100644 index 0000000000..a85edbb7ce --- /dev/null +++ b/doc/bugs/assistant__58___nothing_added_to_commit_but_untracked/comment_2_922d428bb1506ba413b9ae6e5119aa25._comment @@ -0,0 +1,29 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 2""" + date="2026-02-16T18:24:00Z" + content=""" +Looked in more detail into fixing this by moving the ignore check to after +a set of files has been gathered and fed through `git ls-files`. Unfortunately +that will be complicated significantly by the fact that, after the ignore check +it currently does things like re-writing symlinks to annex objects when the +link target needs updating. There is a chicken and egg problem here, +because the type of Change that gets queued depends on parts of that same +code having run. + +BTW: Another way this same bug can manifest is that an annex object is added +to a submodule, and the assistant updates its symlink to point out of the +submodule, to the wrong annex objects directory. + +There is some very delicate timing going on in +Assistant.Threads.Committer in order to gather Changes that happen close +together in time. Which makes me think that even a simple approach of +running `git ls-files` once per changed file, before the ignore check, +might throw the timing off enough to be a problem. As well as being murder +on the CPU when eg, a lot of files have been moved around. + +Note that [[todo/replace_assistant_with_assist]] would fix this bug, +since `git-annex assist` does use `git ls-files`. Not that implementing +that would be any easier than just fixing this bug. But, fixing this bug +moves the assistant in the direction of that todo one way or the other. +"""]] diff --git a/doc/todo/replace_assistant_with_assist/comment_1_685f1aa27ff31fd24cb987f9ff743d93._comment b/doc/todo/replace_assistant_with_assist/comment_1_685f1aa27ff31fd24cb987f9ff743d93._comment new file mode 100644 index 0000000000..600430c568 --- /dev/null +++ b/doc/todo/replace_assistant_with_assist/comment_1_685f1aa27ff31fd24cb987f9ff743d93._comment @@ -0,0 +1,9 @@ +[[!comment format=mdwn + username="joey" + subject=""""Gather inotify events"""" + date="2026-02-16T18:51:50Z" + content=""" +The assistant has some very tricky, and probably also fragile code that +gathers related inotify events. That would need to be factored out for +this. +"""]]
close
diff --git a/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file.mdwn b/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file.mdwn index 5303ad1b52..d1d59cec5c 100644 --- a/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file.mdwn +++ b/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file.mdwn @@ -105,3 +105,4 @@ when I was lucky - yes. [[!meta author=yoh]] [[!tag projects/repronim]] +> [[done]] --[[Joey]]
comment
diff --git a/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file/comment_3_f03f1b609b6b3ab46081913977488230._comment b/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file/comment_3_f03f1b609b6b3ab46081913977488230._comment new file mode 100644 index 0000000000..da4a15f67d --- /dev/null +++ b/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file/comment_3_f03f1b609b6b3ab46081913977488230._comment @@ -0,0 +1,54 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 3""" + date="2026-02-16T16:35:28Z" + content=""" +Congrats I guess, that's the first LLM-generated patch to git-annex, and +it seems approximately correct. + +It was unambiguously helpful to get the hint that `Remote/Git.hs:485` +was the location of the bug. That probably saved 10 minutes of my time. + +But, I probably would have found it easier to fix this on my own without +seeing that patch than it was to fix it given that patch. I had to do a +considerable amount of thinking about whether the patch was correct, or +just confidently sounding incorrect in a different manner than a +human-generated patch would be. (Not helped, certainly, by this being an +area of the code with no type system guardrails helping it be correct.) + +For one thing, I wondered, why does it use isUnescapedInURIComponent rather +than isUnescapedInURI? The latter handles '/' correctly without needing a +special case. + +Being faced with an LLM-generated patch also meant that I needed to consider +what its license is. I was faced with needing to clean-room my own version, +which is a bit difficult given how short the patch is (while probably still +long enough to be copyrightable). + +But, it turns out that git-annex already contains essentially the same +code in Remote/S3.hs, in genericPublicUrl: + + baseurl Posix.</> escapeURIString skipescape p + where + -- Don't need to escape '/' because the bucket object + -- is not necessarily a single url component. + -- But do want to escape eg '+' and ' ' + skipescape '/' = True + skipescape c = isUnescapedInURIComponent c + +This code was presumably in the LLM's training set, and certainly appeared +to be available to it for context, so its mirroring of this could simply be +a case of Garbage In, Garbage Out. + +Note that "skipescape" is a much better name than the LLM-generated +"escchar" which behaves backwards from what its name suggests. + +Why did I use isUnescapedInURIComponent in that and isUnescapedInURI +in Remote/WebDav/DavLocation.hs? +I doubt there was a good reason for either choice, but a full analysis +did find a reason to prefer the isUnescapedInURIComponent approach, +to handle a path containing '[' or ']. + +So, in [[!commit 8fd9b67ed82ca0f39796a8d59431d42a7eb84957]], I've +factored out a general purpose function, and fixed this bug by using it. +"""]]
comment
diff --git a/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file/comment_2_d2dee9d9be3ad9a6726397be2093e92d._comment b/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file/comment_2_d2dee9d9be3ad9a6726397be2093e92d._comment new file mode 100644 index 0000000000..1f1214eaa9 --- /dev/null +++ b/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file/comment_2_d2dee9d9be3ad9a6726397be2093e92d._comment @@ -0,0 +1,13 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 2""" + date="2026-02-16T16:09:47Z" + content=""" +> SHA256E keys (like SHA256E-s107998--4545...) contain only alphanumeric characters + +> The keyFile encoding produces no % or & characters + +Incorrect statements FWIW. + +Certian SHA*E keys will also be affected by this bug. +"""]]
Added a comment
diff --git a/doc/bugs/Number_of_p2phttp_OS_threads_scales_with_-J_flag/comment_3_c127a0a56bacf6d6faabe4964afad37e._comment b/doc/bugs/Number_of_p2phttp_OS_threads_scales_with_-J_flag/comment_3_c127a0a56bacf6d6faabe4964afad37e._comment new file mode 100644 index 0000000000..fca8715d4a --- /dev/null +++ b/doc/bugs/Number_of_p2phttp_OS_threads_scales_with_-J_flag/comment_3_c127a0a56bacf6d6faabe4964afad37e._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="matrss" + avatar="http://cdn.libravatar.org/avatar/cd1c0b3be1af288012e49197918395f0" + subject="comment 3" + date="2026-02-16T11:19:58Z" + content=""" +I must have forgotten to set the email replies checkbox. Thanks for implementing this, it sounds exactly like what I want :) +"""]]
add news item for git-annex 10.20260213
diff --git a/doc/news/version_10.20250929.mdwn b/doc/news/version_10.20250929.mdwn deleted file mode 100644 index 4d46ac2cf1..0000000000 --- a/doc/news/version_10.20250929.mdwn +++ /dev/null @@ -1,7 +0,0 @@ -git-annex 10.20250929 released with [[!toggle text="these changes"]] -[[!toggleable text=""" * enableremote: Allow type= to be provided when it does not change the - type of the special remote. - * importfeed: Fix encoding issues parsing feeds when built with OsPath. - * Fix build with ghc 9.0.2. - * Remove the Servant build flag; always build with support for - annex+http urls and git-annex p2phttp."""]] \ No newline at end of file diff --git a/doc/news/version_10.20260213.mdwn b/doc/news/version_10.20260213.mdwn new file mode 100644 index 0000000000..873cee77a3 --- /dev/null +++ b/doc/news/version_10.20260213.mdwn @@ -0,0 +1,30 @@ +git-annex 10.20260213 released with [[!toggle text="these changes"]] +[[!toggleable text=""" * When used with git forges that allow Push to Create, the remote's + annex-uuid is re-probed after the initial push. + * addurl, importfeed: Enable --verifiable by default. + * fromkey, registerurl: When passed an url, generate a VURL key. + * unregisterurl: Unregister both VURL and URL keys. + * Fix behavior of local git remotes that have annex-ignore + set to be the same as ssh git remotes. + * Added annex.security.allow-insecure-https config, which allows + using old http servers that use TLS 1.2 without Extended Main + Secret support. + * p2phttp: Commit git-annex branch changes promptly. + * p2phttp: Fix a server stall by disabling warp's slowloris attack + prevention. + * p2phttp: Added --cpus option. + * Avoid ever starting more capabilities than the number of cpus. + * fsck: Support repairing a corrupted file in a versioned S3 remote. + * Fix incorrect transfer direction in remote transfer log when + downloading from a local git remote. + * Fix bug that prevented 2 clones of a local git remote + from concurrently downloading the same file. + * rsync: Avoid deleting contents of a non-empty directory when + removing the last exported file from the directory. + * unregisterurl: Fix display of action to not be "registerurl". + * The OsPath build flag requires file-io 0.2.0, which fixes several + issues. + * Remove deprecated commands direct, indirect, proxy, and transferkeys. + * Deprecate undo command. + * Remove undo action from kde and nautilus integrations. + * Fix build on BSDs. Thanks, Greg Steuck"""]] \ No newline at end of file
Added a comment: Ensuring only one process
diff --git a/doc/design/external_special_remote_protocol/async_appendix/comment_4_9ce6f33448a4fde446365746fd094cd6._comment b/doc/design/external_special_remote_protocol/async_appendix/comment_4_9ce6f33448a4fde446365746fd094cd6._comment new file mode 100644 index 0000000000..97fa175719 --- /dev/null +++ b/doc/design/external_special_remote_protocol/async_appendix/comment_4_9ce6f33448a4fde446365746fd094cd6._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="calmofthestorm" + avatar="http://cdn.libravatar.org/avatar/9d97e9bcb1cf7680309e37cd69fab408" + subject="Ensuring only one process" + date="2026-02-13T06:01:44Z" + content=""" +I changed it so that each instance binds an RPC server to a UNIX domain socket and connects to it, so one gets chosen as coordinator, and it works great. Probably overkill, but it works. Still interested in the question of how this **should** work though. +"""]]
Added a comment: the fix
diff --git a/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file/comment_1_12ec16bb350e9a291b6ce39bceeea692._comment b/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file/comment_1_12ec16bb350e9a291b6ce39bceeea692._comment new file mode 100644 index 0000000000..0ecad855ce --- /dev/null +++ b/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file/comment_1_12ec16bb350e9a291b6ce39bceeea692._comment @@ -0,0 +1,75 @@ +[[!comment format=mdwn + username="yarikoptic" + avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4" + subject="the fix" + date="2026-02-13T01:43:15Z" + content=""" +TL;DR, the patch + +```patch +diff --git a/Remote/Git.hs b/Remote/Git.hs +index 6b7dc77d98..4faaea082d 100644 +--- a/Remote/Git.hs ++++ b/Remote/Git.hs +@@ -482,7 +482,12 @@ inAnnex' repo rmt st@(State connpool duc _ _ _ _) key + keyUrls :: GitConfig -> Git.Repo -> Remote -> Key -> [String] + keyUrls gc repo r key = map tourl locs' + where +- tourl l = Git.repoLocation repo ++ \"/\" ++ l ++ tourl l = Git.repoLocation repo ++ \"/\" ++ escapeURIString escchar l ++ -- Escape characters that are not allowed unescaped in a URI ++ -- path component, but don't escape '/' since the location ++ -- is a path with multiple components. ++ escchar '/' = True ++ escchar c = isUnescapedInURIComponent c + -- If the remote is known to not be bare, try the hash locations + -- used for non-bare repos first, as an optimisation. + locs +``` + +seems to work well. Built in https://github.com/datalad/git-annex/pull/251 (CI tests still run), tested locally: + +``` +❯ /usr/bin/git-annex version +git-annex version: 10.20260115+git119-g43a3f3aaf2-1~ndall+1 +build flags: Assistant Webapp Inotify DBus DesktopNotify TorrentParser MagicMime Benchmark Feeds Testsuite S3 WebDAV Servant +dependency versions: aws-0.24.1 bloomfilter-2.0.1.2 crypton-0.34 DAV-1.3.4 feed-1.3.2.1 ghc-9.6.6 http-client-0.7.17 torrent-10000.1.3 uuid-1.3.15 yesod-1.6.2.1 +... +❯ /usr/bin/git-annex get --from origin video.mkv +get video.mkv (from origin...) ok +(recording state in git...) +``` +to work. Here is claude's analysis which lead it to the fix: + +``` + Bug Analysis: fails_to_get_from_apache2_server_URL_backend_file + + Root Cause + + The bug is in Remote/Git.hs:485 — the keyUrls function constructs URLs by simple string concatenation without URL-encoding the path components: + + tourl l = Git.repoLocation repo ++ \"/\" ++ l + + How the failure occurs + + 1. Key: URL--yt:https://www.youtube.com/watch,63v,613ZXfZfnRfyM + 2. keyFile encoding (Annex/Locations.hs:783-795) converts : → &c and / → %: + URL--yt&chttps&c%%www.youtube.com%watch,63v,613ZXfZfnRfyM + 3. keyUrls concatenates this directly into the URL path: + https://datasets.datalad.org/.../.git//annex/objects/zZ/3v/URL--yt&chttps&c%%www.youtube.com%watch,63v,613ZXfZfnRfyM/... + 4. parseURIRelaxed (Utility/Url/Parse.hs:45-47) tries to parse this URL. It calls escapeURIString isAllowedInURI first, but % is allowed in URIs (it's the percent-encoding introducer), so it passes through unescaped. + 5. parseURI then sees %%w and %wa which are invalid percent-encoding sequences (% must be followed by two hex digits). The parse fails, returning Nothing. + 6. download' (Utility/Url.hs:389-391) hits the Nothing branch and returns \"invalid url\". + + Why SHA256E keys work + + SHA256E keys (like SHA256E-s107998--4545...) contain only alphanumeric characters, -, and .. The keyFile encoding produces no % or & characters, so the concatenated URL is always valid. + + The fix + + keyUrls in Remote/Git.hs:485 needs to URL-encode the path components. Other remotes already do this: + + - S3 (Remote/S3.hs:1221-1229): uses escapeURIString with a custom predicate keeping / but encoding everything else + - WebDAV (Remote/WebDAV/DavLocation.hs:35): uses escapeURIString isUnescapedInURI +``` +"""]]
report on inability to get video!
diff --git a/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file.mdwn b/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file.mdwn new file mode 100644 index 0000000000..5303ad1b52 --- /dev/null +++ b/doc/bugs/fails_to_get_from_apache2_server_URL_backend_file.mdwn @@ -0,0 +1,107 @@ +### Please describe the problem. + +For a relaxed url youtube video, git-annex seems just completely skip even trying (I see no apache2 log hits) to download from http git remote where it even points to correct HTTP address, and then just proceeds to yt-dlp to just fail there: + +```shell +❯ git clone https://datasets.datalad.org/repronim/ReproTube/AFNIBootcamp/.git/ +Cloning into 'AFNIBootcamp'... +remote: Enumerating objects: 5904, done. +remote: Counting objects: 100% (5904/5904), done. +remote: Compressing objects: 100% (1793/1793), done. +remote: Total 5904 (delta 2659), reused 5554 (delta 2644), pack-reused 0 (from 0) +Receiving objects: 100% (5904/5904), 743.23 KiB | 2.23 MiB/s, done. +Resolving deltas: 100% (2659/2659), done. +❯ cd AFNIBootcamp +authors.tsv@ channel.json channel_avatar.jpg@ playlists/ videos/ +❯ git annex whereis videos/2020/04/2020-04-17_AFNI-Academy-AFNI-GUI-Clusterizing/video.mkv +whereis videos/2020/04/2020-04-17_AFNI-Academy-AFNI-GUI-Clusterizing/video.mkv (3 copies) + 00000000-0000-0000-0000-000000000001 -- web + cc815e85-73bc-4a5c-81c3-81a39b0c677b -- yoh@falkor:/srv/datasets.datalad.org/www/repronim/ReproTube/AFNIBootcamp [origin] + f574aace-b921-4987-b376-f43cfcc0e925 -- annextube YouTube archive + + web: https://www.youtube.com/watch?v=3ZXfZfnRfyM +ok +❯ git annex --debug get --from origin videos/2020/04/2020-04-17_AFNI-Academy-AFNI-GUI-Clusterizing/video.mkv +[2026-02-12 17:58:20.476127402] (Utility.Process) process [1348659] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","git-annex"] +[2026-02-12 17:58:20.477586712] (Utility.Process) process [1348659] done ExitSuccess +[2026-02-12 17:58:20.477947473] (Utility.Process) process [1348660] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/git-annex"] +[2026-02-12 17:58:20.479504195] (Utility.Process) process [1348660] done ExitSuccess +[2026-02-12 17:58:20.480128621] (Utility.Process) process [1348661] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","log","refs/heads/git-annex..164c7074ef367be9c939366c3febb2322f70c103","--pretty=%H","-n1"] +[2026-02-12 17:58:20.482481122] (Utility.Process) process [1348661] done ExitSuccess +[2026-02-12 17:58:20.484072231] (Utility.Process) process [1348662] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch"] +[2026-02-12 17:58:20.488013705] (Utility.Process) process [1348663] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","ls-files","--stage","-z","--error-unmatch","--","videos/2020/04/2020-04-17_AFNI-Academy-AFNI-GUI-Clusterizing/video.mkv"] +[2026-02-12 17:58:20.488431021] (Utility.Process) process [1348664] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch-check=%(objectname) %(objecttype) %(objectsize)","--buffer"] +[2026-02-12 17:58:20.488864415] (Utility.Process) process [1348665] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch=%(objectname) %(objecttype) %(objectsize)","--buffer"] +[2026-02-12 17:58:20.489285814] (Utility.Process) process [1348666] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch=%(objectname) %(objecttype) %(objectsize)","--buffer"] +[2026-02-12 17:58:20.491835062] (Utility.Process) process [1348666] done ExitSuccess +[2026-02-12 17:58:20.491913957] (Utility.Process) process [1348665] done ExitSuccess +[2026-02-12 17:58:20.491944604] (Utility.Process) process [1348664] done ExitSuccess +[2026-02-12 17:58:20.491970167] (Utility.Process) process [1348663] done ExitSuccess +get videos/2020/04/2020-04-17_AFNI-Academy-AFNI-GUI-Clusterizing/video.mkv (from origin...) +[2026-02-12 17:58:20.516237522] (Utility.Url) https://datasets.datalad.org/repronim/ReproTube/AFNIBootcamp/.git//annex/objects/zZ/3v/URL--yt&chttps&c%%www.youtube.com%watch,63v,613ZXfZfnRfyM/URL--yt&chttps&c%%www.youtube.com%watch,63v,613ZXfZfnRfyM +[2026-02-12 17:58:20.519744566] (Utility.Url) https://datasets.datalad.org/repronim/ReproTube/AFNIBootcamp/.git//annex/objects/950/20d/URL--yt&chttps&c%%www.youtube.com%watch,63v,613ZXfZfnRfyM/URL--yt&chttps&c%%www.youtube.com%watch,63v,613ZXfZfnRfyM + download failed: invalid url + + failed to download content +(Delaying 1s before retrying....) +[2026-02-12 17:58:21.524457718] (Utility.Url) https://datasets.datalad.org/repronim/ReproTube/AFNIBootcamp/.git//annex/objects/zZ/3v/URL--yt&chttps&c%%www.youtube.com%watch,63v,613ZXfZfnRfyM/URL--yt&chttps&c%%www.youtube.com%watch,63v,613ZXfZfnRfyM +[2026-02-12 17:58:21.527050375] (Utility.Url) https://datasets.datalad.org/repronim/ReproTube/AFNIBootcamp/.git//annex/objects/950/20d/URL--yt&chttps&c%%www.youtube.com%watch,63v,613ZXfZfnRfyM/URL--yt&chttps&c%%www.youtube.com%watch,63v,6get videos/2020/04/2020-04-17_AFNI-Academy-AFNI-GUI-Clusterizing/video.mkv (from origin...) + download failed: invalid url + + failed to download content +(Delaying 1s before retrying....) + + download failed: invalid url + + failed to download content +(Delaying 2s before retrying....) + + download failed: invalid url + + failed to download content +failed +[2026-02-12 17:58:23.537290686] (Utility.Process) process [1348662] done ExitSuccess +get: 1 failed +``` + +for a simpler file -- works fine + +``` +❯ git annex whereis channel_avatar.jpg +whereis channel_avatar.jpg (2 copies) + cc815e85-73bc-4a5c-81c3-81a39b0c677b -- yoh@falkor:/srv/datasets.datalad.org/www/repronim/ReproTube/AFNIBootcamp [origin] + f574aace-b921-4987-b376-f43cfcc0e925 -- annextube YouTube archive +ok +❯ git annex get --from origin channel_avatar.jpg +get channel_avatar.jpg (from origin...) ok +(recording state in git...) +❯ ls -l channel_avatar.jpg +lrwxrwxrwx 1 yoh yoh 196 Feb 12 17:57 channel_avatar.jpg -> .git/annex/objects/54/77/SHA256E-s107998--454529608f75da5804000d74018ff790ec24a03eef3544fc44c28071e31acd15.jpg/SHA256E-s107998--454529608f75da5804000d74018ff790ec24a03eef3544fc44c28071e31acd15.jpg +``` + +### What steps will reproduce the problem? + +``` + git clone https://datasets.datalad.org/repronim/ReproTube/AFNIBootcamp/.git/ + cd AFNIBootcamp + git annex whereis videos/2020/04/2020-04-17_AFNI-Academy-AFNI-GUI-Clusterizing/video.mkv + git annex --debug get --from origin videos/2020/04/2020-04-17_AFNI-Academy-AFNI-GUI-Clusterizing/video.mkv +``` + +### What version of git-annex are you using? On what operating system? + + +```shell +❯ git annex version +git-annex version: 10.20250929-gf014fd60d05a3407e2f747e0394997d3780eeafc +``` +but did try even most recent + + +### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders) + +when I was lucky - yes. + +[[!meta author=yoh]] +[[!tag projects/repronim]] +
removed
diff --git a/doc/special_remotes/S3/comment_41_4e5ec774da272291dfd39bf04e3075ed._comment b/doc/special_remotes/S3/comment_41_4e5ec774da272291dfd39bf04e3075ed._comment
deleted file mode 100644
index f91c71e705..0000000000
--- a/doc/special_remotes/S3/comment_41_4e5ec774da272291dfd39bf04e3075ed._comment
+++ /dev/null
@@ -1,31 +0,0 @@
-[[!comment format=mdwn
- username="Basile.Pinsard"
- avatar="http://cdn.libravatar.org/avatar/87e1f73acf277ad0337b90fc0253c62e"
- subject="git-annex: potential data loss with initremote and push with S3 special remotes. "
- date="2026-02-12T20:41:19Z"
- content="""
-A colleague used a wrong config, which was pointing to minio console rather than the S3 endpoint. When they ran initremote, the console wrongfully replied 200-OK when PUTting the annex-uuid file, same when they then pushed the data. The minio console always redirect to a login page, and doesn't fail on PUT ( which is non-compliant ). So the dataset recorded all the data being present in that remote, while there was no trace of any buckets or objects in the S3.
-
-## steps to reproduce:
-
-```
-git init test_s3
-cd test_s3/
-git-annex init
-export AWS_ACCESS_KEY_ID=john AWS_SECRET_ACCESS_KEY=doe
-git annex initremote -d test_remote host=\"play.min.io\" bucket=\"test_bucket\" type=S3 encryption=none autoenable=true port=9443 protocol=https chunk=1GiB requeststyle=pathecho test > test_annexed_file
-git-annex add test_annexed_file
-git commit -m 'add annexed file'
-git-annex copy --fast --to test_remote
-```
-
-I am showing it with `--fast` flag here, as this is what datalad uses by default. Without `--fast`, it fails with (HeaderException {headerErrorMessage = \"ETag missing\"}) failed which is better.
-
-So to sum it up, the unfortunate circumstances are:
-
-1. the initremote PUT of annex-uuid is not performing check that the annex-uuid file was effectively pushed in a bucket.
-2. minio console replies with 200-OK for all http requests
-3. datalad uses `push --fast` by default, which recorded files as being pushed without performing a HEAD after push. I guess that's for performance reason, but that is dangerous if a server or reverse-proxy ends-up responding 200-OK to all requests after init.
-
-Thanks for your help!
-"""]]
Added a comment: git-annex: potential data loss with initremote and push with S3 special remotes.
diff --git a/doc/special_remotes/S3/comment_41_4e5ec774da272291dfd39bf04e3075ed._comment b/doc/special_remotes/S3/comment_41_4e5ec774da272291dfd39bf04e3075ed._comment
new file mode 100644
index 0000000000..f91c71e705
--- /dev/null
+++ b/doc/special_remotes/S3/comment_41_4e5ec774da272291dfd39bf04e3075ed._comment
@@ -0,0 +1,31 @@
+[[!comment format=mdwn
+ username="Basile.Pinsard"
+ avatar="http://cdn.libravatar.org/avatar/87e1f73acf277ad0337b90fc0253c62e"
+ subject="git-annex: potential data loss with initremote and push with S3 special remotes. "
+ date="2026-02-12T20:41:19Z"
+ content="""
+A colleague used a wrong config, which was pointing to minio console rather than the S3 endpoint. When they ran initremote, the console wrongfully replied 200-OK when PUTting the annex-uuid file, same when they then pushed the data. The minio console always redirect to a login page, and doesn't fail on PUT ( which is non-compliant ). So the dataset recorded all the data being present in that remote, while there was no trace of any buckets or objects in the S3.
+
+## steps to reproduce:
+
+```
+git init test_s3
+cd test_s3/
+git-annex init
+export AWS_ACCESS_KEY_ID=john AWS_SECRET_ACCESS_KEY=doe
+git annex initremote -d test_remote host=\"play.min.io\" bucket=\"test_bucket\" type=S3 encryption=none autoenable=true port=9443 protocol=https chunk=1GiB requeststyle=pathecho test > test_annexed_file
+git-annex add test_annexed_file
+git commit -m 'add annexed file'
+git-annex copy --fast --to test_remote
+```
+
+I am showing it with `--fast` flag here, as this is what datalad uses by default. Without `--fast`, it fails with (HeaderException {headerErrorMessage = \"ETag missing\"}) failed which is better.
+
+So to sum it up, the unfortunate circumstances are:
+
+1. the initremote PUT of annex-uuid is not performing check that the annex-uuid file was effectively pushed in a bucket.
+2. minio console replies with 200-OK for all http requests
+3. datalad uses `push --fast` by default, which recorded files as being pushed without performing a HEAD after push. I guess that's for performance reason, but that is dangerous if a server or reverse-proxy ends-up responding 200-OK to all requests after init.
+
+Thanks for your help!
+"""]]
Added a comment: git-annex: potential data loss with initremote and push with S3 special remotes.
diff --git a/doc/special_remotes/S3/comment_40_63191a8ab9482ef5ff8503261bdf6b8d._comment b/doc/special_remotes/S3/comment_40_63191a8ab9482ef5ff8503261bdf6b8d._comment
new file mode 100644
index 0000000000..294f7200ea
--- /dev/null
+++ b/doc/special_remotes/S3/comment_40_63191a8ab9482ef5ff8503261bdf6b8d._comment
@@ -0,0 +1,31 @@
+[[!comment format=mdwn
+ username="Basile.Pinsard"
+ avatar="http://cdn.libravatar.org/avatar/87e1f73acf277ad0337b90fc0253c62e"
+ subject="git-annex: potential data loss with initremote and push with S3 special remotes. "
+ date="2026-02-12T20:40:44Z"
+ content="""
+A colleague used a wrong config, which was pointing to minio console rather than the S3 endpoint. When they ran initremote, the console wrongfully replied 200-OK when PUTting the annex-uuid file, same when they then pushed the data. The minio console always redirect to a login page, and doesn't fail on PUT ( which is non-compliant ). So the dataset recorded all the data being present in that remote, while there was no trace of any buckets or objects in the S3.
+
+## steps to reproduce:
+
+```
+git init test_s3
+cd test_s3/
+git-annex init
+export AWS_ACCESS_KEY_ID=john AWS_SECRET_ACCESS_KEY=doe
+git annex initremote -d test_remote host=\"play.min.io\" bucket=\"test_bucket\" type=S3 encryption=none autoenable=true port=9443 protocol=https chunk=1GiB requeststyle=pathecho test > test_annexed_file
+git-annex add test_annexed_file
+git commit -m 'add annexed file'
+git-annex copy --fast --to test_remote
+```
+
+I am showing it with `--fast` flag here, as this is what datalad uses by default. Without `--fast`, it fails with (HeaderException {headerErrorMessage = \"ETag missing\"}) failed which is better.
+
+So to sum it up, the unfortunate circumstances are:
+
+1. the initremote PUT of annex-uuid is not performing check that the annex-uuid file was effectively pushed in a bucket.
+2. minio console replies with 200-OK for all http requests
+3. datalad uses `push --fast` by default, which recorded files as being pushed without performing a HEAD after push. I guess that's for performance reason, but that is dangerous if a server or reverse-proxy ends-up responding 200-OK to all requests after init.
+
+Thanks for your help!
+"""]]
Added a comment: Ensuring only one process
diff --git a/doc/design/external_special_remote_protocol/async_appendix/comment_3_cb80269f5a1ebc292bc63f2736f13262._comment b/doc/design/external_special_remote_protocol/async_appendix/comment_3_cb80269f5a1ebc292bc63f2736f13262._comment new file mode 100644 index 0000000000..2aecd31bed --- /dev/null +++ b/doc/design/external_special_remote_protocol/async_appendix/comment_3_cb80269f5a1ebc292bc63f2736f13262._comment @@ -0,0 +1,14 @@ +[[!comment format=mdwn + username="calmofthestorm" + avatar="http://cdn.libravatar.org/avatar/9d97e9bcb1cf7680309e37cd69fab408" + subject="Ensuring only one process" + date="2026-02-12T19:10:11Z" + content=""" +My remote (version 1) uses a database that is multithreaded but has process-level locking. Despite using async, multiple remote processes are still being started in `testremote`. Right now I have it working with POSIX advisory locks and open/close the database for each operation in each thread in each process, but that's a lot of overhead. Is there a better way to do this? I could make them coordinate via IPC, or have them release the lock only when idle/when others are waiting, but it seems like it shouldn't be that complex. + +I get clear failures when I use `testremote`. On real workloads (with -j 24) it is more confusing. There are not errors, but at some point the `git-annex` command hangs. Quite possibly a bug in my code, given `testremote` is failing. + +I guess my question is: Is there a way to force git-annex to only use one special remote process, either by configuration or by having all but the first return \"use the other one\" (without -j 1 always)? And does the way this is handled differ between actual use and `testremote`? + +Or to put it another way: how do you envision one should design a special remote that supports concurrency and relies on a database with process-level locking? +"""]]
update
diff --git a/doc/todo/replace_assistant_with_assist.mdwn b/doc/todo/replace_assistant_with_assist.mdwn
index 939717afab..a7fc10c923 100644
--- a/doc/todo/replace_assistant_with_assist.mdwn
+++ b/doc/todo/replace_assistant_with_assist.mdwn
@@ -23,15 +23,17 @@ Basically replacing the assistant needs 3 things:
change has been made to a remote.
3. Wait for commits and trigger `git-annex push` to remotes.
-There is more than that to the assistant, eg automatic periodic fscking,
-various attempts to diagnose and fix problems with repositories, live
-detection of configuration changes, detecting drive mount events, etc. But
-those 3 would be enough for most users.
-
Those could be 3 separate programs, which would gain the benefits of
composition. If the user only wants automatic commits but not pushing or
pulling, they can run one 1 program.
+There is more than that to the assistant, eg automatic periodic fscking,
+various attempts to diagnose and fix problems with repositories, live
+detection of configuration changes, detecting drive mount events, etc. But
+those 3 would be enough for most users. Alternatively, keeping that other
+stuff, but replacing the parts of the assistant that do those three things,
+would also ease maintenance.
+
This would also probably involve [[remove_webapp]], although in theory the
webapp could be retained, with only the parts of the assistant that handle
staging, committing, pull, and push replaced.