Recent changes to this wiki:

Suggest that 'git annex unused' reports total unused size
diff --git a/doc/todo/Size_of_unused_files.mdwn b/doc/todo/Size_of_unused_files.mdwn
new file mode 100644
index 0000000000..d16b4e09fc
--- /dev/null
+++ b/doc/todo/Size_of_unused_files.mdwn
@@ -0,0 +1,14 @@
+It would be nice to have `git annex unused` report the total size of unused files.
+
+I tried:
+
+- ❌ `git annex unused`: doesn't show the size
+- ❌ `git annex info`: doesn't list unused size
+- ❌ `git annex info --unused`: doesn't work
+- ❌ `git annex list --unused`: doesn't work
+- ❌ `git annex find --unused`: doesn't work
+
+Maybe I'm missing something, but it feels like having `git annex unused` and `git annex info` directly report it would be nice.
+
+Cheers,
+Yann

merge from proxy branch
diff --git a/doc/design/passthrough_proxy.mdwn b/doc/design/passthrough_proxy.mdwn
index 4b86037471..22c8d34e82 100644
--- a/doc/design/passthrough_proxy.mdwn
+++ b/doc/design/passthrough_proxy.mdwn
@@ -148,40 +148,8 @@ Configuring the instantiated remotes like that would let anyone who can
 write to the git-annex branch flood other people's repos with configs
 for any number of git remotes. Which might be obnoxious.
 
-## single upload with fanout
-
-If we want to send a file to multiple repositories that are behind the same
-proxy, it would be wasteful to upload it through the proxy repeatedly.
-
-Perhaps a good user interface to this is `git-annex copy --to proxy`.
-The proxy could fan out the upload and store it in one or more nodes behind
-it. Using preferred content to select which nodes to use.
-This would need `storeKey` to be changed to allow returning a UUID (or UUIDs)
-where the content was actually stored.
-
-Alternatively, `git-annex copy --to proxy-foo` could notice that proxy-bar
-also wants the content, and fan out a copy to there. Then it could 
-record in its git-annex branch that the content is present in proxy-bar.
-If the user later does `git-annex copy --to proxy-bar`, it would avoid
-another upload (and the user would learn at that point that it was in
-proxy-bar). This avoids needing to change the `storeKey` interface.
-
-Should a proxy always fanout? if `git-annex copy --to proxy` is what does
-fanout, and `git-annex copy --to proxy-foo` doesn't, then the user has
-content. But if the latter does fanout, that might be annoying to users who
-want to use proxies, but want full control over what lands where, and don't
-want to use preferred content to do it. So probably fanout should be
-configurable. But it can't be configured client side, because the fanout
-happens on the proxy. Seems like remote.name.annex-fanout could be set to
-false to prevent fanout to a specific remote. (This is analagous to a
-remote having `git-annex assistant` running on it, it might fan out uploads
-to it to other repos, and only the owner of that repo can control it.)
-
-A command like `git-annex push` would see all the instantiated remotes and
-would pick ones to send content to. If the proxy does fanout, this would
-lead to `git-annex push` doing extra work iterating over instantiated
-remotes that have already received content via fanout. Could this extra
-work be avoided?
+Ah, instead git-annex's tab completion can be made to include instantiated
+remotes, no need to list them in git config.
 
 ## clusters
 
@@ -208,8 +176,20 @@ For this we need a UUID for the cluster. But it is not like a usual UUID.
 It does not need to actually be recorded in the location tracking logs, and
 it is not counted as a copy for numcopies purposes. The only point of this
 UUID is to make commands like `git-annex drop --from cluster` and
-`git-annex get --from cluster` talk to the cluster's frontend proxy, which
-has as its UUID the cluster's UUID.
+`git-annex get --from cluster` talk to the cluster's frontend proxy.
+
+Cluster UUIDs need to be distinguishable from regular repository UUIDs.
+This is partly to guard against a situation where a regular repository's
+UUID gets used for a cluster. Also it will make implementation easier to be
+able to inspect a UUID and know if it's a cluster UUID. Use a version 8
+UUID, all random except the first octet set to 'a' and the second to 'c'.
+
+The proxy log contains the cluster UUID (with a remote name like
+"cluster"), as well as the UUIDs of the nodes of the cluster.
+This lets the client access the cluster using the proxy, and it lets the
+client access individual nodes (so it can lock content on them while
+dropping). Note that more than one proxy can be in front of the same
+cluster, and multiple clusters can be accessed via the same proxy.
 
 The cluster UUID is recorded in the git-annex branch, along with a list of
 the UUIDs of nodes of the cluster (which can change at any time).
@@ -220,11 +200,11 @@ of the cluster, the cluster's UUID is added to the list of UUIDs.
 When writing a location log, the cluster's UUID is filtered out of the list
 of UUIDs.
 
-The cluster's frontend proxy fans out uploads to nodes according to
-preferred content. And `storeKey` is extended to be able to return a list
-of additional UUIDs where the content was stored. So an upload to the
-cluster will end up writing to the location log the actual nodes that it
-was fanned out to. 
+When proxying an upload to the cluster's UUID, git-annex-shell fans out
+uploads to nodes according to preferred content. And `storeKey` is extended
+to be able to return a list of additional UUIDs where the content was
+stored. So an upload to the cluster will end up writing to the location log
+the actual nodes that it was fanned out to. 
 
 Note that to support clusters that are nodes of clusters, when a cluster's
 frontend proxy fans out an upload to a node, and `storeKey` returns
@@ -232,45 +212,89 @@ additional UUIDs, it should pass those UUIDs along. Of course, no cluster
 can be a node of itself, and cycles have to be broken (as described in a
 section below).
 
-When a file is requested from the cluster's frontend proxy, it can send its
-own local copy if it has one, but otherwise it will proxy to one of its
-nodes. (How to pick which node to use? Load balancing?) This behavior will
-need to be added to git-annex-shell, and to Remote.Git for local paths to a
-cluster.
-
-The cluster's frontend proxy also fans out drops to all nodes, attempting
-to drop content from the whole cluster, and only indicating success if it
-can. Also needs changes to git-annex-sjell and Remote.Git.
+When a file is requested from the cluster's UUID, git-annex-shell picks one
+of the nodes that has the content, and proxies to that one.
+(How to pick which node to use? Load balancing?)
+And, if the proxy repository itself contains the requested key, it can send
+it directly. This allows the proxy repository to be primed with frequently
+accessed files when it has the space.
+
+(Should uploads check preferred content of the proxy repository and also
+store a copy there when allowed? I think this would be ok, so long as when
+preferred content is not set, it does not default to storing content
+there.)
+
+When a drop is requested from the cluster's UUID, git-annex-shell drops
+from all nodes, as well as from the proxy itself. Only indicating success
+if it is able to delete all copies from the cluster. This needs 
+`removeKey` to be extended to return UUIDs that the content was dropped
+from in addition to the remote's uuid (both on success and on failure)
+so that the local location log can be updated.
 
 It does not fan out lockcontent, instead the client will lock content
 on specific nodes. In fact, the cluster UUID should probably be omitted
 when constructing a drop proof, since trying to lockcontent on it will
-usually fail.
+always fail. Also, when constructing a drop proof for a cluster's UUID,
+the nodes of that cluster should be omitted, otherwise a drop from the
+cluster can lock content on individual nodes, causing the drop to fail.
 
 Some commands like `git-annex whereis` will list content as being stored in
-the cluster, as well as on whicheven of its nodes, and whereis currently
+the cluster, as well as on whichever of its nodes, and whereis currently
 says "n copies", but since the cluster doesn't count as a copy, that
 display should probably be counted using the numcopies logic that excludes
 cluster UUIDs.
 
-No other protocol extensions or special cases should be needed. Except for
-the strange case of content stored in the cluster's frontend proxy.
-
-Running `git-annex fsck --fast` on the cluster's frontend proxy will look
-weird: For each file, it will read the location log, and if the file is
-present on any node it will add the frontend proxy's UUID. So fsck will
-expect the content to be present. But it probably won't be. So it will fix
-the location log... which will make no changes since the proxy's UUID will
-be filtered out on write. So probably fsck will need a special case to
-avoid this behavior. (Also for `git-annex fsck --from cluster --fast`)
-
-And if a key does get stored on the cluster's frontend proxy, it will not
-be possible to tell from looking at the location log that the content is
-really present there. So that won't be counted as a copy. In some cases,
-a cluster's frontend proxy may want to keep files, perhaps some files are
-worth caching there for speed. But if a file is stored only on the
-cluster's frontend proxy and not in any of its nodes, clients will not
-consider the cluster to contain the file at all.
+No other protocol extensions or special cases should be needed.
+
+## single upload with fanout
+
+If we want to send a file to multiple repositories that are behind the same
+proxy, it would be wasteful to upload it through the proxy repeatedly.
+
+This is certianly needed when doing `git-annex copy --to remote-cluster`,
+the cluster picks the nodes to store the content in, and it needs to report
+back some UUID that is different than the cluster UUID, in order for the
+location log to get updated. (Cluster UUIDs are not written to the location
+log.) So this will need a change to the P2P protocol to support reporting
+back additional UUIDs where the content was stored.
+
+This might also be useful for proxies. `git-annex copy --to proxy-foo`
+could notice that proxy-bar also wants the content, and fan out a copy to
+there. But that might be annoying to users, who want full control over what
+goes where when using a proxy. Seems it would need a config setting. But
+since clusters will support fanout, it seems unncessary to make proxies
+also support it.
+
+A command like `git-annex push` would see all the instantiated remotes and
+would pick ones to send content to. If fanout is done, this would
+lead to `git-annex push` doing extra work iterating over instantiated
+remotes that have already received content via fanout. Could this extra
+work be avoided?
+
+## cluster configuration lockdown
+
+If some organization is running a cluster, and giving others access to it,
+they may want to prevent letting those others make changes to the
+configuration of the cluster. But the cluster is configured via the
+git-annex branch, particularly preferred content, and the proxy log, and
+the cluster log.
+
+A user could, for example, make the cluster's frontend want all
+content, and so fill up its small disk. They could make a particular node
+not want any content. They could remove nodes from the cluster.
+
+One way to deal with this is for the cluster to reject git-annex branch
+pushes that make such changes. Or only allow them if they are signed with a
+given gpg key. This seems like a tractable enough set of limitations that
+it could be checked by git-annex, in a git hook, when a git config is set
+to lock down the proxy configuration.
+
+Of course, someone with access to a cluster can also drop all data from
+it! Unless git-annex-shell is run with `GIT_ANNEX_SHELL_APPENDONLY` set.
+

(Diff truncated)
add my distribits talk
diff --git a/doc/videos/distribits.mdwn b/doc/videos/distribits.mdwn
new file mode 100644
index 0000000000..9657313ea0
--- /dev/null
+++ b/doc/videos/distribits.mdwn
@@ -0,0 +1,10 @@
+Joey Hess presented at [Distribits 2024](https://distribits.live/)
+a talk titled "git annex is complete, right?"
+
+- [on youtube](https://www.youtube.com/watch?v=pp8IeGXpRRI&list=PLEQHbPfpVqU6esVrgqjfYybY394XD2qf2&index=3)
+- [a mirror](https://downloads.kitenet.net/talks/distribits_2024__git-annex_is_complete,_right.mkv)
+
+Many of the other talks at Distribits also involved git-annex.
+[Playlist](https://www.youtube.com/playlist?list=PLEQHbPfpVqU6esVrgqjfYybY394XD2qf2)
+
+[[!meta title="git-annex presentation by Joey Hess at Distribits 2024"]]

Added a comment
diff --git a/doc/todo/Metadata_on_regular_git_objects___40__blob__44___trees__41____63__/comment_2_95aa2375cbc9789ff87b343e10e7ca67._comment b/doc/todo/Metadata_on_regular_git_objects___40__blob__44___trees__41____63__/comment_2_95aa2375cbc9789ff87b343e10e7ca67._comment
new file mode 100644
index 0000000000..04db76329e
--- /dev/null
+++ b/doc/todo/Metadata_on_regular_git_objects___40__blob__44___trees__41____63__/comment_2_95aa2375cbc9789ff87b343e10e7ca67._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="joris"
+ avatar="http://cdn.libravatar.org/avatar/6fb83f8f62afd4adac91fb14b60928c6"
+ subject="comment 2"
+ date="2024-06-20T09:58:05Z"
+ content="""
+Was this ever explored more? This would be very interesting to be able to use the metadata functionality on regular git files that are not in the annex.
+"""]]

diff --git a/doc/forum/Repository_on_large_disk_server__44___browse_on_client.mdwn b/doc/forum/Repository_on_large_disk_server__44___browse_on_client.mdwn
new file mode 100644
index 0000000000..1a001bf4f8
--- /dev/null
+++ b/doc/forum/Repository_on_large_disk_server__44___browse_on_client.mdwn
@@ -0,0 +1,8 @@
+Hi, I have some large repositories on a separate disk server that I would like to be able to browse on my desktop pc or laptop.
+The repositories do not fit on the my client's disk, therefore I cannot just use `git annex get .`
+One solution would be a readonly NFS mount. However, adding new files as I now more complicated: I have to clone the repo (via ssh) to my desktop/laptop, add new files, use `git annex copy` to get them on the server and then update the working copy there.
+In addition, the readonly mount does not allow me to modify text files which are not managed by git annex.
+
+I've been thinking about using some kind of union fs (overlayfs / mergerfs) but the dead symlinks of the local copy would probably hide the files of the NFS mount. I could probably also just symlink .git/annex/objects to the NFS mount but that sounds like a pretty unsafe and bad idea.
+
+Any suggestions how I might solve this problem?

original report / question
diff --git a/doc/bugs/git_annex_find_dies_of_signal_11_some_times_on_OSX.mdwn b/doc/bugs/git_annex_find_dies_of_signal_11_some_times_on_OSX.mdwn
new file mode 100644
index 0000000000..bb2e8d22f5
--- /dev/null
+++ b/doc/bugs/git_annex_find_dies_of_signal_11_some_times_on_OSX.mdwn
@@ -0,0 +1,25 @@
+### Please describe the problem.
+
+Very rarely we get a unittest to error out with smth like
+
+```
+2024-06-18T03:18:50.5586670Z E           datalad.runner.exception.CommandError: CommandError: 'git -c diff.ignoreSubmodules=none -c core.quotepath=false annex find --anything --include '*' --json --json-error-messages -c annex.dotfiles=true' failed with exitcode 139 under /private/var/folders/b3/2xm02wpd21qgrpkck5q1c6k40000gn/T/datalad_temp_test_path_diff49gfi408 [info keys: stdout_json] [err: 'error: git-annex died of signal 11']
+2024-06-18T03:18:50.5588210Z 
+2024-06-18T03:18:50.5588730Z /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/datalad/runner/runner.py:242: CommandError
+```
+
+unfortunately no more information is captured.  I just wanted to seek ideas on what could lead to exit with 11 and may be what data to collect.
+
+original report: [datalad/issues/7490](https://github.com/datalad/datalad/issues/7490)
+
+### What steps will reproduce the problem?
+
+not sure yet how feasible would be to reproduce since happens really rarely
+
+### What version of git-annex are you using? On what operating system?
+
+OSX. Last time - from brew git-annex--10.20240531.arm64_sonoma.bottle.tar.gz
+
+
+[[!meta author=yoh]]
+[[!tag projects/repronim]]

Added a comment
diff --git a/doc/tuning/comment_7_628c75470167ad5d11dff23b230c6452._comment b/doc/tuning/comment_7_628c75470167ad5d11dff23b230c6452._comment
new file mode 100644
index 0000000000..ce5e0139b0
--- /dev/null
+++ b/doc/tuning/comment_7_628c75470167ad5d11dff23b230c6452._comment
@@ -0,0 +1,20 @@
+[[!comment format=mdwn
+ username="beryllium@5bc3c32eb8156390f96e363e4ba38976567425ec"
+ nickname="beryllium"
+ avatar="http://cdn.libravatar.org/avatar/62b67d68e918b381e7e9dd6a96c16137"
+ subject="comment 7"
+ date="2024-06-15T07:37:06Z"
+ content="""
+I have found one way to graft in the S3 bucket. And that involves performing git-annex initremote cloud type=S3 <params>, which unavoidably creates a new dummybucket (can use bucket=dummy to identify it). Then performing git-annex enableremote cloud bucket=cloud-<origuuid> to utilise the original bucket without having to copy/move over all the files.
+
+I did try it in one shot with git-annex initremote cloud type=S3 bucket=cloud-<origuuid> <params>, but unfortunately it fails because the creation of the bucket step appears mandatory, and the S3 api errors out with an \"already created bucket\" type of error.
+
+However, if there is a general guidance somewhere for... I guess importing/exporting the special remote metadata (including stored encryption keys), that would be very much appreciated.
+
+Sorry, I should just clarify. Trying to do this via sync from the old, non-tuned git-annex repo fails with:
+
+	git-annex: Remote repository is tuned in incompatible way; cannot be merged with local repository.
+
+Which I understand for the wider branch data implications... but I don't know enough to understand why just the special remote data can't be merge in.
+
+"""]]

Added a comment: Grafting? a special remote for tuned migration
diff --git a/doc/tuning/comment_6_6d202923948509737cc43831dca2c827._comment b/doc/tuning/comment_6_6d202923948509737cc43831dca2c827._comment
new file mode 100644
index 0000000000..1346676ec7
--- /dev/null
+++ b/doc/tuning/comment_6_6d202923948509737cc43831dca2c827._comment
@@ -0,0 +1,22 @@
+[[!comment format=mdwn
+ username="beryllium@5bc3c32eb8156390f96e363e4ba38976567425ec"
+ nickname="beryllium"
+ avatar="http://cdn.libravatar.org/avatar/62b67d68e918b381e7e9dd6a96c16137"
+ subject="Grafting? a special remote for tuned migration"
+ date="2024-06-15T00:57:26Z"
+ content="""
+Naively, I put myself in a position where my rather large, untuned git-annex had to be recovered due to not appreciating the effect of case-insensitive filesystems.
+
+Specifically, NTFS-3G is deadly in this case. Because, whilst Windows has advanced, and with WSL added the ability to add case-sensitivity on a folder, which is also inheritable to folders under it... NTFS-3G does not do this.
+
+So beware if you try to work in an \"interoperable\" way. NTFS-3G will do mixed case, but will create child folders that are not case-sensitive.
+
+To that end, I want to migrate this rather large git-annex to be tuned to annex.tune.objecthashlower. I already have a good strategy around this. I'll just create a completely new stream of git-annex'es originating from a newly formed one. I will also be able to create new type=directory special remotes for my \"tape-out\" existing git-annex. I will just use git annex fsck --fast --from $remote to rebuild the location data for it.
+
+I've also tested this with an S3 git-annex as a proof-of-concept. So in the new git-annex, I ran git-annex initremote cloud type=S3... to create a new bucket, copied over a file from the old bucket, and rebuilt the location data for that file.
+
+But I really really would like to be able to avoid creating a new bucket. I am happy to lose the file presence/location data for the old bucket, but I'd like to graft back in, or initremote the cloud bucket with matching parameters. So too I guess, with an encrypted special remote, ie. import over the encryption keys, etc.
+
+Are there \"plumbing\" commands that can do this? Or does it require knowing about the low-level storage of this metadata to achieve it, which seems to just send me back to the earlier comment of using a filter-branch... which I am hoping to avoid (because of all the potential pit-falls)
+
+"""]]

update
diff --git a/doc/thanks/list b/doc/thanks/list
index 9089bb87af..5682043b13 100644
--- a/doc/thanks/list
+++ b/doc/thanks/list
@@ -118,3 +118,5 @@ Stephen Seo,
 Antoine Balaine, 
 mycroft, 
 Lerrr, 
+Eve, 
+Marco, 

comment
diff --git a/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_5_7a14589a7ca4957ae758e342cc7b4596._comment b/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_5_7a14589a7ca4957ae758e342cc7b4596._comment
new file mode 100644
index 0000000000..f9abe7780e
--- /dev/null
+++ b/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_5_7a14589a7ca4957ae758e342cc7b4596._comment
@@ -0,0 +1,91 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 5"""
+ date="2024-06-13T18:01:01Z"
+ content="""
+Looking at the behavior of `git-annex get`, the first one leaves the index
+in a diff state:
+
+	joey@darkstar:~/tmp/b2/x>git-annex get funky
+	get funky (from origin...)
+	ok
+	(recording state in git...)
+	joey@darkstar:~/tmp/b2/x>git diff --cached
+	diff --git a/funky b/funky
+	index a8813f1..9488a18 100644
+	--- a/funky
+	+++ b/funky
+	@@ -1 +1 @@
+	-/annex/objects/WORM--foo
+	+/annex/objects/SHA256E-s30--93c16dbf65b7b66e479bd484398c09c920338e4a1df1fe352b245078d04645f4
+
+To the second `git-annex get`, this is indistinguishable from a different
+unlocked file having been moved over top of funky. So the behavior of the
+second one is fine. 
+
+The problem is with the first `git-annex get` leaving the index in that state.
+
+What's happening is, it doesn't restage the index, because the restage
+itself can't tell the difference between this state and an unlocked file having
+been moved over top of funky. In particular, `git update-index --refresh --stdin`
+when run after the first `git-annex get`, and fed "funky", leaves the index in diff state.
+
+	joey@darkstar:~/tmp/b2/x>touch funky
+	joey@darkstar:~/tmp/b2/x>echo funky | GIT_TRACE=1 git update-index --refresh --stdin
+	14:14:33.911458 git.c:465               trace: built-in: git update-index --refresh --stdin
+	14:14:33.911759 run-command.c:657       trace: run_command: 'git-annex filter-process'
+	14:14:33.917118 git.c:465               trace: built-in: git config --null --list
+	14:14:33.919641 git.c:465               trace: built-in: git show-ref git-annex
+	14:14:33.921390 git.c:465               trace: built-in: git show-ref --hash refs/heads/git-annex
+	14:14:33.925579 git.c:465               trace: built-in: git cat-file --batch
+	14:14:33.927011 run-command.c:50        trace: run_command: running exit handler for pid 1164525
+	joey@darkstar:~/tmp/b2/x>git status --short
+	M  funky
+
+So git update-index is running `git-annex filter-process`, which is doing
+the same as `git-annex smudge --clean funky` in this case.
+And in Command.Smudge.clean, there is a `parseLinkTargetOrPointerLazy'` call
+which is intended to avoid storing a pointer file in the annex... The very
+thing that the assistant is somehow incorrectly doing. In this case
+though, that notices that funky's content looks like an annex pointer file,
+so it outputs that pointer. So git stages that pointer.
+
+To avoid this, the first `git-annex get` would need to notice that the
+content it got looks like a pointer file. And it would need to communicate
+that through the `git update-index` somehow to `git-annex filter-process`. Then
+when that saw the same pointer file, it could output the original key, and
+this situation would be avoided. Also bear in mind that the 
+`git update-index` can be interrupted and get restarted later and
+it would still need to remember that it was dealing with this case then.
+This seems... doable, but it will not be easy.
+
+PS, Full script to synthesize a repository with this situation follows:
+
+	git init z
+	cd z
+	git-annex init
+	git commit --allow-empty -m created
+	cd ..
+	git clone z y
+	cd y
+	git-annex init
+	echo 'Thu Jun 13 12:30:17 JEST 2024' > foo
+	git-annex add foo
+	git commit -m added
+	git-annex move --foo --to origin
+	git rm foo
+	git commit -m removed
+	echo '/annex/objects/SHA256E-s30--93c16dbf65b7b66e479bd484398c09c920338e4a1df1fe352b245078d04645f4' > funkyobj
+	git-annex setkey WORM--foo funkyobj
+	echo '/annex/objects/WORM--foo' > funky
+	git add funky
+	git commit -m add\ funky
+	git annex find --format='${key}\n' funky
+	git-annex get funky
+	cd ..
+	git clone y x
+	cd x
+	git remote add z ../z
+	git-annex get funky
+	git-annex get funky
+"""]]

comment
diff --git a/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_4_5ec1ab77318889c1545f4881ab6e44e9._comment b/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_4_5ec1ab77318889c1545f4881ab6e44e9._comment
new file mode 100644
index 0000000000..17387f0088
--- /dev/null
+++ b/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_4_5ec1ab77318889c1545f4881ab6e44e9._comment
@@ -0,0 +1,38 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 4"""
+ date="2024-06-13T17:07:02Z"
+ content="""
+`git-annex add` (and smudge) use `isPointerFile` to check if a file that is
+being added is an annex pointer file. And in that case they stage the
+pointer file, rather than injecting it into the annex.
+
+The assistant also checks `isPointerFile` though. And in the simple case,
+it also commits a newly added pointer file correctly:
+
+	joey@darkstar:~/tmp/b2/a>git-annex assistant
+	joey@darkstar:~/tmp/b2/a>echo '/annex/objects/SHA256E-s30--93c16dbf65b7b66e479bd484398c09c920338e4a1df1fe352b245078d04645f4' > new
+	joey@darkstar:~/tmp/b2/a>git show|tail -n 1
+	+/annex/objects/SHA256E-s30--93c16dbf65b7b66e479bd484398c09c920338e4a1df1fe352b245078d04645f4
+
+So this makes me think of a race condition. What if the file is not a pointer
+file when the assistant checks `isPointerFile`. But then it gets turned into
+one before it ingests it.
+
+In `git-annex add`, it first stats the file before checking if it's a pointer
+file, and later it checks if the file has changed while it was being added,
+which should avoid such races.
+
+Looking at the assistant, I'm not at all confident it handles such a race.
+
+It might even be another thread of the assistant that triggered the race.
+Could be that something caused the assistant to drop the file,
+then get it again, then drop it again. (Eg something wrong with
+configuration causing a non-stable state... like "not present" in preferred
+content).
+
+I've tried running a get/drop/get/drop loop while the assistant is running,
+and have not seen this happen to a file yet. But the race window is probably small.
+An interesting thing I did notice is that sometimes when such a loop runs for a while,
+the file will be left as a pointer file after `git-annex get`.
+"""]]

partial reproducer
diff --git a/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_3_c68cdec52b134a775cc9d84daa75b4f8._comment b/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_3_c68cdec52b134a775cc9d84daa75b4f8._comment
new file mode 100644
index 0000000000..4ea531db32
--- /dev/null
+++ b/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_3_c68cdec52b134a775cc9d84daa75b4f8._comment
@@ -0,0 +1,66 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2024-06-13T16:31:57Z"
+ content="""
+First I wanted to see if I could get this to happen without the assistant.
+
+	joey@darkstar:~/tmp/y>echo '/annex/objects/SHA256E-s30--93c16dbf65b7b66e479bd484398c09c920338e4a1df1fe352b245078d04645f4' > new
+	joey@darkstar:~/tmp/y>git annex add new
+	add new ok
+	joey@darkstar:~/tmp/y>git annex find --format='${key}\n' new
+	SHA256E-s30--93c16dbf65b7b66e479bd484398c09c920338e4a1df1fe352b245078d04645f4
+
+	joey@darkstar:~/tmp/y>git config annex.largefiles anything
+	joey@darkstar:~/tmp/y>echo '/annex/objects/SHA256E-s30--93c16dbf65b7b66e479bd484398c09c920338e4a1df1fe352b245078d04645f4' > new2
+	joey@darkstar:~/tmp/y>git add new2
+	joey@darkstar:~/tmp/y>git annex find --format='${key}\n' new2
+	SHA256E-s30--93c16dbf65b7b66e479bd484398c09c920338e4a1df1fe352b245078d04645f4
+
+So no, it must be only the assistant that can mess up and add an annexed
+link to the annex.
+
+Secondly, here's a way to manually create a repository with this behavior
+w/o using the assistant.
+
+	joey@darkstar:~/tmp/y>git remote add z ../z
+	joey@darkstar:~/tmp/y>git-annex move --key SHA256E-s30--93c16dbf65b7b66e479bd484398c09c920338e4a1df1fe352b245078d04645f4 --to z
+	joey@darkstar:~/tmp/y>echo '/annex/objects/SHA256E-s30--93c16dbf65b7b66e479bd484398c09c920338e4a1df1fe352b245078d04645f4' > funkyobj
+	joey@darkstar:~/tmp/y>git-annex setkey WORM--foo funkyobj
+	setkey funkyobj ok
+	joey@darkstar:~/tmp/y>echo '/annex/objects/WORM--foo' > funky
+	joey@darkstar:~/tmp/y>git add funky
+	git-annex: git status will show funky to be modified, since content availability has changed and git-annex was unable to update the index. This is only a cosmetic problem affecting git status; git add, git commit, etc won't be affected. To fix the git status display, you can run: git-annex restage
+	joey@darkstar:~/tmp/y>git commit -m add funky
+	joey@darkstar:~/tmp/y>git annex find --format='${key}\n' funky
+	WORM--foo
+	joey@darkstar:~/tmp/y>cat funky
+	/annex/objects/SHA256E-s30--93c16dbf65b7b66e479bd484398c09c920338e4a1df1fe352b245078d04645f4
+	joey@darkstar:~/tmp/y>git-annex get funky
+	joey@darkstar:~/tmp/y>	
+
+Nothing has gone wrong yet, funky is an unlocked file and it happens to have
+the content of an annex pointer file, but git-annex is not treating that
+content *as* an annex pointer file. If it were, the `git-annex get funky` above
+would get the SHA256 key from remote x.
+
+But in a fresh clone, it's another story:
+
+	joey@darkstar:~/tmp>git clone y x
+	joey@darkstar:~/tmp>cd x
+	joey@darkstar:~/tmp/x>git remote add z ../z
+	joey@darkstar:~/tmp/x>cat funky
+	/annex/objects/WORM--foo
+	joey@darkstar:~/tmp/x>git-annex get funky
+	get funky (from origin...)
+	ok
+	(recording state in git...)
+	joey@darkstar:~/tmp/x>git-annex get funky
+	get funky (from z...)
+	ok
+	(recording state in git...)
+	joey@darkstar:~/tmp/x>cat funky
+	Thu Jun 13 12:30:17 JEST 2024
+
+Which reproduces what you showed. I think this on its own is a bug, leaving aside whatever caused the assistant to generate this.
+"""]]

copied over some changes from proxy branch
diff --git a/doc/design/passthrough_proxy.mdwn b/doc/design/passthrough_proxy.mdwn
index 9742cb3686..4b86037471 100644
--- a/doc/design/passthrough_proxy.mdwn
+++ b/doc/design/passthrough_proxy.mdwn
@@ -15,7 +15,7 @@ existing remotes to keep up with changes are made on the server side.
 A proxy would avoid this complexity. It also allows limiting network
 ingress to a single point.
 
-Ideally a proxy would look like any other git-annex remote. All the files
+A proxy can be the frontend to a cluster. All the files
 stored anywhere in the cluster would be available to retrieve from the
 proxy. When a file is sent to the proxy, it would store it somewhere in the
 cluster.
@@ -108,55 +108,169 @@ The only real difference seems to be that the UUID of a remote is cached,
 so A could only do this the first time we accessed it, and not later.
 With UUID discovery, A can do that at any time.
 
-## user interface
+## proxied remote names
 
 What to name the instantiated remotes? Probably the best that could
 be done is to use the proxy's own remote names as suffixes on the client.
 Eg, the proxy's "node1" remote is "proxy-node1".
 
-But the user probably doesn't want to pick which node to send content to.
-They don't necessarily know anything about the nodes. Ideally the user
-would `git-annex copy --to proxy` or `git-annex push` and let it pick
-which instantiated remote(s) to send to.
-
-To make `git-annex copy --to proxy` work, `storeKey` could be changed to
-allow returning a UUID (or UUIDs) where the content was actually stored.
-That would also allow a single upload to the proxy to fan out and be stored
-in multiple nodes. The proxy would use preferred content to pick which of
-its nodes to store on.
-
-Instantiated remotes would still be needed for `git-annex get` and similar
-to work.
-
-To make `git-annex copy --from proxy` work, the proxy would need to pick
-a node and stream content from it. That's doable, but how to handle a case
-where a node gets corrupted? The best it could do is mark that node as no
-longer containing the content (as if a fsck failed) and try another one
-next time. This complication might not be necessary. Consider that
-while `git-annex copy --to foo` followed later by `git-annex copy --from foo`
-will usually work, it doesn't work when eg first copying to a transfer
-remote, which then sends the content elsewhere and drops its copy.
-
-What about dropping? `git-annex drop --from proxy` could be made to work,
-by having `removeKey` return a list of UUIDs that the content was dropped
-from. What should that do if it's able to drop from some nodes but not
-others? Perhaps it would need to be able to return a list of UUIDs that
-content was dropped from but still indicate it overall failed to drop.
-(Note that it's entirely possible that dropping from one node of the proxy
-involves lockContent on another node of the proxy in order to satisfy
-numcopies.)
+But, the user might have their own "proxy-node1" remote configured that
+points to something else. To avoid a proxy changing the configuration of
+the user's remote to point to its remote, git-annex must avoid
+instantiating a proxied remote when there's already a configuration for a
+remote with that same name.
+
+That does mean that, if a user wants to set a git config for a proxy
+remote, they will need to manually set its annex-uuid and its url.
+Which is awkward. Many git configs of the proxy remote can be inherited by
+the instantiated remotes, so users won't often need to do that.
+
+A user can also set up a remote with another name that they
+prefer, that points at a remote behind a proxy. They just need to set
+its annex-uuid and its url. Perhaps there should be a git-annex command
+that eases setting up a remote like that?
+
+## proxied remotes in git remote list
+
+Should instantiated remotes have enough configured in git so that
+`git remote list` will list them? This would make things like tab
+completion of proxied remotes work, and would generally let the user
+discover that there *are* proxied remotes.
+
+This could be done by a config like remote.name.annex-proxied = true.
+That makes other configs of the remote not prevent it being used as an
+instantiated remote. So remote.name.annex-uuid can be changed when
+the uuid behind a proxy changes. And it allows updating remote.name.url
+to keep it the same as the proxy remote's url. (Or possibly to set it to
+something else?)
+
+Configuring the instantiated remotes like that would let anyone who can
+write to the git-annex branch flood other people's repos with configs
+for any number of git remotes. Which might be obnoxious.
+
+## single upload with fanout
+
+If we want to send a file to multiple repositories that are behind the same
+proxy, it would be wasteful to upload it through the proxy repeatedly.
+
+Perhaps a good user interface to this is `git-annex copy --to proxy`.
+The proxy could fan out the upload and store it in one or more nodes behind
+it. Using preferred content to select which nodes to use.
+This would need `storeKey` to be changed to allow returning a UUID (or UUIDs)
+where the content was actually stored.
+
+Alternatively, `git-annex copy --to proxy-foo` could notice that proxy-bar
+also wants the content, and fan out a copy to there. Then it could 
+record in its git-annex branch that the content is present in proxy-bar.
+If the user later does `git-annex copy --to proxy-bar`, it would avoid
+another upload (and the user would learn at that point that it was in
+proxy-bar). This avoids needing to change the `storeKey` interface.
+
+Should a proxy always fanout? if `git-annex copy --to proxy` is what does
+fanout, and `git-annex copy --to proxy-foo` doesn't, then the user has
+content. But if the latter does fanout, that might be annoying to users who
+want to use proxies, but want full control over what lands where, and don't
+want to use preferred content to do it. So probably fanout should be
+configurable. But it can't be configured client side, because the fanout
+happens on the proxy. Seems like remote.name.annex-fanout could be set to
+false to prevent fanout to a specific remote. (This is analagous to a
+remote having `git-annex assistant` running on it, it might fan out uploads
+to it to other repos, and only the owner of that repo can control it.)
 
 A command like `git-annex push` would see all the instantiated remotes and
-would pick one to send content to. Seems like the proxy might choose to
-`storeKey` the content on other node(s) than the requested one. Which would
-be fine. But, `git-annex push` would still do considerable extra work in
-iterating over all the instantiated remotes. So it might be better to make
-such commands not operate on instantiated remotes for sending content but
-only on the proxy. 
+would pick ones to send content to. If the proxy does fanout, this would
+lead to `git-annex push` doing extra work iterating over instantiated
+remotes that have already received content via fanout. Could this extra
+work be avoided?
+
+## clusters
+
+One way to use a proxy is just as a convenient way to access a group of
+remotes that are behind it. Some remotes may only be reachable by the
+proxy, but you still know what the individual remotes are. Eg, one might be
+a S3 bucket that can only be written via the proxy, but is globally
+readable without going through the proxy. Another might be a drive that is
+sometimes located behind the proxy, but other times connected directly.
+Using a proxy this way just involves using the instantiated proxied remotes.
+
+Or a proxy can be the frontend for a cluster. In this situation, the user
+doesn't know anything much about the nodes in the cluster, perhaps not even
+that they exist, or perhaps what keys are stored on which nodes.
+
+In the cluster case, the user would like to not need to pick a specific
+node to send content to. While they could use preferred content to pick a
+node, or nodes, they would prefer to be able to say `git-annex copy --to cluster` 
+and let it pick which nodes to send to. And similarly,
+`git-annex drop --from cluster' should drop the content from every node in
+the cluster.
+
+For this we need a UUID for the cluster. But it is not like a usual UUID.
+It does not need to actually be recorded in the location tracking logs, and
+it is not counted as a copy for numcopies purposes. The only point of this
+UUID is to make commands like `git-annex drop --from cluster` and
+`git-annex get --from cluster` talk to the cluster's frontend proxy, which
+has as its UUID the cluster's UUID.
+
+The cluster UUID is recorded in the git-annex branch, along with a list of
+the UUIDs of nodes of the cluster (which can change at any time).
+
+When reading a location log, if any UUID where content is present is part
+of the cluster, the cluster's UUID is added to the list of UUIDs.
+
+When writing a location log, the cluster's UUID is filtered out of the list
+of UUIDs.
+
+The cluster's frontend proxy fans out uploads to nodes according to
+preferred content. And `storeKey` is extended to be able to return a list
+of additional UUIDs where the content was stored. So an upload to the
+cluster will end up writing to the location log the actual nodes that it
+was fanned out to. 
+
+Note that to support clusters that are nodes of clusters, when a cluster's
+frontend proxy fans out an upload to a node, and `storeKey` returns
+additional UUIDs, it should pass those UUIDs along. Of course, no cluster
+can be a node of itself, and cycles have to be broken (as described in a
+section below).
+
+When a file is requested from the cluster's frontend proxy, it can send its
+own local copy if it has one, but otherwise it will proxy to one of its
+nodes. (How to pick which node to use? Load balancing?) This behavior will
+need to be added to git-annex-shell, and to Remote.Git for local paths to a
+cluster.
 
-Commands like `git-annex push` and `git-annex pull`
-should also skip the instantiated remotes when pushing or pulling the git
-repo, because that would be extra work that accomplishes nothing.
+The cluster's frontend proxy also fans out drops to all nodes, attempting
+to drop content from the whole cluster, and only indicating success if it
+can. Also needs changes to git-annex-sjell and Remote.Git.
+
+It does not fan out lockcontent, instead the client will lock content
+on specific nodes. In fact, the cluster UUID should probably be omitted
+when constructing a drop proof, since trying to lockcontent on it will
+usually fail.
+
+Some commands like `git-annex whereis` will list content as being stored in
+the cluster, as well as on whicheven of its nodes, and whereis currently
+says "n copies", but since the cluster doesn't count as a copy, that

(Diff truncated)
diff --git a/doc/bugs/VURL_verification_failure_on_first_download.mdwn b/doc/bugs/VURL_verification_failure_on_first_download.mdwn
new file mode 100644
index 0000000000..7cadb61389
--- /dev/null
+++ b/doc/bugs/VURL_verification_failure_on_first_download.mdwn
@@ -0,0 +1,93 @@
+### Please describe the problem.
+
+With an external special remote that handles a custom URL scheme, I receive a "Verification of content failed" on the first `git annex get` of a file (i.e. when git-annex cannot know a checksum for the file, yet).
+
+Sorry that this is hidden in a bit of indirection in a datalad extension, what it does is effectively just implement an external special remote that handles `cds:` URLs and then `git annex addurl --fast --verifiable` those URLs. I get the same verification error even with `--relaxed` instead of `--fast` (though I would like to have the semantics of `--fast`, i.e. record checksum on first download and then always check against that).
+
+### What steps will reproduce the problem?
+
+Install datalad, and datalad-cds from this PR: <https://github.com/matrss/datalad-cds/pull/16>. Then:
+[[!format sh """
+datalad create test-ds
+cd test-ds/
+datalad download-cds --lazy --path download.grib '{
+    "dataset": "reanalysis-era5-pressure-levels",
+    "sub-selection": {
+        "variable": "temperature",
+        "pressure_level": "1000",
+        "product_type": "reanalysis",
+        "date": "2017-12-01/2017-12-31",
+        "time": "12:00",
+        "format": "grib" 
+    } 
+}'
+git annex get download.grib
+"""]]
+
+
+### What version of git-annex are you using? On what operating system?
+
+```
+git-annex version: 10.20240430
+build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite S3 WebDAV
+dependency versions: aws-0.24.1 bloomfilter-2.0.1.2 crypton-0.34 DAV-1.3.4 feed-1.3.2.1 ghc-9.6.5 http-client-0.7.17 persistent-sqlite-2.13.3.0 torrent-10000.1.3 uuid-1.3.15 yesod-1.6.2.1
+key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL VURL X*
+remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg rclone hook external
+operating system: linux x86_64
+supported repository versions: 8 9 10
+upgrade supported from repository versions: 0 1 2 3 4 5 6 7 8 9 10
+```
+
+on Ubuntu, installed from a recent version of nixpkgs. Also happens in CI (see PR in datalad-cds) where git-annex is installed from NeuroDebian.
+
+
+### Please provide any additional information below.
+
+[[!format sh """
+# If you can, paste a complete transcript of the problem occurring here.
+# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
+
+$ datalad create test-ds
+create(ok): <...> (dataset)
+$ cd test-ds/
+$ datalad download-cds --lazy --path download.grib '{
+    "dataset": "reanalysis-era5-pressure-levels",
+    "sub-selection": {
+        "variable": "temperature",
+        "pressure_level": "1000",
+        "product_type": "reanalysis",
+        "date": "2017-12-01/2017-12-31",
+        "time": "12:00",
+        "format": "grib" 
+    } 
+}'
+save(ok): . (dataset)                                                                                                                                                                                                    
+cds(ok): <...> (dataset)                                                                                                                                                             
+$ git annex info download.grib
+file: download.grib
+size: 0 bytes (+ 1 unknown size)
+key: VURL--cds:v1-eyJkYXRhc2V0IjoicmVhbmFs-77566133ebfe9220aefbeed5a58b6972
+present: false
+$ git annex get download.grib
+get download.grib (from cds...) 
+
+  CDS request is submitted
+
+  CDS request is completed
+
+  Starting download from CDS
+(checksum...)                  
+  Verification of content failed
+
+  Unable to access these remotes: cds
+
+  No other repository is known to contain the file.
+failed
+get: 1 failed
+
+# End of transcript or log.
+"""]]
+
+### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
+
+

expanding on the exporttree=yes design
diff --git a/doc/design/passthrough_proxy.mdwn b/doc/design/passthrough_proxy.mdwn
index 76e6c2cc18..9742cb3686 100644
--- a/doc/design/passthrough_proxy.mdwn
+++ b/doc/design/passthrough_proxy.mdwn
@@ -272,9 +272,9 @@ Could the proxy be in front of a special remote that uses exporttree=yes?
 
 Some possible approaches:
 
-* Proxy caches files until all the files in the configured
+* Proxy caches files somewhere until all the files in the configured
   annex-tracking-branch are available, then exports them all to the special
-  remote. Not ideal at all.
+  remote.
 * Proxy exports each file to the special remote as it is received.
   It records an incomplete tree export after each export.
   Once all files in the configured annex-tracking-branch have been sent,
@@ -288,9 +288,55 @@ The first two approaches need some way to communicate the
 configured annex-tracking-branch over the P2P protocol. Or to communicate
 the tree that it currently points to.
 
+A proxy for a git repo does not proxy access to the git repo itself, so
+`git push origin-foo master` actually pushes the ref to the proxy's own git
+repo. Perhaps this points in a direction of how the proxy could learn what
+tree to export to exporttree=yes remotes. But only vaguely since how would
+it pick which of multiple branches to export?
+
+Perhaps configure the annex-tracking-branch in the git-annex branch?
+That might be generally useful when working with exporttree=yes remotes.
+
 The first two approaches also have a complication when a key is sent to
 the proxy that is not part of the configured annex-tracking-branch. What
-does the proxy do with it?
+does the proxy do with it? There seem three possibilities:
+
+1. Reject the transfer of the key.
+2. Send the key to another proxied remote that is not exporttree=yes
+   (and get it from there later if needed to finish populating an export)
+3. Store the key locally. (Not desirable because proxy repos may be on
+   small disks as they don't usually need to hold any files.)
+  
+The third approach would mean the user needs to use `git-annex export --to`
+in order to update proxied exporttree remotes. Which gets in the way of the
+other proxy workflows and requires them to know that the proxy has an
+exporttree remote behind it.
+
+Tentative design for exporttree=yes with proxies:
+
+* Configure annex-tracking-branch for the proxy in the git-annex branch.
+  (For the proxy as a whole, or for specific exporttree=yes repos behind
+  it?)
+* Then the user's workflow is simply: `git-annex push proxy`
+* sync/push need to first push any updated annex-tracking-branch to the
+  proxy before sending content to it. (Currently sync only pushes at the
+  end.)
+* If proxied remotes are all exporttree=yes, the proxy rejects any
+  transfers of a key that is not in the annex-tracking-branch that it
+  currently knows about. If there is any other proxied remote, the proxy
+  can direct such transfers to it.
+* Upon receiving a new annex-tracking-branch or any transfer of a key
+  used in the current annex-tracking-branch, the proxy can update
+  the exporttree=yes remotes. This needs to happen incrementally,
+  eg upon receiving a key, just proxy it on to the exporttree=yes remote,
+  and update the export database. Once all keys are received, update
+  the git-annex branch to indicate a new tree has been exported.
+* Upon receiving a git push of the annex-tracking-branch, a proxy might
+  be able to get all the changed objects from non-exporttree=yes proxied
+  remotes that contain them. If so it can update the exporttree=yes
+  remote automatically and inexpensively. At the same time, a
+  `git-annex push` will be attempting to send those same objects.
+  So somehow the proxy will need to manage this situation.
 
 ## possible enhancement: indirect uploads
 

TODO for log --key
diff --git a/doc/todo/add_--key_to___34__annex_log__34__.mdwn b/doc/todo/add_--key_to___34__annex_log__34__.mdwn
new file mode 100644
index 0000000000..fdeb6062b6
--- /dev/null
+++ b/doc/todo/add_--key_to___34__annex_log__34__.mdwn
@@ -0,0 +1,15 @@
+```
+NAME
+       git-annex-log - shows location log information
+
+SYNOPSIS
+       git annex log [path ...]
+
+```
+
+although quite often desired to check by the key which might not even be in the tree. `whereis` ( a sister command for similar investigations ) has `--key`, so I thought it would be great to get it here too.
+
+In my case -- doing archaeology on AFNI's test data in [https://github.com/afni/afni/pull/656](https://github.com/afni/afni/pull/656).
+
+[[!meta author=yoh]]
+[[!tag projects/repronim]]

Added a comment
diff --git a/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_2_5dfa78ee6436020596f4b2efe678f05b._comment b/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_2_5dfa78ee6436020596f4b2efe678f05b._comment
new file mode 100644
index 0000000000..1801ec2d2a
--- /dev/null
+++ b/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_2_5dfa78ee6436020596f4b2efe678f05b._comment
@@ -0,0 +1,88 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="comment 2"
+ date="2024-06-11T17:36:51Z"
+ content="""
+interestingly on the client `git restore --staged PATH` managed to recover the link to become \"proper\". And `git-annex restage` did nothing to fix situation with `Modified` file:
+
+```
+[bids@rolando VIDS] > git merge --ff-only synced/master
+Updating b4f3af57..263dad67
+Updating files: 100% (871/871), done.
+Fast-forward
+ .gitattributes                                                           |  1 +
+ .gitignore                          
+...
+create mode 100644 logs/2024-05-24T07:35-04:00.log
+ create mode 100644 logs/2024-05-24T07:35-04:00.logpwd
+
+
+
+git-annex: git status will show Videos/2024/03/2024.03.17.14.09.12.550_2024.03.17.14.09.18.818.mkv.log to be modified, since content availability has changed and git-annex was unable to update the index. This is only a cosmetic problem affecting git status; git add, git commit, etc won't be affected. To fix the git status display, you can run: git-annex restage
+[bids@rolando VIDS] > 
+[bids@rolando VIDS] > 
+[bids@rolando VIDS] > 
+[bids@rolando VIDS] > git-annex restage
+restage  ok
+[bids@rolando VIDS] > git status
+On branch master
+Changes to be committed:
+  (use \"git restore --staged <file>...\" to unstage)
+	modified:   Videos/2024/03/2024.03.17.14.09.12.550_2024.03.17.14.09.18.818.mkv.log
+
+[bids@rolando VIDS] > git-annex restage 
+restage  ok
+[bids@rolando VIDS] > git status
+On branch master
+Changes to be committed:
+  (use \"git restore --staged <file>...\" to unstage)
+	modified:   Videos/2024/03/2024.03.17.14.09.12.550_2024.03.17.14.09.18.818.mkv.log
+
+[bids@rolando VIDS] > git-annex restage  Videos/2024/03/2024.03.17.14.09.12.550_2024.03.17.14.09.18.818.mkv.log
+git-annex: This command takes no parameters.
+[bids@rolando VIDS] > git status
+On branch master
+Changes to be committed:
+  (use \"git restore --staged <file>...\" to unstage)
+	modified:   Videos/2024/03/2024.03.17.14.09.12.550_2024.03.17.14.09.18.818.mkv.log
+
+[bids@rolando VIDS] > git restore --staged Videos/2024/03/2024.03.17.14.09.12.550_2024.03.17.14.09.18.818.mkv.log
+[bids@rolando VIDS] > git status
+On branch master
+Changes not staged for commit:
+  (use \"git add <file>...\" to update what will be committed)
+  (use \"git restore <file>...\" to discard changes in working directory)
+	modified:   Videos/2024/03/2024.03.17.14.09.12.550_2024.03.17.14.09.18.818.mkv.log
+
+no changes added to commit (use \"git add\" and/or \"git commit -a\")
+[bids@rolando VIDS] > git diff
+diff --git a/Videos/2024/03/2024.03.17.14.09.12.550_2024.03.17.14.09.18.818.mkv.log b/Videos/2024/03/2024.03.17.14.09.12.550_2024.03.17.14.09.18.818.mkv.log
+index 92b79020..fc930f54 100644
+--- a/Videos/2024/03/2024.03.17.14.09.12.550_2024.03.17.14.09.18.818.mkv.log
++++ b/Videos/2024/03/2024.03.17.14.09.12.550_2024.03.17.14.09.18.818.mkv.log
+@@ -1 +1 @@
+-/annex/objects/MD5E-s69--08983cc11522233e5d4815e4ef62275a.mkv.log
++/annex/objects/MD5E-s68799--29541299bea3691f430d855d2fb432fb.mkv.log
+diff --git a/Videos/2024/04/2024.04.04.06.01.22.647_.mkv.log b/Videos/2024/04/2024.04.04.06.01.22.647_.mkv.log
+--- a/Videos/2024/04/2024.04.04.06.01.22.647_.mkv.log
++++ b/Videos/2024/04/2024.04.04.06.01.22.647_.mkv.log
+@@ -1 +0,0 @@
+-/annex/objects/MD5E-s0--d41d8cd98f00b204e9800998ecf8427e.mkv.log
+[bids@rolando VIDS] > git log Videos/2024/03/2024.03.17.14.09.12.550_2024.03.17.14.09.18.818.mkv.log
+commit ef5549f74dfea19c11bf963a7ec9789bce0d925d
+Author: ReproStim User <changeme@example.com>
+Date:   Wed Apr 17 09:38:23 2024 -0400
+
+    Move files under subfolders
+
+```
+
+
+```
+[bids@rolando VIDS] > git --version
+git version 2.39.2
+[bids@rolando VIDS] > git annex version --raw
+10.20231129+git83-g86dbe9a825-1~ndall+1
+```
+"""]]

update
diff --git a/doc/todo/git-annex_proxies.mdwn b/doc/todo/git-annex_proxies.mdwn
index 2e8bad27cd..69257fcb9e 100644
--- a/doc/todo/git-annex_proxies.mdwn
+++ b/doc/todo/git-annex_proxies.mdwn
@@ -36,11 +36,21 @@ For June's work on [[design/passthrough_proxy]], implementation plan:
 
 2. Remote instantiation for proxies. (done)
 
+2. Bug: In a repo cloned with ssh from a proxy repo,
+   running `git-annex init` sets annex-uuid for the instantiated remotes.
+   This prevents them being used, because instanatiation is not done
+   when there's any config set for a remote.
+
 3. Implement proxying in git-annex-shell.
+   (Partly done, still need it for GET, PUT, CONNECT, and NOTIFYCHANGES
+   messages.)
 
 4. Either implement proxying for local path remotes, or prevent
    listProxied from operating on them.
 
+4. Either implement proxying for tor-annex remotes, or prevent
+   listProxied from operating on them.
+
 4. Let `storeKey` return a list of UUIDs where content was stored,
    and make proxies accept uploads directed at them, rather than a specific
    instantiated remote, and fan out the upload to whatever nodes behind

diff --git a/doc/forum/Control_socket_connect__40__..__47__.git__47__annex__47__ssh__47__server.lo.mdwn b/doc/forum/Control_socket_connect__40__..__47__.git__47__annex__47__ssh__47__server.lo.mdwn
new file mode 100644
index 0000000000..de7d8a9275
--- /dev/null
+++ b/doc/forum/Control_socket_connect__40__..__47__.git__47__annex__47__ssh__47__server.lo.mdwn
@@ -0,0 +1,70 @@
+I have this error.
+
+    $ git annex --debug enableremote server
+    [2024-06-11 08:16:48.356839038] (Utility.Process) process [17496] read: git ["--git-dir=../.git","--work-tree=..","--literal-pa
+    thspecs","-c","annex.debug=true","show-ref","git-annex"]
+    [2024-06-11 08:16:48.377496927] (Utility.Process) process [17496] done ExitSuccess
+    [2024-06-11 08:16:48.377922696] (Utility.Process) process [17501] read: git ["--git-dir=../.git","--work-tree=..","--literal-pa
+    thspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/git-annex"]
+    [2024-06-11 08:16:48.397529156] (Utility.Process) process [17501] done ExitSuccess
+    [2024-06-11 08:16:48.399718045] (Utility.Process) process [17507] chat: git ["--git-dir=../.git","--work-tree=..","--literal-pa
+    thspecs","-c","annex.debug=true","cat-file","--batch"]
+    enableremote (normal) server [2024-06-11 08:16:48.415631528] (Utility.Process) process [17509] call: git ["--git-dir=../.git","-
+    -work-tree=..","--literal-pathspecs","-c","annex.debug=true","config","remote.server.annex-ignore","false"]
+    [2024-06-11 08:16:48.425103598] (Utility.Process) process [17509] done ExitSuccess
+    [2024-06-11 08:16:48.425415775] (Utility.Process) process [17510] read: git ["config","--null","--list"] in ".."
+    [2024-06-11 08:16:48.433272117] (Git.Config) git config read: [("",[""]),("annex.backend",["SHA256"]),("annex.tune.objecthashlo
+    wer",["true"]),("annex.uuid",["b1510484-6489-4351-9876-993041f22cb3"]),("annex.version",["10"]),("core.bare",["false"]),("core.
+    filemode",["true"]),("core.logallrefupdates",["true"]),("core.repositoryformatversion",["0"]),("filter.annex.clean",["git-annex
+     smudge --clean -- %f"]),("filter.annex.process",["git-annex filter-process"]),("filter.annex.smudge",["git-annex smudge -- %f"
+    ]),("init.defaultbranch",["master"]),("remote.server.annex-ignore",["false"]),("remote.server.fetch",["+refs/heads/*:refs/remotes
+    /server/*"]),("remote.server.url",["ssh://server.local:/mnt/user/data"]),("safe.directory",["/mnt/user/data"]),("user.email",["roo
+    t","root@delta.local"]),("user.name",["root","root"])]
+    [2024-06-11 08:16:48.433479676] (Utility.Process) process [17510] done ExitSuccess
+    [2024-06-11 08:16:48.435182799] (Utility.Process) process [17511] read: ssh ["server.local","-S","../.git/annex/ssh/server.local"
+    ,"-o","ControlMaster=auto","-o","ControlPersist=yes","-n","-T","git-annex-shell 'configlist' '/mnt/user/data' '--debug'"]
+    [2024-06-11 08:16:48.619602925] (Utility.Process) process [17511] done ExitFailure 255
+    
+      Unable to parse git config from server
+    [2024-06-11 08:16:48.619932626] (Utility.Process) process [17516] call: git ["--git-dir=../.git","--work-tree=..","--literal-pa
+    thspecs","-c","annex.debug=true","fetch","--quiet","server"]
+    [2024-06-11 08:16:49.018922661] (Utility.Process) process [17516] done ExitSuccess
+    
+      Remote server does not have git-annex installed; setting annex-ignore
+    
+      This could be a problem with the git-annex installation on the remote. Please make sure that git-annex-shell is available in
+    PATH when you ssh into the remote. Once you have fixed the git-annex installation, run: git annex enableremote server
+    [2024-06-11 08:16:49.019278841] (Utility.Process) process [17520] call: git ["--git-dir=../.git","--work-tree=..","--literal-pa
+    thspecs","-c","annex.debug=true","config","remote.server.annex-ignore","true"]
+    [2024-06-11 08:16:49.028550677] (Utility.Process) process [17520] done ExitSuccess
+    [2024-06-11 08:16:49.028909964] (Utility.Process) process [17521] read: git ["config","--null","--list"] in ".."
+    [2024-06-11 08:16:49.036666793] (Git.Config) git config read: [("",[""]),("annex.backend",["SHA256"]),("annex.tune.objecthashlo
+    wer",["true"]),("annex.uuid",["b1510484-6489-4351-9876-993041f22cb3"]),("annex.version",["10"]),("core.bare",["false"]),("core.
+    filemode",["true"]),("core.logallrefupdates",["true"]),("core.repositoryformatversion",["0"]),("filter.annex.clean",["git-annex
+     smudge --clean -- %f"]),("filter.annex.process",["git-annex filter-process"]),("filter.annex.smudge",["git-annex smudge -- %f"
+    ]),("init.defaultbranch",["master"]),("remote.server.annex-ignore",["true"]),("remote.server.fetch",["+refs/heads/*:refs/remotes/
+    server/*"]),("remote.server.url",["ssh://server.local:/mnt/user/data"]),("safe.directory",["/mnt/user/data"]),("user.email",["root
+    ","root@delta.local"]),("user.name",["root","root"])]
+    [2024-06-11 08:16:49.036812734] (Utility.Process) process [17521] done ExitSuccess
+    failed
+    [2024-06-11 08:16:49.03837688] (Utility.Process) process [17522] read: ssh ["-O","stop","-S","server.local","-o","ControlMaster=
+    auto","-o","ControlPersist=yes","localhost"] in "../.git/annex/ssh/"
+    [2024-06-11 08:16:49.042787993] (Utility.Process) process [17522] done ExitFailure 255
+    [2024-06-11 08:16:49.043822645] (Utility.Process) process [17507] done ExitSuccess
+    enableremote: 1 failed
+
+I can reproduce it by calling \`ssh\` myself like this.
+
+    ssh server.local -S ../.git/annex/ssh/server.local -o ControlMaster=auto -o ControlPersist=yes -n -T git-annex-shell 'configlist' '/mnt/user/data' '--debug'
+    Control socket connect(../.git/annex/ssh/server.local): Connection refused
+    Failed to connect to new control master
+
+If I change the location of the socket file to use my home folder then it works.
+
+    ssh server.local -S $HOME/server.local -o ControlMaster=auto -o ControlPersist=yes -n -T git-annex-shell 'configlist' '/mnt/user/data' '--debug'
+    annex.uuid=23568973-b0e8-493f-9404-cce91346a818
+    core.gcrypt-id=
+
+Why isn&rsquo;t enableremote working?
+
+Thanks!

notes on behavior
diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_10_df68709f0b9cdc265bdf37056af4edcc._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_10_df68709f0b9cdc265bdf37056af4edcc._comment
new file mode 100644
index 0000000000..e55a239721
--- /dev/null
+++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_10_df68709f0b9cdc265bdf37056af4edcc._comment
@@ -0,0 +1,30 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 10"""
+ date="2024-06-10T14:36:37Z"
+ content="""
+While I don't think this affects the ds002144 repository
+(because the repository with the missing tree is dead), here's what happens
+if the export.log's tree is missing, master has been reset to a previous tree,
+which was exported earlier, and in a clone we try to get a file that is present
+in both trees from the remote:
+
+	get foo (from d...) fatal: bad object f4815823941716de0f0fdf85e8aaba98d024d488
+	
+	  unknown export location
+
+Note that the "bad object" message only appears the first time run.
+Afterwards it only says "unknown export location".
+
+Even if the tree object later somehow gets pulled in, it will keep failing,
+because the exportdb at this point contains the tree sha and it won't try
+to update from it again.
+
+To recover from this situation, the user can make a change to
+the tree (eg add a file), and export. It will complain one last time about
+the bad object, and then the export.log gets fixed to contain an available
+tree. However, any files that were in the missing tree that do not get
+overwritten by that export will remain in the remote, without git-annex
+knowing about them. If the remote has importtree=yes, importing from it
+is another way to recover.
+"""]]

diff --git a/doc/bugs/git-annex_hogs_up_all_memory_oom-killer_kills_it.mdwn b/doc/bugs/git-annex_hogs_up_all_memory_oom-killer_kills_it.mdwn
new file mode 100644
index 0000000000..b26a45c4f2
--- /dev/null
+++ b/doc/bugs/git-annex_hogs_up_all_memory_oom-killer_kills_it.mdwn
@@ -0,0 +1,75 @@
+### Please describe the problem.
+Immediately after startin the git-annex web application the git-annex process uses all available memory. After some time the linux oom-killer stops git-annex 
+
+### What steps will reproduce the problem?
+1. create a git-annex repo
+2. start git-annex webapp
+
+### What version of git-annex are you using? On what operating system?
+ii  git-annex                                      10.20240430-1                            amd64 
+debian trixie/sid
+
+
+### Please provide any additional information below.
+
+
+[[!format sh """
+# If you can, paste a complete transcript of the problem occurring here.
+# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
+
+syslog 
+
+
+root@hwarang:/var/log# grep oom *
+grep: cups: Ist ein Verzeichnis
+grep: gdm3: Ist ein Verzeichnis
+kern.log:2024-06-10T15:44:54.288491+02:00 hwarang kernel: systemd invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
+kern.log:2024-06-10T15:44:54.296307+02:00 hwarang kernel:  oom_kill_process+0xfa/0x200
+kern.log:2024-06-10T15:44:54.305858+02:00 hwarang kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
+kern.log:2024-06-10T15:44:54.317261+02:00 hwarang kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/user@1000.service/app.slice/app-gnome-git\x2dannex-7364.scope,task=git-annex,pid=7388,uid=1000
+kern.log:2024-06-10T15:44:54.317262+02:00 hwarang kernel: Out of memory: Killed process 7388 (git-annex) total-vm:83979364kB, anon-rss:30780216kB, file-rss:1792kB, shmem-rss:0kB, UID:1000 pgtables:122696kB oom_score_adj:100
+kern.log:2024-06-10T15:44:56.485580+02:00 hwarang kernel: oom_reaper: reaped process 7388 (git-annex), now anon-rss:240kB, file-rss:336kB, shmem-rss:0kB
+kern.log:2024-06-10T15:53:38.057774+02:00 hwarang kernel: teamviewerd invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
+kern.log:2024-06-10T15:53:38.059980+02:00 hwarang kernel:  oom_kill_process+0xfa/0x200
+kern.log:2024-06-10T15:53:38.062710+02:00 hwarang kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
+kern.log:2024-06-10T15:53:38.066977+02:00 hwarang kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/user@1000.service/app.slice/app-gnome-git\x2dannex-7833.scope,task=git-annex,pid=7856,uid=1000
+kern.log:2024-06-10T15:53:38.066978+02:00 hwarang kernel: Out of memory: Killed process 7856 (git-annex) total-vm:83979364kB, anon-rss:31243884kB, file-rss:1664kB, shmem-rss:0kB, UID:1000 pgtables:122900kB oom_score_adj:100
+kern.log:2024-06-10T15:53:40.337624+02:00 hwarang kernel: oom_reaper: reaped process 7856 (git-annex), now anon-rss:540kB, file-rss:128kB, shmem-rss:0kB
+grep: postgresql: Ist ein Verzeichnis
+grep: private: Ist ein Verzeichnis
+syslog:2024-06-10T15:44:54.288491+02:00 hwarang kernel: systemd invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
+syslog:2024-06-10T15:44:54.296307+02:00 hwarang kernel:  oom_kill_process+0xfa/0x200
+syslog:2024-06-10T15:44:54.305858+02:00 hwarang kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
+syslog:2024-06-10T15:44:54.317261+02:00 hwarang kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/user@1000.service/app.slice/app-gnome-git\x2dannex-7364.scope,task=git-annex,pid=7388,uid=1000
+syslog:2024-06-10T15:44:54.317262+02:00 hwarang kernel: Out of memory: Killed process 7388 (git-annex) total-vm:83979364kB, anon-rss:30780216kB, file-rss:1792kB, shmem-rss:0kB, UID:1000 pgtables:122696kB oom_score_adj:100
+syslog:2024-06-10T15:44:56.485580+02:00 hwarang kernel: oom_reaper: reaped process 7388 (git-annex), now anon-rss:240kB, file-rss:336kB, shmem-rss:0kB
+syslog:2024-06-10T15:44:56.489365+02:00 hwarang systemd[3185]: app-gnome-git\x2dannex-7364.scope: Failed with result 'oom-kill'.
+syslog:2024-06-10T15:53:38.057774+02:00 hwarang kernel: teamviewerd invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
+syslog:2024-06-10T15:53:38.059980+02:00 hwarang kernel:  oom_kill_process+0xfa/0x200
+syslog:2024-06-10T15:53:38.062710+02:00 hwarang kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
+syslog:2024-06-10T15:53:38.066977+02:00 hwarang kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/user@1000.service/app.slice/app-gnome-git\x2dannex-7833.scope,task=git-annex,pid=7856,uid=1000
+syslog:2024-06-10T15:53:38.066978+02:00 hwarang kernel: Out of memory: Killed process 7856 (git-annex) total-vm:83979364kB, anon-rss:31243884kB, file-rss:1664kB, shmem-rss:0kB, UID:1000 pgtables:122900kB oom_score_adj:100
+syslog:2024-06-10T15:53:40.337624+02:00 hwarang kernel: oom_reaper: reaped process 7856 (git-annex), now anon-rss:540kB, file-rss:128kB, shmem-rss:0kB
+syslog:2024-06-10T15:53:40.365942+02:00 hwarang systemd[3185]: app-gnome-git\x2dannex-7833.scope: Failed with result 'oom-kill'.
+grep: tomcat9: Ist ein Verzeichnis
+
+Extranct syslog End
+
+
+daemon.log
+(scanning...) (started...)
+
+daemon.status
+
+lastRunning:1718027123.570568257s
+scanComplete:False
+sanityCheckRunning:False
+lastSanityCheck:
+
+
+# End of transcript or log.
+"""]]
+
+### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
+
+

Avoid grafting in export tree objects that are missing
They could be missing due to an interrupted git-annex at just the wrong
time during a prior graft, after which the tree objects got garbage
collected.
Or they could be missing because of manual messing with the git-annex
branch, eg resetting it to back before the graft commit.
Sponsored-by: Dartmouth College's OpenNeuro project
diff --git a/Annex/Branch.hs b/Annex/Branch.hs
index 2fb9e030de..806eabb99a 100644
--- a/Annex/Branch.hs
+++ b/Annex/Branch.hs
@@ -889,9 +889,13 @@ performTransitionsLocked jl ts neednewlocalbranch transitionedrefs = do
 		return c
 	  where
 		regraft [] c = pure c
-		regraft (et:ets) c = 
-			prepRememberTreeish et graftpoint c
-				>>= regraft ets
+		regraft (et:ets) c =
+			-- Verify that the tree object exists.
+			catObjectDetails et >>= \case
+				Just _ ->
+					prepRememberTreeish et graftpoint c
+						>>= regraft ets
+				Nothing -> regraft ets c
 		graftpoint = asTopFilePath exportTreeGraftPoint
 
 checkBranchDifferences :: Git.Ref -> Annex ()
diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__.mdwn b/doc/bugs/annex_merge__breaks_git_repository__33__.mdwn
index c453f3b39c..d4b5486fc1 100644
--- a/doc/bugs/annex_merge__breaks_git_repository__33__.mdwn
+++ b/doc/bugs/annex_merge__breaks_git_repository__33__.mdwn
@@ -53,3 +53,4 @@ there are good and there are some bad days ;)
 [[!meta author=yoh]]
 [[!tag projects/openneuro]]
 
+> [[fixed|done]] --[[Joey]]
diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_9_67dfc2b1444bd345f911ed779cb98bcc._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_9_67dfc2b1444bd345f911ed779cb98bcc._comment
new file mode 100644
index 0000000000..267e31599a
--- /dev/null
+++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_9_67dfc2b1444bd345f911ed779cb98bcc._comment
@@ -0,0 +1,16 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 9"""
+ date="2024-06-07T20:25:27Z"
+ content="""
+Note that at least in the case of ds002144, its git-annex branch does not
+contain grafts of the missing trees. The grafts only get created in the
+clone when dealing with a transition.
+
+So, it seems that to recover from the problem, at least in the case of this
+repository, it will be sufficient for git-annex to avoid regrafting trees
+if the object is missing.
+
+Done that, and so I suppose this bug can be closed. I'd be more satified if
+I knew how this repository was produced though.
+"""]]

atomic git-annex branch update when regrafting in transition
Fix a bug where interrupting git-annex while it is updating the git-annex
branch could lead to git fsck complaining about missing tree objects.
Interrupting git-annex while regraftexports is running in a transition
that is forgetting git-annex branch history would leave the
repository with a git-annex branch that did not contain the tree shas
listed in export.log. That lets those trees be garbage collected.
A subsequent run of the same transition then regrafts the trees listed
in export.log into the git-annex branch. But those trees have been lost.
Note that both sides of `if neednewlocalbranch` are atomic now. I had
thought only the True side needed to be, but I do think there may be
cases where the False side needs to be as well.
Sponsored-by: Dartmouth College's OpenNeuro project
diff --git a/Annex/Branch.hs b/Annex/Branch.hs
index 717cbc0400..2fb9e030de 100644
--- a/Annex/Branch.hs
+++ b/Annex/Branch.hs
@@ -818,12 +818,18 @@ performTransitionsLocked jl ts neednewlocalbranch transitionedrefs = do
 		if neednewlocalbranch
 			then do
 				cmode <- annexCommitMode <$> Annex.getGitConfig
-				committedref <- inRepo $ Git.Branch.commitAlways cmode message fullname transitionedrefs
-				setIndexSha committedref
+				-- Creating a new empty branch must happen
+				-- atomically, so if this is interrupted,
+				-- it will not leave the new branch created
+				-- but without exports grafted in.
+				c <- inRepo $ Git.Branch.commitShaAlways
+					cmode message transitionedrefs
+				void $ regraftexports c
 			else do
 				ref <- getBranch
-				commitIndex jl ref message (nub $ fullname:transitionedrefs)
-	regraftexports
+				ref' <- regraftexports ref
+				commitIndex jl ref' message
+					(nub $ fullname:transitionedrefs)
   where
 	message
 		| neednewlocalbranch && null transitionedrefs = "new branch for transition " ++ tdesc
@@ -872,13 +878,21 @@ performTransitionsLocked jl ts neednewlocalbranch transitionedrefs = do
 					apply rest file content'
 
 	-- Trees mentioned in export.log were grafted into the old
-	-- git-annex branch to make sure they remain available. Re-graft
-	-- the trees into the new branch.
-	regraftexports = do
+	-- git-annex branch to make sure they remain available.
+	-- Re-graft the trees.
+	regraftexports parent = do
 		l <- exportedTreeishes . M.elems . parseExportLogMap
 			<$> getStaged exportLog
-		forM_ l $ \t ->
-			rememberTreeishLocked t (asTopFilePath exportTreeGraftPoint) jl
+		c <- regraft l parent
+		inRepo $ Git.Branch.update' fullname c
+		setIndexSha c
+		return c
+	  where
+		regraft [] c = pure c
+		regraft (et:ets) c = 
+			prepRememberTreeish et graftpoint c
+				>>= regraft ets
+		graftpoint = asTopFilePath exportTreeGraftPoint
 
 checkBranchDifferences :: Git.Ref -> Annex ()
 checkBranchDifferences ref = do
@@ -935,26 +949,29 @@ getMergedRefs' = do
  - Returns the sha of the git commit made to the git-annex branch.
  -}
 rememberTreeish :: Git.Ref -> TopFilePath -> Annex Git.Sha
-rememberTreeish treeish graftpoint = lockJournal $
-	rememberTreeishLocked treeish graftpoint
-rememberTreeishLocked :: Git.Ref -> TopFilePath -> JournalLocked -> Annex Git.Sha
-rememberTreeishLocked treeish graftpoint jl = do
+rememberTreeish treeish graftpoint = lockJournal $ \jl -> do
 	branchref <- getBranch
 	updateIndex jl branchref
+	c <- prepRememberTreeish treeish graftpoint branchref
+	inRepo $ Git.Branch.update' fullname c
+	-- The tree in c is the same as the tree in branchref,
+	-- and the index was updated to that above, so it's safe to
+	-- say that the index contains c.
+	setIndexSha c
+	return c
+
+{- Create a series of commits that graft a tree onto the parent commit,
+ - and then remove it. -}
+prepRememberTreeish :: Git.Ref -> TopFilePath -> Git.Ref -> Annex Git.Sha
+prepRememberTreeish treeish graftpoint parent = do
 	origtree <- fromMaybe (giveup "unable to determine git-annex branch tree") <$>
-		inRepo (Git.Ref.tree branchref)
+		inRepo (Git.Ref.tree parent)
 	addedt <- inRepo $ Git.Tree.graftTree treeish graftpoint origtree
 	cmode <- annexCommitMode <$> Annex.getGitConfig
 	c <- inRepo $ Git.Branch.commitTree cmode
-		["graft"] [branchref] addedt
-	c' <- inRepo $ Git.Branch.commitTree cmode
+		["graft"] [parent] addedt
+	inRepo $ Git.Branch.commitTree cmode
 		["graft cleanup"] [c] origtree
-	inRepo $ Git.Branch.update' fullname c'
-	-- The tree in c' is the same as the tree in branchref,
-	-- and the index was updated to that above, so it's safe to
-	-- say that the index contains c'.
-	setIndexSha c'
-	return c'
 
 {- Runs an action on the content of selected files from the branch.
  - This is much faster than reading the content of each file in turn,
diff --git a/CHANGELOG b/CHANGELOG
index 693f55a8ab..e42f967b5b 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,5 +1,8 @@
 git-annex (10.20240532) UNRELEASED; urgency=medium
 
+  * Fix a bug where interrupting git-annex while it is updating the
+    git-annex branch for an export could later lead to git fsck
+    complaining about missing tree objects.
   * Fix Windows build with Win32 2.13.4+
     Thanks, Oleg Tolmatcev
 
diff --git a/Git/Branch.hs b/Git/Branch.hs
index 8569f5d249..9d0ba56384 100644
--- a/Git/Branch.hs
+++ b/Git/Branch.hs
@@ -178,13 +178,25 @@ commitCommand' runner commitmode commitquiet ps =
  - in any way, or output a summary.
  -}
 commit :: CommitMode -> Bool -> String -> Branch -> [Ref] -> Repo -> IO (Maybe Sha)
-commit commitmode allowempty message branch parentrefs repo = do
-	tree <- writeTree repo
-	ifM (cancommit tree)
-		( do
-			sha <- commitTree commitmode [message] parentrefs tree repo
+commit commitmode allowempty message branch parentrefs repo =
+	commitSha commitmode allowempty message parentrefs repo >>= \case
+		Just sha -> do
 			update' branch sha repo
 			return $ Just sha
+		Nothing -> return Nothing
+  where
+	cancommit tree
+		| allowempty = return True
+		| otherwise = case parentrefs of
+			[p] -> maybe False (tree /=) <$> Git.Ref.tree p repo
+			_ -> return True
+
+{- Same as commit but without updating any branch. -}
+commitSha :: CommitMode -> Bool -> String -> [Ref] -> Repo -> IO (Maybe Sha)
+commitSha commitmode allowempty message parentrefs repo = do
+	tree <- writeTree repo
+	ifM (cancommit tree)
+		( Just <$> commitTree commitmode [message] parentrefs tree repo
 		, return Nothing
 		)
   where
@@ -198,6 +210,10 @@ commitAlways :: CommitMode -> String -> Branch -> [Ref] -> Repo -> IO Sha
 commitAlways commitmode message branch parentrefs repo = fromJust
 	<$> commit commitmode True message branch parentrefs repo
 
+commitShaAlways :: CommitMode -> String -> [Ref] -> Repo -> IO Sha
+commitShaAlways commitmode message parentrefs repo = fromJust
+	<$> commitSha commitmode True message parentrefs repo
+
 -- Throws exception if the index is locked, with an error message output by
 -- git on stderr.
 writeTree :: Repo -> IO Sha
diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_8_678151d78d145da6d249184ac212f935._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_8_678151d78d145da6d249184ac212f935._comment
new file mode 100644
index 0000000000..d815f62ca9
--- /dev/null
+++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_8_678151d78d145da6d249184ac212f935._comment
@@ -0,0 +1,15 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 8"""
+ date="2024-06-07T17:59:43Z"
+ content="""
+Fixed performTransitionsLocked to create the new git-annex branch
+atomically.
+
+Found another way this could happen, interrupting `git-annex export` after
+it writes export.log but before it grafts the tree into the git-annex
+branch. Fixed that one too.
+
+So hopefully this won't happen to any more repositories with these fixes.
+Still leaves the question of how to recover from the problem.
+"""]]

update status and design work on proxy encryption and chunking
diff --git a/doc/design/passthrough_proxy.mdwn b/doc/design/passthrough_proxy.mdwn
index e943363369..76e6c2cc18 100644
--- a/doc/design/passthrough_proxy.mdwn
+++ b/doc/design/passthrough_proxy.mdwn
@@ -189,24 +189,60 @@ The remote interface operates on object files stored on disk. See
 [[todo/transitive_transfers]] for discussion of that problem. If proxies
 get implemented, that problem should be revisited.
 
+## chunking
+
+When the proxy is in front of a special remote that is chunked,
+where does the chunking happen? It could happen on the client, or on the
+proxy.
+
+Git remotes don't ever do chunking currently, so chunking on the client
+would need changes there.
+
+Also, a given upload via a proxy may get sent to several special remotes,
+each with different chunk sizes, or perhaps some not chunked and some
+chunked. For uploads to be efficient, chunking needs to happen on the proxy.
+
 ## encryption
 
 When the proxy is in front of a special remote that uses encryption, where
 does the encryption happen? It could either happen on the client before
 sending to the proxy, or the proxy could do the encryption since it
-communicates with the special remote. For security, doing the encryption on
-the client seems like the best choice by far.
+communicates with the special remote.
+
+If the client does not want the proxy to see unencrypted data,
+they would obviously prefer encryption happens locally.
 
-But, git-annex's git remotes don't currently ever do encryption. And
-special remotes don't communicate via the P2P protocol with a git remote.
-So none of git-annex's existing remote implementations would be able to handle
-this case. Something will need to be changed in the remote
-implementation for this.
+But, the proxy could be the only thing that has access to a security key
+that is used in encrypting a special remote that's located behind it.
+There's a security benefit there too.
 
-(Chunking has the same problem.)
+So there are kind of two different perspectives here that can have
+different opinions.
+
+Also if encryption for a special remote behind a proxy happened
+client-side, and the client relied on that, nothing would stop the proxy
+from replacing that encrypted special remote with an unencrypted remote.
+Then the client side encryption would not happen, the user would not
+notice, and the proxy could see their unencrypted content.
+
+Of course, if a client really wanted to, they could make a special remote
+that uses the remote behind the proxy as a key/value backend.
+Then the client could encrypt locally.
+
+On the implementation side, git-annex's git remotes don't currently ever do
+encryption. And special remotes don't communicate via the P2P protocol with
+a git remote. So none of git-annex's existing remote implementations would
+be able to handle client-side encryption.
 
 There's potentially a layering problem here, because exactly how encryption
-(or chunking) works can vary depending on the type of special remote.
+works can vary depending on the type of special remote.
+
+Encrypted and chunked special remotes first chunk, then encrypt.
+So it chunking happens on the proxy, encryption *must* also happen there.
+
+So overall, it seems better to do proxy-side encryption. But it may be
+worth adding a special remote that does its own client-side encryption
+in front of the proxy.
 
 ## cycles
 
diff --git a/doc/todo/git-annex_proxies.mdwn b/doc/todo/git-annex_proxies.mdwn
index 90dc9c614d..2e8bad27cd 100644
--- a/doc/todo/git-annex_proxies.mdwn
+++ b/doc/todo/git-annex_proxies.mdwn
@@ -34,16 +34,13 @@ For June's work on [[design/passthrough_proxy]], implementation plan:
 1. Add `git-annex updateproxy` command and remote.name.annex-proxy
    configuration. (done)
 
-2. Remote instantiation for proxies almost works, but fails at:
-   "git-annex: cannot determine uuid for origin-foo"
-
-   getRepoUUID does not look at the Repo's UUID setting, but reads it
-   from git-config. It's not set there for a proxied remote.
-
-   So: Add annex-uuid parsing to RemoteConfig.
+2. Remote instantiation for proxies. (done)
 
 3. Implement proxying in git-annex-shell.
 
+4. Either implement proxying for local path remotes, or prevent
+   listProxied from operating on them.
+
 4. Let `storeKey` return a list of UUIDs where content was stored,
    and make proxies accept uploads directed at them, rather than a specific
    instantiated remote, and fan out the upload to whatever nodes behind

comment
diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_7_7f3db7b5a47c3021f82a49024a8c7e43._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_7_7f3db7b5a47c3021f82a49024a8c7e43._comment
new file mode 100644
index 0000000000..87fdabd944
--- /dev/null
+++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_7_7f3db7b5a47c3021f82a49024a8c7e43._comment
@@ -0,0 +1,33 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 7"""
+ date="2024-06-04T15:15:36Z"
+ content="""
+Decoding the export.log, we have these events:
+
+Tue Aug  4 13:44:10 2020 (PST): An export is run on an openneuro worker
+sending to `s3-PRIVATE`, of b78b723042e6d7a967c806b52258e8554caa1696 which
+is now lost to history. After that export completed, there was a subsequent
+started but not completed export of
+ae2937297eb1b4f6c9bfdfcf9d7a41b1adcea32e, also lost to history.
+
+Fri Jan 19 21:04:26 2024: An export run on the same worker, sending to
+a `s3-PUBLIC` (not the current one, one that has been marked dead and
+forgotten), of ae2937297eb1b4f6c9bfdfcf9d7a41b1adcea32e. After that export
+completed, there was a subsequent started but not completed export of
+28b655e8207f916122bbcbd22c0369d86bb4ffc1.
+
+Later the same day, an export run on the same worker, sending to
+`s3-PUBLIC` (the current one), of 28b655e8207f916122bbcbd22c0369d86bb4ffc1.
+This export completed.
+
+Interesting that two exports were apparently started but left incomplete.
+This could have been because git-annex was interrupted, which would go a
+way toward confirming my analysis of this bug. But also possible
+there was a error exporting one or more files.
+
+According to Nell, the git history of main was rewritten to remove a large
+file from git. The tree 28b655e8207f916122bbcbd22c0369d86bb4ffc1 appears
+to still contain the large binary file. No commit in main references it.
+It did get grafted into the git-annex branch which is why it was not lost.
+"""]]

next step identified
diff --git a/doc/todo/git-annex_proxies.mdwn b/doc/todo/git-annex_proxies.mdwn
index 84b012368f..90dc9c614d 100644
--- a/doc/todo/git-annex_proxies.mdwn
+++ b/doc/todo/git-annex_proxies.mdwn
@@ -34,7 +34,13 @@ For June's work on [[design/passthrough_proxy]], implementation plan:
 1. Add `git-annex updateproxy` command and remote.name.annex-proxy
    configuration. (done)
 
-2. Test implementation of remote instantiation for proxies.
+2. Remote instantiation for proxies almost works, but fails at:
+   "git-annex: cannot determine uuid for origin-foo"
+
+   getRepoUUID does not look at the Repo's UUID setting, but reads it
+   from git-config. It's not set there for a proxied remote.
+
+   So: Add annex-uuid parsing to RemoteConfig.
 
 3. Implement proxying in git-annex-shell.
 

update
diff --git a/doc/todo/git-annex_proxies.mdwn b/doc/todo/git-annex_proxies.mdwn
index ddac3b9cad..84b012368f 100644
--- a/doc/todo/git-annex_proxies.mdwn
+++ b/doc/todo/git-annex_proxies.mdwn
@@ -34,21 +34,9 @@ For June's work on [[design/passthrough_proxy]], implementation plan:
 1. Add `git-annex updateproxy` command and remote.name.annex-proxy
    configuration. (done)
 
-1. getProxies should be cached to avoid repeatedly reading the log and
-   parsing.
+2. Test implementation of remote instantiation for proxies.
 
-1. Remote names coming from the git-annex branch need to be
-   limited to what's legal in git remote names. If a remote name is not
-   legal, munge it until it is.
-   This will also prevent remote names being a security hazard
-   via eg escape characters.
-
-2. Remote instantiation for proxies. When a remote "foo" is a proxy,
-   and has a remote "bar", instantiate a remote "foo-bar" that has the UUID
-   of bar but is of the same type and configuration of remote "foo".
-
-3. Implement proxying in git-annex-shell so connections with the UUID
-   of one of the proxy's 
+3. Implement proxying in git-annex-shell.
 
 4. Let `storeKey` return a list of UUIDs where content was stored,
    and make proxies accept uploads directed at them, rather than a specific
@@ -73,4 +61,4 @@ For June's work on [[design/passthrough_proxy]], implementation plan:
 
 11. indirect uploads (to be considered). See design.
 
-
+12. Support using a proxy when its url is a P2P address.

removed
diff --git a/doc/forum/How_to_add_git_annex_metadata_to_directories__63__/comment_3_1a6f7ef00e8cdabf8b52dfd01a1f6148._comment b/doc/forum/How_to_add_git_annex_metadata_to_directories__63__/comment_3_1a6f7ef00e8cdabf8b52dfd01a1f6148._comment
deleted file mode 100644
index 74c78fe2bf..0000000000
--- a/doc/forum/How_to_add_git_annex_metadata_to_directories__63__/comment_3_1a6f7ef00e8cdabf8b52dfd01a1f6148._comment
+++ /dev/null
@@ -1,15 +0,0 @@
-[[!comment format=mdwn
- username="ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd"
- nickname="ruslan"
- avatar="http://cdn.libravatar.org/avatar/37d3c852372d96daa8a99629755ed1f9"
- subject="comment 3"
- date="2024-06-06T11:23:55Z"
- content="""
-Thank you for the heads up! 
-
-I've actually looked in to DataLad, and have been using git annex with submodules.
-
-Problem I found with submodules is that they required a lot of additional steps as far as adding/moving/deleting/syncing them. A very manual process, with a lot of complexity and some rough edge cases. They also interfere with some of Git-Annex functionality like metadata driven views I believe. So I'm using submodules very sparingly, only when I really need them.
-
-As far as DataLad - it looks like a mature and well supported project, would love to see more feedback/reviews on it.
-"""]]

Added a comment
diff --git a/doc/forum/How_to_add_git_annex_metadata_to_directories__63__/comment_3_1a6f7ef00e8cdabf8b52dfd01a1f6148._comment b/doc/forum/How_to_add_git_annex_metadata_to_directories__63__/comment_3_1a6f7ef00e8cdabf8b52dfd01a1f6148._comment
new file mode 100644
index 0000000000..74c78fe2bf
--- /dev/null
+++ b/doc/forum/How_to_add_git_annex_metadata_to_directories__63__/comment_3_1a6f7ef00e8cdabf8b52dfd01a1f6148._comment
@@ -0,0 +1,15 @@
+[[!comment format=mdwn
+ username="ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd"
+ nickname="ruslan"
+ avatar="http://cdn.libravatar.org/avatar/37d3c852372d96daa8a99629755ed1f9"
+ subject="comment 3"
+ date="2024-06-06T11:23:55Z"
+ content="""
+Thank you for the heads up! 
+
+I've actually looked in to DataLad, and have been using git annex with submodules.
+
+Problem I found with submodules is that they required a lot of additional steps as far as adding/moving/deleting/syncing them. A very manual process, with a lot of complexity and some rough edge cases. They also interfere with some of Git-Annex functionality like metadata driven views I believe. So I'm using submodules very sparingly, only when I really need them.
+
+As far as DataLad - it looks like a mature and well supported project, would love to see more feedback/reviews on it.
+"""]]

Added a comment
diff --git a/doc/forum/How_to_add_git_annex_metadata_to_directories__63__/comment_2_efe39f86b7ab71a64cd6ce4770f39d42._comment b/doc/forum/How_to_add_git_annex_metadata_to_directories__63__/comment_2_efe39f86b7ab71a64cd6ce4770f39d42._comment
new file mode 100644
index 0000000000..e6a32f5a0a
--- /dev/null
+++ b/doc/forum/How_to_add_git_annex_metadata_to_directories__63__/comment_2_efe39f86b7ab71a64cd6ce4770f39d42._comment
@@ -0,0 +1,15 @@
+[[!comment format=mdwn
+ username="ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd"
+ nickname="ruslan"
+ avatar="http://cdn.libravatar.org/avatar/37d3c852372d96daa8a99629755ed1f9"
+ subject="comment 2"
+ date="2024-06-06T11:23:34Z"
+ content="""
+Thank you for the heads up! 
+
+I've actually looked in to DataLad, and have been using git annex with submodules.
+
+Problem I found with submodules is that they required a lot of additional steps as far as adding/moving/deleting/syncing them. A very manual process, with a lot of complexity and some rough edge cases. They also interfere with some of Git-Annex functionality like metadata driven views I believe. So I'm using submodules very sparingly, only when I really need them.
+
+As far as DataLad - it looks like a mature and well supported project, would love to see more feedback/reviews on it.
+"""]]

diff --git a/doc/forum/How_to_add_git_annex_metadata_to_directories__63__.mdwn b/doc/forum/How_to_add_git_annex_metadata_to_directories__63__.mdwn
index f5920e2b55..e67ec99e77 100644
--- a/doc/forum/How_to_add_git_annex_metadata_to_directories__63__.mdwn
+++ b/doc/forum/How_to_add_git_annex_metadata_to_directories__63__.mdwn
@@ -1,3 +1,3 @@
-As I understand - there is currently now way to track metadata for directories with `git annex metadata`, and it only works for files. Is that indeed the case?
+As I understand - there is currently no way to track metadata for directories with `git annex metadata` (it only works for files). Is that indeed the case?
 
 One workaround I'm looking at is to add a metadata placeholder file for directory metadata inside the directory. As I understand - each directory would need to have such file with some unique content (perhaps UUID), otherwise metadata between files for different directories will actually collide. Are there alternatives/better solutions for tracking datasets metadata (groups of files in a folder)?

Added a comment
diff --git a/doc/forum/How_to_add_git_annex_metadata_to_directories__63__/comment_1_3eee0688143bbd0696cde16c7fca8d06._comment b/doc/forum/How_to_add_git_annex_metadata_to_directories__63__/comment_1_3eee0688143bbd0696cde16c7fca8d06._comment
new file mode 100644
index 0000000000..fd9ecec870
--- /dev/null
+++ b/doc/forum/How_to_add_git_annex_metadata_to_directories__63__/comment_1_3eee0688143bbd0696cde16c7fca8d06._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="nobodyinperson"
+ avatar="http://cdn.libravatar.org/avatar/736a41cd4988ede057bae805d000f4f5"
+ subject="comment 1"
+ date="2024-06-06T09:09:03Z"
+ content="""
+You are absolutely right. You might be interested in [DataLad](https://datalad.org), which provides a lot of convenience around git-annex, has the concept of datasets (git submodules) and also an extended approach to metadata.
+"""]]

diff --git a/doc/forum/How_to_add_git_annex_metadata_to_directories__63__.mdwn b/doc/forum/How_to_add_git_annex_metadata_to_directories__63__.mdwn
new file mode 100644
index 0000000000..f5920e2b55
--- /dev/null
+++ b/doc/forum/How_to_add_git_annex_metadata_to_directories__63__.mdwn
@@ -0,0 +1,3 @@
+As I understand - there is currently now way to track metadata for directories with `git annex metadata`, and it only works for files. Is that indeed the case?
+
+One workaround I'm looking at is to add a metadata placeholder file for directory metadata inside the directory. As I understand - each directory would need to have such file with some unique content (perhaps UUID), otherwise metadata between files for different directories will actually collide. Are there alternatives/better solutions for tracking datasets metadata (groups of files in a folder)?

Added a comment
diff --git a/doc/bugs/git_annex_unannex_-_some_files_still_symlinked/comment_1_a6c2c4e87743da11dcc2ed718a350bb4._comment b/doc/bugs/git_annex_unannex_-_some_files_still_symlinked/comment_1_a6c2c4e87743da11dcc2ed718a350bb4._comment
new file mode 100644
index 0000000000..c507684212
--- /dev/null
+++ b/doc/bugs/git_annex_unannex_-_some_files_still_symlinked/comment_1_a6c2c4e87743da11dcc2ed718a350bb4._comment
@@ -0,0 +1,22 @@
+[[!comment format=mdwn
+ username="ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd"
+ nickname="ruslan"
+ avatar="http://cdn.libravatar.org/avatar/37d3c852372d96daa8a99629755ed1f9"
+ subject="comment 1"
+ date="2024-06-05T17:34:32Z"
+ content="""
+Solution with running `git annex add` is also described at the link below:
+
+https://git-annex.branchable.com/forum/git_annex_add_crash_and_subsequent_recovery/#comment-4f5af644597a055624009c5bbb9aca3f
+
+---
+
+So need to find files that are symlinks to git annex object folder and run `git annex add` / `git annex unused` - I can handle that with a script, though would be nice to have a built-in method
+
+---
+
+Additional notes:
+
+1. There should be a way to find files that were added to git annex folder but are not tracked by git annex. Is this something that can be done with existing commands?
+2. It's desirable to have a way to abort `git annex add` gracefully on long-running jobs. Is there a way to do it now? Looks like ctrl-c resulted in a broken state. Whould Ctrl-z work better?
+"""]]

diff --git a/doc/bugs/git_annex_unannex_-_some_files_still_symlinked.mdwn b/doc/bugs/git_annex_unannex_-_some_files_still_symlinked.mdwn
new file mode 100644
index 0000000000..14c1a4cf89
--- /dev/null
+++ b/doc/bugs/git_annex_unannex_-_some_files_still_symlinked.mdwn
@@ -0,0 +1,35 @@
+### Please describe the problem.
+
+1. Some files remain symlinked after aborted `git annex add` and completed `git annex unannex`
+2. This files are present in``.git/annex/objects` but `git annex unused` does not find them. Running `git annex whereused --key=SHA256E...` runs empty.
+
+To restore files and remove them from git-annex objects folder - need manual workarounds or hacks like adding file again with `git annex add` and trying to removing it again 
+
+### What steps will reproduce the problem?
+
+1. run `git annex add` and abort operation mid-way (this was on directory with large number of files ~3K and running with 12 jobs command switch)
+2. run `git annex unannex` until done
+3. find that some files that were added - were restored, and some still symlinked but are not tracked by git annex
+
+
+### What version of git-annex are you using? On what operating system?
+
+Debian Bookworm / git-annex version: 10.20240227-1
+
+### Please provide any additional information below.
+
+Similar report from another user here:
+https://git-annex.branchable.com/forum/File_still_symlinked_after_git_annex_unannex/
+
+[[!format sh """
+# If you can, paste a complete transcript of the problem occurring here.
+# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
+
+
+# End of transcript or log.
+"""]]
+
+### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
+
+
+Yes, using it extensively for a few years with terabytes of data

Added a comment
diff --git a/doc/todo/wherewas/comment_1_23260a8010a9dc707783408ac1663b00._comment b/doc/todo/wherewas/comment_1_23260a8010a9dc707783408ac1663b00._comment
new file mode 100644
index 0000000000..7c91260bad
--- /dev/null
+++ b/doc/todo/wherewas/comment_1_23260a8010a9dc707783408ac1663b00._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd"
+ nickname="ruslan"
+ avatar="http://cdn.libravatar.org/avatar/37d3c852372d96daa8a99629755ed1f9"
+ subject="comment 1"
+ date="2024-06-05T16:53:50Z"
+ content="""
+Yes, limiting it to a single file would be sufficient for the use case I encountered, and keep it simple from the usage / user interface stand point IMHO
+Would look forward to this!
+"""]]

status update after day 1 of new project
diff --git a/doc/todo/git-annex_proxies.mdwn b/doc/todo/git-annex_proxies.mdwn
index 4ae8bc1f39..ddac3b9cad 100644
--- a/doc/todo/git-annex_proxies.mdwn
+++ b/doc/todo/git-annex_proxies.mdwn
@@ -22,22 +22,27 @@ Planned schedule of work:
 
 # work notes
 
+In development on the `proxy` branch.
+
 For June's work on [[design/passthrough_proxy]], implementation plan:
 
 1. UUID discovery via git-annex branch. Add a log file listing UUIDs
    accessible via proxy UUIDs. It also will contain the names
    of the remotes that the proxy is a proxy for, 
-   from the perspective of the proxy.
+   from the perspective of the proxy. (done)
+
+1. Add `git-annex updateproxy` command and remote.name.annex-proxy
+   configuration. (done)
 
-   Note that remote names coming from the git-annex branch need to be
-   limited to what's legal in git remote names. 
+1. getProxies should be cached to avoid repeatedly reading the log and
+   parsing.
+
+1. Remote names coming from the git-annex branch need to be
+   limited to what's legal in git remote names. If a remote name is not
+   legal, munge it until it is.
    This will also prevent remote names being a security hazard
    via eg escape characters.
 
-1. Add a command that is run on the proxy to update the proxy log file.
-   This is how the user sets it up as a proxy, and selects the remotes its
-   proxying for.
-
 2. Remote instantiation for proxies. When a remote "foo" is a proxy,
    and has a remote "bar", instantiate a remote "foo-bar" that has the UUID
    of bar but is of the same type and configuration of remote "foo".

implementation plan
diff --git a/doc/design/passthrough_proxy.mdwn b/doc/design/passthrough_proxy.mdwn
index 36ffbc1bd4..e943363369 100644
--- a/doc/design/passthrough_proxy.mdwn
+++ b/doc/design/passthrough_proxy.mdwn
@@ -87,6 +87,27 @@ to store data when eg, all the repositories that is knows about are full.
 Just getting the git-annex back in sync should recover from either
 situation.
 
+> This seems like the clear winner.
+
+## UUID discovery security
+
+Are there any security concerns with adding UUID discovery?
+
+Suppose that repository A claims to be a proxy for repository B, but it's
+not connected to B, and is actually evil. Then git-annex would instantiate
+a remote A-B with the UUID of B. If files were sent to A-B, git-annex would
+consider them present on B, and not send them to B by other remotes.
+
+Well, in this situation, A wrote to the git-annex branch (or used a P2P
+protocol extension) in order to pose as B. Without a proxy feature A could
+just as well falsify location logs to claim that B contains things it did
+not. Also, without a proxy feature, A could set its UUID to be the same as
+B, and so trick us into sending files to it rather than B.
+
+The only real difference seems to be that the UUID of a remote is cached,
+so A could only do this the first time we accessed it, and not later.
+With UUID discovery, A can do that at any time.
+
 ## user interface
 
 What to name the instantiated remotes? Probably the best that could
@@ -129,7 +150,7 @@ A command like `git-annex push` would see all the instantiated remotes and
 would pick one to send content to. Seems like the proxy might choose to
 `storeKey` the content on other node(s) than the requested one. Which would
 be fine. But, `git-annex push` would still do considerable extra work in
-interating over all the instantiated remotes. So it might be better to make
+iterating over all the instantiated remotes. So it might be better to make
 such commands not operate on instantiated remotes for sending content but
 only on the proxy. 
 
@@ -192,7 +213,7 @@ There's potentially a layering problem here, because exactly how encryption
 What if repo A is a proxy and has repo B as a remote. Meanwhile, repo B is
 a proxy and has repo A as a remote?
 
-An upload to repo A will start by checkin if repo B wants the content and if so,
+An upload to repo A will start by checking if repo B wants the content and if so,
 start an upload to repo B. Then the same happens on repo B, leading it to
 start an upload to repo A. 
 
diff --git a/doc/todo/git-annex_proxies.mdwn b/doc/todo/git-annex_proxies.mdwn
index 9c41bdb15a..4ae8bc1f39 100644
--- a/doc/todo/git-annex_proxies.mdwn
+++ b/doc/todo/git-annex_proxies.mdwn
@@ -19,3 +19,53 @@ Planned schedule of work:
 * October: proving behavior of balanced preferred content with proxies
 
 [[!tag projects/openneuro]]
+
+# work notes
+
+For June's work on [[design/passthrough_proxy]], implementation plan:
+
+1. UUID discovery via git-annex branch. Add a log file listing UUIDs
+   accessible via proxy UUIDs. It also will contain the names
+   of the remotes that the proxy is a proxy for, 
+   from the perspective of the proxy.
+
+   Note that remote names coming from the git-annex branch need to be
+   limited to what's legal in git remote names. 
+   This will also prevent remote names being a security hazard
+   via eg escape characters.
+
+1. Add a command that is run on the proxy to update the proxy log file.
+   This is how the user sets it up as a proxy, and selects the remotes its
+   proxying for.
+
+2. Remote instantiation for proxies. When a remote "foo" is a proxy,
+   and has a remote "bar", instantiate a remote "foo-bar" that has the UUID
+   of bar but is of the same type and configuration of remote "foo".
+
+3. Implement proxying in git-annex-shell so connections with the UUID
+   of one of the proxy's 
+
+4. Let `storeKey` return a list of UUIDs where content was stored,
+   and make proxies accept uploads directed at them, rather than a specific
+   instantiated remote, and fan out the upload to whatever nodes behind
+   the proxy want it. This will need P2P protocol extensions.
+
+5. Make `git-annex copy --from $proxy` pick a node that contains each
+   file, and use the instantiated remote for getting the file. Same for
+   similar commands.
+
+6. Make `git-annex drop --from $proxy` drop, when possible, from every
+   remote accessible by the proxy. Communicate partial drops somehow.
+
+7. Make commands like `git-annex push` not iterate over instantiate
+   remotes, and instead just send content to the proxy for fanout.
+
+8. Optimise proxy speed. See design for idea.
+
+9. Encryption and chunking. See design for issues.
+
+10. Cycle prevention. See design.
+
+11. indirect uploads (to be considered). See design.
+
+

recieved funding to work on this, which comes with a schedule
diff --git a/doc/todo/git-annex_proxies.mdwn b/doc/todo/git-annex_proxies.mdwn
index 8f4fe3c3de..9c41bdb15a 100644
--- a/doc/todo/git-annex_proxies.mdwn
+++ b/doc/todo/git-annex_proxies.mdwn
@@ -8,4 +8,14 @@ repositories.
 4. [[todo/track_free_space_in_repos_via_git-annex_branch]]
 5. [[todo/proving_preferred_content_behavior]]
 
+Joey has received funding to work on this.
+Planned schedule of work:
+
+* June: git-annex proxy
+* July, part 1: git-annex proxy support for exporttree
+* July, part 2: p2p protocol over http
+* August: balanced preferred content
+* September: streaming through proxy to special remotes (especially S3)
+* October: proving behavior of balanced preferred content with proxies
+
 [[!tag projects/openneuro]]

comment
diff --git a/doc/forum/Strange_symlinkPointsToGitDir_error/comment_11_22a1e72fdd2008a03f40677590f51567._comment b/doc/forum/Strange_symlinkPointsToGitDir_error/comment_11_22a1e72fdd2008a03f40677590f51567._comment
deleted file mode 100644
index 610f760939..0000000000
--- a/doc/forum/Strange_symlinkPointsToGitDir_error/comment_11_22a1e72fdd2008a03f40677590f51567._comment
+++ /dev/null
@@ -1,8 +0,0 @@
-[[!comment format=mdwn
- username="joey"
- subject="""comment 11"""
- date="2024-06-04T10:39:27Z"
- content="""
-A fixed git was released on Friday. I don't know when GitLab will upgrade
-to it, perhaps they have already?
-"""]]
diff --git a/doc/forum/Strange_symlinkPointsToGitDir_error/comment_13_8da4bcf2ebfd986e45e4b3a732bfdfaf._comment b/doc/forum/Strange_symlinkPointsToGitDir_error/comment_13_8da4bcf2ebfd986e45e4b3a732bfdfaf._comment
new file mode 100644
index 0000000000..c8c054bf93
--- /dev/null
+++ b/doc/forum/Strange_symlinkPointsToGitDir_error/comment_13_8da4bcf2ebfd986e45e4b3a732bfdfaf._comment
@@ -0,0 +1,7 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 13"""
+ date="2024-06-04T10:40:44Z"
+ content="""
+Very happy the git developers fixed this in their release on Friday.
+"""]]

update
diff --git a/doc/forum/Strange_symlinkPointsToGitDir_error/comment_11_22a1e72fdd2008a03f40677590f51567._comment b/doc/forum/Strange_symlinkPointsToGitDir_error/comment_11_22a1e72fdd2008a03f40677590f51567._comment
new file mode 100644
index 0000000000..610f760939
--- /dev/null
+++ b/doc/forum/Strange_symlinkPointsToGitDir_error/comment_11_22a1e72fdd2008a03f40677590f51567._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 11"""
+ date="2024-06-04T10:39:27Z"
+ content="""
+A fixed git was released on Friday. I don't know when GitLab will upgrade
+to it, perhaps they have already?
+"""]]

fixed by git release
diff --git a/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir.mdwn b/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir.mdwn
index 77fb4e4885..31648dc593 100644
--- a/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir.mdwn
+++ b/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir.mdwn
@@ -25,6 +25,8 @@ configs.
 Is it at all common to set `git config fetch.fsckObjects true` or 
 `git config receive.fsckObjects` true?
 
+> [[fixed|done]] in git --[[Joey]] 
+
 ----
 
 There is also potential breakage from git fsck now warning about symlink
diff --git a/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_4_3da394c6155827fb6d9e64f1fdc214a1._comment b/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_4_3da394c6155827fb6d9e64f1fdc214a1._comment
new file mode 100644
index 0000000000..97b6c04816
--- /dev/null
+++ b/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_4_3da394c6155827fb6d9e64f1fdc214a1._comment
@@ -0,0 +1,11 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 4"""
+ date="2024-06-04T10:35:09Z"
+ content="""
+Fixed git was released. All these versions fix it: v2.45.2 v2.39.5, v2.40.3, v2.41.2, v2.42.3, v2.43.5, and v2.44.2
+
+That does leave the fsck.symlinkTargetLength as potentially still a problem
+on Windows, but I think I'll consider this closed and if that does actually
+lead to problems on windws we can revisit.
+"""]]

Added a comment: Yes, GitLab fixed!
diff --git a/doc/forum/Strange_symlinkPointsToGitDir_error/comment_12_0a40b66561f52ac0895b593a56b973d2._comment b/doc/forum/Strange_symlinkPointsToGitDir_error/comment_12_0a40b66561f52ac0895b593a56b973d2._comment
new file mode 100644
index 0000000000..3dcfe972d5
--- /dev/null
+++ b/doc/forum/Strange_symlinkPointsToGitDir_error/comment_12_0a40b66561f52ac0895b593a56b973d2._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="nobodyinperson"
+ avatar="http://cdn.libravatar.org/avatar/736a41cd4988ede057bae805d000f4f5"
+ subject="Yes, GitLab fixed!"
+ date="2024-06-04T07:38:47Z"
+ content="""
+I can confirm, `git annex assist` to GitLab.com works again 👍
+"""]]

Added a comment: GitLab fixed?
diff --git a/doc/forum/Strange_symlinkPointsToGitDir_error/comment_11_c49399ed889112cb37bf85362731870d._comment b/doc/forum/Strange_symlinkPointsToGitDir_error/comment_11_c49399ed889112cb37bf85362731870d._comment
new file mode 100644
index 0000000000..f9bea776fc
--- /dev/null
+++ b/doc/forum/Strange_symlinkPointsToGitDir_error/comment_11_c49399ed889112cb37bf85362731870d._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="datamanager"
+ avatar="http://cdn.libravatar.org/avatar/7d4ca7c5e571d4740ef072b83a746c12"
+ subject="GitLab fixed?"
+ date="2024-06-04T01:18:25Z"
+ content="""
+I believe the problem has been corrected, at least on GitLab; I was just able to synchronize my changes there. 
+"""]]

root cause analysis
diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_4_c818e240e0d0ed8da3733e574694cc44._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_4_c818e240e0d0ed8da3733e574694cc44._comment
index 4e25ea3ec6..39a8f71865 100644
--- a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_4_c818e240e0d0ed8da3733e574694cc44._comment
+++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_4_c818e240e0d0ed8da3733e574694cc44._comment
@@ -6,8 +6,4 @@
 Spotchecked a few other OpenNeuro datasets in the same numeric range and
 they seem ok, so this may have been a 1-off problem. It would be good to
 check all 1.1k datasets.
-
-One possible way this might happen: git-annex is performing a transition,
-and gets interrupted after making the commit starting the new git-annex
-branch, but before it can graft in the export trees.
 """]]
diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_5_0998f6d67fc6327ecbbb3c23ad7f5275._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_5_0998f6d67fc6327ecbbb3c23ad7f5275._comment
new file mode 100644
index 0000000000..5e9fb1bc0d
--- /dev/null
+++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_5_0998f6d67fc6327ecbbb3c23ad7f5275._comment
@@ -0,0 +1,16 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 5"""
+ date="2024-06-03T17:28:02Z"
+ content="""
+performTransitionsLocked, when `neednewlocalbranch = True`,
+first writes the new git-annex branch, and then calls
+regraftexports which adds a second commit onto it.
+
+In the window before regraftexports finishes, interrupting git-annex will
+leave the repository in this state.
+
+There may be some other way this could happen, but that seems like a likely
+cause. It needs to avoid updating the git-annex branch ref until it's
+grafted the exports into it.
+"""]]

Added a comment
diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_5_1cd00e099b48d26557268924029709bc._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_5_1cd00e099b48d26557268924029709bc._comment
new file mode 100644
index 0000000000..876de47088
--- /dev/null
+++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_5_1cd00e099b48d26557268924029709bc._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="comment 5"
+ date="2024-06-03T17:54:42Z"
+ content="""
+This is how I detected this one, since I am going through all of them to be forked/pushed to https://datasets.datalad.org/?dir=/openneuro so I think it is indeed so far just one.
+"""]]

comment
diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_4_c818e240e0d0ed8da3733e574694cc44._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_4_c818e240e0d0ed8da3733e574694cc44._comment
new file mode 100644
index 0000000000..4e25ea3ec6
--- /dev/null
+++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_4_c818e240e0d0ed8da3733e574694cc44._comment
@@ -0,0 +1,13 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 4"""
+ date="2024-06-03T16:30:34Z"
+ content="""
+Spotchecked a few other OpenNeuro datasets in the same numeric range and
+they seem ok, so this may have been a 1-off problem. It would be good to
+check all 1.1k datasets.
+
+One possible way this might happen: git-annex is performing a transition,
+and gets interrupted after making the commit starting the new git-annex
+branch, but before it can graft in the export trees.
+"""]]

a small clarification
diff --git a/doc/bugs/git-remote-annex_doesn__39__t_work_on_Windows___40__perms__41__.mdwn b/doc/bugs/git-remote-annex_doesn__39__t_work_on_Windows___40__perms__41__.mdwn
index c67097ff66..df2d612a32 100644
--- a/doc/bugs/git-remote-annex_doesn__39__t_work_on_Windows___40__perms__41__.mdwn
+++ b/doc/bugs/git-remote-annex_doesn__39__t_work_on_Windows___40__perms__41__.mdwn
@@ -1,8 +1,8 @@
 ### Please describe the problem.
 
-The new git-remote-annex functionality doesn't appear to work due to not being able
-to delete temporary files (problem with thawing content, perhaps). Somehow, the
-temp files (e.g. GITBUNDLE objects) seem to have their readonly attribute stubbornly
+The new git-remote-annex functionality doesn't appear to work on Windows due to git-annex
+not being able to delete temporary files (problem with thawing content, perhaps). Somehow, the
+temp files (e.g. GITBUNDLE objects) seem to have their read-only attribute stubbornly
 set (from git-annex's view).
 
 On a happier note, git-remote-annex works just fine on WSL2 operating annexes and

report on git-remote-annex on Windows not quite working
diff --git a/doc/bugs/git-remote-annex_doesn__39__t_work_on_Windows___40__perms__41__.mdwn b/doc/bugs/git-remote-annex_doesn__39__t_work_on_Windows___40__perms__41__.mdwn
new file mode 100644
index 0000000000..c67097ff66
--- /dev/null
+++ b/doc/bugs/git-remote-annex_doesn__39__t_work_on_Windows___40__perms__41__.mdwn
@@ -0,0 +1,498 @@
+### Please describe the problem.
+
+The new git-remote-annex functionality doesn't appear to work due to not being able
+to delete temporary files (problem with thawing content, perhaps). Somehow, the
+temp files (e.g. GITBUNDLE objects) seem to have their readonly attribute stubbornly
+set (from git-annex's view).
+
+On a happier note, git-remote-annex works just fine on WSL2 operating annexes and
+directory special remotes on Windows' side (via drvfs/9p). :)
+
+### What steps will reproduce the problem?
+
+[[!format sh """
+E:\git-annex-tests\test-git-remote-annex> ls ~\bin\git-remote-annex
+
+    Directory: C:\Users\jkniiv\bin
+
+Mode                 LastWriteTime         Length Name
+----                 -------------         ------ ----
+la---           12.5.2024    23:03              0 git-remote-annex -> ..\.local\bin\git-annex.exe
+
+E:\git-annex-tests\test-git-remote-annex> ls
+
+    Directory: E:\git-annex-tests\test-git-remote-annex
+
+Mode                 LastWriteTime         Length Name
+----                 -------------         ------ ----
+d----            1.6.2024    23:08                annex-a
+d----            1.6.2024    23:22                annex-b
+d----            1.6.2024    23:13                dirremote-a
+d----            1.6.2024    23:24                dirremote-b
+
+E:\git-annex-tests\test-git-remote-annex> mkdir annex-c,dirremote-c
+
+    Directory: E:\git-annex-tests\test-git-remote-annex
+
+Mode                 LastWriteTime         Length Name
+----                 -------------         ------ ----
+d----            1.6.2024    23:26                annex-c
+d----            1.6.2024    23:26                dirremote-c
+
+E:\git-annex-tests\test-git-remote-annex> cd annex-c
+E:\git-annex-tests\test-git-remote-annex\annex-c> git init
+Initialized empty Git repository in E:/git-annex-tests/test-git-remote-annex/annex-c/.git/
+E:\git-annex-tests\test-git-remote-annex\annex-c [master]> git annex init annex-c
+init annex-c
+  Detected a filesystem without fifo support.
+
+  Disabling ssh connection caching.
+
+  Detected a crippled filesystem.
+
+  Entering an adjusted branch where files are unlocked as this filesystem does not support locked files.
+
+Switched to branch 'adjusted/master(unlocked)'
+ok
+(recording state in git...)
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked)]> git annex initremote test-dir type=directory encryption=none directory=E:\git-annex-tests\test-git-remote-annex\dirremote-c --with-url
+initremote test-dir ok
+(recording state in git...)
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked)]> cat .git\config
+[core]
+        repositoryformatversion = 0
+        filemode = false
+        bare = false
+        logallrefupdates = true
+        symlinks = false
+        ignorecase = true
+[annex]
+        uuid = 0b9fd211-24fe-403f-a143-e2a676f9bc64
+        sshcaching = false
+        crippledfilesystem = true
+        version = 10
+[filter "annex"]
+        smudge = git-annex smudge -- %f
+        clean = git-annex smudge --clean -- %f
+        process = git-annex filter-process
+[remote "test-dir"]
+        annex-directory = E:\\git-annex-tests\\test-git-remote-annex\\dirremote-c
+        annex-uuid = cf20a53e-1773-4fe8-ad44-f8c825a6d42f
+        url = annex::
+        fetch = +refs/heads/*:refs/remotes/test-dir/*
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked)]> echo one > a-1
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +1 ~0 -0 !]> echo two > b-2
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +2 ~0 -0 !]> echo three > c-3
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +3 ~0 -0 !]> ls
+
+    Directory: E:\git-annex-tests\test-git-remote-annex\annex-c
+
+Mode                 LastWriteTime         Length Name
+----                 -------------         ------ ----
+-a---            1.6.2024    23:28              5 a-1
+-a---            1.6.2024    23:28              5 b-2
+-a---            1.6.2024    23:28              7 c-3
+
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +3 ~0 -0 !]> git annex add a-1 b-2
+add a-1
+ok
+add b-2
+ok
+(recording state in git...)
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +2 ~0 -0 | +1 ~0 -0 !]> git add c-3
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +3 ~0 -0 ~]> git status
+On branch adjusted/master(unlocked)
+Changes to be committed:
+  (use "git restore --staged <file>..." to unstage)
+        new file:   a-1
+        new file:   b-2
+        new file:   c-3
+
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked) +3 ~0 -0 ~]> git commit -m 'Added some files'
+[adjusted/master(unlocked) 088853f] Added some files
+ 3 files changed, 3 insertions(+)
+ create mode 100644 a-1
+ create mode 100644 b-2
+ create mode 100644 c-3
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked)]> git log
+commit 088853f033ef46688dd4ded801e506a035e0f65b (HEAD -> adjusted/master(unlocked))
+Author: Jarkko Kniivilä <jkniiv@REDACTED>
+Date:   Sat Jun 1 23:29:44 2024 +0300
+
+    Added some files
+
+commit 38b32aed3f1985f92c943904ed64bce7006c728c
+Author: Jarkko Kniivilä <jkniiv@REDACTED>
+Date:   Sat Jun 1 23:27:47 2024 +0300
+
+    git-annex adjusted branch
+
+commit 1b7e0f9e661e78fee6e0262c857c5e57c0b0ee47 (master)
+Author: Jarkko Kniivilä <jkniiv@REDACTED>
+Date:   Sat Jun 1 23:27:47 2024 +0300
+
+    commit before entering adjusted branch
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked)]> git log master
+commit 1b7e0f9e661e78fee6e0262c857c5e57c0b0ee47 (master)
+Author: Jarkko Kniivilä <jkniiv@REDACTED>
+Date:   Sat Jun 1 23:27:47 2024 +0300
+
+    commit before entering adjusted branch
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked)]> git annex adjust --unlock
+adjust  ok
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked)]> git log master
+commit c275f9426ca2f587702c1212566bbdc5d09b620c (master)
+Author: Jarkko Kniivilä <jkniiv@REDACTED>
+Date:   Sat Jun 1 23:29:44 2024 +0300
+
+    Added some files
+
+commit 1b7e0f9e661e78fee6e0262c857c5e57c0b0ee47
+Author: Jarkko Kniivilä <jkniiv@REDACTED>
+Date:   Sat Jun 1 23:27:47 2024 +0300
+
+    commit before entering adjusted branch
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked)]> git push test-dir master
+Full remote url: annex::cf20a53e-1773-4fe8-ad44-f8c825a6d42f?encryption=none&type=directory
+To annex::
+ * [new branch]      master -> master
+E:\git-annex-tests\test-git-remote-annex\annex-c [adjusted/master(unlocked)]> git -c annex.debug annex sync --no-commit --content test-dir
+[2024-06-01 23:39:17.4962365] (Utility.Process) process [24708] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","git-annex"]
+[2024-06-01 23:39:17.5342304] (Utility.Process) process [24708] done ExitSuccess
+[2024-06-01 23:39:17.5442429] (Utility.Process) process [22804] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--hash","refs/heads/git-annex"]
+[2024-06-01 23:39:17.5812289] (Utility.Process) process [22804] done ExitSuccess
+[2024-06-01 23:39:17.6172278] (Utility.Process) process [12588] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","hash-object","-w","--no-filters","--stdin-paths"]
+[2024-06-01 23:39:17.6262323] (Utility.Process) process [31796] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch"]
+[2024-06-01 23:39:17.6392313] (Utility.Process) process [18380] feed: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","update-index","-z","--index-info"]
+[2024-06-01 23:39:17.6492327] (Utility.Process) process [31736] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","diff-index","--raw","-z","-r","--no-renames","-l0","--cached","refs/heads/git-annex","--"]
+[2024-06-01 23:39:17.6902288] (Utility.Process) process [31736] done ExitSuccess
+[2024-06-01 23:39:17.6952283] (Utility.Process) process [18380] done ExitSuccess
+[2024-06-01 23:39:17.7112279] (Utility.Process) process [20832] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","symbolic-ref","-q","HEAD"]
+[2024-06-01 23:39:17.7422277] (Utility.Process) process [20832] done ExitSuccess
+[2024-06-01 23:39:17.7482276] (Utility.Process) process [28880] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","refs/heads/adjusted/master(unlocked)"]
+[2024-06-01 23:39:17.7822246] (Utility.Process) process [28880] done ExitSuccess
+[2024-06-01 23:39:17.7912266] (Utility.Process) process [29536] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--verify","-q","refs/heads/synced/master"]
+[2024-06-01 23:39:17.8202293] (Utility.Process) process [29536] done ExitFailure 1
+[2024-06-01 23:39:17.8262314] (Utility.Process) process [15136] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","log","refs/heads/adjusted/master(unlocked)..refs/heads/master","--pretty=%H","-n1"]
+[2024-06-01 23:39:17.862228] (Utility.Process) process [15136] done ExitSuccess
+merge master [2024-06-01 23:39:17.8712316] (Utility.Process) process [24664] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--hash","refs/basis/adjusted/master(unlocked)"]
+[2024-06-01 23:39:17.9032575] (Utility.Process) process [24664] done ExitSuccess
+[2024-06-01 23:39:17.9082289] (Utility.Process) process [8688] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch"]
+[2024-06-01 23:39:17.940232] (Utility.Process) process [4772] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","log","refs/basis/adjusted/master(unlocked)..refs/heads/adjusted/master(unlocked)","--pretty=%H","--reverse"]
+[2024-06-01 23:39:17.9832314] (Utility.Process) process [4772] done ExitSuccess
+[2024-06-01 23:39:17.9952296] (Utility.Process) process [20876] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show","-z","--raw","--no-renames","-l0","--no-abbrev","--pretty=","--raw","088853f033ef46688dd4ded801e506a035e0f65b"]
+[2024-06-01 23:39:18.0272323] (Utility.Process) process [7580] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch-check=%(objectname) %(objecttype) %(objectsize)"]
+[2024-06-01 23:39:18.0782375] (Utility.Process) process [21852] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","mktree","--missing","--batch","-z"]
+[2024-06-01 23:39:18.093233] (Utility.Process) process [33064] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","ls-tree","--full-tree","-z","-r","-t","--","c275f9426ca2f587702c1212566bbdc5d09b620c"]
+[2024-06-01 23:39:18.1302264] (Utility.Process) process [33064] done ExitSuccess
+[2024-06-01 23:39:18.1402283] (Utility.Process) process [21852] done ExitSuccess
+[2024-06-01 23:39:18.1402283] (Utility.Process) process [20876] done ExitSuccess
+[2024-06-01 23:39:18.1472316] (Utility.Process) process [18108] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","commit-tree","fdd574ad3eff386ce645e69b0160895f6633a131","--no-gpg-sign","-p","c275f9426ca2f587702c1212566bbdc5d09b620c","-m","Added some files\n"]
+[2024-06-01 23:39:18.1872297] (Utility.Process) process [18108] done ExitSuccess
+[2024-06-01 23:39:18.1932381] (Utility.Process) process [20276] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","update-ref","refs/basis/adjusted/master(unlocked)","864609e5e29506f6376c01a180aff55632be2bee"]
+[2024-06-01 23:39:18.2302311] (Utility.Process) process [20276] done ExitSuccess
+[2024-06-01 23:39:18.24023] (Utility.Process) process [24636] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--hash","refs/heads/master"]

(Diff truncated)
add news item for git-annex 10.20240531
diff --git a/doc/news/version_10.20231129.mdwn b/doc/news/version_10.20231129.mdwn
deleted file mode 100644
index 103697888d..0000000000
--- a/doc/news/version_10.20231129.mdwn
+++ /dev/null
@@ -1,22 +0,0 @@
-git-annex 10.20231129 released with [[!toggle text="these changes"]]
-[[!toggleable text="""  * Fix bug in git-annex copy --from --to that skipped files that were
-    locally present.
-  * Make git-annex copy --from --to --fast actually fast.
-  * Fix crash of enableremote when the special remote has embedcreds=yes.
-  * Ignore directories and other unusual files in .git/annex/journal/
-  * info: Added calculation of combined annex size of all repositories.
-  * log: Added options --sizesof, --sizes and --totalsizes that
-    display how the size of repositories changed over time.
-  * log: Added options --interval, --bytes, --received, and --gnuplot
-    to tune the output of the above added options.
-  * findkeys: Support --largerthan and --smallerthan.
-  * importfeed: Use caching database to avoid needing to list urls
-    on every run, and avoid using too much memory.
-  * Improve memory use of --all when using annex.private.
-  * lookupkey: Sped up --batch.
-  * Windows: Consistently avoid ending standard output lines with CR.
-    This matches the behavior of git on Windows.
-  * Windows: Fix CRLF handling in some log files.
-  * Windows: When git-annex init is installing hook scripts, it will
-    avoid ending lines with CR for portability. Existing hook scripts
-    that do have CR line endings will not be changed."""]]
\ No newline at end of file
diff --git a/doc/news/version_10.20240531.mdwn b/doc/news/version_10.20240531.mdwn
new file mode 100644
index 0000000000..3ee5149461
--- /dev/null
+++ b/doc/news/version_10.20240531.mdwn
@@ -0,0 +1,16 @@
+git-annex 10.20240531 released with [[!toggle text="these changes"]]
+[[!toggleable text="""  * git-remote-annex: New program which allows pushing a git repo to a
+    git-annex special remote, and cloning from a special remote.
+    (Based on Michael Hanke's git-remote-datalad-annex.)
+  * initremote, enableremote: Added --with-url to enable using
+    git-remote-annex.
+  * When building an adjusted unlocked branch, make pointer files
+    executable when the annex object file is executable.
+  * group: Added --list option.
+  * fsck: Fix recent reversion that made it say it was checksumming files
+    whose content is not present.
+  * Avoid the --fast option preventing checksumming in some cases it
+    was not supposed to.
+  * testremote: Really fsck downloaded objects.
+  * Typo fixes.
+    Thanks, Yaroslav Halchenko"""]]
\ No newline at end of file

some analysis
diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_1_d80d61a68d20813a4bf3a8e7e7a8ca9f._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_1_d80d61a68d20813a4bf3a8e7e7a8ca9f._comment
new file mode 100644
index 0000000000..aab4ae2c9c
--- /dev/null
+++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_1_d80d61a68d20813a4bf3a8e7e7a8ca9f._comment
@@ -0,0 +1,32 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2024-05-31T14:52:17Z"
+ content="""
+Reproduction recipe works, thanks!
+
+Happens back to 10.20240129 at least, this is not recent breakage.
+
+There are some interesting things in the git-annex history.
+Including some git-annex:export.tree grafting, and also
+a continued transition.
+
+I made a new empty repo, initialized and annexed some files. Running the
+same script but cloning that, this problem does not occur. I also tried
+exporting a tree in that repo, and still the problem doesn't occur. I even
+tried running `git-annex forget` in there and still can't cause the
+problem.
+
+So something about this specific repo's git-annex branch history is
+triggering the problem and I don't know what. I've archived the current
+state of this repo in my big repo as git-annex-test-repos/ds002144.tar.gz
+to make sure I can continue to reproduce this.
+
+The first git-annex branch commit that is missing its tree object
+is a git-annex:export.tree graft commit. That is 3 commits above
+the git-annex branch pulled from github:
+
+Very interesting. Especially since the point of those export.tree graft commits
+are to make sure that the exported tree objects are referenced and so don't get
+gced out from under us.
+"""]]
diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_2_c615b185b48d0ac08c0b332fe8e5760a._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_2_c615b185b48d0ac08c0b332fe8e5760a._comment
new file mode 100644
index 0000000000..00bfab72fd
--- /dev/null
+++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_2_c615b185b48d0ac08c0b332fe8e5760a._comment
@@ -0,0 +1,54 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2024-05-31T15:26:26Z"
+ content="""
+Resetting the repo's git-annex branch all the way back to the 1st commit in it
+is sufficient to reproduce this bug.
+
+	joey@darkstar:~/tmp/ds002144#main>git log git-annex
+	commit 2e24112747f3742c5426138def93fd3219574df7 (git-annex)
+	Author: Git Worker <git@openneuro.org>
+	Date:   Fri Jan 19 21:04:18 2024 +0000
+	
+	    new branch for transition ["forget git history"]
+
+Hmm. That ref contains an export.log that references some tree shas.
+
+	1596548650.649001679s 2606f878-85c6-459a-8402-5f4b98720bbd:58a4efbe-8fb4-4cb3-8be3-b982a4673947 b78b723042e6d7a967c806b52258e8554caa1696 ae2937297eb1b4f6c9bfdfcf9d7a41b1adcea32e
+	1705698180.15617956s 2606f878-85c6-459a-8402-5f4b98720bbd:8af9d961-216f-47ec-b052-31696fc2f12d ae2937297eb1b4f6c9bfdfcf9d7a41b1adcea32e 28b655e8207f916122bbcbd22c0369d86bb4ffc1
+
+Those seem familiar:
+
+	missing tree b78b723042e6d7a967c806b52258e8554caa1696
+	missing tree ae2937297eb1b4f6c9bfdfcf9d7a41b1adcea32e
+
+So ok.. We have here a transition that forgot git history. But it kept an
+export.log that referenced 2 trees in that now-forgotten git history.
+
+Everything else seems to follow from that. Grafting those trees back into the
+git-annex branch in order to not forget them is a bad move since they're
+already forgotten. So it could just avoid doing that, if the tree object
+is missing, I suppose.
+
+There might be a deeper bug though: If we want to `git-annex export`, in either
+the original repo with forgotten history, or in a clone, it won't be able to
+refer to those tree objects. So it won't know what has been written to the
+special remote. So eg, if we export a tree that deletes a file compared to one
+of these trees, it wouldn't delete the file from the special remote.
+I think this problem might not happen when exporting in the original repo,
+because there the export database also records the same information. More likely
+it will happen in a clone.
+
+So, action items:
+
+* When performing a transition, the trees mentioned in export.log needs to be
+  grafted back in, in order not to lose them. I think it already is supposed to
+  do that, but it clearly didn't work in this case. So I need to find a way to
+  reproduce the situation in commit 2e24112747f3742c5426138def93fd3219574df7 in
+  a new repository to find out why that didn't happen. And fix that.
+* When encountering a git-annex branch with this situation in it, avoid
+  grafting missing trees back into the branch. And probably `git-annex export`
+  needs to refuse to touch the affected special remote, or warn the user
+  that it's lost track of what files were sent to the special remote.
+"""]]
diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__/comment_3_a4ecfaa2f8a050a179398c4b01f018a9._comment b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_3_a4ecfaa2f8a050a179398c4b01f018a9._comment
new file mode 100644
index 0000000000..da8af4031c
--- /dev/null
+++ b/doc/bugs/annex_merge__breaks_git_repository__33__/comment_3_a4ecfaa2f8a050a179398c4b01f018a9._comment
@@ -0,0 +1,17 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2024-05-31T15:42:17Z"
+ content="""
+Occurs to me that one way to get a repository into this situation would be
+to do a `git-annex export`, then `git-annex forget`, and then manually
+reset the git-annex branch to `git-annex^^` (or similarly push
+`git-annex^^` to origin).
+
+There is a commit after the transition commit that re-grafts the exported
+tree back into the git-annex branch, and a manual reset would cause exactly
+this situation.
+
+I doubt OpenNeuro is manually resetting the git-annex branch when creating
+these repos, but stranger things have happened...
+"""]]

report on git repo getting broken
diff --git a/doc/bugs/annex_merge__breaks_git_repository__33__.mdwn b/doc/bugs/annex_merge__breaks_git_repository__33__.mdwn
new file mode 100644
index 0000000000..c453f3b39c
--- /dev/null
+++ b/doc/bugs/annex_merge__breaks_git_repository__33__.mdwn
@@ -0,0 +1,55 @@
+### Please describe the problem.
+
+References of the struggle with more background:
+- [https://github.com/datalad/datalad/issues/7608](https://github.com/datalad/datalad/issues/7608)
+- [https://github.com/datalad/datalad/issues/7609](https://github.com/datalad/datalad/issues/7609)
+
+### What steps will reproduce the problem?
+
+```
+$> (set -e; cd /tmp/; rm -rf ds002144*; git clone http://github.com/OpenNeuroDatasets/ds002144 ; cd ds002144; git fsck; mkdir /tmp/ds002144-2; (cd /tmp/ds002144-2; git init; git annex init; ); git remote add --fetch datalad-public /tmp/ds002144-2; git fsck; git annex merge; git fsck; )
+
+```
+
+### What version of git-annex are you using? On what operating system?
+
+```
+10.20240430+git26-g5f61667f27-1~ndall+1%
+```
+
+### Please provide any additional information below.
+
+[[!format sh """
+
+$> (set -ex; cd /tmp/; rm -rf ds002144*; git clone http://github.com/OpenNeuroDatasets/ds002144 ; cd ds002144; git fsck; mkdir /tmp/ds002144-2; (cd /tmp/ds002144-2; git init; git annex init; ); git remote add --fetch datalad-public /tmp/ds002144-2; git fsck; git annex merge; git fsck; )
+...
++/bin/zsh:80> git fsck
+Checking object directories: 100% (256/256), done.
+Checking objects: 100% (4759/4759), done.
++/bin/zsh:80> git annex merge
+(merging datalad-public/git-annex into git-annex...)
+(recording state in git...)
+
+  Remote origin not usable by git-annex; setting annex-ignore
+
+  http://github.com/OpenNeuroDatasets/ds002144/config download failed: Not Found
+merge git-annex ok
++/bin/zsh:80> git fsck
+Checking object directories: 100% (256/256), done.
+Checking objects: 100% (4759/4759), done.
+broken link from    tree 4089998623737d39cd3f5d6fdfa89b164898e464
+              to    tree ae2937297eb1b4f6c9bfdfcf9d7a41b1adcea32e
+broken link from    tree 8ba58233cd121b97d5c918a6ba7c3a8c56fd38b1
+              to    tree b78b723042e6d7a967c806b52258e8554caa1696
+missing tree b78b723042e6d7a967c806b52258e8554caa1696
+missing tree ae2937297eb1b4f6c9bfdfcf9d7a41b1adcea32e
+
+"""]]
+
+### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
+
+there are good and there are some bad days ;)
+
+[[!meta author=yoh]]
+[[!tag projects/openneuro]]
+

reporting that annex merge should not merge into main branch
diff --git a/doc/bugs/annex_merge_mustn__39__t_merge_into_non-git-annex_branc.mdwn b/doc/bugs/annex_merge_mustn__39__t_merge_into_non-git-annex_branc.mdwn
new file mode 100644
index 0000000000..f96eee5042
--- /dev/null
+++ b/doc/bugs/annex_merge_mustn__39__t_merge_into_non-git-annex_branc.mdwn
@@ -0,0 +1,157 @@
+### Please describe the problem.
+
+Somehow I ended up with the case that git remote (over ssh) had no "main" branch pushed, only "git-annex" branch which was not yet merged into local git-annex branch.
+More details at [https://github.com/datalad/datalad/issues/7608](https://github.com/datalad/datalad/issues/7608).
+
+git-annex info:
+
+<details>
+<summary>git-annex info shows datalad-public remote and all looks ok</summary> 
+
+```shell
+(git)smaug:/mnt/btrfs/datasets/datalad/crawl/openneuro/ds002144[main]git
+$> git annex info
+trusted repositories: 0
+semitrusted repositories: 7
+        00000000-0000-0000-0000-000000000001 -- web
+        00000000-0000-0000-0000-000000000002 -- bittorrent
+        2606f878-85c6-459a-8402-5f4b98720bbd -- root@openneuro-prod-dataset-worker-14:/datalad/ds002144
+        58a4efbe-8fb4-4cb3-8be3-b982a4673947 -- s3-PRIVATE
+        72279570-1519-43aa-aea8-6f9c3a6f72f4 -- yoh@smaug:/mnt/btrfs/datasets/datalad/crawl/openneuro/ds002144 [here]
+        ebcbc36d-4230-46b8-8654-87876ed1af0f -- yoh@falkor:/srv/datasets.datalad.org/www/openneuro/ds002144 [datalad-public]
+        fb99edfc-26aa-40ac-b770-996e91421e88 -- [s3-PUBLIC]
+untrusted repositories: 0
+transfers in progress: none
+available local disk space: 23.15 terabytes (+100 megabytes reserved)
+local annex keys: 0
+local annex size: 0 bytes
+annexed files in working tree: 288
+size of annexed files in working tree: 84.49 gigabytes
+combined annex size of all repositories: 257.38 gigabytes (+ 90 unknown size)
+annex sizes of repositories:
+        88.35 GB: 2606f878-85c6-459a-8402-5f4b98720bbd -- root@openneuro-prod-dataset-worker-14:/datalad/ds002144
+        85.93 GB: fb99edfc-26aa-40ac-b770-996e91421e88 -- [s3-PUBLIC]
+         83.1 GB: 58a4efbe-8fb4-4cb3-8be3-b982a4673947 -- s3-PRIVATE
+backend usage:
+        MD5E: 288
+bloom filter size: 32 mebibytes (0% full)
+
+```
+</details>
+
+But the datalad-public/git-annex branch is not yet merged into local git-annex branch, and `git annex merge` even though says "ok", in reality does nothing - it remains not merged.
+
+```
+(git)smaug:/mnt/btrfs/datasets/datalad/crawl/openneuro/ds002144[main]git
+$> git annex merge --debug
+merge git-annex [2024-05-31 09:46:03.819131714] (Utility.Process) process [2601180] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","git-annex"]
+[2024-05-31 09:46:03.821638775] (Utility.Process) process [2601180] done ExitSuccess
+[2024-05-31 09:46:03.822255445] (Utility.Process) process [2601181] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/git-annex"]
+[2024-05-31 09:46:03.825268573] (Utility.Process) process [2601181] done ExitSuccess
+ok
+[2024-05-31 09:46:03.826008662] (Utility.Process) process [2601184] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","symbolic-ref","-q","HEAD"]
+[2024-05-31 09:46:03.829119829] (Utility.Process) process [2601184] done ExitSuccess
+[2024-05-31 09:46:03.829536231] (Utility.Process) process [2601187] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","refs/heads/main"]
+[2024-05-31 09:46:03.833612799] (Utility.Process) process [2601187] done ExitSuccess
+[2024-05-31 09:46:03.834740598] (Utility.Process) process [2601188] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch"]
+[2024-05-31 09:46:03.842123731] (Utility.Process) process [2601192] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--verify","-q","refs/heads/synced/main"]
+[2024-05-31 09:46:03.846432269] (Utility.Process) process [2601192] done ExitFailure 1
+[2024-05-31 09:46:03.846903191] (Utility.Process) process [2601188] done ExitSuccess
+(dev3) 2 12816.....................................:Fri 31 May 2024 09:46:03 AM EDT:.
+
+$> echo $?
+0
+
+$> git br -a
+  git-annex
+* main
+  remotes/datalad-public/git-annex
+  remotes/datalad-public/main
+  remotes/origin/HEAD -> origin/main
+  remotes/origin/git-annex
+  remotes/origin/main
+  remotes/origin/master
+```
+
+so I thought to force the annex merge manually expecting that git-annex would do its merge into `git-annex` branch, but it does it into current `main`:
+
+```
+(git)smaug:/mnt/btrfs/datasets/datalad/crawl/openneuro/ds002144[main]git
+$> git br -a
+  git-annex
+* main
+  remotes/datalad-public/git-annex
+  remotes/datalad-public/main
+  remotes/origin/HEAD -> origin/main
+  remotes/origin/git-annex
+  remotes/origin/main
+  remotes/origin/master
+
+$> git branch
+  git-annex
+* main
+
+$> git annex merge --allow-unrelated-histories datalad-public/git-annex
+merge datalad-public/git-annex
+Merge made by the 'ort' strategy.
+ uuid.log | 1 +
+ 1 file changed, 1 insertion(+)
+ create mode 100644 uuid.log
+ok
+
+$> git show
+commit ce8ade971ead660c4dccc5cc1214a894fbfd65a2 (HEAD -> main)
+Merge: 99006dd a7cd458
+Author: Yaroslav Halchenko <debian@onerussian.com>
+Date:   Fri May 31 09:47:12 2024 -0400
+
+    Merge remote-tracking branch 'datalad-public/git-annex'
+
+    * datalad-public/git-annex:
+      update
+      branch created
+
+```
+
+<details>
+<summary>FWIW here is again with --debug</summary> 
+
+```shell
+(git)smaug:/mnt/btrfs/datasets/datalad/crawl/openneuro/ds002144[main]git
+$> git reset --hard origin/main
+HEAD is now at 99006dd [DATALAD] added content
+(dev3) 2 12822.....................................:Fri 31 May 2024 09:47:57 AM EDT:.
+(git)smaug:/mnt/btrfs/datasets/datalad/crawl/openneuro/ds002144[main]git
+$> git annex merge --debug --allow-unrelated-histories datalad-public/git-annex
+merge datalad-public/git-annex [2024-05-31 09:48:03.27116017] (Utility.Process) process [2606124] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","symbolic-ref","-q","HEAD"]
+[2024-05-31 09:48:03.275932771] (Utility.Process) process [2606124] done ExitSuccess
+[2024-05-31 09:48:03.276628862] (Utility.Process) process [2606125] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","refs/heads/main"]
+[2024-05-31 09:48:03.281249999] (Utility.Process) process [2606125] done ExitSuccess
+
+[2024-05-31 09:48:03.282011289] (Utility.Process) process [2606126] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/main"]
+[2024-05-31 09:48:03.284998952] (Utility.Process) process [2606126] done ExitSuccess
+[2024-05-31 09:48:03.286186327] (Utility.Process) process [2606127] read: git ["--version"]
+[2024-05-31 09:48:03.28861691] (Utility.Process) process [2606127] done ExitSuccess
+[2024-05-31 09:48:03.289258819] (Utility.Process) process [2606128] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","merge","--no-edit","datalad-public/git-annex","--allow-unrelated-histories"]
+Merge made by the 'ort' strategy.
+ uuid.log | 1 +
+ 1 file changed, 1 insertion(+)
+ create mode 100644 uuid.log
+[2024-05-31 09:48:03.398901654] (Utility.Process) process [2606147] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","filter.annex.smudge=","-c","filter.annex.clean=","-c","filter.annex.process=","write-tree"]
+[2024-05-31 09:48:03.40183409] (Utility.Process) process [2606147] done ExitSuccess
+[2024-05-31 09:48:03.402786193] (Utility.Process) process [2606148] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--hash","refs/annex/last-index"]
+[2024-05-31 09:48:03.406085518] (Utility.Process) process [2606148] done ExitSuccess
+[2024-05-31 09:48:03.424223834] (Utility.Process) process [2606128] done ExitSuccess
+ok
+
+```
+</details>
+
+### What version of git-annex are you using? On what operating system?
+
+
+10.20240430+git26-g5f61667f27-1~ndall+1
+
+[[!meta author=yoh]]
+[[!tag projects/openneuro]]
+

tweak
diff --git a/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn b/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
index a25eee8c4c..0e79e3d337 100644
--- a/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
+++ b/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
@@ -84,8 +84,8 @@ to access it:
 
 	git-annex initremote foohttp --with-url --sameas=foo type=httpalso url=https://example.com/foo/
 
-Be sure to remember to include exporttree=yes if the remote is configured
-that way.
+Be sure to remember to include exporttree=yes if the special remote is
+configured that way.
 
 Once a httpalso remote is set up like this, `git fetch` from it to display
 its full annex:: url. That url can be shared with others to let them clone

tweak
diff --git a/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn b/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
index f8605ca6f1..a25eee8c4c 100644
--- a/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
+++ b/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
@@ -76,6 +76,7 @@ But this won't work with special remotes that are configured with
 If the content of a special remote gets written to some location that is
 published to http, you can use the [[special_remotes/httpalso]] special
 remote with [[git-remote-annex]] to `git clone` and `git pull` over http.
+(It's readonly, so no pushing.)
 
 For example, if your directory special remote named "foo" is published
 at `https://example.com/foo/`, set up the httpalso remote like this

tweak
diff --git a/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn b/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
index 83ee670889..f8605ca6f1 100644
--- a/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
+++ b/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
@@ -75,7 +75,7 @@ But this won't work with special remotes that are configured with
 
 If the content of a special remote gets written to some location that is
 published to http, you can use the [[special_remotes/httpalso]] special
-remote with [[git-remote-annex]] to `git clone` and `git fetch` over http.
+remote with [[git-remote-annex]] to `git clone` and `git pull` over http.
 
 For example, if your directory special remote named "foo" is published
 at `https://example.com/foo/`, set up the httpalso remote like this

tweaks
diff --git a/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn b/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
index 31e6828389..83ee670889 100644
--- a/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
+++ b/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
@@ -5,7 +5,7 @@ repository alongside the annexed files. So you can `git pull`, `git push`, and
 even `git clone` from a special remote.
 
 In order to use [[git-remote-annex]], the special remote needs to have
-its url set to an url starting with "annex::".
+its url configured to something starting with "annex::".
 
 Special remotes are not configured with such an url by default,
 but it can easily be set by using the `--with-url` parameter
@@ -23,7 +23,7 @@ Or you could configure it manually:
     git config remote.foo.fetch '+refs/heads/*:refs/remotes/foo/*'
 
 Now you can `git push foo` and `git pull foo`. And commands like
-`git-annex sync` will also use foo as a git repository.
+`git-annex sync` will also use foo as a git remote.
 
 You can even `git clone` from the special remote. To do that, you need
 an url that tells git-annex all about the special remote's configuration.

tip
diff --git a/doc/git-remote-annex.mdwn b/doc/git-remote-annex.mdwn
index 3238f7f15a..35bdd8751e 100644
--- a/doc/git-remote-annex.mdwn
+++ b/doc/git-remote-annex.mdwn
@@ -78,25 +78,6 @@ time, for one of the pushes to be overwritten by the other one. In this
 situation, the overwritten push will appear to have succeeded, but pulling
 later will show the true situation.
 
-# HTTPALSO
-
-If the content of a special remote is published via http, a httpalso
-special remote can be initialized, and used to `git clone` and `git fetch`
-over http.
-
-For example, a directory special remote named "foo" is published
-at `https://example.com/foo/`, set up the httpalso remote like this
-to access it:
-
-    git-annex initremote foohttp --with-url --sameas=foo type=httpalso url=https://example.com/foo/
-
-Be sure to remember to include exporttree=yes if the remote is configured
-that way.
-
-Once a httpalso remote is set up like this, `git fetch` from it to display
-its full annex:: url. That url can be shared with others to let them clone
-the repository.
-
 # REPOSITORY FORMAT
 
 The git repository is stored in the special remote using special annex objects
diff --git a/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn b/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
new file mode 100644
index 0000000000..31e6828389
--- /dev/null
+++ b/doc/tips/storing_a_git_repository_on_any_special_remote.mdwn
@@ -0,0 +1,97 @@
+Usually a [[special remote|special_remotes]] stores the content of annexed
+files, but you need another git remote to store your git repository. But now we
+have [[git-remote-annex]], which lets most any special remote store a git
+repository alongside the annexed files. So you can `git pull`, `git push`, and
+even `git clone` from a special remote.
+
+In order to use [[git-remote-annex]], the special remote needs to have
+its url set to an url starting with "annex::".
+
+Special remotes are not configured with such an url by default,
+but it can easily be set by using the `--with-url` parameter
+when running [[git-annex-initremote]] or [[git-annex-enableremote]].
+
+Let's say you have a special remote named "foo" you want to use
+with [[git-remote-annex]]. This command will configure remote.foo.url
+and also remote.foo.fetch:
+
+    git-annex enableremote foo --with-url
+
+Or you could configure it manually:
+
+    git config remote.foo.url annex::
+    git config remote.foo.fetch '+refs/heads/*:refs/remotes/foo/*'
+
+Now you can `git push foo` and `git pull foo`. And commands like
+`git-annex sync` will also use foo as a git repository.
+
+You can even `git clone` from the special remote. To do that, you need
+an url that tells git-annex all about the special remote's configuration.
+The easy way to get that url is to run `git fetch foo`,
+and look at its output, which might look like this if it's an S3 bucket.
+
+	Full remote url: annex::13c2500f-a302-4331-9720-6ec43cb8da2b?type=S3&encryption=none&bucket=foo
+
+## does it work just like any other git remote?
+
+Very close, but not completely the same.
+
+If two people make conflicting pushes into a special remote at the same time,
+one of the pushes will overwrite the other one. In this situation, the
+overwritten push will appear to have succeeded, but pulling later will show the
+true situation.
+
+While pushes are mostly done incrementally, and so are fast, sometimes it will
+do a full re-upload of the contents of your repository, which is slower.
+
+If you `git push --delete foo badbranch`, the branch will be deleted, but
+all traces of it will not be immediately removed from the special remote.
+
+See the "REPOSITORY FORMAT" section of [[git-remote-annex]] for details.
+
+## what special remotes does this work with?
+
+Some types of special remotes already have an url that points at a git
+repository, so it can't also be set to an annex:: url.
+For example, [[special_remotes/git-lfs]].
+
+A few other types of special remotes can't be used for other reasons,
+including [[special_remotes/web]] and [[special_remotes/borg]].
+
+Encrypted special remotes *can* be used as git remotes. But, the git repository
+contains information that is needed to decrypt the files that are stored
+on the special remote. This means it's not possible to clone from an encrypted
+special remote. So you will be prompted to set a git config before using
+[[git-remote-annex]] with an encrypted special remote, to avoid shooting
+yourself in the foot.
+
+You *can* use this with special remotes that are configured with 
+"exporttree=yes". (The git repository is written under `.git/annex/objects/`
+on the remote to keep it separate from the exported files.) 
+But this won't work with special remotes that are configured with
+ "importrree=yes" but without "exportrree=yes".
+
+## httpalso special remotes
+
+If the content of a special remote gets written to some location that is
+published to http, you can use the [[special_remotes/httpalso]] special
+remote with [[git-remote-annex]] to `git clone` and `git fetch` over http.
+
+For example, if your directory special remote named "foo" is published
+at `https://example.com/foo/`, set up the httpalso remote like this
+to access it:
+
+	git-annex initremote foohttp --with-url --sameas=foo type=httpalso url=https://example.com/foo/
+
+Be sure to remember to include exporttree=yes if the remote is configured
+that way.
+
+Once a httpalso remote is set up like this, `git fetch` from it to display
+its full annex:: url. That url can be shared with others to let them clone
+the repository.
+
+The url will be rather long and ugly. There's a way to make a shorter url
+that you can tell someone to let them clone your httpalso repository.
+Just write the url to a file on your website. Then wrap the url to that
+file in an annex:: url, for example "annex::https://example.com/foo-repo"
+Currently this only works for httpalso urls.

git-remote-annex: Support urls like annex::https://example.com/foo-repo
Using the usual url download machinery even allows these urls to need
http basic auth, which is prompted for with git-credential. Which opens
the possibility for urls that contain a secret to be used, eg the cipher
for encryption=shared. Although the user is currently on their own
constructing such an url, I do think it would work.
Limited to httpalso for now, for security reasons. Since both httpalso
(and retrieving this very url) is limited by the usual
annex.security.allowed-ip-addresses configs, it's not possible for an
attacker to make one of these urls that sets up a httpalso url that
opens the garage door. Which is one class of attacks to keep in mind
with this thing.
It seems that there could be either a git-config that allows other types
of special remotes to be set up this way, or special remotes could
indicate when they are safe. I do worry that the git-config would
encourage users to set it without thinking through the security
implications. One remote config might be safe to access this way, but
another config, for one with the same type, might not be. This will need
further thought, and real-world examples to decide what to do.
diff --git a/CmdLine/GitRemoteAnnex.hs b/CmdLine/GitRemoteAnnex.hs
index 9ec4acc4e3..7666f5da03 100644
--- a/CmdLine/GitRemoteAnnex.hs
+++ b/CmdLine/GitRemoteAnnex.hs
@@ -24,6 +24,7 @@ import qualified Git.Version
 import qualified Annex.SpecialRemote as SpecialRemote
 import qualified Annex.Branch
 import qualified Annex.BranchState
+import qualified Annex.Url as Url
 import qualified Types.Remote as Remote
 import qualified Logs.Remote
 import qualified Remote.External
@@ -57,6 +58,7 @@ import Utility.FileMode
 
 import Network.URI
 import Data.Either
+import Data.Char
 import qualified Data.ByteString as B
 import qualified Data.ByteString.Char8 as B8
 import qualified Data.Map.Strict as M
@@ -65,21 +67,25 @@ import qualified Utility.RawFilePath as R
 import qualified Data.Set as S
 
 run :: [String] -> IO ()
-run (remotename:url:[]) =
-	-- git strips the "annex::" prefix of the url
-	-- when running this command, so add it back
-	let url' = "annex::" ++ url
-	in case parseSpecialRemoteNameUrl remotename url' of
-		Left e -> giveup e
-		Right src -> do
-			repo <- getRepo
-			state <- Annex.new repo
-			Annex.eval state (run' src url')
+run (remotename:url:[]) = do
+	repo <- getRepo
+	state <- Annex.new repo
+	Annex.eval state $
+		resolveSpecialRemoteWebUrl url >>= \case
+			-- git strips the "annex::" prefix of the url
+			-- when running this command, so add it back
+			Nothing -> parseurl ("annex::" ++ url) pure
+			Just url' -> parseurl url' checkAllowedFromSpecialRemoteWebUrl
+  where
+	parseurl u checkallowed =
+		case parseSpecialRemoteNameUrl remotename u of
+			Right src -> checkallowed src >>= run' u
+			Left e -> giveup e
 run (_remotename:[]) = giveup "remote url not configured"
 run _ = giveup "expected remote name and url parameters"
 
-run' :: SpecialRemoteConfig -> String -> Annex ()
-run' src url = do
+run' :: String -> SpecialRemoteConfig -> Annex ()
+run' url src = do
 	sab <- startAnnexBranch
 	whenM (Annex.getRead Annex.debugenabled) $
 		enableDebugOutput
@@ -477,7 +483,36 @@ parseSpecialRemoteUrl url remotename = case parseURI url of
 		let (k, sv) = break (== '=') kv
 		    v = if null sv then sv else drop 1 sv
 		in (Proposed (unEscapeString k), Proposed (unEscapeString v))
-			
+
+-- Handles an url that contains a http address, by downloading
+-- the web page and using it as the full annex:: url.
+-- The passed url has already had "annex::" stripped off.
+resolveSpecialRemoteWebUrl :: String -> Annex (Maybe String)
+resolveSpecialRemoteWebUrl url
+	| "http://" `isPrefixOf` lcurl || "https://" `isPrefixOf` lcurl =
+		Url.withUrlOptionsPromptingCreds $ \uo ->
+			withTmpFile "git-remote-annex" $ \tmp h -> do
+				liftIO $ hClose h
+				Url.download' nullMeterUpdate Nothing url tmp uo >>= \case
+					Left err -> giveup $ url ++ " " ++ err
+					Right () -> liftIO $
+						(headMaybe . lines)
+							<$> readFileStrict tmp
+	| otherwise = return Nothing
+  where
+	lcurl = map toLower url
+
+-- Only some types of special remotes are allowed to come from
+-- resolveSpecialRemoteWebUrl. Throws an error if this one is not.
+checkAllowedFromSpecialRemoteWebUrl :: SpecialRemoteConfig -> Annex SpecialRemoteConfig
+checkAllowedFromSpecialRemoteWebUrl src@(ExistingSpecialRemote {}) = pure src
+checkAllowedFromSpecialRemoteWebUrl src@(SpecialRemoteConfig {}) =
+	case M.lookup typeField (specialRemoteConfig src) of
+		Nothing -> giveup "Web URL did not include a type field."
+		Just t
+			| t == Proposed "httpalso" -> return src
+			| otherwise -> giveup "Web URL can only be used for a httpalso special remote."
+
 getSpecialRemoteUrl :: Remote -> Annex (Maybe String)
 getSpecialRemoteUrl rmt = do
 	rcp <- Remote.configParser (Remote.remotetype rmt)
diff --git a/doc/git-remote-annex.mdwn b/doc/git-remote-annex.mdwn
index 4b2ac1288e..3238f7f15a 100644
--- a/doc/git-remote-annex.mdwn
+++ b/doc/git-remote-annex.mdwn
@@ -10,31 +10,61 @@ git fetch annex::uuid?param=value&param=value...
 
 This is a git remote helper program that allows git to clone,
 pull and push from a git repository that is stored in a git-annex
-special remote.
+special remote with an URL that starts with "annex::"
 
-The format of the remote URL is "annex::" followed by the UUID of the
-special remote, and then followed by all of the configuration parameters of
-the special remote.
+The special remote needs to have a `remote.<name>.url` 
+configured to use this. That is set up automatically when git
+cloning from a special remote.
 
-For example, to clone from a directory special remote:
+To make [[git-annex-initremote]](1) and [[git-annex-enableremote]](1)
+configure the url, pass them the `--with-url` option.
 
-    git clone annex::358ff77e-0bc3-11ef-bc49-872e6695c0e3?type=directory&encryption=none&directory=/mnt/foo/
+Or, to configure an existing special remote with a shorthand URL, run:
 
-But you don't need to generate such an url yourself. Instead, you can use
-the shorthand url of "annex::" with an existing special remote.
+    git config remote.name.url annex::
 
-    git-annex initremote foo type=directory encryption=none directory=/mnt/foo
-    git config remote.foo.url annex::
-	git push foo master
+Once the URL is configured, you can use `git pull`, `git push`, etc
+with the special remote much like with any other git remote.
+But see CONFLICTING PUSHES below for some situations where it behaves
+slightly differently.
 
-Configuring the url like that is automatically done when cloning from a
-special remote. To make [[git-annex-initremote]](1) and
-[[git-annex-enableremote]](1) configure the url, pass them the `--with-url`
-option.
+# URL FORMAT
 
-When using the shorthand "annex::" url, the full url will be displayed
-each time you git pull or push, when it's possible for git-annex to
-determine it.
+This uses an URL that starts with "annex::". There are three forms of such
+URLs:
+
+* Complete URL
+
+  This contains the UUID and all configuration parameters 
+  of the special remote that were passed when using 
+  `git-annex initremote`. 
+  
+  For example, to clone from a directory special remote:
+  
+    git clone annex::358ff77e-0bc3-11ef-bc49-872e6695c0e3?type=directory&encryption=none&directory=/mnt/foo/ 
+
+* Shorthand URL
+
+  This makes it easy to configure an existing special remote with an URL
+  without having to come up with the complete URL.
+  
+    annex::
+  
+  When using this shorthand URL, the full URL will be displayed each time you
+  git pull or push, when it's possible for git-annex to determine it. 
+  (Although in some cases, like the directory special remote, some 
+  parameters may be left off of the displayed URL.)
+
+* Web URL
+
+  This URL points at a file on the web, which contains the complete annex::
+  URL.
+  
+    annex::https://example.com/foo-repo
+  
+  Not all special remotes can be accessed by such an URL,
+  for security reasons. Currently, this is limited to httpalso special
+  remotes.
 
 # CONFLICTING PUSHES
 
@@ -48,13 +78,13 @@ time, for one of the pushes to be overwritten by the other one. In this
 situation, the overwritten push will appear to have succeeded, but pulling
 later will show the true situation.
 
-# HTTP ACCESS
+# HTTPALSO
 
 If the content of a special remote is published via http, a httpalso
 special remote can be initialized, and used to `git clone` and `git fetch`
 over http.
 
-For example, if the directory special remote set up above is published
+For example, a directory special remote named "foo" is published
 at `https://example.com/foo/`, set up the httpalso remote like this
 to access it:
 
diff --git a/doc/todo/annex_url_redirects.mdwn b/doc/todo/annex_url_redirects.mdwn
index 9749f8ceb5..0807daacd0 100644
--- a/doc/todo/annex_url_redirects.mdwn
+++ b/doc/todo/annex_url_redirects.mdwn

(Diff truncated)
Added a comment
diff --git a/doc/todo/make_annex___34__respect__34___.git__47__hooks__47__prepare-commit-msg/comment_8_b89d83fbf3f8a1e88fa607f01b182d0f._comment b/doc/todo/make_annex___34__respect__34___.git__47__hooks__47__prepare-commit-msg/comment_8_b89d83fbf3f8a1e88fa607f01b182d0f._comment
new file mode 100644
index 0000000000..a6dfe57d7e
--- /dev/null
+++ b/doc/todo/make_annex___34__respect__34___.git__47__hooks__47__prepare-commit-msg/comment_8_b89d83fbf3f8a1e88fa607f01b182d0f._comment
@@ -0,0 +1,41 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="comment 8"
+ date="2024-05-30T14:34:32Z"
+ content="""
+FWIW the trick with `.git/last-commit-msg` did not really work for me in direct use of git-annex as I guess annex introduces changes to git-annex when I do `git-annex add` first, so it takes then PRIOR commit message:
+
+```
+❯ git annex add random2
+(recording state in git...)
+add random2 
+ok                                
+(recording state in git...)
+❯ git commit -a
+[master 16107604] Will be committing changes to following files: random2
+ 1 file changed, 1 insertion(+)
+ create mode 120000 random2
+❯ git show git-annex
+commit dfb41b9b170e4d504e1e494538362e20bd73943a (git-annex)
+Author: Yaroslav Halchenko <debian@onerussian.com>
+Date:   Thu May 30 10:31:46 2024 -0400
+
+    Will be committing changes to following files: random
+
+diff --git a/a37/1c8/MD5E-s1000--288b6b2b44800acf433b76dc5889695c.log b/a37/1c8/MD5E-s1000--288b6b2b44800acf433b76dc5889695c.log
+new file mode 100644
+index 00000000..4a3787e6
+--- /dev/null
++++ b/a37/1c8/MD5E-s1000--288b6b2b44800acf433b76dc5889695c.log
+@@ -0,0 +1 @@
++1717079506s 1 fff52b70-2aa4-4d16-8377-97fee7b2de1c
+❯ cat .git/last-commit-msg
+Will be committing changes to following files: random2
+
+❯ git lg HEAD^^..HEAD
+* 16107604 - (HEAD -> master) Will be committing changes to following files: random2 (3 minutes ago) [Yaroslav Halchenko]
+* d4840167 - Will be committing changes to following files: random (4 minutes ago) [Yaroslav Halchenko]
+
+```
+"""]]

Added a comment
diff --git a/doc/todo/make_annex___34__respect__34___.git__47__hooks__47__prepare-commit-msg/comment_7_a766358f9d5635589379d9be16ef8092._comment b/doc/todo/make_annex___34__respect__34___.git__47__hooks__47__prepare-commit-msg/comment_7_a766358f9d5635589379d9be16ef8092._comment
new file mode 100644
index 0000000000..8bc1b86bd9
--- /dev/null
+++ b/doc/todo/make_annex___34__respect__34___.git__47__hooks__47__prepare-commit-msg/comment_7_a766358f9d5635589379d9be16ef8092._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="comment 7"
+ date="2024-05-30T14:29:43Z"
+ content="""
+fixed in [10.20240129-86-g3475b09c3e AKA 10.20240227~29](https://git.kitenet.net/index.cgi/git-annex.git/commit/?id=3475b09c3e93ae7e5f21de02e7ada8f460e490a4) --[[yarikoptic]]
+"""]]

security
diff --git a/doc/todo/annex_url_redirects.mdwn b/doc/todo/annex_url_redirects.mdwn
index 13da557ad9..9749f8ceb5 100644
--- a/doc/todo/annex_url_redirects.mdwn
+++ b/doc/todo/annex_url_redirects.mdwn
@@ -4,3 +4,16 @@ short url
 How about supporting an url like "annex::https://example.com/foo",
 where the http url redirects to the full annex url. Then any url
 shortener can be used. --[[Joey]]
+
+> This might be a security problem. An arbitrary annex:: url can access an
+> arbitrary resource. Eg, it might be a directory special remote, using any
+> directory on the user's computer, and they won't know if it's hidden
+> behind a http redirect.
+> 
+> Perhaps that could be dealt with by displaying information about the
+> special remote and prompting if it's ok to use. But users generally
+> say "yes" without thinking.
+> 
+> Perhaps it could be limited to safe special remotes. httpalso is surely
+> safe in this context. Would anything else be? Any external special
+> remotes? --[[Joey]]

todo
diff --git a/doc/todo/annex_url_redirects.mdwn b/doc/todo/annex_url_redirects.mdwn
new file mode 100644
index 0000000000..13da557ad9
--- /dev/null
+++ b/doc/todo/annex_url_redirects.mdwn
@@ -0,0 +1,6 @@
+annex:: urls are long, and in some situations it's useful to have a nice
+short url
+
+How about supporting an url like "annex::https://example.com/foo",
+where the http url redirects to the full annex url. Then any url
+shortener can be used. --[[Joey]]

Added a comment
diff --git a/doc/bugs/git_annex_get_is_silently_stuck_on__P2P___62___GET_0/comment_3_6a9abad163d6cc3c1c0cc90ed3a3f256._comment b/doc/bugs/git_annex_get_is_silently_stuck_on__P2P___62___GET_0/comment_3_6a9abad163d6cc3c1c0cc90ed3a3f256._comment
new file mode 100644
index 0000000000..5fc7172268
--- /dev/null
+++ b/doc/bugs/git_annex_get_is_silently_stuck_on__P2P___62___GET_0/comment_3_6a9abad163d6cc3c1c0cc90ed3a3f256._comment
@@ -0,0 +1,18 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="comment 3"
+ date="2024-05-29T18:31:16Z"
+ content="""
+might be (part of) mystery resolved -- dmesg on server shows bunch of
+
+```
+[Wed May 29 14:12:32 2024] lockd: server dbic-mrinbox not responding, still trying
+[Wed May 29 14:12:32 2024] lockd: server dbic-mrinbox not responding, still trying
+[Wed May 29 14:21:33 2024] lockd: server dbic-mrinbox not responding, still trying
+[Wed May 29 14:23:34 2024] lockd: server dbic-mrinbox not responding, still trying
+[Wed May 29 14:26:34 2024] lockd: server dbic-mrinbox not responding, still trying
+
+```
+although I can still read content just fine, so it is more specifically about locking I guess (despite using pidlock?)?
+"""]]

Added a comment: odd odd odd
diff --git a/doc/bugs/git_annex_get_is_silently_stuck_on__P2P___62___GET_0/comment_2_cd9e001bc07ab51e6ff1f9abbbb043ac._comment b/doc/bugs/git_annex_get_is_silently_stuck_on__P2P___62___GET_0/comment_2_cd9e001bc07ab51e6ff1f9abbbb043ac._comment
new file mode 100644
index 0000000000..76d50e988a
--- /dev/null
+++ b/doc/bugs/git_annex_get_is_silently_stuck_on__P2P___62___GET_0/comment_2_cd9e001bc07ab51e6ff1f9abbbb043ac._comment
@@ -0,0 +1,43 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="odd odd odd"
+ date="2024-05-29T18:25:23Z"
+ content="""
+and I cannot copy from the server to the client:
+
+```
+[bids@rolando 1076_spacetop.git] > git annex copy --debug --to typhon --fast --not --in typhon
+yoh@typhon.dartmouth.edu's password: 
+[2024-05-29 14:21:40.601369476] (Utility.Process) process [28828] read: git [\"--git-dir=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"ls-tree\",\"--full-tree\",\"-z\",\"-r\",\"--\",\"refs/heads/git-annex\"]
+[2024-05-29 14:21:40.602439011] (Utility.Process) process [28829] chat: git [\"--git-dir=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"cat-file\",\"--batch=%(objectname) %(objecttype) %(objectsize)\",\"--buffer\"]
+[2024-05-29 14:21:40.775458494] (Messages.explain) [ MD5E-s292602526--2069b86f10e1a39bda5a6ed8996078d4.nii.gz does not match:not in=typhon[TRUE] ]
+
+[2024-05-29 14:21:40.793913739] (Messages.explain) [ MD5E-s2615170--589b64f4cd3067f468408d809c7f1037.nii.gz does not match:not in=typhon[TRUE] ]
+
+[2024-05-29 14:21:40.811856562] (Messages.explain) [ MD5E-s597779--1108c4d91a2386573c50c251a0e967c2.tgz matches:not in=typhon[FALSE] ]
+
+copy MD5E-s597779--1108c4d91a2386573c50c251a0e967c2.tgz (to typhon...) [2024-05-29 14:21:40.974993044] (Utility.Process) process [28817] done ExitSuccess
+[2024-05-29 14:21:40.975527841] (Utility.Process) process [28816] done ExitSuccess
+
+
+  You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time.
+
+  annex.sshcaching is not set to true
+[2024-05-29 14:21:40.977267402] (Utility.Process) process [28831] chat: ssh [\"yoh@typhon.dartmouth.edu\",\"-T\",\"git-annex-shell 'p2pstdio' '/mnt/DATA/data/yoh/1076_spacetop' '--debug' '40795e62-527c-4d26-ae8c-af42a6e2da5a' --uuid 97b6f5e4-4642-43a7-988a-c483caf553c5\"]
+yoh@typhon.dartmouth.edu's password: 
+[2024-05-29 14:21:57.87732028] (P2P.IO) [ThreadId 4] P2P > AUTH-SUCCESS 97b6f5e4-4642-43a7-988a-c483caf553c5
+[2024-05-29 14:21:57.879974047] (P2P.IO) [ssh connection Just 28831] [ThreadId 4] P2P < AUTH-SUCCESS 97b6f5e4-4642-43a7-988a-c483caf553c5
+[2024-05-29 14:21:57.880228709] (P2P.IO) [ssh connection Just 28831] [ThreadId 4] P2P > VERSION 1
+[2024-05-29 14:21:57.878927215] (P2P.IO) [ThreadId 4] P2P < VERSION 1
+[2024-05-29 14:21:57.878991142] (P2P.IO) [ThreadId 4] P2P > VERSION 1
+[2024-05-29 14:21:57.881305277] (P2P.IO) [ssh connection Just 28831] [ThreadId 4] P2P < VERSION 1
+[2024-05-29 14:21:57.881487685] (P2P.IO) [ssh connection Just 28831] [ThreadId 4] P2P > PUT  MD5E-s597779--1108c4d91a2386573c50c251a0e967c2.tgz
+[2024-05-29 14:21:57.880214299] (P2P.IO) [ThreadId 4] P2P < PUT  MD5E-s597779--1108c4d91a2386573c50c251a0e967c2.tgz
+[2024-05-29 14:21:57.880467298] (P2P.IO) [ThreadId 4] P2P > PUT-FROM 0
+[2024-05-29 14:21:57.882841829] (P2P.IO) [ssh connection Just 28831] [ThreadId 4] P2P < PUT-FROM 0
+
+```
+
+NB we had issue with a recent 10GB switch update which was worked around by me reducing MTU on my servers from Jumbo frames to regular 1500 . But it manifested that scp did not work.  Here scp worked (I did scp on typhon from rolando). And for heejung it does not work to `pull` data from roland to discovery.
+"""]]

Added a comment: odd odd odd
diff --git a/doc/bugs/git_annex_get_is_silently_stuck_on__P2P___62___GET_0/comment_1_360cee1c036eca92a25fe92bedd2e2af._comment b/doc/bugs/git_annex_get_is_silently_stuck_on__P2P___62___GET_0/comment_1_360cee1c036eca92a25fe92bedd2e2af._comment
new file mode 100644
index 0000000000..b4fa142437
--- /dev/null
+++ b/doc/bugs/git_annex_get_is_silently_stuck_on__P2P___62___GET_0/comment_1_360cee1c036eca92a25fe92bedd2e2af._comment
@@ -0,0 +1,43 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="odd odd odd"
+ date="2024-05-29T18:25:11Z"
+ content="""
+and I cannot copy from the server to the client:
+
+```
+[bids@rolando 1076_spacetop.git] > git annex copy --debug --to typhon --fast --not --in typhon
+yoh@typhon.dartmouth.edu's password: 
+[2024-05-29 14:21:40.601369476] (Utility.Process) process [28828] read: git [\"--git-dir=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"ls-tree\",\"--full-tree\",\"-z\",\"-r\",\"--\",\"refs/heads/git-annex\"]
+[2024-05-29 14:21:40.602439011] (Utility.Process) process [28829] chat: git [\"--git-dir=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"cat-file\",\"--batch=%(objectname) %(objecttype) %(objectsize)\",\"--buffer\"]
+[2024-05-29 14:21:40.775458494] (Messages.explain) [ MD5E-s292602526--2069b86f10e1a39bda5a6ed8996078d4.nii.gz does not match:not in=typhon[TRUE] ]
+
+[2024-05-29 14:21:40.793913739] (Messages.explain) [ MD5E-s2615170--589b64f4cd3067f468408d809c7f1037.nii.gz does not match:not in=typhon[TRUE] ]
+
+[2024-05-29 14:21:40.811856562] (Messages.explain) [ MD5E-s597779--1108c4d91a2386573c50c251a0e967c2.tgz matches:not in=typhon[FALSE] ]
+
+copy MD5E-s597779--1108c4d91a2386573c50c251a0e967c2.tgz (to typhon...) [2024-05-29 14:21:40.974993044] (Utility.Process) process [28817] done ExitSuccess
+[2024-05-29 14:21:40.975527841] (Utility.Process) process [28816] done ExitSuccess
+
+
+  You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time.
+
+  annex.sshcaching is not set to true
+[2024-05-29 14:21:40.977267402] (Utility.Process) process [28831] chat: ssh [\"yoh@typhon.dartmouth.edu\",\"-T\",\"git-annex-shell 'p2pstdio' '/mnt/DATA/data/yoh/1076_spacetop' '--debug' '40795e62-527c-4d26-ae8c-af42a6e2da5a' --uuid 97b6f5e4-4642-43a7-988a-c483caf553c5\"]
+yoh@typhon.dartmouth.edu's password: 
+[2024-05-29 14:21:57.87732028] (P2P.IO) [ThreadId 4] P2P > AUTH-SUCCESS 97b6f5e4-4642-43a7-988a-c483caf553c5
+[2024-05-29 14:21:57.879974047] (P2P.IO) [ssh connection Just 28831] [ThreadId 4] P2P < AUTH-SUCCESS 97b6f5e4-4642-43a7-988a-c483caf553c5
+[2024-05-29 14:21:57.880228709] (P2P.IO) [ssh connection Just 28831] [ThreadId 4] P2P > VERSION 1
+[2024-05-29 14:21:57.878927215] (P2P.IO) [ThreadId 4] P2P < VERSION 1
+[2024-05-29 14:21:57.878991142] (P2P.IO) [ThreadId 4] P2P > VERSION 1
+[2024-05-29 14:21:57.881305277] (P2P.IO) [ssh connection Just 28831] [ThreadId 4] P2P < VERSION 1
+[2024-05-29 14:21:57.881487685] (P2P.IO) [ssh connection Just 28831] [ThreadId 4] P2P > PUT  MD5E-s597779--1108c4d91a2386573c50c251a0e967c2.tgz
+[2024-05-29 14:21:57.880214299] (P2P.IO) [ThreadId 4] P2P < PUT  MD5E-s597779--1108c4d91a2386573c50c251a0e967c2.tgz
+[2024-05-29 14:21:57.880467298] (P2P.IO) [ThreadId 4] P2P > PUT-FROM 0
+[2024-05-29 14:21:57.882841829] (P2P.IO) [ssh connection Just 28831] [ThreadId 4] P2P < PUT-FROM 0
+
+```
+
+NB we had issue with a recent 10GB switch update which was worked around by me reducing MTU on my servers from Jumbo frames to regular 1500 . But it manifested that scp did not work.  Here scp worked (I did scp on typhon from rolando). And for heejung it does not work to `pull` data from roland to discovery.
+"""]]

get is silently stuck.
diff --git a/doc/bugs/git_annex_get_is_silently_stuck_on__P2P___62___GET_0.mdwn b/doc/bugs/git_annex_get_is_silently_stuck_on__P2P___62___GET_0.mdwn
new file mode 100644
index 0000000000..811281ed9a
--- /dev/null
+++ b/doc/bugs/git_annex_get_is_silently_stuck_on__P2P___62___GET_0.mdwn
@@ -0,0 +1,83 @@
+### Please describe the problem.
+
+Trying to get fresh data from a server but `annex get` is just stuck without any output.  Here is how it looks with `--debug`:
+
+```
+yoh@typhon:/mnt/DATA/data/yoh/1076_spacetop$ git annex get --debug sub-0002/ses-01/dwi/sub-0002_ses-01_acq-96dirX6b0Xmb_dwi.nii.gz
+[2024-05-29 14:08:30.203689795] (Utility.Process) process [3922981] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","ls-files","--stage","-z","--error-unmatch","--","sub-0002/ses-01/dwi/sub-0002_ses-01_acq-96dirX6b0Xmb_dwi.nii.gz"]
+[2024-05-29 14:08:30.204159722] (Utility.Process) process [3922982] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch-check=%(objectname) %(objecttype) %(objectsize)","--buffer"]
+[2024-05-29 14:08:30.204574378] (Utility.Process) process [3922983] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch=%(objectname) %(objecttype) %(objectsize)","--buffer"]
+[2024-05-29 14:08:30.205075804] (Utility.Process) process [3922984] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","git-annex"]
+[2024-05-29 14:08:30.207328301] (Utility.Process) process [3922984] done ExitSuccess
+[2024-05-29 14:08:30.207536385] (Utility.Process) process [3922985] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","show-ref","--hash","refs/heads/git-annex"]
+[2024-05-29 14:08:30.209355606] (Utility.Process) process [3922985] done ExitSuccess
+[2024-05-29 14:08:30.209800913] (Utility.Process) process [3922986] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch=%(objectname) %(objecttype) %(objectsize)","--buffer"]
+get sub-0002/ses-01/dwi/sub-0002_ses-01_acq-96dirX6b0Xmb_dwi.nii.gz [2024-05-29 14:08:30.215082484] (Utility.Process) process [3922987] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch"]
+(from origin...) 
+[2024-05-29 14:08:30.225575902] (Utility.Process) process [3922988] read: ssh ["-O","stop","-S","bids@rolando.cns.dartmouth.edu","-o","ControlMaster=auto","-o","ControlPersist=yes","localhost"] in ".git/annex/ssh/"
+[2024-05-29 14:08:30.230512383] (Utility.Process) process [3922988] done ExitSuccess
+[2024-05-29 14:08:30.231078838] (Utility.Process) process [3922989] read: ssh ["-o","BatchMode=true","-S",".git/annex/ssh/bids@rolando.cns.dartmouth.edu","-o","ControlMaster=auto","-o","ControlPersist=yes","-n","-T","bids@rolando.cns.dartmouth.edu","true"]
+[2024-05-29 14:08:31.042361711] (Utility.Process) process [3922989] done ExitSuccess
+[2024-05-29 14:08:31.043219566] (Utility.Process) process [3923085] chat: ssh ["bids@rolando.cns.dartmouth.edu","-S",".git/annex/ssh/bids@rolando.cns.dartmouth.edu","-o","ControlMaster=auto","-o","ControlPersist=yes","-T","git-annex-shell 'p2pstdio' '/inbox/BIDS/Wager/Wager/1076_spacetop' '--debug' '97b6f5e4-4642-43a7-988a-c483caf553c5' --uuid 590b4fd0-0142-4e9d-8964-d1158c242c6a"]
+[2024-05-29 14:08:31.540468659] (P2P.IO) [ThreadId 4] P2P > AUTH-SUCCESS 590b4fd0-0142-4e9d-8964-d1158c242c6a
+[2024-05-29 14:08:31.539643152] (P2P.IO) [ssh connection Just 3923085] [ThreadId 4] P2P < AUTH-SUCCESS 590b4fd0-0142-4e9d-8964-d1158c242c6a
+[2024-05-29 14:08:31.539846045] (P2P.IO) [ssh connection Just 3923085] [ThreadId 4] P2P > VERSION 1
+[2024-05-29 14:08:31.542774097] (P2P.IO) [ThreadId 4] P2P < VERSION 1
+[2024-05-29 14:08:31.543007073] (P2P.IO) [ThreadId 4] P2P > VERSION 1
+[2024-05-29 14:08:31.541288127] (P2P.IO) [ssh connection Just 3923085] [ThreadId 4] P2P < VERSION 1
+[2024-05-29 14:08:31.541362199] (P2P.IO) [ssh connection Just 3923085] [ThreadId 4] P2P > GET 0 sub-0002/ses-01/dwi/sub-0002_ses-01_acq-96dirX6b0Xmb_dwi.nii.gz MD5E-s239384952--c3aaaebbed3ef5932b4390ddb47d2150.nii.gz
+[2024-05-29 14:08:31.544298446] (P2P.IO) [ThreadId 4] P2P < GET 0 sub-0002/ses-01/dwi/sub-0002_ses-01_acq-96dirX6b0Xmb_dwi.nii.gz MD5E-s239384952--c3aaaebbed3ef5932b4390ddb47d2150.nii.gz
+
+```
+
+and on the server it looks like ( I killed one, client reinitiated, this is current one)
+
+```
+bids     27577  0.3  0.0 1074226428 11120 ?    Ssl  14:10   0:00 /data/home/bids/git-annexes/10.20231129+git83-g86dbe9a825/usr/lib/git-annex.linux/exe/git-annex-shell --library-path /data/home/bids/git-annexes/10.20231129+git83-g86dbe9a825/usr/lib/git-annex.linux//lib/x86_64-linux-gnu: /data/home/bids/git-annexes/10.20231129+git83-g86dbe9a825/usr/lib/git-annex.linux/shimmed/git-annex-shell/git-annex-shell p2pstdio /inbox/BIDS/Wager/Wager/1076_spacetop.git --debug 97b6f5e4-4642-43a7-988a-c483caf553c5 --uuid 40795e62-527c-4d26-ae8c-af42a6e2da5a
+```
+
+and on server config is 
+
+``` 
+[annex]
+        sshcaching = false
+        autoupgraderepository = false
+        pidlock = true
+
+[safe]
+        directory = /inbox/BIDS
+
+```
+
+and that directory `/inbox/BIDS` is NFS mounted.
+
+Not sure what nobs to turn to help diagnose the issue.
+
+The git-annex on client side is `10.20240129-1~ndall+1`
+
+
+[[!meta author=yoh]]
+[[!tag projects/repronim]]
+
+
+
+
+### What steps will reproduce the problem?
+
+
+### What version of git-annex are you using? On what operating system?
+
+
+### Please provide any additional information below.
+
+[[!format sh """
+# If you can, paste a complete transcript of the problem occurring here.
+# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
+
+
+# End of transcript or log.
+"""]]
+
+### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
+
+

group: Added --list option
Seemed to make sense to exclude groups used only by dead repositories.
diff --git a/CHANGELOG b/CHANGELOG
index 4ee02472c3..cc9ef7f688 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -7,6 +7,7 @@ git-annex (10.20240431) UNRELEASED; urgency=medium
     git-remote-annex.
   * When building an adjusted unlocked branch, make pointer files
     executable when the annex object file is executable.
+  * group: Added --list option.
   * fsck: Fix recent reversion that made it say it was checksumming files
     whose content is not present.
   * Avoid the --fast option preventing checksumming in some cases it
diff --git a/Command/Group.hs b/Command/Group.hs
index 41717961a5..e01539ff5f 100644
--- a/Command/Group.hs
+++ b/Command/Group.hs
@@ -1,6 +1,6 @@
 {- git-annex command
  -
- - Copyright 2012 Joey Hess <id@joeyh.name>
+ - Copyright 2012-2024 Joey Hess <id@joeyh.name>
  -
  - Licensed under the GNU AGPL version 3 or higher.
  -}
@@ -9,18 +9,38 @@ module Command.Group where
 
 import Command
 import qualified Remote
-import Logs.Group
 import Types.Group
+import Logs.Group
+import Logs.UUID
+import Logs.Trust
 import Utility.SafeOutput
 
 import qualified Data.Set as S
+import qualified Data.Map as M
 
 cmd :: Command
 cmd = noMessages $ command "group" SectionSetup "add a repository to a group"
-	(paramPair paramRemote paramDesc) (withParams seek)
+	(paramPair paramRemote paramDesc) (seek <$$> optParser)
+
+data GroupOptions = GroupOptions
+	{ cmdparams :: CmdParams
+	, listOption :: Bool
+	}
 
-seek :: CmdParams -> CommandSeek
-seek = withWords (commandAction . start)
+optParser :: CmdParamsDesc -> Parser GroupOptions
+optParser desc = GroupOptions
+	<$> cmdParams desc
+	<*> switch
+		( long "list"
+		<> help "list all currently defined groups"
+		)
+
+seek :: GroupOptions -> CommandSeek
+seek o
+	| listOption o = if null (cmdparams o)
+		then commandAction startList
+		else giveup "Cannot combine --list with other options"
+	| otherwise = commandAction $ start (cmdparams o)
 
 start :: [String] -> CommandStart
 start ps@(name:g:[]) = do
@@ -33,12 +53,21 @@ start ps@(name:g:[]) = do
 start (name:[]) = do
 	u <- Remote.nameToUUID name
 	startingCustomOutput (ActionItemOther Nothing) $ do
-		liftIO . putStrLn . safeOutput . unwords . map fmt . S.toList
-			=<< lookupGroups u
+		liftIO . listGroups =<< lookupGroups u
 		next $ return True
+start _ = giveup "Specify a repository and a group."
+
+startList :: CommandStart
+startList = startingCustomOutput (ActionItemOther Nothing) $ do
+	us <- trustExclude DeadTrusted =<< M.keys <$> uuidDescMap
+	gs <- foldl' S.union mempty <$> mapM lookupGroups us
+	liftIO $ listGroups gs
+	next $ return True
+
+listGroups :: S.Set Group -> IO ()
+listGroups = liftIO . putStrLn . safeOutput . unwords . map fmt . S.toList
   where
 	fmt (Group g) = decodeBS g
-start _ = giveup "Specify a repository and a group."
 
 setGroup :: UUID -> Group -> CommandPerform
 setGroup uuid g = do
diff --git a/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_4_781a39ea819887335f65a31198220b44._comment b/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_4_781a39ea819887335f65a31198220b44._comment
index 0dda511669..5079a9a297 100644
--- a/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_4_781a39ea819887335f65a31198220b44._comment
+++ b/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_4_781a39ea819887335f65a31198220b44._comment
@@ -12,8 +12,8 @@ improvements on querying in this area.
 its preferred content expression, as well as any groupwanted expression,
 and the standard preferred content expression.
 
-There could be a command that just outputs a list of groups, one per line.
-Maybe `git-annex group --list`
+There could be a command that just outputs a list of groups.
+Maybe `git-annex group --list` (update: implemented this)
 
 Then you could get your dump of the groupwanted configurations for each
 group:
diff --git a/doc/git-annex-group.mdwn b/doc/git-annex-group.mdwn
index afe5aee47f..32dafabee7 100644
--- a/doc/git-annex-group.mdwn
+++ b/doc/git-annex-group.mdwn
@@ -8,7 +8,7 @@ git annex group `repository [groupname]`
 
 # DESCRIPTION
 
-Adds a repository to a group, such as "archival", "enduser", or "transfer".
+Adds a repository to a group, such as "archive" or "transfer".
 The groupname must be a single word.
   
 Omit the groupname to show the current groups that a repository is in.
@@ -20,7 +20,11 @@ A repository can be in multiple groups at the same time.
 
 # OPTIONS
 
-* The [[git-annex-common-options]](1) can be used.
+* `--list`
+
+  Outputs a list of all groups that are used by at least one repository.
+
+* Also the [[git-annex-common-options]](1) can be used.
 
 # SEE ALSO
 

split off todo, comment
diff --git a/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_4_781a39ea819887335f65a31198220b44._comment b/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_4_781a39ea819887335f65a31198220b44._comment
new file mode 100644
index 0000000000..0dda511669
--- /dev/null
+++ b/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_4_781a39ea819887335f65a31198220b44._comment
@@ -0,0 +1,25 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 4"""
+ date="2024-05-29T17:04:41Z"
+ content="""
+Opened todo for my idea, [[git-annex_list_with_want_get_and_drop]].
+
+Back to what you were wanting.. I think there is a lot of room for
+improvements on querying in this area.
+
+`git-annex info repo` could display the groups the repo is in, as well as
+its preferred content expression, as well as any groupwanted expression,
+and the standard preferred content expression.
+
+There could be a command that just outputs a list of groups, one per line.
+Maybe `git-annex group --list`
+
+Then you could get your dump of the groupwanted configurations for each
+group:
+
+	for g in $(git-annex group --list); do git-annex groupwanted $g; done
+
+There could also be a command that lists the repositories that are in a
+group. Maybe `git-annex group --members-of=group`
+"""]]
diff --git a/doc/todo/git-annex_list_with_want_get_and_drop.mdwn b/doc/todo/git-annex_list_with_want_get_and_drop.mdwn
new file mode 100644
index 0000000000..d763005fea
--- /dev/null
+++ b/doc/todo/git-annex_list_with_want_get_and_drop.mdwn
@@ -0,0 +1,6 @@
+`git-annex list` could be extended with information about whether each
+repository wants to get or drop a file.
+
+For example, it could use "-" when a repository has a file, but wants to
+drop it. And "+" when it does not have a file, but wants to get it.
+Although perhaps something more clear can be found. --[[Joey]]

add config-uuid to annex:: url for --sameas remotes
And use it to set annex-config-uuid in git config. This makes
using the origin special remote work after cloning.
Without the added Logs.Remote.configSet, instantiating the remote will
look at the annex-config-uuid's config in the remote log, which will be
empty, and so it will fail to find a special remote.
The added deletion of files in the alternatejournaldir is just to make
100% sure they don't get committed to the git-annex branch. Now that
they contain things that definitely should not be committed.
diff --git a/CmdLine/GitRemoteAnnex.hs b/CmdLine/GitRemoteAnnex.hs
index 41d048a3da..93922a069d 100644
--- a/CmdLine/GitRemoteAnnex.hs
+++ b/CmdLine/GitRemoteAnnex.hs
@@ -500,12 +500,16 @@ genSpecialRemoteUrl rmt rcp
 	conv = escapeURIString isUnescapedInURIComponent
 		. fromProposedAccepted
 	
-	cs = M.toList $ M.filterWithKey (\k _ -> k `elem` safefields) c
+	cs = M.toList (M.filterWithKey (\k _ -> k `elem` safefields) c)
+		++ case remoteAnnexConfigUUID (Remote.gitconfig rmt) of
+			Nothing -> []
+			Just cu -> [(Accepted "config-uuid", Accepted (fromUUID cu))]
+
 	c = unparsedRemoteConfig $ Remote.config rmt
 	
 	-- Hidden fields are used for internal stuff like ciphers
 	-- that should not be included in the url.
-	safefields = map parserForField $
+	safefields = map parserForField $ 
 		filter (\p -> fieldDesc p /= HiddenField) ps
 
 	knownfields = map parserForField ps
@@ -544,8 +548,9 @@ withSpecialRemote cfg@(SpecialRemoteConfig {}) sab a = case specialRemoteName cf
 	-- Initialize a new special remote with the provided configuration
 	-- and name.
 	initremote remotename = do
-		let c = M.insert SpecialRemote.nameField (Proposed remotename)
-			(specialRemoteConfig cfg)
+		let c = M.insert SpecialRemote.nameField (Proposed remotename) $
+			M.delete (Accepted "config-uuid") $
+			specialRemoteConfig cfg
 		t <- either giveup return (SpecialRemote.findType c)
 		dummycfg <- liftIO dummyRemoteGitConfig
 		(c', u) <- Remote.setup t Remote.Init (Just (specialRemoteUUID cfg)) 
@@ -553,6 +558,17 @@ withSpecialRemote cfg@(SpecialRemoteConfig {}) sab a = case specialRemoteName cf
 			`onException` cleanupremote remotename
 		Logs.Remote.configSet u c'
 		setConfig (remoteConfig c' "url") (specialRemoteUrl cfg)
+		case M.lookup (Accepted "config-uuid") (specialRemoteConfig cfg) of
+			Just cu -> do
+				setConfig (remoteAnnexConfig c' "config-uuid")
+					(fromProposedAccepted cu)
+				-- This is not quite the same as what is
+				-- usually stored to the git-annex branch
+				-- for the config-uuid, but it will work.
+				-- This change will never be committed to the
+				-- git-annex branch.
+				Logs.Remote.configSet (toUUID (fromProposedAccepted cu)) c'
+			Nothing -> noop
 		remotesChanged
 		getEnabledSpecialRemoteByName remotename >>= \case
 			Just rmt -> return rmt
@@ -1070,15 +1086,17 @@ specialRemoteFromUrl sab a = withTmpDir "journal" $ \tmpdir -> do
 		c { annexAlwaysCommit = False }
 	Annex.BranchState.changeState $ \st -> 
 		st { alternateJournal = Just (toRawFilePath tmpdir) }
-	a `finally` cleanupInitialization sab
+	a `finally` cleanupInitialization sab tmpdir
 
 -- If the git-annex branch did not exist when this command started,
 -- it was created empty by this command, and this command has avoided
--- making any other commits to it. If nothing else has written to the
--- branch while this command was running, the branch will be deleted.
--- That allows for the git-annex branch that is fetched from the special
--- remote to contain Differences, which would prevent it from being merged
--- with the git-annex branch created by this command.
+-- making any other commits to it, writing any temporary annex branch
+-- changes to thre alternateJournal, which can now be discarded. 
+-- 
+-- If nothing else has written to the branch while this command was running,
+-- the branch will be deleted. That allows for the git-annex branch that is
+-- fetched from the special remote to contain Differences, which would prevent
+-- it from being merged with the git-annex branch created by this command.
 --
 -- If there is still not a sibling git-annex branch, this deletes all annex
 -- objects for git bundles from the annex objects directory, and deletes
@@ -1096,8 +1114,9 @@ specialRemoteFromUrl sab a = withTmpDir "journal" $ \tmpdir -> do
 -- does not contain any hooks. Since initialization installs
 -- hooks, have to work around that by not initializing, and 
 -- delete the git bundle objects.
-cleanupInitialization :: StartAnnexBranch -> Annex ()
-cleanupInitialization sab = void $ tryNonAsync $ do
+cleanupInitialization :: StartAnnexBranch -> FilePath -> Annex ()
+cleanupInitialization sab alternatejournaldir = void $ tryNonAsync $ do
+	liftIO $ mapM_ removeFile =<< dirContents alternatejournaldir
 	case sab of
 		AnnexBranchExistedAlready _ -> noop
 		AnnexBranchCreatedEmpty r ->
diff --git a/doc/todo/git-remote-annex_web_special_remote_support.mdwn b/doc/todo/git-remote-annex_web_special_remote_support.mdwn
index 2b9d720682..10ce9f9c65 100644
--- a/doc/todo/git-remote-annex_web_special_remote_support.mdwn
+++ b/doc/todo/git-remote-annex_web_special_remote_support.mdwn
@@ -20,3 +20,5 @@ works, after cloning, fetching again fails:
 
     joey@darkstar:~/tmp/newp2>git fetch origin
     git-annex: no url configured for httpalso special remote
+
+> fixed that, [[done]] --[[Joey]]

Added a comment
diff --git a/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_3_c8938dbe2a78001298e7ed7e9fa90746._comment b/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_3_c8938dbe2a78001298e7ed7e9fa90746._comment
new file mode 100644
index 0000000000..24002bc2e1
--- /dev/null
+++ b/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_3_c8938dbe2a78001298e7ed7e9fa90746._comment
@@ -0,0 +1,14 @@
+[[!comment format=mdwn
+ username="derphysiker"
+ avatar="http://cdn.libravatar.org/avatar/80623354932109c891c2e0ebf523b38f"
+ subject="comment 3"
+ date="2024-05-29T06:58:15Z"
+ content="""
+Well, that is an awesome idea with the more detailed info about each file. I did not thought about this yet.
+
+But what I was talking about is dumping - in a more asthetic way - the remote configuration with its groups and wanteds (for example \"standard\" or \"groupwanted\" or some custom ones).
+
+Furthermore it would be great to even dump the groupwanted configuration for each group and for reference all the standard definitions.
+
+git annex vicfg is more for editing the configuration, but it is visually less appealing for just dumping and reading the information.
+"""]]

clarify which rclone special remote
Now that there are several.
diff --git a/doc/tips/multiple_remotes_accessing_the_same_data_store.mdwn b/doc/tips/multiple_remotes_accessing_the_same_data_store.mdwn
index d332e43700..2900655835 100644
--- a/doc/tips/multiple_remotes_accessing_the_same_data_store.mdwn
+++ b/doc/tips/multiple_remotes_accessing_the_same_data_store.mdwn
@@ -57,5 +57,6 @@ If you find combinations that work, please edit this page to list them.
 * directory and rsync
 * httpalso and directory
 * httpalso and rsync
-* httpalso and rclone (any layout except for frankencase)
+* httpalso and rclone (using git-remote-rclone) 
+  (any layout except for frankencase)
 * httpalso and any special remote that uses exporttree=yes

document using git-remote-annex with httpalso
diff --git a/doc/git-remote-annex.mdwn b/doc/git-remote-annex.mdwn
index f3d9fbc142..4b2ac1288e 100644
--- a/doc/git-remote-annex.mdwn
+++ b/doc/git-remote-annex.mdwn
@@ -48,6 +48,25 @@ time, for one of the pushes to be overwritten by the other one. In this
 situation, the overwritten push will appear to have succeeded, but pulling
 later will show the true situation.
 
+# HTTP ACCESS
+
+If the content of a special remote is published via http, a httpalso
+special remote can be initialized, and used to `git clone` and `git fetch`
+over http.
+
+For example, if the directory special remote set up above is published
+at `https://example.com/foo/`, set up the httpalso remote like this
+to access it:
+
+    git-annex initremote foohttp --with-url --sameas=foo type=httpalso url=https://example.com/foo/
+
+Be sure to remember to include exporttree=yes if the remote is configured
+that way.
+
+Once a httpalso remote is set up like this, `git fetch` from it to display
+its full annex:: url. That url can be shared with others to let them clone
+the repository.
+
 # REPOSITORY FORMAT
 
 The git repository is stored in the special remote using special annex objects

httpalso just worked, with one small issue to fix
diff --git a/doc/todo/git-remote-annex_web_special_remote_support.mdwn b/doc/todo/git-remote-annex_web_special_remote_support.mdwn
index d5f418d0d4..2b9d720682 100644
--- a/doc/todo/git-remote-annex_web_special_remote_support.mdwn
+++ b/doc/todo/git-remote-annex_web_special_remote_support.mdwn
@@ -6,20 +6,17 @@ special remote.
 
 Supporting something like this in git-remote-annex would be good.
 
-While to the user this might be considered part of the web special remote,
-it would really be a separate download code path in git-remote-annex that
-downloads from the urls.
-
-datalad-annex assumes that the url uses the exporttree=yes layout.
-If git-annex did the same, it would look in "$url/.git/annex/objects/".
-But it could instead try both that and the regular hash directories
-and use whichever it found.
-
-How should the annex:: url look for this? It needs to contain the UUID of
-the special remote (not the web special remote) because the MANIFEST key
-includes the UUID. Perhaps "annex::https://example.com/?type=web&uuid=..."
-or "annex::uuid?type=web&url=..." (in either case the inner url will need
-to be URI-encoded)
-
-What should be recorded in .git/config for such a remote? I suppose the
-annex:: url and no annex-uuid. --[[Joey]]
+The httpalso special remote already exists to handle this kind of thing.
+
+In fact, it just works with git-remote-annex!
+
+Eg, this url on my laptop is a directory special remote
+accessed via the web server:
+
+    annex::13c2500f-a302-4331-9720-6ec43cb8da2b?encryption=none&exporttree=yes&type=httpalso&url=http%3A%2F%2Flocalhost%2F~joey%2Ftmp%2Fd
+
+But, while fetching from a httpalso special remote works, and cloning
+works, after cloning, fetching again fails:
+
+    joey@darkstar:~/tmp/newp2>git fetch origin
+    git-annex: no url configured for httpalso special remote

clean up man page
diff --git a/doc/git-remote-annex.mdwn b/doc/git-remote-annex.mdwn
index 9c218d96f3..f3d9fbc142 100644
--- a/doc/git-remote-annex.mdwn
+++ b/doc/git-remote-annex.mdwn
@@ -36,35 +36,34 @@ When using the shorthand "annex::" url, the full url will be displayed
 each time you git pull or push, when it's possible for git-annex to
 determine it.
 
-When a special remote needs some credentials to be used, they are not
-included in the URL, and will need to be provided when cloning from the
-special remote. That is typically done by setting environment variables.
-Some special remotes may also need environment variables to be set when
-pulling or pushing.
+# CONFLICTING PUSHES
 
 Like any git repository, a git repository stored on a special remote can
 have conflicting things pushed to it from different places. This mostly
 works the same as any other git repository, eg a push that overwrites other
-work will be prevented unless forced. However, it is possible, when
-conflicting pushes are being done at the same time, for one of the pushes
-to be overwritten by the other one. In this situation, the overwritten 
-push will appear to have succeeded, but pulling later will show the true
-situation.
+work will be prevented unless forced. 
+
+However, it is possible, when conflicting pushes are being done at the same
+time, for one of the pushes to be overwritten by the other one. In this
+situation, the overwritten push will appear to have succeeded, but pulling
+later will show the true situation.
+
+# REPOSITORY FORMAT
 
 The git repository is stored in the special remote using special annex objects
 with names starting with "GITMANIFEST" and "GITBUNDLE". For details, see:
 <https://git-annex.branchable.com/internals/git-remote-annex/>
 
 Pushes to a special remote are usually done incrementally. However,
-sometimes the whole git repository (but not the annex) needs to be
-re-uploaded. That is done when force pushing a ref, or deleting a
-ref from the remote. It's also done when too many git bundles
-accumulate in the special remote, as configured by the
+sometimes the whole git repository is re-uploaded. That is done when force
+pushing a ref, or deleting a ref from the remote. It's also done when too
+many git bundles accumulate in the special remote, as configured by the
 `remote.<name>.annex-max-git-bundles` git config.
 
 Note that a re-upload of the repository does not delete old GITBUNDLE
 objects from it. This means that refs pushed to the special
 remote can still be accessed even after deleting or overwriting them.
+
 A push that deletes every ref from the special remote will delete all
 the accumulated GITBUNDLE objects. But of course, making such a push
 means that someone who clones from the special remote at that point in time

Added a comment
diff --git a/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_1_37a78ae43c990f80fabc661d98b8cc48._comment b/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_1_37a78ae43c990f80fabc661d98b8cc48._comment
new file mode 100644
index 0000000000..5f0869ba67
--- /dev/null
+++ b/doc/bugs/assistant___40__webapp__41___commited_unlocked_link_to_annex/comment_1_37a78ae43c990f80fabc661d98b8cc48._comment
@@ -0,0 +1,11 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="comment 1"
+ date="2024-05-28T17:51:02Z"
+ content="""
+any ideas?  now I got to another box where I have tried to \"sync\" this repo using assistant and found that it didn't sync anything since Apr 17th but also git history is quite a show
+: [tig view of history](http://www.oneukrainian.com/tmp/git-annex-blowup.png)
+
+with bunch of commits moving files from annex into git etc.  Need to kill assistant for now. System has `10.20240129-1~ndall+1`
+"""]]

update
diff --git a/doc/internals/git-remote-annex.mdwn b/doc/internals/git-remote-annex.mdwn
index f32f08875c..8d15d28d3c 100644
--- a/doc/internals/git-remote-annex.mdwn
+++ b/doc/internals/git-remote-annex.mdwn
@@ -14,7 +14,7 @@ GITBUNDLE--$UUID-$sha256 is a git bundle.
 An ordered list of bundle keys, one per line. 
 
 Additionally, there may be bundle keys that are prefixed with "-".
-These keys are not part of the current content of the git remote
+These keys are not part of the current content of the git remote,
 and are in the process of being deleted.
 
 (Lines end with unix `"\n"`, not `"\r\n"`.)
@@ -43,12 +43,14 @@ stored in such a special remote, this procedure will work:
    (Note that later bundles can update refs from the versions in previous
    bundles.)
 
-Note that, if a GITBUNDLE listed in the GITMANIFEST turns out not to exist,
+If any GITBUNDLE listed in the GITMANIFEST turns out not to exist,
 a clone should treat this the same as if the GITMANIFEST were empty.
-bundle objects are deleted when a push is made to the remote that
-deletes all refs from it, and in a race between such a push and another
-push of some refs, it is possible for the GITMANIFEST to refer to deleted
-bundles.
+Bundle objects are deleted only when a push is made to the remote that
+deletes all refs from it, and when there was a race between such a push
+and another push of some refs, it is possible for the GITMANIFEST to
+refer to deleted bundles. In such a situation, the push that deleted all
+refs wins. (This race condition is why old GITBUNDLE objects are listed in
+the manifest rather than being immediately deleted.)
 
 When the special remote is encrypted, both the names and content of
 the GITMANIFEST and GITBUNDLE will also be encrypted. To

git-remote-annex: brought back max-git-bundles config
An incremental push that gets converted to a full push due to this
config results in the inManifest having just one bundle in it, and the
outManifest listing every other bundle. So it actually takes up more
space on the special remote. But, it speeds up clone and fetch to not
have to download a long series of bundles for incremental pushes.
diff --git a/CmdLine/GitRemoteAnnex.hs b/CmdLine/GitRemoteAnnex.hs
index e6cdbc50c9..2667e92f8c 100644
--- a/CmdLine/GitRemoteAnnex.hs
+++ b/CmdLine/GitRemoteAnnex.hs
@@ -273,6 +273,10 @@ fullPush :: State -> Remote -> [Ref] -> Annex (Bool, State)
 fullPush st rmt refs = guardPush st $ do
 	oldmanifest <- maybe (downloadManifestWhenPresent rmt) pure
 		(manifestCache st)
+	fullPush' oldmanifest st rmt refs
+
+fullPush' :: Manifest -> State -> Remote -> [Ref] -> Annex (Bool, State)
+fullPush' oldmanifest st rmt refs = do
 	let bs = map Git.Bundle.fullBundleSpec refs
 	(bundlekey, uploadbundle) <- generateGitBundle rmt bs oldmanifest
 	let manifest = mkManifest [bundlekey] $
@@ -297,14 +301,19 @@ guardPush st a = catchNonAsync a $ \ex -> do
 incrementalPush :: State -> Remote -> M.Map Ref Sha -> M.Map Ref Sha -> Annex (Bool, State)
 incrementalPush st rmt oldtrackingrefs newtrackingrefs = guardPush st $ do
 	oldmanifest <- maybe (downloadManifestWhenPresent rmt) pure (manifestCache st)
-	bs <- calc [] (M.toList newtrackingrefs)
-	(bundlekey, uploadbundle) <- generateGitBundle rmt bs oldmanifest
-	let manifest = oldmanifest <> mkManifest [bundlekey] mempty
-	manifest' <- startPush rmt manifest
-	uploadbundle
-	uploadManifest rmt manifest'
-	return (True, st { manifestCache = Nothing })
+	if length (inManifest oldmanifest) + 1 > remoteAnnexMaxGitBundles (Remote.gitconfig rmt)
+		then fullPush' oldmanifest st rmt (M.keys newtrackingrefs)
+		else go oldmanifest
   where
+	go oldmanifest = do
+		bs <- calc [] (M.toList newtrackingrefs)
+		(bundlekey, uploadbundle) <- generateGitBundle rmt bs oldmanifest
+		let manifest = oldmanifest <> mkManifest [bundlekey] mempty
+		manifest' <- startPush rmt manifest
+		uploadbundle
+		uploadManifest rmt manifest'
+		return (True, st { manifestCache = Nothing })
+	
 	calc c [] = return (reverse c)
 	calc c ((ref, sha):refs) = case M.lookup ref oldtrackingrefs of
 		Just oldsha
diff --git a/Types/GitConfig.hs b/Types/GitConfig.hs
index e1090a1121..42f1811997 100644
--- a/Types/GitConfig.hs
+++ b/Types/GitConfig.hs
@@ -373,6 +373,7 @@ data RemoteGitConfig = RemoteGitConfig
 	, remoteAnnexBwLimitDownload :: Maybe BwRate
 	, remoteAnnexAllowUnverifiedDownloads :: Bool
 	, remoteAnnexConfigUUID :: Maybe UUID
+	, remoteAnnexMaxGitBundles :: Int
 	, remoteAnnexAllowEncryptedGitRepo :: Bool
 	, remoteUrl :: Maybe String
 
@@ -453,6 +454,8 @@ extractRemoteGitConfig r remotename = do
 			readBwRatePerSecond =<< getmaybe "bwlimit-download"
 		, remoteAnnexAllowUnverifiedDownloads = (== Just "ACKTHPPT") $
 			getmaybe ("security-allow-unverified-downloads")
+		, remoteAnnexMaxGitBundles =
+			fromMaybe 100 (getmayberead  "max-git-bundles")
 		, remoteAnnexConfigUUID = toUUID <$> getmaybe "config-uuid"
 		, remoteAnnexShell = getmaybe "shell"
 		, remoteAnnexSshOptions = getoptions "ssh-options"
diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn
index 65750ab917..19570dcfb8 100644
--- a/doc/git-annex.mdwn
+++ b/doc/git-annex.mdwn
@@ -1648,6 +1648,17 @@ Remotes are configured using these settings in `.git/config`.
   remotes, and is set when using [[git-annex-initremote]](1) with the
   `--private` option.
 
+* `remote.<name>.annex-max-git-bundles`, `annex.max-git-bundles`
+
+  When using [[git-remote-annex]] to store a git repository in a special
+  remote, this configures how many separate git bundle objects to store
+  in the special remote before re-uploading a single git bundle that contains
+  the entire git repository.
+  
+  The default is 100, which aims to avoid often needing to often re-upload,
+  while preventing a clone or fetch needing to download too many objects.
+  Set to 0 to disable re-uploading.
+
 * `remote.<name>.annex-allow-encrypted-gitrepo`
 
   Setting this to true allows using [[git-remote-annex]] to push the git
diff --git a/doc/git-remote-annex.mdwn b/doc/git-remote-annex.mdwn
index 2d08e33b0a..9c218d96f3 100644
--- a/doc/git-remote-annex.mdwn
+++ b/doc/git-remote-annex.mdwn
@@ -36,39 +36,40 @@ When using the shorthand "annex::" url, the full url will be displayed
 each time you git pull or push, when it's possible for git-annex to
 determine it.
 
-When a special remote needs some additional credentials to be provided,
-they are not included in the URL, and need to be provided when cloning from
-the special remote. That is typically done by setting environment
-variables. Some special remotes may also need environment variables to be
-set when pulling or pushing.
+When a special remote needs some credentials to be used, they are not
+included in the URL, and will need to be provided when cloning from the
+special remote. That is typically done by setting environment variables.
+Some special remotes may also need environment variables to be set when
+pulling or pushing.
+
+Like any git repository, a git repository stored on a special remote can
+have conflicting things pushed to it from different places. This mostly
+works the same as any other git repository, eg a push that overwrites other
+work will be prevented unless forced. However, it is possible, when
+conflicting pushes are being done at the same time, for one of the pushes
+to be overwritten by the other one. In this situation, the overwritten 
+push will appear to have succeeded, but pulling later will show the true
+situation.
 
 The git repository is stored in the special remote using special annex objects
-with names starting with "GITMANIFEST" and "GITBUNDLE". For details about
-how the git repository is stored, see
+with names starting with "GITMANIFEST" and "GITBUNDLE". For details, see:
 <https://git-annex.branchable.com/internals/git-remote-annex/>
 
 Pushes to a special remote are usually done incrementally. However,
 sometimes the whole git repository (but not the annex) needs to be
 re-uploaded. That is done when force pushing a ref, or deleting a
-ref from the remote.
+ref from the remote. It's also done when too many git bundles
+accumulate in the special remote, as configured by the
+`remote.<name>.annex-max-git-bundles` git config.
 
-The special remote accumulates one GITBUNDLE object per push, and old
-objects are usually not deleted. This means that refs pushed to the special
+Note that a re-upload of the repository does not delete old GITBUNDLE
+objects from it. This means that refs pushed to the special
 remote can still be accessed even after deleting or overwriting them.
-A push that deletes every ref from the special remote does delete all
+A push that deletes every ref from the special remote will delete all
 the accumulated GITBUNDLE objects. But of course, making such a push
-means that someone clones from the special remote at that point in time
+means that someone who clones from the special remote at that point in time
 will see an empty remote.
 
-Like any git repository, a git repository stored on a special remote can
-have conflicting things pushed to it from different places. This mostly
-works the same as any other git repository, eg a push that overwrites other
-work will be prevented unless forced. However, it is possible, when
-conflicting pushes are being done at the same time, for one of the pushes
-to be overwritten by the other one. In this situation, the overwritten 
-push will appear to have succeeded, but pulling later will show the true
-situation.
-
 # SEE ALSO
 
 gitremote-helpers(1)

comment
diff --git a/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_2_7124a6b1f076c411d45c6e964ea1f5fe._comment b/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_2_7124a6b1f076c411d45c6e964ea1f5fe._comment
new file mode 100644
index 0000000000..a57d181868
--- /dev/null
+++ b/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_2_7124a6b1f076c411d45c6e964ea1f5fe._comment
@@ -0,0 +1,17 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2024-05-28T16:53:00Z"
+ content="""
+Are you talking about listing information about the configuration of
+repositories, or listing information about which repos want a file?
+
+I can imagine extending `git-annex list` to display eg, a "-" if the 
+repository currently has, but does not want a file, and a "+" 
+if the repository wants, but does not currently have a file.
+Or something like that.
+
+If the goal is an overview of configuration from the git-annex branch, I
+agree `git-annex vicfg` is not perfect, and it would be nice to have
+something that dumped out all the configuration.
+"""]]

update
diff --git a/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_3_927812cb2d4a1789ba842c1fa55df13e._comment b/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_3_927812cb2d4a1789ba842c1fa55df13e._comment
index b7f039d209..accb44c0bc 100644
--- a/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_3_927812cb2d4a1789ba842c1fa55df13e._comment
+++ b/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_3_927812cb2d4a1789ba842c1fa55df13e._comment
@@ -9,11 +9,11 @@ Revert "fsck: warn about symlink pointing inside a gitdir"
 
 I don't know if it will be backported to the other affected git versions.
 
-Currently, the fsck.symlinkTargetLength check has not been reverted,
-so might want to still do something about that.
+As well as removing the symlink to .git check, that also removes the
+symlink target too long check.
 
 Also, git-remote-annex is affected by the same git clone check about hooks
-getting installed as git-lfs. That check is also part of the reversion set.
+getting installed as git-lfs. That check is also going to be reverted.
 git-remote-annex contains a workaround, but it currently only checks for
 the specific git versions that added that check, so if any new git point
 releases don't revert that check it will need to update its version list.

update
diff --git a/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir.mdwn b/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir.mdwn
index 9c2f2e0a06..77fb4e4885 100644
--- a/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir.mdwn
+++ b/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir.mdwn
@@ -40,3 +40,13 @@ If git-annex wanted to also avoid this breakage, it could set:
     git config fsck.symlinkTargetLength ignore
     git config receive.fsck.symlinkTargetLength ignore
     git config fetch.fsck.symlinkTargetLength ignore
+
+Of course, that would not help when the bare repo is not git-annex
+initialized.
+
+If a git-annex repo is checked out on Windows and has a longer symlink,
+this will cause fsck to complain about it, even though git-annex will of
+course use an adjusted unlocked branch and so the symlink won't actually be
+followed. That seems like a good reason to set these configs. OTOH, there's
+no benefit in doing it on Linux, unless some other OS has a longer
+`PATH_MAX` than 4096 (Hurd?)

adjust unlocked execute bit handling
When building an adjusted unlocked branch, make pointer files executable
when the annex object file is executable.
This slows down git-annex adjust --unlock/--unlock-present by needing to
stat all annex object files in the tree. Probably not a significant
slowdown compared to other work they do, but I have not benchmarked.
I chose to leave git-annex adjust --unlock marked as stable, even though
get or drop of an object file can change whether it would make the pointer
file executable. Partly because making it unstable would slow down
re-adjustment, and partly for symmetry with the handling of an unlocked
pointer file that is executable when the content is dropped, which does not
remove its execute bit.
diff --git a/Annex/AdjustedBranch.hs b/Annex/AdjustedBranch.hs
index 6aedaa29ed..4ce101d8f9 100644
--- a/Annex/AdjustedBranch.hs
+++ b/Annex/AdjustedBranch.hs
@@ -1,6 +1,6 @@
 {- adjusted branch
  -
- - Copyright 2016-2023 Joey Hess <id@joeyh.name>
+ - Copyright 2016-2024 Joey Hess <id@joeyh.name>
  -
  - Licensed under the GNU AGPL version 3 or higher.
  -}
@@ -68,9 +68,12 @@ import qualified Database.Keys
 import Config
 import Logs.View (is_branchView)
 import Logs.AdjustedBranchUpdate
+import Utility.FileMode
+import qualified Utility.RawFilePath as R
 
 import Data.Time.Clock.POSIX
 import qualified Data.Map as M
+import System.PosixCompat.Files (fileMode)
 
 class AdjustTreeItem t where
 	-- How to perform various adjustments to a TreeItem.
@@ -155,8 +158,13 @@ adjustToPointer :: TreeItem -> Annex (Maybe TreeItem)
 adjustToPointer ti@(TreeItem f _m s) = catKey s >>= \case
 	Just k -> do
 		Database.Keys.addAssociatedFile k f
-		Just . TreeItem f (fromTreeItemType TreeFile)
-			<$> hashPointerFile k
+		exe <- catchDefaultIO False $
+			(isExecutable . fileMode) <$> 
+				(liftIO . R.getFileStatus
+					=<< calcRepo (gitAnnexLocation k))
+		let mode = fromTreeItemType $ 
+			if exe then TreeExecutable else TreeFile
+		Just . TreeItem f mode <$> hashPointerFile k
 	Nothing -> return (Just ti)
 
 adjustToSymlink :: TreeItem -> Annex (Maybe TreeItem)
diff --git a/CHANGELOG b/CHANGELOG
index 9c0a26d0ba..4ee02472c3 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -5,6 +5,8 @@ git-annex (10.20240431) UNRELEASED; urgency=medium
     (Based on Michael Hanke's git-remote-datalad-annex.)
   * initremote, enableremote: Added --with-url to enable using
     git-remote-annex.
+  * When building an adjusted unlocked branch, make pointer files
+    executable when the annex object file is executable.
   * fsck: Fix recent reversion that made it say it was checksumming files
     whose content is not present.
   * Avoid the --fast option preventing checksumming in some cases it
diff --git a/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission.mdwn b/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission.mdwn
index 516197c1a1..c996f5b519 100644
--- a/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission.mdwn
+++ b/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission.mdwn
@@ -34,3 +34,5 @@ annex.alwayscommit = false.
 
 PS: git-annex is so solid that this is the first data-related issue I've 
 ever seen. Kudos!
+
+> [[fixed|done]] --[[Joey]]
diff --git a/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission/comment_1_c081e6824b116ca4960dc31a0a20b81a._comment b/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission/comment_1_c081e6824b116ca4960dc31a0a20b81a._comment
index 4f5b009918..b8c33fbedc 100644
--- a/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission/comment_1_c081e6824b116ca4960dc31a0a20b81a._comment
+++ b/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission/comment_1_c081e6824b116ca4960dc31a0a20b81a._comment
@@ -37,4 +37,11 @@ than stat.
 
 Overall, I think this is probably worth doing, just to be symmetric with
 `git-annex unlock`.
+
+There's also an argument that, if I have a large executable (LLM models
+come to mind for some ungodly reason), and I annex it and enter an adjusted
+branch, I should still be able to run it. Although it's really better to
+add it unlocked in the first place, since then you're tracking the execute
+bit in git permanantly and not relying on best-effort execute bit
+preservation when copying objects around.
 """]]
diff --git a/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission/comment_2_39b5177e7666c7a3680458749cb0f600._comment b/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission/comment_2_39b5177e7666c7a3680458749cb0f600._comment
new file mode 100644
index 0000000000..19d7bdc1e3
--- /dev/null
+++ b/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission/comment_2_39b5177e7666c7a3680458749cb0f600._comment
@@ -0,0 +1,18 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2024-05-28T16:27:10Z"
+ content="""
+If the object file is executable at `git-annex unlock` time, the pointer
+file is made executable. If the object is then dropped, the pointer file
+remains executable.
+
+So shouldn't it be the case for symmetry that `git-annex adjust --unlock`
+should make the pointer file executable, and a drop followed by re-doing
+the same adjustment should leave the pointer file executable? That would
+argue for leaving it stable.
+
+I don't think there's a perfect solution to that question, both behaviors
+seem perhaps wanted at different times. But since leaving it stable avoids
+extra work, I'm leaning toward that.
+"""]]

retitle and comment
diff --git a/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission.mdwn b/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission.mdwn
index 5dd73cc7a0..516197c1a1 100644
--- a/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission.mdwn
+++ b/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission.mdwn
@@ -1,3 +1,5 @@
+[[!meta title="git-annex adjust --unlock does not copy execute bit of object files"]]
+
 It seems that performing `git annex adjust --unlock-present` or `sync` 
 will remove the +x permission from files.
 
diff --git a/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission/comment_1_c081e6824b116ca4960dc31a0a20b81a._comment b/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission/comment_1_c081e6824b116ca4960dc31a0a20b81a._comment
new file mode 100644
index 0000000000..4f5b009918
--- /dev/null
+++ b/doc/bugs/adjust_or_sync_in_unlock-present_repo_removes_+x_permission/comment_1_c081e6824b116ca4960dc31a0a20b81a._comment
@@ -0,0 +1,40 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2024-05-28T15:46:45Z"
+ content="""
+Notice that `git-annex unlock` does preserve the execute bit when the
+object file has it set.
+
+Currently, generating an adjusted branch does not look at permissions of
+object files.
+
+Now git-annex generally does not preserve execute bit on object files (eg
+when storing in a special remote), and of course doesn't know the
+permissions of an object file that's not currently present. So, if adjusted
+branch generation did look at the permissions, running it twice in two
+different repositories, or at different times in the same repository could
+result in different adjusted branch trees being generated.
+
+That might or might not be a problem for interoperability? Adjusted branches
+are not usually pushed anywhere, so might not matter much.
+
+It seems that in `instance AdjustTreeItem LinkAdjustment` it would no
+longer be able to have `adjustmentIsStable` return True. Well, it could, but
+then `git-annex adjust --unlock` followed by `git-annex get foo` when it
+happens to get an object with the execute bit, followed by `git-annex
+adjust --unlock` would not reflect the execute bit in the adjusted branch.
+
+So handling that case matters, re-adjusting would get slower. This might
+impact users who have a large tree they are adjusting with --unlock.
+(`git-annex adjust --unlock-present` is already not stable of course, so no
+additional performance penalty there)
+
+Of course, statting every object file to check for execute bits would also
+make adjusting a large tree somewhat slower. Probably on the order of less
+than 10% slower I'd guess, because it currently has to catKey, which is slower
+than stat.
+
+Overall, I think this is probably worth doing, just to be symmetric with
+`git-annex unlock`.
+"""]]

update
diff --git a/doc/todo/git-remote-annex_web_special_remote_support.mdwn b/doc/todo/git-remote-annex_web_special_remote_support.mdwn
index f31b41646b..d5f418d0d4 100644
--- a/doc/todo/git-remote-annex_web_special_remote_support.mdwn
+++ b/doc/todo/git-remote-annex_web_special_remote_support.mdwn
@@ -15,9 +15,11 @@ If git-annex did the same, it would look in "$url/.git/annex/objects/".
 But it could instead try both that and the regular hash directories
 and use whichever it found.
 
-How should the annex:: url look for this? The UUID is not really relevant
-in this case, because the web only has a dummy single UUID. So
-it would work to use "annex::https://example.com/?type=web"
+How should the annex:: url look for this? It needs to contain the UUID of
+the special remote (not the web special remote) because the MANIFEST key
+includes the UUID. Perhaps "annex::https://example.com/?type=web&uuid=..."
+or "annex::uuid?type=web&url=..." (in either case the inner url will need
+to be URI-encoded)
 
 What should be recorded in .git/config for such a remote? I suppose the
 annex:: url and no annex-uuid. --[[Joey]]

Added a comment: Yep, would be nice!
diff --git a/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_1_911c0ff3327f1ce418588cfb7dca487f._comment b/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_1_911c0ff3327f1ce418588cfb7dca487f._comment
new file mode 100644
index 0000000000..df8ed262e0
--- /dev/null
+++ b/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants/comment_1_911c0ff3327f1ce418588cfb7dca487f._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="nobodyinperson"
+ avatar="http://cdn.libravatar.org/avatar/736a41cd4988ede057bae805d000f4f5"
+ subject="Yep, would be nice!"
+ date="2024-05-28T12:18:58Z"
+ content="""
+This doesn't exist and you need to script it yourself, e.g. with `git annex info --fast --json` and then subsequent calls to `git annex group ...` etc. I want to include this in [`git annex control-center`](https://gitlab.com/nobodyinperson/git-annex-control-center) eventually, but didn't have the time yet.
+"""]]

Added a comment: Re: worktree provisioning
diff --git a/doc/todo/compute_special_remote/comment_8_bace0128b326dba6394e0f23b743f049._comment b/doc/todo/compute_special_remote/comment_8_bace0128b326dba6394e0f23b743f049._comment
new file mode 100644
index 0000000000..7a65fd222e
--- /dev/null
+++ b/doc/todo/compute_special_remote/comment_8_bace0128b326dba6394e0f23b743f049._comment
@@ -0,0 +1,22 @@
+[[!comment format=mdwn
+ username="m.risse@77eac2c22d673d5f10305c0bade738ad74055f92"
+ nickname="m.risse"
+ avatar="http://cdn.libravatar.org/avatar/59541f50d845e5f81aff06e88a38b9de"
+ subject="Re: worktree provisioning"
+ date="2024-05-28T12:06:39Z"
+ content="""
+(I forgot to tick \"email replies to me\", sorry for the late reply)
+
+My reasoning for suggesting to always stay in HEAD is this:
+Let's assume we have a file \"data.grib\" that we want to convert into \"data.nc\" using this compute special remote. We use its facilities to make it do exactly that.
+Now, if there was a bug in \"data.grib\" that necessitates an update, we would replace the file. The special remote could do two things then:
+
+1. Try to convert \"data.grib\" from current HEAD to \"data.nc\", possibly failing if the checksums no longer match (if git-annex is instructed to check those).
+2. Silently use the old version of \"data.grib\", creating a mismatch between \"data.nc\" and \"data.grib\" as available on HEAD (and in this case using a buggy version of the data).
+
+I think the first error is preferable over the second, because the second one is much more subtle and easy to miss.
+
+This same reasoning extends to software as well, if it is somehow tracked in git: for the above mentioned conversion one could use \"cdo\" (climate data operators). One could pin a specific version of \"cdo\" with nix and its flake.lock file, meaning that there is an exact version of cdo associated with every commit sha of the git-annex/DataLad repository. If I update that lock file to get a new version of cdo, then as a user I would naively assume that re-converting \"data.grib\" to \"data.nc\" would now use this new version of cdo. With worktree provisioning it would silently use the old one instead.
+
+IMO worktree provisioning would create an explosion of potential inputs to consider for the computation (the entire git history so far), which would create a lot of subtle pitfalls. Always using stuff from HEAD would be an easier implementation, easier to reason about, and make the user explicitly responsible for keeping the repository contents consistent.
+"""]]

update
diff --git a/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_1_0cbe5ed96c9fa9dea13bdf6b52243243._comment b/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_1_0cbe5ed96c9fa9dea13bdf6b52243243._comment
index 2bcea465e8..e951ea555b 100644
--- a/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_1_0cbe5ed96c9fa9dea13bdf6b52243243._comment
+++ b/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_1_0cbe5ed96c9fa9dea13bdf6b52243243._comment
@@ -3,19 +3,6 @@
  subject="""comment 1"""
  date="2024-05-21T21:47:38Z"
  content="""
-BTW, I have to mention that I'm deeply unhappy for git for making this
-change, with such a 
-[weak justification](https://github.com/git/git/commit/a33fea0886cfa016d313d2bd66bdd08615bffbc9),
-and so little care for breakage.
-
- The change came after a security fix which involved symlinks and
- `.git/objects`, but that was a symlink *inside* `.git/objects`, 
-which is entirely different than a symlink pointing into the
-`.git` directory. 
-
-While it's understandable that someone encountering a
-symlink related security hole may want to throw out the baby with the
-bathwater, what they have actually done here is to only throw out the
-baby. This change will not prevent the class of security hole that
-motivated it.
+[Deleted this comment. I was annoyed and rightfully so, but I also
+misunderstood which security hole led them down this path.]
 """]]
diff --git a/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_3_927812cb2d4a1789ba842c1fa55df13e._comment b/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_3_927812cb2d4a1789ba842c1fa55df13e._comment
new file mode 100644
index 0000000000..b7f039d209
--- /dev/null
+++ b/doc/todo/deal_with_git_fsck_symlinkPointsToGitDir/comment_3_927812cb2d4a1789ba842c1fa55df13e._comment
@@ -0,0 +1,20 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2024-05-28T02:39:52Z"
+ content="""
+Junio has now queued for git 2.45.2, coming early in June:
+
+Revert "fsck: warn about symlink pointing inside a gitdir"
+
+I don't know if it will be backported to the other affected git versions.
+
+Currently, the fsck.symlinkTargetLength check has not been reverted,
+so might want to still do something about that.
+
+Also, git-remote-annex is affected by the same git clone check about hooks
+getting installed as git-lfs. That check is also part of the reversion set.
+git-remote-annex contains a workaround, but it currently only checks for
+the specific git versions that added that check, so if any new git point
+releases don't revert that check it will need to update its version list.
+"""]]

diff --git a/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants.mdwn b/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants.mdwn
new file mode 100644
index 0000000000..9faa71542a
--- /dev/null
+++ b/doc/forum/Feature_request__58___List_all_locations__44___groups__38__wants.mdwn
@@ -0,0 +1,7 @@
+Hi there!
+
+Maybe I have overseen this feature. I would like to have something like 'git annex list' which lists all basic info about the locations, such as name, group and wanted.
+
+Basically I need to go into 'git annex vicfg' to have some kind of overview of the remotes.
+
+Or did I miss something?

git-remote-annex: Fix error display on clone
cleanupInitialization gets run when an exception is thrown, so needs to
avoid throwing exceptions itself, as that would hide the error message
that the user needs to see.
diff --git a/CmdLine/GitRemoteAnnex.hs b/CmdLine/GitRemoteAnnex.hs
index 53c7274a12..e6cdbc50c9 100644
--- a/CmdLine/GitRemoteAnnex.hs
+++ b/CmdLine/GitRemoteAnnex.hs
@@ -1078,14 +1078,25 @@ specialRemoteFromUrl sab a = withTmpDir "journal" $ \tmpdir -> do
 -- hooks, have to work around that by not initializing, and 
 -- delete the git bundle objects.
 cleanupInitialization :: StartAnnexBranch -> Annex ()
-cleanupInitialization sab = do
+cleanupInitialization sab = void $ tryNonAsync $ do
 	case sab of
 		AnnexBranchExistedAlready _ -> noop
-		AnnexBranchCreatedEmpty r -> 
+		AnnexBranchCreatedEmpty r ->
 			whenM ((r ==) <$> Annex.Branch.getBranch) $ do
-				inRepo $ Git.Branch.delete Annex.Branch.fullname
 				indexfile <- fromRepo gitAnnexIndex
 				liftIO $ removeWhenExistsWith R.removeLink indexfile
+				-- When cloning failed and this is being
+				-- run as an exception is thrown, HEAD will
+				-- not be set to a valid value, which will
+				-- prevent deleting the git-annex branch.
+				-- But that's ok, git will delete the 
+				-- repository it failed to clone into.
+				-- So skip deleting to avoid an ugly
+				-- message.
+				inRepo Git.Branch.currentUnsafe >>= \case
+					Nothing -> return ()
+					Just _ -> void $ tryNonAsync $
+						inRepo $ Git.Branch.delete Annex.Branch.fullname
 	ifM (Annex.Branch.hasSibling <&&> nonbuggygitversion)
 		( do
 			autoInitialize' (pure True) remoteList
diff --git a/doc/todo/git-remote-annex.mdwn b/doc/todo/git-remote-annex.mdwn
index 269cf0772b..f492858f1f 100644
--- a/doc/todo/git-remote-annex.mdwn
+++ b/doc/todo/git-remote-annex.mdwn
@@ -6,13 +6,6 @@ It will be a safer implementation, will support incremental pushes, and
 will be available to users who don't use datalad. 
 --[[Joey]]
 
----
-
-This is implememented and working. Remaining todo list for it:
-
-* When git clone is used with an annex:: url that is for a directory
-  special remote and is missing directory=, for example, it does
-  not display any useful error message. git fetch does, but it seems
-  git clone eats git-remote-annex stderr.
+> [[done!]] --[[Joey]]
 
 See also: [[git-remote-annex_web_special_remote_support]]

split out a todo
diff --git a/doc/todo/git-remote-annex.mdwn b/doc/todo/git-remote-annex.mdwn
index 09708814c5..269cf0772b 100644
--- a/doc/todo/git-remote-annex.mdwn
+++ b/doc/todo/git-remote-annex.mdwn
@@ -15,8 +15,4 @@ This is implememented and working. Remaining todo list for it:
   not display any useful error message. git fetch does, but it seems
   git clone eats git-remote-annex stderr.
 
-* datalad-annex supports cloning from the web special remote,
-  using an url that contains the result of pushing to eg, a directory
-  special remote.
-  `datalad-annex::https://example.com?type=web&url={noquery}`
-  Supporting something like this would be good.
+See also: [[git-remote-annex_web_special_remote_support]]
diff --git a/doc/todo/git-remote-annex_web_special_remote_support.mdwn b/doc/todo/git-remote-annex_web_special_remote_support.mdwn
new file mode 100644
index 0000000000..f31b41646b
--- /dev/null
+++ b/doc/todo/git-remote-annex_web_special_remote_support.mdwn
@@ -0,0 +1,23 @@
+datalad-annex supports cloning from the web special remote,
+using an url that contains the result of pushing to eg, a directory
+special remote.
+
+`datalad-annex::https://example.com?type=web&url={noquery}`
+
+Supporting something like this in git-remote-annex would be good.
+
+While to the user this might be considered part of the web special remote,
+it would really be a separate download code path in git-remote-annex that
+downloads from the urls.
+
+datalad-annex assumes that the url uses the exporttree=yes layout.
+If git-annex did the same, it would look in "$url/.git/annex/objects/".
+But it could instead try both that and the regular hash directories
+and use whichever it found.
+
+How should the annex:: url look for this? The UUID is not really relevant
+in this case, because the web only has a dummy single UUID. So
+it would work to use "annex::https://example.com/?type=web"
+
+What should be recorded in .git/config for such a remote? I suppose the
+annex:: url and no annex-uuid. --[[Joey]]

git-remote-annex: support importrree=yes remotes
When exporttree=yes is also set. Probably it would also be possible to
support ones with only importtree=yes, by enabling exporttree=yes for
the remote only when using git-remote-annex, but let's keep this
simple... I'm not sure what gets recorded in .git/annex/ state
differently in the two cases that might cause a problem when doing that.
Note that the full annex:: urls generated and displayed for such a
remote omit the importree=yes. Which is ok, cloning from such an url
uses an exporttree=remote, but the git-annex branch doesn't get written
by this program, so once the real config is available from the git-annex
branch, it will still function as an importree=yes remote.
diff --git a/CmdLine/GitRemoteAnnex.hs b/CmdLine/GitRemoteAnnex.hs
index 0d6ddaf604..53c7274a12 100644
--- a/CmdLine/GitRemoteAnnex.hs
+++ b/CmdLine/GitRemoteAnnex.hs
@@ -22,7 +22,6 @@ import qualified Git.Remote
 import qualified Git.Remote.Remove
 import qualified Git.Version
 import qualified Annex.SpecialRemote as SpecialRemote
-import qualified Annex.SpecialRemote.Config as SpecialRemote
 import qualified Annex.Branch
 import qualified Annex.BranchState
 import qualified Types.Remote as Remote
@@ -576,7 +575,7 @@ getEnabledSpecialRemoteByName remotename =
 			| unparsedRemoteConfig (Remote.config rmt) == mempty ->
 				return Nothing
 			| otherwise -> 
-				maybe (return (Just rmt)) giveup
+				maybe (Just <$> importTreeWorkAround rmt) giveup
 					(checkSpecialRemoteProblems rmt)
 
 checkSpecialRemoteProblems :: Remote -> Maybe String
@@ -586,9 +585,6 @@ checkSpecialRemoteProblems rmt
 	| Remote.thirdPartyPopulated (Remote.remotetype rmt) =
 		Just $ "Cannot use this thirdparty-populated special"
 			++ " remote as a git remote."
-	| importTree (Remote.config rmt) = 
-		Just $ "Using importtree=yes special remotes as git remotes"
-			++ " is not yet supported."
 	| parseEncryptionMethod (unparsedRemoteConfig (Remote.config rmt)) /= Right NoneEncryption
 		&& not (remoteAnnexAllowEncryptedGitRepo (Remote.gitconfig rmt)) =
 			Just $ "Using an encrypted special remote as a git"
@@ -600,6 +596,27 @@ checkSpecialRemoteProblems rmt
   where
 	ConfigKey allowencryptedgitrepo = remoteAnnexConfig rmt "allow-encrypted-gitrepo"
 
+-- Using importTree remotes needs the content identifier database to be
+-- populated, but it is not when cloning, and cannot be updated when
+-- pushing since git-annex branch updates by this program are prevented.
+--
+-- So, generate instead a version of the remote that uses exportTree actions,
+-- which do not need content identifiers. Since Remote.Helper.exportImport
+-- replaces the exportActions in exportActionsForImport with ones that use
+-- import actions, have to instantiate a new remote with a modified config.
+importTreeWorkAround :: Remote -> Annex Remote
+importTreeWorkAround rmt
+	| not (importTree (Remote.config rmt)) = pure rmt
+	| not (exportTree (Remote.config rmt)) = giveup "Using special remotes with importtree=yes but without exporttree=yes as git remotes is not supported."
+	| otherwise = do
+		m <- Logs.Remote.remoteConfigMap
+		r <- Remote.getRepo rmt
+		remoteGen' adjustconfig m (Remote.remotetype rmt) r >>= \case
+			Just rmt' -> return rmt'
+			Nothing -> giveup "Failed to use importtree=yes remote."
+  where
+	adjustconfig = M.delete importTreeField
+
 -- Downloads the Manifest when present in the remote. When not present,
 -- returns an empty Manifest.
 downloadManifestWhenPresent :: Remote -> Annex Manifest
diff --git a/Remote/List.hs b/Remote/List.hs
index e25d09b957..71e33f7763 100644
--- a/Remote/List.hs
+++ b/Remote/List.hs
@@ -87,12 +87,20 @@ remoteList' autoinit = do
 
 {- Generates a Remote. -}
 remoteGen :: M.Map UUID RemoteConfig -> RemoteType -> Git.Repo -> Annex (Maybe Remote)
-remoteGen m t g = do
+remoteGen = remoteGen' id
+
+remoteGen'
+	:: (RemoteConfig -> RemoteConfig)
+	-> M.Map UUID RemoteConfig
+	-> RemoteType
+	-> Git.Repo
+	-> Annex (Maybe Remote)
+remoteGen' adjustconfig m t g = do
 	u <- getRepoUUID g
 	gc <- Annex.getRemoteGitConfig g
 	let cu = fromMaybe u $ remoteAnnexConfigUUID gc
 	let rs = RemoteStateHandle cu
-	let c = fromMaybe M.empty $ M.lookup cu m
+	let c = adjustconfig (fromMaybe M.empty $ M.lookup cu m)
 	generate t g u c gc rs >>= \case
 		Nothing -> return Nothing
 		Just r -> Just <$> adjustExportImport (adjustReadOnly (addHooks r)) rs
diff --git a/doc/todo/git-remote-annex.mdwn b/doc/todo/git-remote-annex.mdwn
index 6d635cc492..09708814c5 100644
--- a/doc/todo/git-remote-annex.mdwn
+++ b/doc/todo/git-remote-annex.mdwn
@@ -15,10 +15,6 @@ This is implememented and working. Remaining todo list for it:
   not display any useful error message. git fetch does, but it seems
   git clone eats git-remote-annex stderr.
 
-* Cloning from an annex:: url with importtree=yes doesn't work
-  (with or without exporttree=yes). This is because the ContentIdentifier
-  db is not populated. It should be possible to work around this.
-
 * datalad-annex supports cloning from the web special remote,
   using an url that contains the result of pushing to eg, a directory
   special remote.