Recent changes to this wiki:

diff --git a/doc/bugs/git_commit_smudges_unncessarily.mdwn b/doc/bugs/git_commit_smudges_unncessarily.mdwn
new file mode 100644
index 000000000..b0aae6b49
--- /dev/null
+++ b/doc/bugs/git_commit_smudges_unncessarily.mdwn
@@ -0,0 +1,20 @@
+### Please describe the problem.
+
+git commit smudges unlocked files in the index when they were already smudged with git status.
+
+### What steps will reproduce the problem?
+
+```
+for i in {1..200}; do echo $i > $i; done
+git init; git annex init; git config annex.thin true; git config annex.crippledfilesystem true; git config annex.addunlocked true; git annex add .; git status; time git commit -m .
+```
+
+git commit should take less than a second but takes several seconds to smudge the files in the index.
+
+### What version of git-annex are you using? On what operating system?
+
+Through bisection, the problem was found to be introduced in [[!commit 428c91606b434512d1986622e751c795edf4df44]]. Problem occurs both in Linux and WSL1.
+
+### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
+
+git-annex has worked wonderfully for managing my files across different machines and cloud storage services.

Added a comment
diff --git a/doc/bugs/git-annex_died_of_signal_11_when_syncing_content/comment_4_678a0136f143d2d7874226b3fc744eb2._comment b/doc/bugs/git-annex_died_of_signal_11_when_syncing_content/comment_4_678a0136f143d2d7874226b3fc744eb2._comment
new file mode 100644
index 000000000..2665ce27e
--- /dev/null
+++ b/doc/bugs/git-annex_died_of_signal_11_when_syncing_content/comment_4_678a0136f143d2d7874226b3fc744eb2._comment
@@ -0,0 +1,24 @@
+[[!comment format=mdwn
+ username="git-annex.visiteur@e9d364191d2ffc1b163c8d9e4c57dbadf58aad8e"
+ nickname="git-annex.visiteur"
+ avatar="http://cdn.libravatar.org/avatar/59640df9d44f100f0bf98c1cbb430037"
+ subject="comment 4"
+ date="2021-10-21T08:24:55Z"
+ content="""
+Yes the problem occurs each time I want to do git annex get . on a repository with about 20 000 files. 
+
+As i say, I've perfom memtest86 (several times) and memtester (several times) without any problem. Do you think it could be hardware problem despite these results ?
+
+You suggest using gdb. I get
+
+Thread 1 \"git-annex\" received signal SIGSEGV, Segmentation fault.
+0x0000000003c6be67 in ?? ()
+
+(gdb) backtrace
+
+#0  0x0000000003c6be67 in ?? ()
+#1  0x0000000000000000 in ?? ()
+
+which is not very helpful. Do you have any advices to investigate?
+
+"""]]

diff --git a/doc/bugs/metadata_--batch_--json_should_fail_on_bad_fields.mdwn b/doc/bugs/metadata_--batch_--json_should_fail_on_bad_fields.mdwn
new file mode 100644
index 000000000..0441664fc
--- /dev/null
+++ b/doc/bugs/metadata_--batch_--json_should_fail_on_bad_fields.mdwn
@@ -0,0 +1,3 @@
+When setting file metadata using `git-annex metadata --batch --json --json-error-messages`, if the "fields" field of an input line is not 100% an object whose values are arrays of strings, then git-annex will silently ignore the "fields" field and act as though the user simply requested the metadata for the given file/key.  It would be more useful if, whenever the input contains a "fields" field that does not match the required schema, git annex treats it as an error.  This would make it easier for users to figure out that they are doing something wrong.
+
+[[!meta author=jwodder]]

Added a comment
diff --git a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_12_76f765e3befd5e263a8863b56cad139b._comment b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_12_76f765e3befd5e263a8863b56cad139b._comment
new file mode 100644
index 000000000..adaae56df
--- /dev/null
+++ b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_12_76f765e3befd5e263a8863b56cad139b._comment
@@ -0,0 +1,45 @@
+[[!comment format=mdwn
+ username="asakurareiko@f3d908c71c009580228b264f63f21c7274df7476"
+ nickname="asakurareiko"
+ avatar="http://cdn.libravatar.org/avatar/a865743e357add9d15081840179ce082"
+ subject="comment 12"
+ date="2021-10-21T02:09:02Z"
+ content="""
+Forgot to add in the previous comment. The index looks fine afterwards
+
+```
+On branch master
+
+No commits yet
+
+Changes to be committed:
+  (use \"git rm --cached <file>...\" to unstage)
+        new file:   a
+        new file:   b
+        new file:   c
+```
+
+```
+diff --git a/a b/a
+new file mode 100755
+index 0000000..f8e47b9
+--- /dev/null
++++ b/a
+@@ -0,0 +1 @@
++/annex/objects/SHA256E-s0--e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
+diff --git a/b b/b
+new file mode 100755
+index 0000000..f8e47b9
+--- /dev/null
++++ b/b
+@@ -0,0 +1 @@
++/annex/objects/SHA256E-s0--e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
+diff --git a/c b/c
+new file mode 100755
+index 0000000..f8e47b9
+--- /dev/null
++++ b/c
+@@ -0,0 +1 @@
++/annex/objects/SHA256E-s0--e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
+```
+"""]]

Added a comment
diff --git a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_11_94d00ed84dfebebd7889fa861172df61._comment b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_11_94d00ed84dfebebd7889fa861172df61._comment
new file mode 100644
index 000000000..b01e11584
--- /dev/null
+++ b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_11_94d00ed84dfebebd7889fa861172df61._comment
@@ -0,0 +1,46 @@
+[[!comment format=mdwn
+ username="asakurareiko@f3d908c71c009580228b264f63f21c7274df7476"
+ nickname="asakurareiko"
+ avatar="http://cdn.libravatar.org/avatar/a865743e357add9d15081840179ce082"
+ subject="comment 11"
+ date="2021-10-21T02:06:50Z"
+ content="""
+I found a new type of failure which occurs when there are new unlocked files in the index.
+
+```
+git init
+git annex init
+git config annex.crippledfilesystem true
+git config annex.addunlocked true
+touch a
+git annex add . # OK
+touch b
+git annex add . # 1 error
+touch c
+git annex add . # 2 errors
+```
+
+Something is happening to the files already in the index and the error is triggered once per file in the index.
+
+```
+add c
+ok
+sqlite worker thread crashed: user error (SQLite3 returned ErrorProtocol while attempting to perform prepare \"SELECT null from content limit 1\": locking protocol(while opening database connection))
+git-annex: sqlite query crashed: thread blocked indefinitely in an MVar operation
+CallStack (from HasCallStack):
+  error, called at ./Database/Handle.hs:102:40 in main:Database.Handle
+error: external filter 'git-annex smudge --clean -- %f' failed 1
+error: external filter 'git-annex smudge --clean -- %f' failed
+add a
+ok
+sqlite worker thread crashed: user error (SQLite3 returned ErrorProtocol while attempting to perform prepare \"SELECT null from content limit 1\": locking protocol(while opening database connection))
+git-annex: sqlite query crashed: thread blocked indefinitely in an MVar operation
+CallStack (from HasCallStack):
+  error, called at ./Database/Handle.hs:102:40 in main:Database.Handle
+error: external filter 'git-annex smudge --clean -- %f' failed 1
+error: external filter 'git-annex smudge --clean -- %f' failed
+add b
+ok
+(recording state in git...)
+```
+"""]]

diff --git a/doc/bugs/metadata_cmd._vs._--json-error-messages.mdwn b/doc/bugs/metadata_cmd._vs._--json-error-messages.mdwn
new file mode 100644
index 000000000..9f5524636
--- /dev/null
+++ b/doc/bugs/metadata_cmd._vs._--json-error-messages.mdwn
@@ -0,0 +1,6 @@
+(Sorry about the title; I was trying to work within the character limit.)
+
+When invoking `git-annex metadata --batch --json --json-error-messages`, if an error occurs in response to some input — say, because the name of a nonexistent file was supplied (or, in my case, because the name of a file downloaded milliseconds ago in a parallel addurl process was supplied) — then `git-annex metadata` will output "git-annex: not an annexed file: {filepath}" to standard error and immediately exit.  Not only is this in contrast to what it seems `--json-error-messages` should do, but the "exiting immediately" bit is in contrast to my understanding of how batch mode is supposed to work.  Surely this should be fixed?
+
+[[!meta author=jwodder]]
+[[!tag projects/dandi]]

Added a comment
diff --git a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_10_93c9d3ea6d7a1f2d035f8a52c81790ad._comment b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_10_93c9d3ea6d7a1f2d035f8a52c81790ad._comment
new file mode 100644
index 000000000..15b06ccc7
--- /dev/null
+++ b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_10_93c9d3ea6d7a1f2d035f8a52c81790ad._comment
@@ -0,0 +1,71 @@
+[[!comment format=mdwn
+ username="asakurareiko@f3d908c71c009580228b264f63f21c7274df7476"
+ nickname="asakurareiko"
+ avatar="http://cdn.libravatar.org/avatar/a865743e357add9d15081840179ce082"
+ subject="comment 10"
+ date="2021-10-20T22:54:39Z"
+ content="""
+The error I get previously (before [[!commit 0f38ad9a6]]) with my test case is
+
+```
+init
+  Detected a filesystem without fifo support.
+
+  Disabling ssh connection caching.
+ok
+(recording state in git...)
+get a (from origin...)
+ok
+get b (from origin...)
+ok
+(recording state in git...)
+sqlite worker thread crashed: user error (SQLite3 returned ErrorProtocol while attempting to perform prepare \"SELECT null from content limit 1\": locking protocol(while opening database connection))
+git-annex: sqlite query crashed: thread blocked indefinitely in an MVar operation
+CallStack (from HasCallStack):
+  error, called at ./Database/Handle.hs:102:40 in main:Database.Handle
+error: external filter 'git-annex smudge --clean -- %f' failed 1
+error: external filter 'git-annex smudge --clean -- %f' failed
+sqlite worker thread crashed: user error (SQLite3 returned ErrorProtocol while attempting to perform prepare \"SELECT null from content limit 1\": locking protocol(while opening database connection))
+git-annex: sqlite query crashed: thread blocked indefinitely in an MVar operation
+CallStack (from HasCallStack):
+  error, called at ./Database/Handle.hs:102:40 in main:Database.Handle
+error: external filter 'git-annex smudge --clean -- %f' failed 1
+error: external filter 'git-annex smudge --clean -- %f' failed
+```
+
+With [[!commit d0ef8303c]], the test case still works, but adjusted branches still have the same error. 
+
+```
+git init
+git annex init
+git config annex.crippledfilesystem true
+echo aaa > a
+cp a b
+git annex add .
+git commit -m .
+git annex adjust --unlock
+```
+
+produces
+
+```
+adjust
+sqlite worker thread crashed: user error (SQLite3 returned ErrorProtocol while attempting to perform prepare \"SELECT null from content limit 1\": locking protocol(while opening database connection))
+git-annex: thread blocked indefinitely in an MVar operation
+error: external filter 'git-annex smudge -- %f' failed 1
+error: external filter 'git-annex smudge -- %f' failed
+sqlite worker thread crashed: user error (SQLite3 returned ErrorProtocol while attempting to perform prepare \"SELECT null from content limit 1\": locking protocol(while opening database connection))
+git-annex: thread blocked indefinitely in an MVar operation
+error: external filter 'git-annex smudge -- %f' failed 1
+error: external filter 'git-annex smudge -- %f' failed
+Switched to branch 'adjusted/master(unlocked)'
+sqlite worker thread crashed: user error (SQLite3 returned ErrorProtocol while attempting to perform prepare \"SELECT null from content limit 1\": locking protocol(while opening database connection))
+git-annex: sqlite query crashed: thread blocked indefinitely in an MVar operation
+CallStack (from HasCallStack):
+  error, called at ./Database/Handle.hs:79:40 in main:Database.Handle
+failed
+adjust: 1 failed
+```
+
+About `git-annex version`, I'm using `make install-home` to do an incremental build but the version does not update.
+"""]]

comment
diff --git a/doc/bugs/git-annex_died_of_signal_11_when_syncing_content/comment_3_373e63d55e07cb5cdecc4e01d5d52b79._comment b/doc/bugs/git-annex_died_of_signal_11_when_syncing_content/comment_3_373e63d55e07cb5cdecc4e01d5d52b79._comment
new file mode 100644
index 000000000..d73ed01f4
--- /dev/null
+++ b/doc/bugs/git-annex_died_of_signal_11_when_syncing_content/comment_3_373e63d55e07cb5cdecc4e01d5d52b79._comment
@@ -0,0 +1,13 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2021-10-20T19:18:17Z"
+ content="""
+Since the vast number of places git-annex runs do not have this problem, I
+don't think we can draw any conclusions about differences between your 2
+machines.
+
+Memory issue still seems like the most likely bet.
+
+Are you able to reproduce the problem repeatedly?
+"""]]

Added a comment: Same problem with git-annex 8.20210223
diff --git a/doc/bugs/git-annex_died_of_signal_11_when_syncing_content/comment_2_e920f624ccf2e3a9c71a00105ebc3595._comment b/doc/bugs/git-annex_died_of_signal_11_when_syncing_content/comment_2_e920f624ccf2e3a9c71a00105ebc3595._comment
new file mode 100644
index 000000000..83bbe81af
--- /dev/null
+++ b/doc/bugs/git-annex_died_of_signal_11_when_syncing_content/comment_2_e920f624ccf2e3a9c71a00105ebc3595._comment
@@ -0,0 +1,16 @@
+[[!comment format=mdwn
+ username="git-annex.visiteur@e9d364191d2ffc1b163c8d9e4c57dbadf58aad8e"
+ nickname="git-annex.visiteur"
+ avatar="http://cdn.libravatar.org/avatar/59640df9d44f100f0bf98c1cbb430037"
+ subject="Same problem with git-annex 8.20210223"
+ date="2021-10-20T19:09:29Z"
+ content="""
+I have exactly the same problem with Debian 11 and git-annex 8.20210223. In kernel.log I can see 
+
+> traps: git-annex[91341] general protection fault ip:3c6be67 sp:7ffd7afd26f8 error:0 in git-annex[400000+3a78000]
+
+I've done a memtest86 on my memory and nothing appended.
+
+Oddly, I have another machin with same versions of OS and git-annex, but it does not reproduce the problem. The bigger difference between the two machines is that machine with problem run on ZFS. Is it possible that the problem comes from that?
+
+"""]]

comment
diff --git a/doc/forum/git_annex_sync_destroys_data_on_shallow_clones/comment_1_431fef51e00f19cdf206a5ab97570be5._comment b/doc/forum/git_annex_sync_destroys_data_on_shallow_clones/comment_1_431fef51e00f19cdf206a5ab97570be5._comment
new file mode 100644
index 000000000..072ffbd49
--- /dev/null
+++ b/doc/forum/git_annex_sync_destroys_data_on_shallow_clones/comment_1_431fef51e00f19cdf206a5ab97570be5._comment
@@ -0,0 +1,84 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2021-10-20T18:14:38Z"
+ content="""
+Well, I tried to reproduce this, following your instructions to the extent
+there were clear.
+
+I made a repository, added some files to git, and committed. 
+Then `git rm --cached` the files, and `git-annex add` to add them to
+git-annex, and committed. So master had 2 commits, an older commit
+where the files were in git, and a newer commit where the files are in
+git-annex. The git-annex branch had a couple of commits as well.
+
+Then I cloned:
+
+	git clone --depth 1 localhost:/tmp/path/to/repo clone
+
+In the clone, that made master be a single commit. There was no
+origin/git-annex branch in the clone, like there normally would be
+in a non-shallow clone.
+
+Then I ran `git-annex init; git annex sync`:
+
+	joey@darkstar:/tmp/clone>git annex init
+	init  ok
+	(recording state in git...)
+	joey@darkstar:/tmp/clone>git annex sync
+	commit 
+	On branch master
+	Your branch is up to date with 'origin/master'.
+	
+	nothing to commit, working tree clean
+	ok
+	pull origin 
+	ok
+	push origin 
+	Enumerating objects: 6, done.
+	Counting objects: 100% (6/6), done.
+	Delta compression using up to 4 threads
+	Compressing objects: 100% (3/3), done.
+	Writing objects: 100% (5/5), 450 bytes | 450.00 KiB/s, done.
+	Total 5 (delta 0), reused 0 (delta 0), pack-reused 0
+	To localhost:/tmp/repo
+	 * [new branch]      master -> synced/master
+	 * [new branch]      git-annex -> synced/git-annex
+	ok
+
+At this point, the git-annex branch in the remote repository is not
+destroyed. It contains a merge between the branch that was there before
+the clone and the git-annex branch that was synced from the clone.
+Looks just fine.
+
+In the clone, there is still no origin/git-annex branch, and the git-annex
+branch has only the changes that git-annex committed to it in the clone.
+
+So, the clone still doesn't know that it can get the annexed files from origin.
+But nothing is "destroyed".
+
+This does not seem like the best possible behavior, it would be better if,
+after git-annex sync, it fetched origin/git-annex (either the latest
+commit or all of them) and merged it into the local git-annex branch.
+
+What's going on is, the shallow clone gets remote.origin.fetch set
+to "+refs/heads/master:refs/remotes/origin/master". So, attempting
+to fetch any other branch from origin will always skip creating
+a tracking branch.
+
+All you have to do, then is 
+
+	git config '+refs/heads/*:refs/remotes/origin/*'
+
+That preserves master as a shallow clone, while letting
+the git-annex branch be fetched. Or, alternatively, `git fetch --unshallow`.
+
+Maybe git-annex sync could detect this situation and force fetch 
+the git-annex branch. (eg, git fetch origin git-annex, which does
+actually fetch the refs, followed by manually setting
+origin/git-annex to `FETCH_HEAD`) That would leave workflows using
+`git push` and `git pull` still with the problem. And it might be that
+someone who wants a shallow clone also wants the git-annex branch to be
+cloned shallowly and would object if its full history was fetched by that.
+I have not found a way yet to fetch the git-annex branch shallowly.
+"""]]

comment
diff --git a/doc/todo/More_space_savings_for_annex.thin/comment_3_15cc8f9e428df04199d139e2677afd8a._comment b/doc/todo/More_space_savings_for_annex.thin/comment_3_15cc8f9e428df04199d139e2677afd8a._comment
new file mode 100644
index 000000000..6114e5a41
--- /dev/null
+++ b/doc/todo/More_space_savings_for_annex.thin/comment_3_15cc8f9e428df04199d139e2677afd8a._comment
@@ -0,0 +1,14 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2021-10-20T18:01:31Z"
+ content="""
+If your filesystem supports reflinks, you should not need to enable
+annex.thin, just let git-annex make copies. It makes the copy
+using `cp --reflink=auto`, so when reflinks are supported, you'll get a
+nice cheap reflink.
+
+WRT annex.thinmode=forcehardlink, this would be something that aimed the
+gun right at the user's foot and then waits for the trigger to be pulled
+by any program that ever might write to a file.
+"""]]

comment
diff --git a/doc/bugs/Host_resolution_error_on_Android_when_adding_RSync.net_repo/comment_2_97bf976e0e04fb27916e245eb9c7709d._comment b/doc/bugs/Host_resolution_error_on_Android_when_adding_RSync.net_repo/comment_2_97bf976e0e04fb27916e245eb9c7709d._comment
new file mode 100644
index 000000000..706e54e99
--- /dev/null
+++ b/doc/bugs/Host_resolution_error_on_Android_when_adding_RSync.net_repo/comment_2_97bf976e0e04fb27916e245eb9c7709d._comment
@@ -0,0 +1,13 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2021-10-20T17:52:23Z"
+ content="""
+This is not limited to rsync.net, other DNS lookups by git-annex also fail.
+
+Without investigating, my strong hunch is that this is because git-annex is
+linked to glibc, and so it expects there to be /etc/nssswitch.conf,
+/etc/resolv.conf, /etc/services etc that glibc uses. And
+some/all of those files are not present on Android. If you were able to set
+up the files, it would probably work.
+"""]]
diff --git a/doc/forum/is_it_possible_to_use_github_with_git-lfs_special_remote_within_the_assistant___40__android__41____63__/comment_2_a73cbde4297d1be734c768d7b8f9d597._comment b/doc/forum/is_it_possible_to_use_github_with_git-lfs_special_remote_within_the_assistant___40__android__41____63__/comment_2_a73cbde4297d1be734c768d7b8f9d597._comment
new file mode 100644
index 000000000..28f7880b1
--- /dev/null
+++ b/doc/forum/is_it_possible_to_use_github_with_git-lfs_special_remote_within_the_assistant___40__android__41____63__/comment_2_a73cbde4297d1be734c768d7b8f9d597._comment
@@ -0,0 +1,7 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2021-10-20T17:51:53Z"
+ content="""
+See [[bugs/Host_resolution_error_on_Android_when_adding_RSync.net_repo]]
+"""]]

update
diff --git a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_8_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_8_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment
index f7a78a6a5..636823964 100644
--- a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_8_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment
+++ b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_8_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment
@@ -1,24 +1,20 @@
 [[!comment format=mdwn
  username="joey"
- subject="""comment 7"""
+ subject="""comment 8"""
  date="2021-10-20T17:04:09Z"
  content="""
 @asakurareiko oh that's encouraging that I seem to be on the right track.
 
-Adjusted branches still have the same error?
+Although I was not aware that this test case in your comment #8 failed
+before?
 
-I noticed that
-git-annex opened a second connection to the database for writes, in
-addition to the connection it used for reads. That seems likely to be
-involved in whatever locking problem there is on WSL.
+I noticed that git-annex opened a second connection to the database for
+writes, in addition to the connection it used for reads. That seems likely
+to be involved in whatever locking problem there is on WSL.
 
 Commit [[!commit d0ef8303cf8c4f40a1d17bd134af961fd9917ca4]] eliminates that
 second connection. But there's some chance I'll have to revert it.
 
 If you test, please include `git-annex version` output
 so I can make sure you have a version with that change.
-
-I also wonder if this means that running 2 git-annex commands
-simulantaneously will sometimes result in the same sqlite problem, despite
-these fixes.
 """]]

oops, I misread, still happens for adjusted branches
diff --git a/CHANGELOG b/CHANGELOG
index 9d8052c2e..d47b374dd 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -17,8 +17,7 @@ git-annex (8.20211012) UNRELEASED; urgency=medium
   * git-annex get when run as the first git-annex command in a new repo
     did not populate all unlocked files.
     (Reversion in version 8.20210621)
-  * Avoid a sqlite crash on Windows SubSystem for Linux (WSL)
-    when entering an adjusted branch.
+  * Avoid a some sqlite crashes on Windows SubSystem for Linux (WSL).
 
  -- Joey Hess <id@joeyh.name>  Mon, 11 Oct 2021 14:09:13 -0400
 
diff --git a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_8_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_8_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment
index 88a1727d0..f7a78a6a5 100644
--- a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_8_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment
+++ b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_8_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment
@@ -3,14 +3,14 @@
  subject="""comment 7"""
  date="2021-10-20T17:04:09Z"
  content="""
-@asakurareiko oh excellent news!
+@asakurareiko oh that's encouraging that I seem to be on the right track.
 
-Before I saw that it's apparently fixed, I noticed that
+Adjusted branches still have the same error?
+
+I noticed that
 git-annex opened a second connection to the database for writes, in
 addition to the connection it used for reads. That seems likely to be
-involved in whatever locking problem there is on WSL. While maybe I already
-fixed the main one, the fact that fix works makes me even more suspicious
-about situations where there are multiple database connections.
+involved in whatever locking problem there is on WSL.
 
 Commit [[!commit d0ef8303cf8c4f40a1d17bd134af961fd9917ca4]] eliminates that
 second connection. But there's some chance I'll have to revert it.
diff --git a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_9_b0700fdf101f6cc883857b293cd35267._comment b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_9_b0700fdf101f6cc883857b293cd35267._comment
index d86b66e14..3761132d4 100644
--- a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_9_b0700fdf101f6cc883857b293cd35267._comment
+++ b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_9_b0700fdf101f6cc883857b293cd35267._comment
@@ -17,7 +17,7 @@ which would not be helpful. So perhaps it would be better to handle it
 like Sqlite.ErrorIO is handled, waiting for up to 1/10th of a second.
 But perhaps that would not be enough of a wait.
 
-Anyway, this is a note to myself: If concurrent git-annex processes on Windows
-still have this problem, try catching Sqlite.ErrorProtocol and experiment
+Anyway, this is a note to myself: If all else fails,
+try catching Sqlite.ErrorProtocol and experiment
 with different ways to handle it.
 """]]

linked bugs
diff --git a/doc/bugs/WSL1__58___git-annex-add_fails_in_DrvFs_filesystem/comment_2_5634af4cef0d0fe0d2affe40d9c0d5ea._comment b/doc/bugs/WSL1__58___git-annex-add_fails_in_DrvFs_filesystem/comment_2_5634af4cef0d0fe0d2affe40d9c0d5ea._comment
new file mode 100644
index 000000000..1d4dd4a59
--- /dev/null
+++ b/doc/bugs/WSL1__58___git-annex-add_fails_in_DrvFs_filesystem/comment_2_5634af4cef0d0fe0d2affe40d9c0d5ea._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2021-10-20T17:31:07Z"
+ content="""
+These look like very similar problems:
+<https://git-annex.branchable.com/bugs/__34__rename__58___permission_denied__34__/>
+<https://git-annex.branchable.com/bugs/still_seeing_errors_with_parallel_git-annex-add/>
+"""]]
diff --git a/doc/bugs/__34__rename__58___permission_denied__34__/comment_2_e70c8b8192eb0f8868e52dc7a287e526._comment b/doc/bugs/__34__rename__58___permission_denied__34__/comment_2_e70c8b8192eb0f8868e52dc7a287e526._comment
new file mode 100644
index 000000000..064317335
--- /dev/null
+++ b/doc/bugs/__34__rename__58___permission_denied__34__/comment_2_e70c8b8192eb0f8868e52dc7a287e526._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2021-10-20T17:30:24Z"
+ content="""
+These look like very similar problems:
+<https://git-annex.branchable.com/bugs/WSL1__58___git-annex-add_fails_in_DrvFs_filesystem/>
+<https://git-annex.branchable.com/bugs/still_seeing_errors_with_parallel_git-annex-add/>
+"""]]
diff --git a/doc/bugs/still_seeing_errors_with_parallel_git-annex-add/comment_3_e76ddebaf9d248c02d16a6914973d7d0._comment b/doc/bugs/still_seeing_errors_with_parallel_git-annex-add/comment_3_e76ddebaf9d248c02d16a6914973d7d0._comment
new file mode 100644
index 000000000..14afe6cec
--- /dev/null
+++ b/doc/bugs/still_seeing_errors_with_parallel_git-annex-add/comment_3_e76ddebaf9d248c02d16a6914973d7d0._comment
@@ -0,0 +1,15 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2021-10-20T17:33:03Z"
+ content="""
+This seems to be the same error message as in these 2 bugs:
+
+OSX: <https://git-annex.branchable.com/bugs/__34__rename__58___permission_denied__34__/>
+
+Windows: <https://git-annex.branchable.com/bugs/WSL1__58___git-annex-add_fails_in_DrvFs_filesystem/>
+
+So it would be especially helpful if I could reproduce it on Linux. Is
+anything else needed other than running git-annex add concurrently? And is that
+multiple processes, or with -J? 
+"""]]

comment
diff --git a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_9_b0700fdf101f6cc883857b293cd35267._comment b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_9_b0700fdf101f6cc883857b293cd35267._comment
new file mode 100644
index 000000000..d86b66e14
--- /dev/null
+++ b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_9_b0700fdf101f6cc883857b293cd35267._comment
@@ -0,0 +1,23 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 9"""
+ date="2021-10-20T17:15:27Z"
+ content="""
+The crash shows that runSqliteRobustly called `rethrow "while opening
+database connection"`, and I think it was in the "| otherwise" branch
+because the error is not Sqlite.ErrorIO.
+
+So, it may also possibly help to handle Sqlite.ErrorProtocol,
+which seems like what the actual error is from the message.
+Handling it the same as Sqlite.ErrorBusy would make opening the db
+be retried until whatever else had it open closes it, or finishes
+the operation that is causing the problem. On the other hand, 
+that might make git-annex hang until another git-annex process exits,
+which would not be helpful. So perhaps it would be better to handle it
+like Sqlite.ErrorIO is handled, waiting for up to 1/10th of a second.
+But perhaps that would not be enough of a wait.
+
+Anyway, this is a note to myself: If concurrent git-annex processes on Windows
+still have this problem, try catching Sqlite.ErrorProtocol and experiment
+with different ways to handle it.
+"""]]

update per other comment
diff --git a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_7_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_8_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment
similarity index 52%
rename from doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_7_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment
rename to doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_8_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment
index c63d1c266..88a1727d0 100644
--- a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_7_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment
+++ b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_8_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment
@@ -3,13 +3,22 @@
  subject="""comment 7"""
  date="2021-10-20T17:04:09Z"
  content="""
+@asakurareiko oh excellent news!
+
+Before I saw that it's apparently fixed, I noticed that
 git-annex opened a second connection to the database for writes, in
 addition to the connection it used for reads. That seems likely to be
-involved in whatever locking problem there is on WSL.
+involved in whatever locking problem there is on WSL. While maybe I already
+fixed the main one, the fact that fix works makes me even more suspicious
+about situations where there are multiple database connections.
 
 Commit [[!commit d0ef8303cf8c4f40a1d17bd134af961fd9917ca4]] eliminates that
 second connection. But there's some chance I'll have to revert it.
 
 If you test, please include `git-annex version` output
 so I can make sure you have a version with that change.
+
+I also wonder if this means that running 2 git-annex commands
+simulantaneously will sometimes result in the same sqlite problem, despite
+these fixes.
 """]]

comment
diff --git a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_7_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_7_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment
new file mode 100644
index 000000000..c63d1c266
--- /dev/null
+++ b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_7_cb7ce88ae3d77b9ba0a4e33c2321a3e1._comment
@@ -0,0 +1,15 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 7"""
+ date="2021-10-20T17:04:09Z"
+ content="""
+git-annex opened a second connection to the database for writes, in
+addition to the connection it used for reads. That seems likely to be
+involved in whatever locking problem there is on WSL.
+
+Commit [[!commit d0ef8303cf8c4f40a1d17bd134af961fd9917ca4]] eliminates that
+second connection. But there's some chance I'll have to revert it.
+
+If you test, please include `git-annex version` output
+so I can make sure you have a version with that change.
+"""]]

Added a comment
diff --git a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_7_276be047d18b2e20a8e3114abe3132ee._comment b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_7_276be047d18b2e20a8e3114abe3132ee._comment
new file mode 100644
index 000000000..0a267591d
--- /dev/null
+++ b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_7_276be047d18b2e20a8e3114abe3132ee._comment
@@ -0,0 +1,36 @@
+[[!comment format=mdwn
+ username="asakurareiko@f3d908c71c009580228b264f63f21c7274df7476"
+ nickname="asakurareiko"
+ avatar="http://cdn.libravatar.org/avatar/a865743e357add9d15081840179ce082"
+ subject="comment 7"
+ date="2021-10-20T14:42:33Z"
+ content="""
+I tested 0f38ad9a6 with the test case below as well as with the repo I use and sqlite errors no longer occur. Adjusted branches still do not work but everything else with unlocked files seems to be ok now. Thank you Joey.
+
+Setup:
+
+```
+# on NTFS volume mounted with metadata option
+git init annex
+cd annex
+git annex init
+git config annex.addunlocked true
+git config annex.crippledfilesystem true
+echo aaa > a
+cp a b
+git annex add .
+git commit -m .
+```
+
+Test:
+
+```
+git clone annex annex2
+cd annex2
+git annex init
+git config annex.crippledfilesystem true
+git annex get .
+```
+
+
+"""]]

Added a comment: Still seeing the issue.
diff --git a/doc/bugs/Host_resolution_error_on_Android_when_adding_RSync.net_repo/comment_1_a7a613c3fa1bee42442a446366027381._comment b/doc/bugs/Host_resolution_error_on_Android_when_adding_RSync.net_repo/comment_1_a7a613c3fa1bee42442a446366027381._comment
new file mode 100644
index 000000000..71c50228e
--- /dev/null
+++ b/doc/bugs/Host_resolution_error_on_Android_when_adding_RSync.net_repo/comment_1_a7a613c3fa1bee42442a446366027381._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="gitannexuser2021"
+ avatar="http://cdn.libravatar.org/avatar/e0d5a0aa4ca494b520a971c53865edcd"
+ subject="Still seeing the issue."
+ date="2021-10-19T22:37:44Z"
+ content="""
+I'm using termux to install git-annex on my android. When trying to add Rsync.net repo, the provided hostname cannot be resolved. Adding repo on my laptop (linux) works normally. I'm able to ping the hostname via termux successfully.
+"""]]

improve sqlite MultiWriter handling of read after write
This removes a messy caveat that was easy to forget and caused at least one
bug. The price paid is that, after a write to a MultiWriter db, it has to
close the db connection that it had been using to read, and open a new
connection. So it might be a little bit slower. But, writes are usually
batched together, so there's often only a single write, and so there should
not be much of a slowdown. Notice that SingleWriter already closed the db
connection after a write, so paid the same overhead.
This is the second try at fixing a bug: git-annex get when run as the first
git-annex command in a new repo did not populate all unlocked files.
(Reversion in version 8.20210621)
Sponsored-by: Boyd Stephen Smith Jr. on Patreon
diff --git a/CHANGELOG b/CHANGELOG
index e8e4ad1d0..8c86d935a 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -14,6 +14,9 @@ git-annex (8.20211012) UNRELEASED; urgency=medium
     occurred when downloading the chunk, rather than the error that
     occurred when trying to download the unchunked content, which is less
     likely to actually be stored in the remote.
+  * git-annex get when run as the first git-annex command in a new repo
+    did not populate all unlocked files.
+    (Reversion in version 8.20210621)
 
  -- Joey Hess <id@joeyh.name>  Mon, 11 Oct 2021 14:09:13 -0400
 
diff --git a/Database/Handle.hs b/Database/Handle.hs
index d7f1822dc..349dbca30 100644
--- a/Database/Handle.hs
+++ b/Database/Handle.hs
@@ -45,19 +45,13 @@ type TableName = String
 {- Sqlite only allows a single write to a database at a time; a concurrent
  - write will crash. 
  - 
- - MultiWrter works around this limitation.
- - The downside of using MultiWriter is that after writing a change to the
- - database, the a query using the same DbHandle will not immediately see
- - the change! This is because the change is actually written using a
- - separate database connection, and caching can prevent seeing the change.
- - Also, consider that if multiple processes are writing to a database,
- - you can't rely on seeing values you've just written anyway, as another
- - process may change them.
+ - MultiWrter works around this limitation. It uses additional resources
+ - when writing, because it needs to open the database multiple times. And
+ - writes to the database may block for some time, if other processes are also
+ - writing to it.
  -
  - When a database can only be written to by a single process (enforced by
- - a lock file), use SingleWriter. Changes written to the database will
- - always be immediately visible then. Multiple threads can write; their
- - writes will be serialized.
+ - a lock file), use SingleWriter. (Multiple threads can still write.)
  -}
 data DbConcurrency = SingleWriter | MultiWriter
 
@@ -89,9 +83,6 @@ closeDb (DbHandle _ worker jobs) = do
  - Only one action can be run at a time against a given DbHandle.
  - If called concurrently in the same process, this will block until
  - it is able to run.
- -
- - Note that when the DbHandle was opened in MultiWriter mode, recent
- - writes may not be seen by queryDb.
  -}
 queryDb :: DbHandle -> SqlPersistM a -> IO a
 queryDb (DbHandle _ _ jobs) a = do
@@ -165,7 +156,7 @@ workerThread db tablename jobs = go
 			Right (QueryJob a) -> a >> loop
 			Right (ChangeJob a) -> do
 				a
-				-- Exit this sqlite transaction so the
+				-- Exit this sqlite connection so the
 				-- database gets updated on disk.
 				return True
 			-- Change is run in a separate database connection
@@ -174,7 +165,11 @@ workerThread db tablename jobs = go
 			-- that the write is made to.
 			Right (RobustChangeJob a) -> do
 				liftIO (a (runSqliteRobustly tablename db))
-				loop
+				-- Exit this sqlite connection so the
+				-- change that was just written, using 
+				-- a different db handle, is immediately
+				-- visible to queries.
+				return True
 	
 -- Like runSqlite, but more robust.
 --
diff --git a/Database/Queue.hs b/Database/Queue.hs
index 68e3e42f8..434acfc9a 100644
--- a/Database/Queue.hs
+++ b/Database/Queue.hs
@@ -64,9 +64,6 @@ flushDbQueue (DQ hdl qvar) = do
  -
  - Queries will not see changes that have been recently queued,
  - so use with care.
- -
- - Also, when the database was opened in MultiWriter mode,
- - queries may not see changes even after flushDbQueue.
  -}
 queryDbQueue :: DbQueue -> SqlPersistM a -> IO a
 queryDbQueue (DQ hdl _) = queryDb hdl
diff --git a/doc/bugs/initial_get_of_unlocked_file_fails_to_populate_pointer.mdwn b/doc/bugs/initial_get_of_unlocked_file_fails_to_populate_pointer.mdwn
index 885205068..74806b695 100644
--- a/doc/bugs/initial_get_of_unlocked_file_fails_to_populate_pointer.mdwn
+++ b/doc/bugs/initial_get_of_unlocked_file_fails_to_populate_pointer.mdwn
@@ -43,3 +43,43 @@ This outputs 1 for foo, followed by annex pointer files for files bar and baz.
 
 The previous fix attempt did make foo get populated, before that none
 of the files were populated.
+
+----
+
+`GIT_TRACE=1` shows that git only runs the smudge filter on the first
+file, not the other two. And indeed, restagePointerFile is only called
+on the first file.
+
+Added debugging to Database.Keys.reconcileStaged, and it adds all 3 files to
+the associated files table, but only adds the inode cache of foo.
+And that's what I see in the db after the fact too. Which is
+not itself a problem, to the extent that the other files are not
+populated, and only populated files have an inode cache recorded.
+
+So, Database.Keys.reconcileStaged is called after it gets foo,
+but before the other files are present, and in reconcilepointerfile it
+calls populatePointerFile and records the inode cache for foo.
+That is how foo gets populated.
+
+But, the other 2 files do not have populatePointerFile run on them.
+In moveAnnex, it calls getAssociatedFiles and somehow that returns
+`[]`, for all 3 files. This does not matter for foo, because it gets
+populated by reconcileStaged as explained above. But for the other 2, with
+no known associated files of course it fails to populate them.
+
+So: Why is getAssociatedFiles returning `[]`? Those calls come
+after Database.Keys.reconcileStaged has added the associated files,
+but are somehow not seeing the changes it made.
+
+Ah.. The keys db is opened in MultiWriter mode. 
+See the comment above the definition of MultiWriter,
+which explains that a write to a MultiWriter database,
+followed by a flushDbQueue may not be visible when reading
+from that same database.
+
+Verified this by making it re-open the db after reconcileStaged,
+which did fix the problem.
+
+A better fix is possible: Make MultiWriter mode not have this hidden
+gotcha, by re-opening the db after writing to it always. [[done]]
+--[[Joey]]

Added a comment
diff --git a/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_19_854531569edbb5c152c4d6b0d764a6c5._comment b/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_19_854531569edbb5c152c4d6b0d764a6c5._comment
new file mode 100644
index 000000000..83a122f2e
--- /dev/null
+++ b/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_19_854531569edbb5c152c4d6b0d764a6c5._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="Lukey"
+ avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
+ subject="comment 19"
+ date="2021-10-19T17:17:47Z"
+ content="""
+Again, you can just run `git annex import --from=CD1 CD1`, which will import everything on the CD1 special remote to a branch named \"CD1\" which is completely independ from the master branch (it won't even share the same history).
+"""]]

reopen
diff --git a/doc/bugs/initial_get_of_unlocked_file_fails_to_populate_pointer.mdwn b/doc/bugs/initial_get_of_unlocked_file_fails_to_populate_pointer.mdwn
index d2eea4c95..885205068 100644
--- a/doc/bugs/initial_get_of_unlocked_file_fails_to_populate_pointer.mdwn
+++ b/doc/bugs/initial_get_of_unlocked_file_fails_to_populate_pointer.mdwn
@@ -14,4 +14,32 @@ empty list. So something to do with database write caching.
 
 Somehow, not having init call `scanAnnexedFiles` makes this bug go away.
 
-> [[fixed|done]] --[[Joey]]
+> fixed --[[Joey]]
+
+----
+
+I have reopened this bug, it seems the previous fix was not right.
+See [[!commit b3c4579c7907147a496bdf2c73b42238d8b239d6]] for that
+fix, which had doubts at the time in the commit message. --[[Joey]] 
+
+Here is a test case:
+
+	git init foo
+	cd foo
+	git annex init
+	echo 1 > foo
+	echo 2 > bar
+	echo 3 > baz
+	git annex add
+	git annex unlock
+	git commit -m add
+	cd ..
+	git clone foo bar
+	cd bar
+	git-annex get
+	cat *
+
+This outputs 1 for foo, followed by annex pointer files for files bar and baz.
+
+The previous fix attempt did make foo get populated, before that none
+of the files were populated.

close keys db to possibly work around WSL1 issue
diff --git a/Annex/Link.hs b/Annex/Link.hs
index c305b39bd..878228788 100644
--- a/Annex/Link.hs
+++ b/Annex/Link.hs
@@ -203,7 +203,12 @@ restagePointerFile (Restage True) f orig = withTSDelta $ \tsd ->
 	-- updated index file.
 	runner :: Git.Queue.InternalActionRunner Annex
 	runner = Git.Queue.InternalActionRunner "restagePointerFile" $ \r l -> do
-		liftIO . Database.Keys.Handle.flushDbQueue
+		-- Flush any queued changes to the keys database, so they
+		-- are visible to child processes.
+		-- The database is closed because that may improve behavior
+		-- when run in Windows's WSL1, which has issues with
+		-- multiple writers to SQL databases.
+		liftIO . Database.Keys.Handle.closeDbHandle
 			=<< Annex.getRead Annex.keysdbhandle
 		realindex <- liftIO $ Git.Index.currentIndexFile r
 		let lock = fromRawFilePath (Git.Index.indexFileLock realindex)
diff --git a/Database/Keys.hs b/Database/Keys.hs
index 4aeee67bd..afd6048ad 100644
--- a/Database/Keys.hs
+++ b/Database/Keys.hs
@@ -37,7 +37,7 @@ import qualified Annex
 import Annex.LockFile
 import Annex.Content.PointerFile
 import Annex.Content.Presence.LowLevel
-import Annex.Link
+import Annex.Link (Restage(..), maxPointerSz, parseLinkTargetOrPointerLazy)
 import Utility.InodeCache
 import Annex.InodeSentinal
 import Git
diff --git a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_6_b7a3837fd6af236e9ecf6d5bae077fd0._comment b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_6_b7a3837fd6af236e9ecf6d5bae077fd0._comment
new file mode 100644
index 000000000..5f46a0be3
--- /dev/null
+++ b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_6_b7a3837fd6af236e9ecf6d5bae077fd0._comment
@@ -0,0 +1,25 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 6"""
+ date="2021-10-19T16:44:13Z"
+ content="""
+@asakurareiko it makes sense it would fail that way with WAL disabled,
+since the sqlite database cannot support multiple writers then. And
+there are probably several situations where multiple git-annex processes
+end up using the database, even when you are only running a single
+git-annex command at a time.
+
+> Without this patch other than adjusted branches, unlocked files generally do
+> work in WSL1. Sqlite error may occur at the end of commands such as `git annex get/drop`
+
+Sounds like `restagePointerFile`, which tends to run at the
+end of such an operation to handle all the files that have been updated. 
+That runs `git update-index`, which then runs `git-annex smudge`.
+So both the parent and child git-annex process can have the database open
+for write, which WAL mode normally supports, but something in WSL prevents
+it from working right.
+
+Following this theory, I've made `restagePointerFile` close the database
+first. Perhaps that will avoid the problem, at least in those cases. Your
+testing is appreciated.
+"""]]

diff --git a/doc/forum/git_annex_sync_destroys_data_on_shallow_clones.mdwn b/doc/forum/git_annex_sync_destroys_data_on_shallow_clones.mdwn
new file mode 100644
index 000000000..1b88cd0a7
--- /dev/null
+++ b/doc/forum/git_annex_sync_destroys_data_on_shallow_clones.mdwn
@@ -0,0 +1,2 @@
+I just migrated binaries from native git file tracking to git annex. Then I went on to clone the repo on a different device. Because older commits still contain the binaries I did a shallow clone. After that I wanted to fetch a few binaries from annex and ran ```git annex init; git annex sync```. 
+To my surprise instead of somehow working out the annex metadata with the remote it just force pushed an empty git-annex branch to the remote. Luckily I was just testing things out and had a backup available but for people who rely on a service like gitlab to access their annex repository when traveling this can end up being a very nasty surprise. I don't exactly know how this could best be fixed but force pushing without asking isn't a good solution in my opinion. Maybe git-annex-init could check if a remote already has annex metadata and pull that. git-annex-sync could fail and give you the option to add a force flag or work out how to merge things (which shouldn't be too hard when the local metadata is completely empty)

comment
diff --git a/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_3_b9de19ba8382225ac8c65ee1ad8110a8._comment b/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_4_b9de19ba8382225ac8c65ee1ad8110a8._comment
similarity index 96%
rename from doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_3_b9de19ba8382225ac8c65ee1ad8110a8._comment
rename to doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_4_b9de19ba8382225ac8c65ee1ad8110a8._comment
index 051cfb4d2..3bb4de0e2 100644
--- a/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_3_b9de19ba8382225ac8c65ee1ad8110a8._comment
+++ b/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_4_b9de19ba8382225ac8c65ee1ad8110a8._comment
@@ -1,6 +1,6 @@
 [[!comment format=mdwn
  username="joey"
- subject="""comment 3"""
+ subject="""comment 4"""
  date="2021-10-19T13:58:24Z"
  content="""
 Sequoia-PGP could be another contender (OpenPGP in rust). There is a sq
diff --git a/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_5_c81d61a221fd93eb25765998b67bbde7._comment b/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_5_c81d61a221fd93eb25765998b67bbde7._comment
new file mode 100644
index 000000000..d3aee173b
--- /dev/null
+++ b/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_5_c81d61a221fd93eb25765998b67bbde7._comment
@@ -0,0 +1,32 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 5"""
+ date="2021-10-19T16:33:01Z"
+ content="""
+> I'm not sure I follow, couldn't you just generate an age keypair and simply store that in the repo?
+> 
+> Does the current gpg-based implementation not do it just like that?
+
+No, it uses gpg --symmetric which is much simpler and also likely more
+secure.
+
+As far as gpg's UI complexity, it's a problem to some extent (although every
+one of those options presumably has a user), but notice that a git-annex user
+who uses encryption=shared never has to touch gpg's interface at all.
+This is by design. It's only with encryption=hybrid and pubkey that the user
+is exposed to  the complexities of public key crypto, and I expect that mostly
+users who already are familiar with that and need the inherent complexity of it
+will use those.
+
+> age seems like the most obvious alternative for use-cases like
+> git-annex. Only time can tell whether it actually becomes the new file encryption
+> standard but it seems like the most likely candidate right now.
+
+I don't follow this reasoning; the openpgp standard is a well-established
+standard with many implementations, and so it seems likely that an implementation
+of that standard will be what replaces gpg, if anything.
+
+(It also is possible that gpg eventually ends up being reimplemented using
+something like Sequoia-PGP under the hood to gain the protections from C-level
+security holes, which are certainly a real concern.)
+"""]]

remove xmpp mention
diff --git a/doc/install/fromsource.mdwn b/doc/install/fromsource.mdwn
index f024974da..c8a75779e 100644
--- a/doc/install/fromsource.mdwn
+++ b/doc/install/fromsource.mdwn
@@ -64,11 +64,11 @@ yielding a less reliable build. Stack also only installs the binary,
 and not other files.)
 
 Note that this build produces a git-annex without the build flags
-XMPP, DBUS, and MagicMime.
+DBUS and MagicMime.
 These optional features require installing additional C libraries.
 To try to build with these features 
 enabled, pass extra parameters when running `stack build`: 
-`--flag git-annex:XMPP --flag git-annex:DBUS --flag git-annex:MagicMime`
+`--flag git-annex:DBUS --flag git-annex:MagicMime`
 
 ## minimal build from source with cabal
 

comment
diff --git a/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_3_b9de19ba8382225ac8c65ee1ad8110a8._comment b/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_3_b9de19ba8382225ac8c65ee1ad8110a8._comment
new file mode 100644
index 000000000..051cfb4d2
--- /dev/null
+++ b/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_3_b9de19ba8382225ac8c65ee1ad8110a8._comment
@@ -0,0 +1,18 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2021-10-19T13:58:24Z"
+ content="""
+Sequoia-PGP could be another contender (OpenPGP in rust). There is a sq
+command line interface that is equivilant to gpg.
+
+It should be able to decrypt objects in repositories encrypted using gpg.
+If that works well, git-annex could support it in parallel with gpg for
+some time, then deprecate the gpg support and eventually remove it. This is
+a more appealing path than supporting multiple encryption tools
+indefinitely.
+
+This issue is a pre-requisite for using sq, although presumably the
+low-level library could be used directly to avoid the issue.
+<https://gitlab.com/sequoia-pgp/sequoia/-/issues/766>
+"""]]
diff --git a/stack.yaml b/stack.yaml
index efdf931f2..06668431c 100644
--- a/stack.yaml
+++ b/stack.yaml
@@ -19,7 +19,7 @@ extra-deps:
 - IfElse-0.85
 - aws-0.22
 - bloomfilter-2.0.1.0
-- git-lfs-1.1.1
+- git-lfs-1.1.2
 - http-client-restricted-0.0.4
 - network-multicast-0.3.2
 - sandi-0.5

Added a comment
diff --git a/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_3_09b89df28a3a3dab2f0fbf2f56feafe1._comment b/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_3_09b89df28a3a3dab2f0fbf2f56feafe1._comment
new file mode 100644
index 000000000..e6946ee5f
--- /dev/null
+++ b/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_3_09b89df28a3a3dab2f0fbf2f56feafe1._comment
@@ -0,0 +1,36 @@
+[[!comment format=mdwn
+ username="Atemu"
+ avatar="http://cdn.libravatar.org/avatar/d1f0f4275931c552403f4c6707bead7a"
+ subject="comment 3"
+ date="2021-10-19T13:05:50Z"
+ content="""
+> Mostly people pick on gpg's public key crypto implementation, and mostly I think, because public key crypto is easy to pick at. I have not seen many such arguments about gpg that seem very convincing to me.
+
+Ah, this is not about the underlying algorithms and theories. Those are more than sufficient in our pre-quantum world and, as you said, AGE isn't fundamentally different here.
+
+What I'm mostly concerned about is the quality of the implementation; especially the non-functional aspects like speed, UI and simplicity (some of which also have security implications).
+
+> The way git-annex uses gpg for encryption=shared does not involve public key crypto at all, but uses AES-128, which is about as close as there is to a standard for encrypting things.
+
+The actual encryption is done symmetrically but the encryption of the symmetric key is done asymmetrically if I'm not mistaken.
+
+This is not what age aims to replace though, it functions the exact same from a high level AFAIK; just with a different implementation that satisfies different goals from GPG.
+
+> git-annex already links to an AES implementation for other purposes and could probably bypass gpg and use that, but that smells of implementing your own crypto.
+
+That definitely smells of homebrewed crypto. This is why I would love to see something like age used: It's a pre-made, (supposedly) secure, standardised crypto system/library that you can feed files into and it simply gives you encrypted files back. No faffing about with complex key setups or other brilliant UX anno 1999.
+
+> AGE does not seem to expose any non-public key encryption, so it could not be used for encryption=shared, I think. (Unless perhaps the public/private key pair were stored in the repo as the shared cipher?)
+
+I'm not sure I follow, couldn't you just generate an age keypair and simply store that in the repo?
+
+Does the current gpg-based implementation not do it just like that?
+
+> I don't like the idea of git-annex supporting more than 2 encryption programs, and even 2 seems like 1 too many. Every one will be an ongoing cost. It's not clear to me that there's enough of a benefit to support AGE, or that it would be the best choice for a +1.
+
+Me neither and I understand your point but the problem with that approach is that it leaves absolutely no way to migrate away from GPG.
+
+GPG is a complex beast that will likely need replacement within the next decade and age seems like the most obvious alternative for use-cases like git-annex.  
+Only time can tell whether it actually becomes the new file encryption standard but it seems like the most likely candidate right now.
+
+"""]]

Added a comment
diff --git a/doc/todo/File_deletion_workflow/comment_2_60b54e6bbb37464697b4f5c4ebe561da._comment b/doc/todo/File_deletion_workflow/comment_2_60b54e6bbb37464697b4f5c4ebe561da._comment
new file mode 100644
index 000000000..6c43e974c
--- /dev/null
+++ b/doc/todo/File_deletion_workflow/comment_2_60b54e6bbb37464697b4f5c4ebe561da._comment
@@ -0,0 +1,23 @@
+[[!comment format=mdwn
+ username="Atemu"
+ avatar="http://cdn.libravatar.org/avatar/d1f0f4275931c552403f4c6707bead7a"
+ subject="comment 2"
+ date="2021-10-19T12:26:33Z"
+ content="""
+> Since this was posted, fsck has stopped complaining about files dropped with `dropunused`.
+
+Thank you!
+
+> I could imagine formalizing this ad-hoc tag into something standard in git-annex.
+
+This is the best option I think; some sort of flag you can set on a key that marks it as unwanted and propagates via the git-annex branch.
+The actual deletions could then be carried out on the individual repos by using a dedicated command (`dropdeleted`?), by the assistant or perhaps even using `sync --content`. 
+
+The important bit is that it shouldn't be synchronous or depend on the repository being reachable directly though; it should be recorded and propagated asynchronously.  
+E.g. in any tree of repos with assistants running (so, no transitive connections), marking a key as deleted in any one of them should result in the key being deleted from all of them.
+
+> But one problem with it is it may not play well in multiuser environments where people have different ideas about what files they want to delete all copies of.
+
+Ah, I didn't consider that. A recycle bin with tracked files is likely infeasible but an untracked one could still be valuable. That's a topic for another issue though.
+
+"""]]

Added a comment
diff --git a/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_18_679ad639cbba7f9a4d28131219d20178._comment b/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_18_679ad639cbba7f9a4d28131219d20178._comment
new file mode 100644
index 000000000..fdf131035
--- /dev/null
+++ b/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_18_679ad639cbba7f9a4d28131219d20178._comment
@@ -0,0 +1,25 @@
+[[!comment format=mdwn
+ username="username"
+ avatar="http://cdn.libravatar.org/avatar/3c17ce77d299219a458fc2eff973239a"
+ subject="comment 18"
+ date="2021-10-18T19:54:29Z"
+ content="""
+I wasn't clear enough, or maybe I did something wrong when testing the directory special remote.
+
+When using standard git-annex repositories for everything as explained in comment 14 I get this:
+
+```
+$ git log --graph --oneline
+*   5555555555 (HEAD -> master) amended commit 2
+|\  
+| * 4444444444 (DVD1) DVD1
+*   3333333333  amended commit 1
+|\  
+| * 2222222222 (CD1) CD1
+*   1111111111  init
+```
+
+The master branch tracks only the contents of my local HDD (note that in step 4 I edit the working directory to my liking and amend the automatic ```git-annex sync``` commit), and the CD1 branch contains only what's in that disc and nothing else, such that ```git checkout 2222222222``` or ```git checkout CD1``` replaces everything in the working directory with the contents of CD1, similar to mounting the physical disc and browsing its filesystem.
+
+Using the directory special remote, the contents of my master branch are combined with the contents of the special remote, so the same ```git checkout CD1``` command wouldn't replicate exactly what's on CD1, the master branch directory tree would still be present after the branch switch.
+"""]]

Added a comment
diff --git a/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_17_455ab8b72c477769d68eb4348e911234._comment b/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_17_455ab8b72c477769d68eb4348e911234._comment
new file mode 100644
index 000000000..7e4e9dc4c
--- /dev/null
+++ b/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_17_455ab8b72c477769d68eb4348e911234._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="Lukey"
+ avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
+ subject="comment 17"
+ date="2021-10-18T15:36:06Z"
+ content="""
+>Thanks for the suggestion. I've looked into directory special remotes and using them for the optical media appears to undermine the intent of step 3 in my previous reply, of \"mounting\" each disc using git checkout DISC_LABEL, because the master branch contents are combined with the imported directory special remote contents.
+
+Huh? Of course you can import to a separate branch.
+"""]]

note about encryption/chunking
diff --git a/doc/git-annex-inprogress.mdwn b/doc/git-annex-inprogress.mdwn
index 25739ae71..744cebf84 100644
--- a/doc/git-annex-inprogress.mdwn
+++ b/doc/git-annex-inprogress.mdwn
@@ -13,6 +13,9 @@ it is still being downloaded. It outputs to standard output the
 name of the temporary file that is being used to download the specified
 annexed file.
 
+Nothing will be output when the download is from an encrypted or chunked 
+special remote.
+
 This can sometimes be used to stream a file before it's been fully
 downloaded, for example:
 

Added a comment: directory special remote
diff --git a/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_16_c3f6cb58dd2328b7af8dd2657c2e2a1e._comment b/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_16_c3f6cb58dd2328b7af8dd2657c2e2a1e._comment
new file mode 100644
index 000000000..ba211048f
--- /dev/null
+++ b/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_16_c3f6cb58dd2328b7af8dd2657c2e2a1e._comment
@@ -0,0 +1,31 @@
+[[!comment format=mdwn
+ username="username"
+ avatar="http://cdn.libravatar.org/avatar/3c17ce77d299219a458fc2eff973239a"
+ subject="directory special remote"
+ date="2021-10-17T22:15:51Z"
+ content="""
+Thanks for the suggestion.
+I've looked into directory special remotes and using them for the optical media appears to undermine the intent of step 3 in my previous reply, of \"mounting\" each disc using ```git checkout DISC_LABEL```, because the master branch contents are combined with the imported directory special remote contents.
+
+The ```git checkout``` should leave the working directory with an 1:1 copy of the directory tree of the imported disc, except with all files replaced by broken annex symlinks.
+
+But I'm considering the opposite now: using the directory special remote not for the optical discs but for the master branch of the repo instead, the one that tracks the local HDD tree:
+
+    git-annex initremote HDD type=directory directory=/path/to/HDD encryption=none importtree=yes
+
+The local dataset I want to use as the seed for the catalogue has multiple hardlinks so making a git-annex repository directly within it is out of the question as it would lead to duplicated data.
+
+The initial plan to work around that was making a reflink copy of the directory tree, initialising the git-annex repo therein, and regularly update its master branch by replacing the git working directory with a brand new reflink clone and ```git-annex add```'ing it.
+
+If I understood git-annex right, this would imply a full re-read of the whole dataset because of the changed inode numbers of the new reflink clone, despite the contents, filenames, and mtimes of most files being 100% identical.
+
+However, it seems that using a directory special remote would neatly circumvent that (at least until the current HDD dies and I'm forced to ```mkfs``` in the replacement) because git-annex would be smart enough to detect renames by looking at the stable inode and mtime of the moved files.
+
+The local dataset is around 250K files and 4TiB in real size, ballooning to over 8TiB if hardlinked files were counted as copies. The updates (using ```git-annex import master --from HDD --no-content```) to the catalogue master branch would happen with frequency somewhere between monthly to every 2 years.
+
+2 questions:
+
+1. Am I correct in assuming that re-importing a special remote would only read the newly added files and correctly detect all renames and deletions without re-reading, no matter how much time passes between re-imports of the master branch?
+
+2. Are there any downsides (scalability, memory use, etc) to using a directory special remote for this use case instead of a regular git-annex repository?
+"""]]

Added a comment
diff --git a/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_5_f8ee3d06a79bdc429a114b5256290206._comment b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_5_f8ee3d06a79bdc429a114b5256290206._comment
new file mode 100644
index 000000000..aa6ace7e1
--- /dev/null
+++ b/doc/bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol/comment_5_f8ee3d06a79bdc429a114b5256290206._comment
@@ -0,0 +1,17 @@
+[[!comment format=mdwn
+ username="asakurareiko@f3d908c71c009580228b264f63f21c7274df7476"
+ nickname="asakurareiko"
+ avatar="http://cdn.libravatar.org/avatar/a865743e357add9d15081840179ce082"
+ subject="comment 5"
+ date="2021-10-17T01:14:18Z"
+ content="""
+Since WSL2 has terrible performance with the NTFS volumes already mounted in Windows, consumes more memory, and has higher hardware requirements, I'm still interested in using WSL1. I applied the patch to disable WAL from [this comment](http://git-annex.branchable.com/bugs/crippled_fs___40__pidlock__41___leads_to_git-annex__58___SQLite3_error/#comment-46e7f3e4052eec268ae72ead4afc3cea), however now I get a different sqlite error that happens more often as well.
+
+```
+  failed to commit changes to sqlite database: Just user error (SQLite3 returned ErrorBusy while attempting to perform step: database is locked(after successful open))
+  CallStack (from HasCallStack):
+    error, called at ./Database/Handle.hs:116:26 in main:Database.Handle
+```
+
+Without this patch other than adjusted branches, unlocked files generally do work in WSL1. Sqlite error may occur at the end of commands such as `git annex get/drop` and can be fixed by manually removing .git/index.lock and doing a `git annex add` or `git reset`.
+"""]]

Added a comment
diff --git a/doc/todo/More_space_savings_for_annex.thin/comment_2_95d195f2bad7e6a912e5d201928a94b9._comment b/doc/todo/More_space_savings_for_annex.thin/comment_2_95d195f2bad7e6a912e5d201928a94b9._comment
new file mode 100644
index 000000000..704c7407c
--- /dev/null
+++ b/doc/todo/More_space_savings_for_annex.thin/comment_2_95d195f2bad7e6a912e5d201928a94b9._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="asakurareiko@f3d908c71c009580228b264f63f21c7274df7476"
+ nickname="asakurareiko"
+ avatar="http://cdn.libravatar.org/avatar/a865743e357add9d15081840179ce082"
+ subject="comment 2"
+ date="2021-10-16T15:46:32Z"
+ content="""
+Then in such cases I don't think annex.thin=true need to create hardlinks.
+"""]]

Added a comment
diff --git a/doc/todo/More_space_savings_for_annex.thin/comment_1_cd571b1b3b6e73041ee629d031481d09._comment b/doc/todo/More_space_savings_for_annex.thin/comment_1_cd571b1b3b6e73041ee629d031481d09._comment
new file mode 100644
index 000000000..67a083681
--- /dev/null
+++ b/doc/todo/More_space_savings_for_annex.thin/comment_1_cd571b1b3b6e73041ee629d031481d09._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="Lukey"
+ avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
+ subject="comment 1"
+ date="2021-10-16T15:27:35Z"
+ content="""
+git-annex already uses reflinks by default if the filesystem supports it.
+"""]]

Note about case sensitivity dirs
diff --git a/doc/todo/windows_support.mdwn b/doc/todo/windows_support.mdwn
index 98ec2d454..f68aee856 100644
--- a/doc/todo/windows_support.mdwn
+++ b/doc/todo/windows_support.mdwn
@@ -144,7 +144,9 @@ Do the following:
 
 2. Mount the NTFS drive with metadata option. [`/etc/wsl.conf`](https://docs.microsoft.com/en-us/windows/wsl/wsl-config) can be used or a line such as `C: /mnt/c drvfs metadata` can be added in `/etc/fstab`.
 
-3. Create an empty directory where your repo will be. Then enable case sensitivity `setfattr -n system.wsl_case_sensitive -v 1 <path>`. This attribute will be automatically and recursively applied to any future subdirectories. If setfattr(1) errs out with permission denied, you can also effect the same change in CMD.EXE / Windows Powershell as admin with `fsutil file setCaseSensitiveInfo <path> enable`.[^1] You can check that the setting is enabled with `getfattr -n system.wsl_case_sensitive <path>` under WSL1.
+3. Create an empty directory where your repo will be. Then enable case sensitivity `setfattr -n system.wsl_case_sensitive -v 1 <path>`. This attribute will be inherited by new subdirectories. If setfattr(1) errs out with permission denied, you can also effect the same change in CMD.EXE / Windows Powershell as admin with `fsutil file setCaseSensitiveInfo <path> enable`.[^1] You can check that the setting is enabled with `getfattr -n system.wsl_case_sensitive <path>` under WSL1.
+
+    If you do not have files which may be differ only case, you do not need to set this on the entire repository, only for `.git/annex`. If in addition you have repository [[tuning]] set to use only lowercase hash directories, you do not need to set this at all.
 
 4. Create the repo however you like (see steps below for cloning a repo with ssh). Immediately after `git annex init`, do `git config annex.crippledfilesystem true`. If you set `crippledfilesystem` before init, then git annex will try to enter an adjusted branch and trigger the first bug. If you do not set `crippledfilesystem` after init, you will trigger the second bug when doing `git annex add`.
 

Update WSL1 instructions
diff --git a/doc/todo/windows_support.mdwn b/doc/todo/windows_support.mdwn
index 243c536d2..98ec2d454 100644
--- a/doc/todo/windows_support.mdwn
+++ b/doc/todo/windows_support.mdwn
@@ -141,8 +141,11 @@ The following steps are tested on Windows 10 21h1 with Ubuntu 18.04/20.04 and ar
 Do the following:
 
 1. Enable Developer mode in Windows settings so that symlinks can be created without elevated privileges.
-2. Mount the NTFS drive with metadata option. This line can be added in `/etc/fstab`: `C: /mnt/c drvfs metadata`. I prefer to also add `uid=1000,gid=1000,fmask=0133,dmask=0022`.
+
+2. Mount the NTFS drive with metadata option. [`/etc/wsl.conf`](https://docs.microsoft.com/en-us/windows/wsl/wsl-config) can be used or a line such as `C: /mnt/c drvfs metadata` can be added in `/etc/fstab`.
+
 3. Create an empty directory where your repo will be. Then enable case sensitivity `setfattr -n system.wsl_case_sensitive -v 1 <path>`. This attribute will be automatically and recursively applied to any future subdirectories. If setfattr(1) errs out with permission denied, you can also effect the same change in CMD.EXE / Windows Powershell as admin with `fsutil file setCaseSensitiveInfo <path> enable`.[^1] You can check that the setting is enabled with `getfattr -n system.wsl_case_sensitive <path>` under WSL1.
+
 4. Create the repo however you like (see steps below for cloning a repo with ssh). Immediately after `git annex init`, do `git config annex.crippledfilesystem true`. If you set `crippledfilesystem` before init, then git annex will try to enter an adjusted branch and trigger the first bug. If you do not set `crippledfilesystem` after init, you will trigger the second bug when doing `git annex add`.
 
 [^1]: This works because Administrators usually have Full Control over most files. What Windows actually looks for is "Write attributes", "Create files", "Create folders" and "Delete subfolders and files" permissions on the directory required for changing case-sensitivity. As a regular user (or without UAC) you might not have those permissions by default for instance on external drives, so adjust accordingly. For more info about about the `system.wsl_case_sensitive` attribute see this blog post: [[https://devblogs.microsoft.com/commandline/improved-per-directory-case-sensitivity-support-in-wsl/]]

diff --git a/doc/todo/More_space_savings_for_annex.thin.mdwn b/doc/todo/More_space_savings_for_annex.thin.mdwn
new file mode 100644
index 000000000..a852dd530
--- /dev/null
+++ b/doc/todo/More_space_savings_for_annex.thin.mdwn
@@ -0,0 +1,11 @@
+Currently with `annex.thin=true`, only one copy will be hardlinked so that duplicate copies do not get silently modified. It would be good to have an option such as `annex.thinmode` for alternative ways of unlocking files, especially for cases when all files need to be kept unlocked.
+
+1. `annex.thinmode=forcehardlink`
+
+    In some cases most of the files in the repository will never be modified, and if a file does need to be modified, the hardlink can be first broken by making a copy. This can save a lot of space if git-annex is also used for file level deduplication.
+
+2. `annex.thinmode=reflink`
+
+    For some copy-on-write filesystems such as BTRFS, reflink copies can be made, such as with `cp --reflink`. This both saves space and also prevents files in .git/annex/objects from being modified.
+
+git-annex-fix can be used to apply these settings to existing repositories.

Added a comment: GitHub Actions
diff --git a/doc/forum/Submodule_.git_not_converted_to_symlink/comment_3_636fe0ff273cd8987168e3095929aa7f._comment b/doc/forum/Submodule_.git_not_converted_to_symlink/comment_3_636fe0ff273cd8987168e3095929aa7f._comment
new file mode 100644
index 000000000..ab1ae150c
--- /dev/null
+++ b/doc/forum/Submodule_.git_not_converted_to_symlink/comment_3_636fe0ff273cd8987168e3095929aa7f._comment
@@ -0,0 +1,15 @@
+[[!comment format=mdwn
+ username="athas@60e56fd42a78bbbce444d175865ce4d66ba1a779"
+ nickname="athas"
+ avatar="http://cdn.libravatar.org/avatar/f6ddda1fabf459f90ca590f9499033c4"
+ subject="GitHub Actions"
+ date="2021-10-15T20:53:06Z"
+ content="""
+It appears the `checkout` action on GitHub Actions mangles submodules that use `git-annex`.  The solution is to ask the checkout action to do a deep clone:
+
+    - uses: actions/checkout@v2
+      with:
+        submodules: recursive # for git-annex to work.
+        fetch-depth: 0        #
+
+"""]]

update
diff --git a/doc/forum/Submodule_.git_not_converted_to_symlink/comment_2_351c58b21d36770729c52991901d668c._comment b/doc/forum/Submodule_.git_not_converted_to_symlink/comment_2_351c58b21d36770729c52991901d668c._comment
index 5129d45da..1a9bc1203 100644
--- a/doc/forum/Submodule_.git_not_converted_to_symlink/comment_2_351c58b21d36770729c52991901d668c._comment
+++ b/doc/forum/Submodule_.git_not_converted_to_symlink/comment_2_351c58b21d36770729c52991901d668c._comment
@@ -5,4 +5,8 @@
  content="""
 The conversion is made when you run `git annex init`, which you have
 apparently not done..
+
+If the git-annex branch were available, it would automatically initialize
+and convert, so you may be on to something if you meant to say it did not
+fetch the git-annex branch.
 """]]

comment
diff --git a/doc/forum/Submodule_.git_not_converted_to_symlink/comment_2_351c58b21d36770729c52991901d668c._comment b/doc/forum/Submodule_.git_not_converted_to_symlink/comment_2_351c58b21d36770729c52991901d668c._comment
new file mode 100644
index 000000000..5129d45da
--- /dev/null
+++ b/doc/forum/Submodule_.git_not_converted_to_symlink/comment_2_351c58b21d36770729c52991901d668c._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2021-10-15T17:49:00Z"
+ content="""
+The conversion is made when you run `git annex init`, which you have
+apparently not done..
+"""]]

comment
diff --git a/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_2_18ae790f59a9dbeda745e6f6f45384cb._comment b/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_2_18ae790f59a9dbeda745e6f6f45384cb._comment
new file mode 100644
index 000000000..a3a92baa3
--- /dev/null
+++ b/doc/todo/whishlist__58___GPG_alternatives_like_AGE/comment_2_18ae790f59a9dbeda745e6f6f45384cb._comment
@@ -0,0 +1,35 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2021-10-15T16:45:46Z"
+ content="""
+Mostly people pick on gpg's public key crypto implementation, and mostly
+I think, because public key crypto is easy to pick at. I have not seen
+many such arguments about gpg that seem very convincing to me.
+I see that AGE also implements public key crypto.
+
+The way git-annex uses gpg for encryption=shared does not involve public
+key crypto at all, but uses AES-128, which is about as close as there is
+to a standard for encrypting things. (If you prefer AES-256, gpg can be
+configured to use that instead.)
+
+git-annex already links to an AES implementation for other purposes
+and could probably bypass gpg and use that, but that smells of implementing
+your own crypto. (AES ECB penguin comes to mind.) 
+AGE does not seem to expose any non-public key encryption, 
+so it could not be used for encryption=shared, I think. (Unless perhaps
+the public/private key pair were stored in the repo as the shared
+cipher?)
+
+While encryption=hybrid uses public key crypto, it's only for encrypting
+a cipher file, which then gets used as an AES key. So if encryption=shared
+can't be done with AGE, encryption=hybrid can't either.
+
+That leaves encryption=pubkey and encryption=sharedpubkey, which I
+suppose could have variants implemented using AGE.
+
+I don't like the idea of git-annex supporting more than 2 encryption
+programs, and even 2 seems like 1 too many. Every one will be an ongoing
+cost. It's not clear to me that there's enough of a benefit to support AGE,
+or that it would be the best choice for a +1.
+"""]]

comment
diff --git a/doc/todo/File_deletion_workflow/comment_1_7b80d9a202ede51730cbd443731e1fa7._comment b/doc/todo/File_deletion_workflow/comment_1_7b80d9a202ede51730cbd443731e1fa7._comment
new file mode 100644
index 000000000..2a42f8d31
--- /dev/null
+++ b/doc/todo/File_deletion_workflow/comment_1_7b80d9a202ede51730cbd443731e1fa7._comment
@@ -0,0 +1,39 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2021-10-15T16:21:19Z"
+ content="""
+Since this was posted, fsck has stopped complaining about files dropped
+with `dropunused`.
+
+> What's important though is that this workflow doesn't involve manually
+> running magic git annex commands for every deleted file on all possible
+> remotes (especially not hard to reach ones).
+
+If a repository is not accessible, this is difficult to implement.
+
+It seems that the closest we can get to implementing it is
+something like `dropunused`, which can be run in that inaccessible
+repository at some later point when it's accessible, and catch up on
+dropping all the files that have become unwanted while it was inaccessible.
+
+One way to do that without relying on the idea of "unused" would be to tag
+a file with metadata saying its content ought to be deleted from
+everywhere. That is possible to do now, eg:
+
+	git annex metadata --tag deletethis foo
+	git annex drop --all --metadata tag=deletethis --force
+
+That drop can be run in every clone over time to delete all the tagged
+files.
+
+I could imagine formalizing this ad-hoc tag into something standard in
+git-annex. Perhaps similar to how dead files are currently indicated.
+But one problem with it is it may not play well in multiuser
+environments where people have different ideas about what files they want
+to delete all copies of. If two users are using dropunused and have a
+disagreement, they will have 2 different branches, which are forked, and
+neither will step on the other's toes when they run dropunused against
+their branches and drop content that is still used on the other person's
+branch. But a tag like "deletethis" is repository global.
+"""]]

Added a comment: Update
diff --git a/doc/forum/Submodule_.git_not_converted_to_symlink/comment_1_5d95da69235eb3c6b60227544380a4e8._comment b/doc/forum/Submodule_.git_not_converted_to_symlink/comment_1_5d95da69235eb3c6b60227544380a4e8._comment
new file mode 100644
index 000000000..841d1e146
--- /dev/null
+++ b/doc/forum/Submodule_.git_not_converted_to_symlink/comment_1_5d95da69235eb3c6b60227544380a4e8._comment
@@ -0,0 +1,11 @@
+[[!comment format=mdwn
+ username="athas@60e56fd42a78bbbce444d175865ce4d66ba1a779"
+ nickname="athas"
+ avatar="http://cdn.libravatar.org/avatar/f6ddda1fabf459f90ca590f9499033c4"
+ subject="Update"
+ date="2021-10-15T16:37:34Z"
+ content="""
+Ignore the remark about different behaviour on the same machine; I was looking at the wrong thing.
+
+It does appear that GitHub Actions does not correctly fetch the `annex` directory for submodules, which must be the source of the error.
+"""]]

removed
diff --git a/doc/forum/Submodule_.git_not_converted_to_symlink/comment_1_8cd28022fb5cc201e00c11b4f109fc12._comment b/doc/forum/Submodule_.git_not_converted_to_symlink/comment_1_8cd28022fb5cc201e00c11b4f109fc12._comment
deleted file mode 100644
index f092dd982..000000000
--- a/doc/forum/Submodule_.git_not_converted_to_symlink/comment_1_8cd28022fb5cc201e00c11b4f109fc12._comment
+++ /dev/null
@@ -1,9 +0,0 @@
-[[!comment format=mdwn
- username="athas@60e56fd42a78bbbce444d175865ce4d66ba1a779"
- nickname="athas"
- avatar="http://cdn.libravatar.org/avatar/f6ddda1fabf459f90ca590f9499033c4"
- subject="Update"
- date="2021-10-15T16:13:32Z"
- content="""
-I have found a difference.  The name of the submodule is `futhark-benchmarks`.  On the checkout where `git-annex` works, a directory `.git/modules/futhark-benchmarks/annex` exists.  On the checkout where `git-annex` does not work, this directory is missing.  What could cause it to be lost?  The repository that fails is checked out by Buildbot (and a similar issue occurs on GitHub Actions for that matter).  Maybe they prune things they do not understand, somehow?
-"""]]

Added a comment: Update
diff --git a/doc/forum/Submodule_.git_not_converted_to_symlink/comment_1_8cd28022fb5cc201e00c11b4f109fc12._comment b/doc/forum/Submodule_.git_not_converted_to_symlink/comment_1_8cd28022fb5cc201e00c11b4f109fc12._comment
new file mode 100644
index 000000000..f092dd982
--- /dev/null
+++ b/doc/forum/Submodule_.git_not_converted_to_symlink/comment_1_8cd28022fb5cc201e00c11b4f109fc12._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="athas@60e56fd42a78bbbce444d175865ce4d66ba1a779"
+ nickname="athas"
+ avatar="http://cdn.libravatar.org/avatar/f6ddda1fabf459f90ca590f9499033c4"
+ subject="Update"
+ date="2021-10-15T16:13:32Z"
+ content="""
+I have found a difference.  The name of the submodule is `futhark-benchmarks`.  On the checkout where `git-annex` works, a directory `.git/modules/futhark-benchmarks/annex` exists.  On the checkout where `git-annex` does not work, this directory is missing.  What could cause it to be lost?  The repository that fails is checked out by Buildbot (and a similar issue occurs on GitHub Actions for that matter).  Maybe they prune things they do not understand, somehow?
+"""]]

diff --git a/doc/forum/Submodule_.git_not_converted_to_symlink.mdwn b/doc/forum/Submodule_.git_not_converted_to_symlink.mdwn
new file mode 100644
index 000000000..52ecfe53e
--- /dev/null
+++ b/doc/forum/Submodule_.git_not_converted_to_symlink.mdwn
@@ -0,0 +1,13 @@
+According to [this page](https://git-annex.branchable.com/submodules/), `git-annex` should automatically convert the `.git` file of submodules into a symlink.  However, I have a repository where on some machines, this doesn't happen.
+
+    $ file .git
+    .git: ASCII text
+
+
+    $ git-annex info
+    git-annex: First run: git-annex init
+
+    $ file .git
+    .git: ASCII text
+
+Even more mysteriously, it works on some *other* checkouts of the repository *on the same machine*.

When retrival from a chunked remote fails, display the error that occurred when downloading the chunk
Rather than the error that occurred when trying to download the unchunked
content, which is less likely to actually be stored in the remote.
Sponsored-by: Boyd Stephen Smith Jr. on Patreon
diff --git a/CHANGELOG b/CHANGELOG
index dfd6b555c..e8e4ad1d0 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -10,6 +10,10 @@ git-annex (8.20211012) UNRELEASED; urgency=medium
   * test: Put gpg temp home directory in system temp directory,
     not filesystem being tested.
   * Avoid crashing tilde expansion on user who does not exist.
+  * When retrival from a chunked remote fails, display the error that
+    occurred when downloading the chunk, rather than the error that
+    occurred when trying to download the unchunked content, which is less
+    likely to actually be stored in the remote.
 
  -- Joey Hess <id@joeyh.name>  Mon, 11 Oct 2021 14:09:13 -0400
 
diff --git a/Remote/Helper/Chunked.hs b/Remote/Helper/Chunked.hs
index ade47f3f4..b56d43389 100644
--- a/Remote/Helper/Chunked.hs
+++ b/Remote/Helper/Chunked.hs
@@ -286,9 +286,9 @@ retrieveChunks retriever u vc chunkconfig encryptor basek dest basep enc encc
 	firstavail Nothing _ [] = giveup "unable to determine the chunks to use for this remote"
 	firstavail (Just e) _ [] = throwM e
 	firstavail pe currsize ([]:ls) = firstavail pe currsize ls
-	firstavail _ currsize ((k:ks):ls)
+	firstavail pe currsize ((k:ks):ls)
 		| k == basek = getunchunked
-			`catchNonAsync` (\e -> firstavail (Just e) currsize ls)
+			`catchNonAsync` (\e -> firstavail (Just (pickerr e)) currsize ls)
 		| otherwise = do
 			let offset = resumeOffset currsize k
 			let p = maybe basep
@@ -302,10 +302,15 @@ retrieveChunks retriever u vc chunkconfig encryptor basek dest basep enc encc
 							fromMaybe 0 $ fromKey keyChunkSize k
 						getrest p h iv sz sz ks
 			case v of
-				Left e
-					| null ls -> throwM e
-					| otherwise -> firstavail (Just e) currsize ls
+				Left e -> firstavail (Just (pickerr e)) currsize ls
 				Right r -> return r
+	  where
+		-- Prefer an earlier exception to a later one, because the
+		-- more probable location is tried first and less probable
+		-- ones later.
+		pickerr e = case pe of
+			Just pe' -> pe'
+			Nothing -> e
 
 	getrest _ _ iv _ _ [] = return (Right iv)
 	getrest p h iv sz bytesprocessed (k:ks) = do
diff --git a/doc/bugs/Improvements_to_S3_glacier_integration.mdwn b/doc/bugs/Improvements_to_S3_glacier_integration.mdwn
index 233acfdbe..983869528 100644
--- a/doc/bugs/Improvements_to_S3_glacier_integration.mdwn
+++ b/doc/bugs/Improvements_to_S3_glacier_integration.mdwn
@@ -115,3 +115,5 @@ ongoing-request="false", expiry-date="Mon, 26 Apr 2021 00:00:00 GMT"
 ### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
 
 Yes, for loads of stuff. It's awesome, thanks!
+
+> [[closed|done]], see my comment --[[Joey]]
diff --git a/doc/bugs/Improvements_to_S3_glacier_integration/comment_5_9cbe83bbade15b9146d033ceb5d8b05d._comment b/doc/bugs/Improvements_to_S3_glacier_integration/comment_5_9cbe83bbade15b9146d033ceb5d8b05d._comment
new file mode 100644
index 000000000..b0dfda330
--- /dev/null
+++ b/doc/bugs/Improvements_to_S3_glacier_integration/comment_5_9cbe83bbade15b9146d033ceb5d8b05d._comment
@@ -0,0 +1,19 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 5"""
+ date="2021-10-14T16:17:14Z"
+ content="""
+I have added a note to the S3 documentation about `DEEP_ARCHIVE` and the
+glacier special remote.
+
+I have made git-annex display the exception for the more likely chunked 
+location, rather than the less likely unchunked location, when retrieving
+from both locations fails. Although it's still possible for there to be
+situations where the exception if displays is not for the location where
+the content actually is. Eg, if the chunk size of the remote has
+changed over time.
+
+I think that todo is basically talking about the same desire to make the S3
+remote support these glacier-style storage classes, in one way or another, 
+and so I think this bug report can be closed as otherwise a duplicate of it.
+"""]]
diff --git a/doc/special_remotes/S3.mdwn b/doc/special_remotes/S3.mdwn
index 71d74a533..e34cbb7f9 100644
--- a/doc/special_remotes/S3.mdwn
+++ b/doc/special_remotes/S3.mdwn
@@ -44,14 +44,17 @@ the S3 remote.
   
   When using Amazon S3,
   if the remote will be used for backup or archival,
-  and so its files are Infrequently Accessed, "STANDARD_IA" is a
+  and so its files are Infrequently Accessed, `STANDARD_IA` is a
   good choice to save money (requires a git-annex built with aws-0.13.0).
   If you have configured git-annex to preserve
-  multiple [[copies]], also consider setting this to "ONEZONE_IA"
+  multiple [[copies]], also consider setting this to `ONEZONE_IA`
   to save even more money.
 
+  Amazon S3's `DEEP_ARCHIVE` is similar to Amazon Glacier. For that,
+  use the [[glacier]] special remote, rather than this one.
+
   When using Google Cloud Storage, to make a nearline bucket, set this to
-  "NEARLINE". (Requires a git-annex built with aws-0.13.0)
+  `NEARLINE`. (Requires a git-annex built with aws-0.13.0)
 
   Note that changing the storage class of an existing S3 remote will
   affect new objects sent to the remote, but not objects already

followup
diff --git a/doc/todo/Fsck_remote_files_in-flight/comment_2_3baff888fbc3068571cdc9fee73fbe36._comment b/doc/todo/Fsck_remote_files_in-flight/comment_2_3baff888fbc3068571cdc9fee73fbe36._comment
new file mode 100644
index 000000000..35c53cd76
--- /dev/null
+++ b/doc/todo/Fsck_remote_files_in-flight/comment_2_3baff888fbc3068571cdc9fee73fbe36._comment
@@ -0,0 +1,29 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2021-10-14T15:55:14Z"
+ content="""
+Checksum during transfer is now implemented for as many remotes as it
+reasonably can be, which is almost all of them. But not 100% of all
+remotes in all circumstances. And there's no way to know if a remote
+will support it before doing the transfer.
+
+To avoid changing the API, it occurs to me that retrieveKeyFile could be
+passed `/dev/null`. But any remote that does not support resuming and tries
+to overwrite the existing destination file would fail.
+
+Also some kinds of remotes download to the file in one process or thread
+and while the download is happening, git-annex checksums the file as new
+data appears in it. External special remotes in particular do this.
+That would break with `/dev/null` too.
+
+Putting the temp file on some other medium seems like the only way to
+address this. If there were a config of a directory to use, you could point
+it at a disk rather than the SSD, or even at a ram disk, if you have
+sufficient memory. Unsure if it's worth adding such an option though,
+probably few people would use it. And cloning the repository onto the other
+medium and running the remote fsck from there would have the same result
+without needing an option.
+
+I'm inclined to close this, since I don't think it can be addressed.
+"""]]

remove 3 comments that turned out to be about an unrelated problem which got its own bug report
diff --git a/doc/todo/OPT__58_____34__bundle__34___get_+_check___40__of_checksum__41___in_a_single_operation/comment_17_225c1890457835884f9a46359935c0b0._comment b/doc/todo/OPT__58_____34__bundle__34___get_+_check___40__of_checksum__41___in_a_single_operation/comment_17_225c1890457835884f9a46359935c0b0._comment
deleted file mode 100644
index dd25285d7..000000000
--- a/doc/todo/OPT__58_____34__bundle__34___get_+_check___40__of_checksum__41___in_a_single_operation/comment_17_225c1890457835884f9a46359935c0b0._comment
+++ /dev/null
@@ -1,18 +0,0 @@
-[[!comment format=mdwn
- username="jkniiv@b330fc3a602d36a37a67b2a2d99d4bed3bb653cb"
- nickname="jkniiv"
- avatar="http://cdn.libravatar.org/avatar/419f2eee8b0c37256488fabcc2737ff2"
- subject="`git annex sync --no-commit --content` takes double the time of `git annex get .`"
- date="2021-08-20T02:05:53Z"
- content="""
-Hi Joey! Could you think of a reason why a `git annex sync --no-commit --content` takes pretty
-much double the time to retrieve an annexed file than a `git annex get .` in the same directory /
-git remote with this one file added in origin? Naturally I drop the file inbetween measurements
-and my files are of multi-gigabyte size so the delay is really noticeable between two HDDs.
-It seems that the extra time (and disk activity) takes place after the retrieval has hit 100% so
-to me it feels like git-annex is doing an extra verification pass after copying the file. This is
-on Windows and git-annex is a fresh `8.20210804-g492036622` but I seem to have noticed it
-happening even before your latest development activities above (as in months ago). Should I
-file a bug report about this?
-
-"""]]
diff --git a/doc/todo/OPT__58_____34__bundle__34___get_+_check___40__of_checksum__41___in_a_single_operation/comment_18_ea896de521803d4862a886a5ddb9f505._comment b/doc/todo/OPT__58_____34__bundle__34___get_+_check___40__of_checksum__41___in_a_single_operation/comment_18_ea896de521803d4862a886a5ddb9f505._comment
deleted file mode 100644
index 5415a2fc2..000000000
--- a/doc/todo/OPT__58_____34__bundle__34___get_+_check___40__of_checksum__41___in_a_single_operation/comment_18_ea896de521803d4862a886a5ddb9f505._comment
+++ /dev/null
@@ -1,8 +0,0 @@
-[[!comment format=mdwn
- username="joey"
- subject="""comment 18"""
- date="2021-08-24T15:09:12Z"
- content="""
-@jkniiv I think you are seeing something unrelated, such as other scanning that
-git-annex sync has to do.
-"""]]
diff --git a/doc/todo/OPT__58_____34__bundle__34___get_+_check___40__of_checksum__41___in_a_single_operation/comment_19_ecf3206253148199815fe0ed1e9c25b5._comment b/doc/todo/OPT__58_____34__bundle__34___get_+_check___40__of_checksum__41___in_a_single_operation/comment_19_ecf3206253148199815fe0ed1e9c25b5._comment
deleted file mode 100644
index 726a9e78f..000000000
--- a/doc/todo/OPT__58_____34__bundle__34___get_+_check___40__of_checksum__41___in_a_single_operation/comment_19_ecf3206253148199815fe0ed1e9c25b5._comment
+++ /dev/null
@@ -1,14 +0,0 @@
-[[!comment format=mdwn
- username="jkniiv@b330fc3a602d36a37a67b2a2d99d4bed3bb653cb"
- nickname="jkniiv"
- avatar="http://cdn.libravatar.org/avatar/419f2eee8b0c37256488fabcc2737ff2"
- subject="it turns out I had to file this as a bug"
- date="2021-08-27T01:38:29Z"
- content="""
-@joey I don't know what you mean by scanning in this case but I now have proof that git-annex really is
-doing an extra read pass over the whole file after it's been transferred from one regular remote to the
-other in the case of `sync --content[-of _file_]`. I filed a bug with details:
-[`sync -C` takes longer to get file than `get`](https://git-annex.branchable.com/bugs/__96__sync_-C__96___takes_longer_to_get_file_than___96__get__96__/)
-(btw. I can't seem to be able to make this into a WikiLink as ikiwiki thinks of the underscores of the page name as denoting strong emphasis).
-
-"""]]

close per comment
diff --git a/doc/bugs/Windows__58___drop_claims_that___34__content_is_locked__34__.mdwn b/doc/bugs/Windows__58___drop_claims_that___34__content_is_locked__34__.mdwn
index a8c4682cf..3b1416d03 100644
--- a/doc/bugs/Windows__58___drop_claims_that___34__content_is_locked__34__.mdwn
+++ b/doc/bugs/Windows__58___drop_claims_that___34__content_is_locked__34__.mdwn
@@ -95,3 +95,5 @@ so it seems to happily create that file.
 
 [[!meta author=yoh]]
 [[!tag projects/datalad]]
+
+> [[done]] apparently, reopen if neccessary --[[Joey]]

open todo
diff --git a/doc/bugs/shared_setting_of_git_causes_annex__39__ed_files_to_be_writeable__33__/comment_7_a82064795fdbdb0187763aeee4a308ff._comment b/doc/bugs/shared_setting_of_git_causes_annex__39__ed_files_to_be_writeable__33__/comment_7_a82064795fdbdb0187763aeee4a308ff._comment
new file mode 100644
index 000000000..add0385dd
--- /dev/null
+++ b/doc/bugs/shared_setting_of_git_causes_annex__39__ed_files_to_be_writeable__33__/comment_7_a82064795fdbdb0187763aeee4a308ff._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 7"""
+ date="2021-10-12T17:40:51Z"
+ content="""
+I've opened [[todo/v9_changes]] and this can be taken care of if/when a v9
+happens. It's unfortunate I didn't think of this bug when doing v8, so
+hopefully that new todo will remind me about it for v9.
+"""]]
diff --git a/doc/todo/v9_changes.mdwn b/doc/todo/v9_changes.mdwn
new file mode 100644
index 000000000..75b5648bc
--- /dev/null
+++ b/doc/todo/v9_changes.mdwn
@@ -0,0 +1,13 @@
+This is a todo for collecting changes that could lead to a v9 repository
+version.
+
+Currently, there does not seem to be enough reason to warrant one, but that
+could change and if it does, these things could be included.
+
+* Change locking of annexed files to use a separate lock file
+  rather than posix locking the file itself.
+
+  This would let write bits be removed from the file when
+  core.sharedRepository is set. See <https://git-annex.branchable.com/bugs/shared_setting_of_git_causes_annex__39__ed_files_to_be_writeable__33__/>
+
+  Note that windows already uses a separate lock file.

move gpg tmp home to system temp dir
test: Put gpg temp home directory in system temp directory, not filesystem
being tested.
Since I've found indications gpg can fail talking to the agent when the
socket ends up on eg, fat. And to hopefully fix this bug report I've
followed up on.
The main risk in using the system temp dir is that TMPDIR could be set to a
long directory path, which is too long to put a unix socket in. To
partially amelorate that risk, it uses either an absolute or a relative
path, whichever is shorter. (Hopefully gpg will not convert it to a longer
form of the path..)
If the user sets TMPDIR to something so long a path to it +
"S.gpg-agent" is too long, I suppose that's their issue to deal with.
Sponsored-by: Dartmouth College's Datalad project
diff --git a/CHANGELOG b/CHANGELOG
index c9202beb2..a4ff6fb61 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -7,6 +7,8 @@ git-annex (8.20211012) UNRELEASED; urgency=medium
   * Negotiate P2P protocol version with tor remotes, allowing
     use of protocol version 1. This negotiation is not supported
     by versions of git-annex older than 6.20180312.
+  * test: Put gpg temp home directory in system temp directory,
+    not filesystem being tested.
 
  -- Joey Hess <id@joeyh.name>  Mon, 11 Oct 2021 14:09:13 -0400
 
diff --git a/Test.hs b/Test.hs
index eb2f7e397..d0647c37f 100644
--- a/Test.hs
+++ b/Test.hs
@@ -1821,13 +1821,20 @@ test_crypto = do
 	testscheme "pubkey"
   where
 	gpgcmd = Utility.Gpg.mkGpgCmd Nothing
-	testscheme scheme = do
-		abstmp <- fromRawFilePath <$> absPath (toRawFilePath tmpdir)
-		testscheme' scheme abstmp
-	testscheme' scheme abstmp = intmpclonerepo $ do
-		gpgtmp <- (</> "gpgtmp") . fromRawFilePath
-			<$> relPathCwdToFile (toRawFilePath abstmp)
-		createDirectoryIfMissing False gpgtmp
+	testscheme scheme = Utility.Tmp.Dir.withTmpDir "gpgtmp" $ \gpgtmp -> do
+		-- Use the system temp directory as gpg temp directory because 
+		-- it needs to be able to store the agent socket there,
+		-- which can be problimatic when testing some filesystems.
+		absgpgtmp <- fromRawFilePath <$> absPath (toRawFilePath gpgtmp)
+		testscheme' scheme absgpgtmp
+	testscheme' scheme absgpgtmp = intmpclonerepo $ do
+		-- Since gpg uses a unix socket, which is limited to a
+		-- short path, use whichever is shorter of absolute
+		-- or relative path.
+		relgpgtmp <- fromRawFilePath <$> relPathCwdToFile (toRawFilePath absgpgtmp)
+		let gpgtmp = if length relgpgtmp < length absgpgtmp
+			then relgpgtmp 
+			else absgpgtmp
 		Utility.Gpg.testTestHarness gpgtmp gpgcmd
 			@? "test harness self-test failed"
 		void $ Utility.Gpg.testHarness gpgtmp gpgcmd $ do
diff --git a/doc/bugs/gpgconf__58___invalid_option___34__--kill__34_____40__gpg_2.0.22__41___/comment_3_d3a478e3f7da91ab14c9b1e24d3e07d6._comment b/doc/bugs/gpgconf__58___invalid_option___34__--kill__34_____40__gpg_2.0.22__41___/comment_3_d3a478e3f7da91ab14c9b1e24d3e07d6._comment
new file mode 100644
index 000000000..6d53afe66
--- /dev/null
+++ b/doc/bugs/gpgconf__58___invalid_option___34__--kill__34_____40__gpg_2.0.22__41___/comment_3_d3a478e3f7da91ab14c9b1e24d3e07d6._comment
@@ -0,0 +1,58 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2021-10-12T16:42:02Z"
+ content="""
+From the log:
+
+	    crypto:                                               gpgconf: invalid option "--kill"
+	gpgconf: invalid option "--kill"
+	FAIL (22.13s)
+	      ./Test/Framework.hs:57:
+	      copy --to encrypted remote failed (transcript follows)
+	      copy foo (to foo...)
+	      gpg: can't connect to the agent: Invalid value passed to IPC
+	      gpg: problem with the agent: No agent running
+
+That is not really be a problem with gpgconf --kill, but a problem
+talking to gpg-agent.
+
+The same crypto test fails a couple more times in that log, like this:
+
+	    crypto:                                               gpgconf: invalid option "--kill"
+	gpgconf: invalid option "--kill"
+	FAIL (12.00s)
+	      ./Test/Framework.hs:57:
+	      get of file failed (transcript follows)
+	      get foo (not available) 
+	        No other repository is known to contain the file.
+	      failed
+	      get: 1 failed
+
+That is also not a problem with gpgconf --kill, it's actually due to an
+earlier test failure, unrelated to this. That earlier failure was
+the one [the other issue](https://git-annex.branchable.com/bugs/__34__357_out_of_984_tests_failed__34___on_NFS_lustre_mount/)
+was about, which has since been fixed. So we can ignore these I think,
+leaving only the one above as an unexplained failure.
+
+"gpg: can't connect to the agent: Invalid value passed to IPC" could 
+be some kind of gpg bug. I found some other instances of gpg failing that way.
+One involved using --homedir (similar to the test suite's
+use of GNUPGHOME) but on windows.
+<https://lists.gnupg.org/pipermail/gnupg-users/2016-October/056817.html>
+And here's another one, in WSL when apt runs
+gpg. <https://github.com/microsoft/WSL/issues/5125>
+
+Perhaps this is a problem with the location of the gpg agent socket in the
+filesystem that git-annex test is running in. That somehow messes up not
+creation of that socket, but later use of it. It seems that the earlier
+self-test of the test harness did not trigger the problem though, which is
+odd because it sets up a gpg private key and I'd think would use the agent
+too.
+
+In [[!commit b426ff682570d8600dc8025bbcd20aa95819a7e4]] I considered
+putting the gpg directory inside the system temp dir, which would perhaps
+avoid the problem here. I've made that change.
+
+Please test a fresh build on this system again, if you can..
+"""]]

negotiate P2P protocol version for tor remotes
This negotiation is not supported by versions of git-annex older
than 6.20180312. Well, maybe really 6.20180227 or so, but using that in
the changelog simplifies things since it was the version for the other
changes as well.
See commit c81768d425725f868ddf23333d49eed0f3fda011 for the back story.
As well as allowing for future protocol improvements, this will result
in negoatiating protocol version 1, which is an improvement over default
version 0.
In fact, it looks like no supported version of git-annex will use
protocol version 0, since version 1 was introduced in 6.20180227.
Still, removing the code for version 0 seems unncessary.
See commit 31e1adc005e8181c5190a9f39a3f148e10f4d364.
Sponsored-by: Brett Eisenberg on Patreon.
diff --git a/CHANGELOG b/CHANGELOG
index 0d8ce53f3..c9202beb2 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -4,6 +4,9 @@ git-annex (8.20211012) UNRELEASED; urgency=medium
     git-annex older than 6.20180312.
   * git-annex-shell: Removed several commands that were only needed to
     support git-annex versions older than 6.20180312.
+  * Negotiate P2P protocol version with tor remotes, allowing
+    use of protocol version 1. This negotiation is not supported
+    by versions of git-annex older than 6.20180312.
 
  -- Joey Hess <id@joeyh.name>  Mon, 11 Oct 2021 14:09:13 -0400
 
diff --git a/Remote/P2P.hs b/Remote/P2P.hs
index 2c6056a91..15c2ea1f3 100644
--- a/Remote/P2P.hs
+++ b/Remote/P2P.hs
@@ -148,14 +148,7 @@ openConnection u addr = do
 			authtoken <- fromMaybe nullAuthToken
 				<$> loadP2PRemoteAuthToken addr
 			let proto = P2P.auth myuuid authtoken $
-				-- Before 6.20180312, the protocol server
-				-- had a bug that made negotiating the
-				-- protocol version terminate the
-				-- connection. So, this must stay disabled
-				-- until the old version is not in use
-				-- anywhere.
-				--P2P.negotiateProtocolVersion P2P.maxProtocolVersion
-				return ()
+				P2P.negotiateProtocolVersion P2P.maxProtocolVersion
 			runst <- liftIO $ mkRunState Client
 			res <- liftIO $ runNetProto runst conn proto
 			case res of
diff --git a/doc/todo/p2p_protocol_flag_days.mdwn b/doc/todo/p2p_protocol_flag_days.mdwn
index 967295b77..0b0abcd9e 100644
--- a/doc/todo/p2p_protocol_flag_days.mdwn
+++ b/doc/todo/p2p_protocol_flag_days.mdwn
@@ -10,6 +10,8 @@ historical bug, the version is not currently negotiated when using the
 protocol over tor. At some point in the future, when all peers can be
 assumed to be upgraded, this should be changed.
 
+> [[done]] --[[Joey]]
+
 ## git-annex-shell fallbacks
 
 When using git-annex-shell p2pio, git-annex assumes that if it exits 1,
@@ -21,6 +23,6 @@ can be assumed to be upgraded to 6.20180312, this fallback can be removed.
 It will allows removing a lot of code from git-annex-shell and a lot of
 fallback code from Remote.Git.
 
-> This part is done now. --[[Joey]]
+> [[done]] --[[Joey]]
 
 [[!tag confirmed]]

remove git-annex-shell compat code
* Removed support for accessing git remotes that use versions of
git-annex older than 6.20180312.
* git-annex-shell: Removed several commands that were only needed to
support git-annex versions older than 6.20180312.
(lockcontent, recvkey, sendkey, transferinfo, commit)
The P2P protocol was added in that version, and used ever since, so
this code was only needed for interop with older versions.
"git-annex-shell commit" is used by newer git-annex versions, though
unnecessarily so, because the p2pstdio command makes a single commit at
shutdown. Luckily, it was run with stderr and stdout sent to /dev/null,
and non-zero exit status or other exceptions are caught and ignored. So,
that was able to be removed from git-annex-shell too.
git-annex-shell inannex, recvkey, sendkey, and dropkey are still used by
gcrypt special remotes accessed over ssh, so those had to be kept.
It would probably be possible to convert that to using the P2P protocol,
but it would be another multi-year transition.
Some git-annex-shell fields were able to be removed. I hoped to remove
all of them, and the very concept of them, but unfortunately autoinit
is used by git-annex sync, and gcrypt uses remoteuuid.
The main win here is really in Remote.Git, removing piles of hairy fallback
code.
Sponsored-by: Luke Shumaker
diff --git a/CHANGELOG b/CHANGELOG
index 58bfe59d5..9e61ac495 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,3 +1,13 @@
+git-annex (8.20211012) UNRELEASED; urgency=medium
+
+  * Removed support for accessing git remotes that use versions of
+    git-annex older than 6.20180312.
+  * git-annex-shell: Removed several commands that were only needed to
+    support git-annex versions older than 6.20180312.
+    (lockcontent, recvkey, sendkey, transferinfo, commit)
+
+ -- Joey Hess <id@joeyh.name>  Mon, 11 Oct 2021 14:09:13 -0400
+
 git-annex (8.20211011) upstream; urgency=medium
 
   * Added annex.bwlimit and remote.name.annex-bwlimit config to limit
diff --git a/CmdLine/GitAnnexShell.hs b/CmdLine/GitAnnexShell.hs
index 55c3ae1ee..ba4f402c1 100644
--- a/CmdLine/GitAnnexShell.hs
+++ b/CmdLine/GitAnnexShell.hs
@@ -20,16 +20,13 @@ import Remote.GCrypt (getGCryptUUID)
 import P2P.Protocol (ServerMode(..))
 
 import qualified Command.ConfigList
-import qualified Command.InAnnex
-import qualified Command.LockContent
-import qualified Command.DropKey
-import qualified Command.RecvKey
-import qualified Command.SendKey
-import qualified Command.TransferInfo
-import qualified Command.Commit
 import qualified Command.NotifyChanges
 import qualified Command.GCryptSetup
 import qualified Command.P2PStdIO
+import qualified Command.InAnnex
+import qualified Command.RecvKey
+import qualified Command.SendKey
+import qualified Command.DropKey
 
 import qualified Data.Map as M
 
@@ -42,18 +39,15 @@ cmdsMap = M.fromList $ map mk
   where
 	readonlycmds = map addGlobalOptions
 		[ Command.ConfigList.cmd
-		, gitAnnexShellCheck Command.InAnnex.cmd
-		, gitAnnexShellCheck Command.LockContent.cmd
-		, gitAnnexShellCheck Command.SendKey.cmd
-		, gitAnnexShellCheck Command.TransferInfo.cmd
 		, gitAnnexShellCheck Command.NotifyChanges.cmd
 		-- p2pstdio checks the enviroment variables to
 		-- determine the security policy to use
 		, gitAnnexShellCheck Command.P2PStdIO.cmd
+		, gitAnnexShellCheck Command.InAnnex.cmd
+		, gitAnnexShellCheck Command.SendKey.cmd
 		]
 	appendcmds = readonlycmds ++ map addGlobalOptions
 		[ gitAnnexShellCheck Command.RecvKey.cmd
-		, gitAnnexShellCheck Command.Commit.cmd
 		]
 	allcmds = map addGlobalOptions
 		[ gitAnnexShellCheck Command.DropKey.cmd
@@ -166,9 +160,6 @@ parseFields = map (separate (== '='))
 checkField :: (String, String) -> Bool
 checkField (field, val)
 	| field == fieldName remoteUUID = fieldCheck remoteUUID val
-	| field == fieldName associatedFile = fieldCheck associatedFile val
-	| field == fieldName unlocked = fieldCheck unlocked val
-	| field == fieldName direct = fieldCheck direct val
 	| field == fieldName autoInit = fieldCheck autoInit val
 	| otherwise = False
 
diff --git a/CmdLine/GitAnnexShell/Fields.hs b/CmdLine/GitAnnexShell/Fields.hs
index 639adf347..1e416cdb3 100644
--- a/CmdLine/GitAnnexShell/Fields.hs
+++ b/CmdLine/GitAnnexShell/Fields.hs
@@ -9,7 +9,6 @@ module CmdLine.GitAnnexShell.Fields where
 
 import Annex.Common
 import qualified Annex
-import Git.FilePath
 
 import Data.Char
 
@@ -27,14 +26,6 @@ remoteUUID = Field "remoteuuid" $
 	-- does it look like a UUID?
 	all (\c -> isAlphaNum c || c == '-')
 
-associatedFile :: Field
-associatedFile = Field "associatedfile" $ \f ->
-	-- is the file a safe relative filename?
-	not (absoluteGitPath (toRawFilePath f)) && not ("../" `isPrefixOf` f)
-
-direct :: Field
-direct = Field "direct" $ \f -> f == "1"
-
 unlocked :: Field
 unlocked = Field "unlocked" $ \f -> f == "1"
 
diff --git a/Command/Commit.hs b/Command/Commit.hs
deleted file mode 100644
index 1175a0d52..000000000
--- a/Command/Commit.hs
+++ /dev/null
@@ -1,32 +0,0 @@
-{- git-annex command
- -
- - Copyright 2012 Joey Hess <id@joeyh.name>
- -
- - Licensed under the GNU AGPL version 3 or higher.
- -}
-
-module Command.Commit where
-
-import Command
-import qualified Annex.Branch
-import qualified Git
-import Git.Types
-
-cmd :: Command
-cmd = command "commit" SectionPlumbing 
-	"commits any staged changes to the git-annex branch"
-	paramNothing (withParams seek)
-
-seek :: CmdParams -> CommandSeek
-seek = withNothing (commandAction start)
-
-start :: CommandStart
-start = starting "commit" ai si $ do
-	Annex.Branch.commit =<< Annex.Branch.commitMessage
-	_ <- runhook <=< inRepo $ Git.hookPath "annex-content"
-	next $ return True
-  where
-	runhook (Just hook) = liftIO $ boolSystem hook []
-	runhook Nothing = return True
-	ai = ActionItemOther (Just (fromRef Annex.Branch.name))
-	si = SeekInput []
diff --git a/Command/LockContent.hs b/Command/LockContent.hs
deleted file mode 100644
index c57801880..000000000
--- a/Command/LockContent.hs
+++ /dev/null
@@ -1,41 +0,0 @@
-{- git-annex-shell command
- -
- - Copyright 2015 Joey Hess <id@joeyh.name>
- -
- - Licensed under the GNU AGPL version 3 or higher.
- -}
-
-module Command.LockContent where
-
-import Command
-import Annex.Content
-import Remote.Helper.Ssh (contentLockedMarker)
-import Utility.SimpleProtocol
-
-cmd :: Command
-cmd = noCommit $ 
-	command "lockcontent" SectionPlumbing 
-		"locks key's content in the annex, preventing it being dropped"
-		paramKey
-		(withParams seek)
-
-seek :: CmdParams -> CommandSeek
-seek = withWords (commandAction . start)
-
--- First, lock the content, then print out "OK". 
--- Wait for the caller to send a line before dropping the lock.
-start :: [String] -> CommandStart
-start [ks] = do
-	ok <- lockContentShared k (const locksuccess)
-		`catchNonAsync` (const $ return False)
-	liftIO $ if ok
-		then exitSuccess
-		else exitFailure
-  where
-	k = fromMaybe (giveup "bad key") (deserializeKey ks)
-	locksuccess = liftIO $ do
-		putStrLn contentLockedMarker
-		hFlush stdout
-		_ <- getProtocolLine stdin
-		return True
-start _ = giveup "Specify exactly 1 key."
diff --git a/Command/RecvKey.hs b/Command/RecvKey.hs
index 2b49ca84a..e6832e32e 100644
--- a/Command/RecvKey.hs
+++ b/Command/RecvKey.hs
@@ -15,7 +15,6 @@ import Utility.Rsync
 import Types.Transfer
 import Logs.Location
 import Command.SendKey (fieldTransfer)
-import qualified CmdLine.GitAnnexShell.Fields as Fields
 
 cmd :: Command
 cmd = noCommit $ command "recvkey" SectionPlumbing 
@@ -27,14 +26,9 @@ seek = withKeys (commandAction . start)
 

(Diff truncated)
add news item for git-annex 8.20211011
diff --git a/doc/news/version_8.20210621.mdwn b/doc/news/version_8.20210621.mdwn
deleted file mode 100644
index 0a5797bd7..000000000
--- a/doc/news/version_8.20210621.mdwn
+++ /dev/null
@@ -1,35 +0,0 @@
-git-annex 8.20210621 released with [[!toggle text="these changes"]]
-[[!toggleable text="""  * New matching options --excludesamecontent and --includesamecontent
-  * When two files have the same content, and a required content expression
-    matches one but not the other, dropping the latter file will fail as it
-    would also remove the content of the required file.
-  * drop, move, mirror: When two files have the same content, and
-    different numcopies or requiredcopies values, use the higher value.
-  * drop --auto: When two files have the same content, and a preferred content
-    expression matches one but not the other, do not drop the content.
-  * sync --content, assistant: When two unlocked files have the same
-    content, and a preferred content expression matches one but not the
-    other, do not drop the content. (This was already the case for locked
-    files.)
-  * sync --content, assistant: Fix an edge case where a file that is not
-    preferred content did not get dropped.
-  * filter-branch: New command, useful to produce a filtered version of the
-    git-annex branch, eg when splitting a repository.
-  * fromkey: Create an unlocked file when used in an adjusted branch
-    where the file should be unlocked, or when configured by annex.addunlocked.
-  * Fix behavior of several commands, including reinject, addurl, and rmurl
-    when given an absolute path to an unlocked file, or a relative path
-    that leaves and re-enters the repository.
-  * smudge: Fix a case where an unlocked annexed file that annex.largefiles
-    does not match could get its unchanged content checked into git,
-    due to git running the smudge filter unecessarily.
-  * reinject: Error out when run on a file that is not annexed, rather
-    than silently skipping it.
-  * assistant: Fix a crash on startup by avoiding using forkProcess.
-  * init: When annex.commitmessage is set, use that message for the commit
-    that creates the git-annex branch.
-  * Added annex.adviceNoSshCaching config.
-  * Added --size-limit option.
-  * Future proof activity log parsing.
-  * Fix an exponential slowdown when large numbers of duplicate files are
-    being added in unlocked form."""]]
\ No newline at end of file
diff --git a/doc/news/version_8.20211011.mdwn b/doc/news/version_8.20211011.mdwn
new file mode 100644
index 000000000..caf17585e
--- /dev/null
+++ b/doc/news/version_8.20211011.mdwn
@@ -0,0 +1,19 @@
+git-annex 8.20211011 released with [[!toggle text="these changes"]]
+[[!toggleable text="""  * Added annex.bwlimit and remote.name.annex-bwlimit config to limit
+    the bandwidth of transfers. It works for git remotes and many
+    but not all special remotes.
+  * Bug fix: Git configs such as annex.verify were incorrectly overriding
+    per-remote git configs such as remote.name.annex-verify.
+    (Reversion in version 4.20130323)
+  * borg: Significantly improved memory use when a borg repository
+    contains many archives.
+  * borg: Avoid trying to extract xattrs, ACLS, and bsdflags when
+    retrieving from a borg repository.
+  * Sped up git-annex smudge --clean by 25%.
+  * Resume where it left off when copying a file to/from a local git remote
+    was interrupted.
+  * sync --content: Avoid a redundant checksum of a file that was
+    incrementally verified, when used on NTFS and perhaps other filesystems.
+  * reinject: Fix crash when reinjecting a file from outside the repository.
+    (Reversion in version 8.20210621)
+  * Avoid cursor jitter when updating progress display."""]]
\ No newline at end of file

document how to resume downloads
diff --git a/doc/design/external_special_remote_protocol.mdwn b/doc/design/external_special_remote_protocol.mdwn
index fc6c5f110..365ee6cf4 100644
--- a/doc/design/external_special_remote_protocol.mdwn
+++ b/doc/design/external_special_remote_protocol.mdwn
@@ -125,12 +125,16 @@ The following requests *must* all be supported by the special remote.
   * `PREPARE-FAILURE ErrorMsg`  
     Sent as a response to PREPARE if the special remote cannot be used.
 * `TRANSFER STORE|RETRIEVE Key File`  
-  Requests the transfer of a key. For STORE, the File is the file to upload;
-  for RETRIEVE the File is where to store the download.  
-  Note that the File should not influence the filename used on the remote.  
-  Note that in some cases, the File may contain whitespace.  
-  It's important that, while a Key is being stored, `CHECKPRESENT`
-  not indicate it's present until all the data has been transferred.  
+  Requests the transfer of a key. This is the main thing a special remote
+  does. For STORE, the File contains the content to upload;
+  for RETRIEVE the File is where to store the content you download.  
+  When retrieving, the File may already exist, if its retieval was
+  interrupted before. That lets the remote resume downloading, if it's able
+  to.  
+  Note that the File should not influence the filename used on the remote;
+  that filename should be based on the Key.  
+  Note that in some cases, the File's name may include whitespace or other
+  special characters.  
   While the transfer is running, the remote can send any number of
   `PROGRESS` messages to indicate its progress. It can also send any of the
   other special remote messages. Once the transfer is done, it finishes by
@@ -140,7 +144,10 @@ The following requests *must* all be supported by the special remote.
   * `TRANSFER-FAILURE STORE|RETRIEVE Key ErrorMsg`  
     Indicates the transfer failed.
 * `CHECKPRESENT Key`  
-  Requests the remote to check if a key is present in it.
+  Requests the remote to check if a key is present in it.  
+  It's important that, while a key is being transferred to a remote,
+  `CHECKPRESENT` not indicate it's present in the remote until all
+  the data has been sent.
   * `CHECKPRESENT-SUCCESS Key`  
     Indicates that a key has been positively verified to be present in the
     remote.
@@ -270,13 +277,18 @@ handling a request.
   (git-annex does not send a reply to this message, but may give up if it
   doesn't support the necessary protocol version.)
 * `PROGRESS Int`  
-  Indicates the current progress of the transfer (in bytes). May be repeated
-  any number of times during the transfer process, but it's wasteful to
-  update the progress too frequently. Bear in mind that this is used both
+  Indicates the current progress of the transfer. The Int is the
+  number of bytes from the beginning of the file that have been
+  transferred.  
+  May be repeated any number of times during the transfer
+  process, but it's wasteful to update the progress too frequently.
+  Bear in mind that this is used both
   to display a progress meter for the user, and for annex.stalldetection.
   So, sending an update on each 1% of the file may not be frequent enough,
   as it could appear to be a stall when transferring a large file.  
-  This is highly recommended for STORE. (It is optional but good for RETRIEVE.)  
+  This is highly recommended for STORE.
+  (It is optional but good for RETRIEVE; git-annex will fall back to
+  tracking the size of the file as it grows.)  
   (git-annex does not send a reply to this message.)
 * `DIRHASH Key`  
   Gets a two level hash associated with a Key. Something like "aB/Cd".
@@ -291,7 +303,6 @@ handling a request.
   creating hash directory structures to store Keys in. This is the same
   directory hash that is used by eg, the directory special remote.  
   (git-annex replies with VALUE followed by the value.)  
-  (First supported by git-annex version 6.20160511.)
 * `SETCONFIG Setting Value`  
   Sets one of the special remote's configuration settings.  
   Normally this is sent during INITREMOTE, which allows these settings

comment
diff --git a/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_15_9ddd6278ea00c5594083347b5b9b8405._comment b/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_15_9ddd6278ea00c5594083347b5b9b8405._comment
new file mode 100644
index 000000000..1d46c5150
--- /dev/null
+++ b/doc/forum/Managing_a_large_number_of_files_archived_on_many_pieces_of_read-only_medium___40__E.G._DVDs__41__/comment_15_9ddd6278ea00c5594083347b5b9b8405._comment
@@ -0,0 +1,24 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 15"""
+ date="2021-10-11T16:07:38Z"
+ content="""
+git-annex now has the ability to import a tree of files from a
+directory special remote, which results in a remote tracking branch, the
+same as you'd have after fetching a git remote.
+
+	git-annex initremote dvd type=directory directory=/path/to/DVD encryption=none importtree=yes
+	git-annex import master --from dvd --no-content
+
+The --no-content option avoids copying files to the local disk, although
+their content still will have to be read to hash them. If you want to
+copy the files from the disk at the same time, omit that option.
+
+After that, you can use the dvd/master branch it created in whatever way
+you desire. Also if you want the discs files to end up in a subdirectory,
+that can be specified when you import, eg "master:dvd" will put the files
+into a dvd/ subdirectory.
+
+Using this with multiple discs would probably work best if there was a way
+to mount each DVD to its own unique location.
+"""]]

update
diff --git a/doc/todo/optimise_by_converting_Map_to_HashMap.mdwn b/doc/todo/optimise_by_converting_Map_to_HashMap.mdwn
index 448a26b2f..4db7ec8e2 100644
--- a/doc/todo/optimise_by_converting_Map_to_HashMap.mdwn
+++ b/doc/todo/optimise_by_converting_Map_to_HashMap.mdwn
@@ -8,5 +8,5 @@ parts with HashMap. The uses in AnnexRead especially.
 
 > Note that HashMap perfomance can degrade if an attacker provides keys
 > that collide. This has been used to DOS aeson parsing. (Which could
-> affect a few parts of git-annex in theory). So if converting to HashMap,
-> need to consider this. --[[Joey]]
+> affect a few parts of git-annex in theory; fixed in aeson-2.0.1.0). 
+> So if converting to HashMap, need to consider this. --[[Joey]]

todo
diff --git a/doc/todo/avoid_storing_contentidentifier_log_for_borg.mdwn b/doc/todo/avoid_storing_contentidentifier_log_for_borg.mdwn
new file mode 100644
index 000000000..453d31291
--- /dev/null
+++ b/doc/todo/avoid_storing_contentidentifier_log_for_borg.mdwn
@@ -0,0 +1,20 @@
+Borg uses an empty ContentIdentifier for everything; it does not need to
+record anything. But that empty value gets stored in the log for each key
+that is stored in borg. This unncessarily bloats the size of the git-annex
+branch, by one content identifier per key stored in borg.
+
+I think that it also slows down importing many archives from borg,
+because for each of them it has to record the content identifier,
+which is always the same, but still results in a db write.
+
+Omitting storing any ContentIdentifier would break code such as
+Remote.Helper.ExportImport's retrieveKeyFileFromImport.
+
+If the borg Remote could indicate with a flag that it does not use
+ContentIdentifiers, then code like that could pass it a null
+ContentIdentifier without needing to read it from the db.
+
+Annex.Import uses getContentIdentifierKeys, but only when it's not
+thirdpartypopulated. So this change would not break that for borg,
+but a clean way to handle that would be to make it also return a null
+ContentIdentifier when the remote has the flag set. --[[Joey]]

Added a comment: Re: Resuming an interrupted download
diff --git a/doc/design/external_special_remote_protocol/comment_49_9045ad13ad2c2f998173a870174eb3ee._comment b/doc/design/external_special_remote_protocol/comment_49_9045ad13ad2c2f998173a870174eb3ee._comment
new file mode 100644
index 000000000..3f66a9990
--- /dev/null
+++ b/doc/design/external_special_remote_protocol/comment_49_9045ad13ad2c2f998173a870174eb3ee._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="alex@f04d0d3c452a2a99b27ccc93c1543bee4a1bf5be"
+ nickname="alex"
+ avatar="http://cdn.libravatar.org/avatar/9d97e9bcb1cf7680309e37cd69fab408"
+ subject="Re: Resuming an interrupted download"
+ date="2021-10-08T03:06:57Z"
+ content="""
+Thanks, that works great!
+"""]]

Added a comment
diff --git a/doc/todo/windows_support/comment_25_5eb1a4693db88a9e05500ad66edce61d._comment b/doc/todo/windows_support/comment_25_5eb1a4693db88a9e05500ad66edce61d._comment
new file mode 100644
index 000000000..b9befe759
--- /dev/null
+++ b/doc/todo/windows_support/comment_25_5eb1a4693db88a9e05500ad66edce61d._comment
@@ -0,0 +1,18 @@
+[[!comment format=mdwn
+ username="jkniiv"
+ avatar="http://cdn.libravatar.org/avatar/05fd8b33af7183342153e8013aa3713d"
+ subject="comment 25"
+ date="2021-10-07T18:27:19Z"
+ content="""
+> _asakurareiko_: One of the requirements in particular is that the link target must exist. ...
+
+That's useful info. :) Many thanks for your input regarding WSL1 use! For me that opens up new avenues by way of
+borg special remotes (borg support on Windows is very lacking and there are no recent binaries for it).
+I still think git -- and to a lesser extent git-annex -- is performant enough on native Windows, so I'm
+going to continue using it that way for those repos that need to be foolproof and in a supported configuration.
+But for more experimental uses WSL1 still holds quite some promise wrt git-annex. DrvFs is still many times faster
+than 9p of WSL2 *for Windows native files on NTFS* which is an important use case for exchanging data between
+Windows and POSIXland. In fact while WSL2 boots incredibly fast for a Hyper-V virtual machine, in many file-level
+use cases it can't hold candle to the almost seamless nature of WSL1. The latter has its warts in API compatibility
+but in my mind it's quite a undervalued little performer. I hope Microsoft doesn't remove it too soon in favor of WSL2.
+"""]]

Added a comment
diff --git a/doc/todo/windows_support/comment_24_f50de72a082476e0a2e7587368191788._comment b/doc/todo/windows_support/comment_24_f50de72a082476e0a2e7587368191788._comment
new file mode 100644
index 000000000..2ae8a5be3
--- /dev/null
+++ b/doc/todo/windows_support/comment_24_f50de72a082476e0a2e7587368191788._comment
@@ -0,0 +1,11 @@
+[[!comment format=mdwn
+ username="asakurareiko@f3d908c71c009580228b264f63f21c7274df7476"
+ nickname="asakurareiko"
+ avatar="http://cdn.libravatar.org/avatar/a865743e357add9d15081840179ce082"
+ subject="comment 24"
+ date="2021-10-07T16:27:39Z"
+ content="""
+One of the requirements in particular is that the link target must exist. In a freshly cloned repo the link targets do not exist so it's important to recreate the symlinks after you get the annexed files.
+
+NT symlinks must have either file or directory target type. If the link target does not exist at the time when the symlink is created, it's not possible to determine the target type.
+"""]]

add footnote about the permissions required and Microsoft blog post
diff --git a/doc/todo/windows_support.mdwn b/doc/todo/windows_support.mdwn
index 1ffb76638..243c536d2 100644
--- a/doc/todo/windows_support.mdwn
+++ b/doc/todo/windows_support.mdwn
@@ -142,9 +142,11 @@ Do the following:
 
 1. Enable Developer mode in Windows settings so that symlinks can be created without elevated privileges.
 2. Mount the NTFS drive with metadata option. This line can be added in `/etc/fstab`: `C: /mnt/c drvfs metadata`. I prefer to also add `uid=1000,gid=1000,fmask=0133,dmask=0022`.
-3. Create an empty directory where your repo will be. Then enable case sensitivity `setfattr -n system.wsl_case_sensitive -v 1 <path>`. This attribute will be automatically and recursively applied to any future subdirectories. If setfattr(1) errs out with permission denied, you can also effect the same change in CMD.EXE / Windows Powershell as admin with `fsutil file setCaseSensitiveInfo <path> enable`. You can check that the setting is enabled with `getfattr -n system.wsl_case_sensitive <path>` under WSL1.
+3. Create an empty directory where your repo will be. Then enable case sensitivity `setfattr -n system.wsl_case_sensitive -v 1 <path>`. This attribute will be automatically and recursively applied to any future subdirectories. If setfattr(1) errs out with permission denied, you can also effect the same change in CMD.EXE / Windows Powershell as admin with `fsutil file setCaseSensitiveInfo <path> enable`.[^1] You can check that the setting is enabled with `getfattr -n system.wsl_case_sensitive <path>` under WSL1.
 4. Create the repo however you like (see steps below for cloning a repo with ssh). Immediately after `git annex init`, do `git config annex.crippledfilesystem true`. If you set `crippledfilesystem` before init, then git annex will try to enter an adjusted branch and trigger the first bug. If you do not set `crippledfilesystem` after init, you will trigger the second bug when doing `git annex add`.
 
+[^1]: This works because Administrators usually have Full Control over most files. What Windows actually looks for is "Write attributes", "Create files", "Create folders" and "Delete subfolders and files" permissions on the directory required for changing case-sensitivity. As a regular user (or without UAC) you might not have those permissions by default for instance on external drives, so adjust accordingly. For more info about about the `system.wsl_case_sensitive` attribute see this blog post: [[https://devblogs.microsoft.com/commandline/improved-per-directory-case-sensitivity-support-in-wsl/]]
+
 ### Cloning a repo with ssh
 
 When cloning a repo with ssh, `git annex init` will fail to enable ssh remotes if `crippledfilesystem` is not set, but you also cannot set it before init. Follow these steps to avoid unrelated history in the `git-annex` branch.

remove mention of mount option `case=dir` (turned out to be unnecessary)
diff --git a/doc/todo/windows_support.mdwn b/doc/todo/windows_support.mdwn
index 1d930b5e7..1ffb76638 100644
--- a/doc/todo/windows_support.mdwn
+++ b/doc/todo/windows_support.mdwn
@@ -141,7 +141,7 @@ The following steps are tested on Windows 10 21h1 with Ubuntu 18.04/20.04 and ar
 Do the following:
 
 1. Enable Developer mode in Windows settings so that symlinks can be created without elevated privileges.
-2. Mount the NTFS drive with metadata option. This line can be added in `/etc/fstab`: `C: /mnt/c drvfs metadata,case=dir`. I prefer to also add `uid=1000,gid=1000,fmask=0133,dmask=0022`.
+2. Mount the NTFS drive with metadata option. This line can be added in `/etc/fstab`: `C: /mnt/c drvfs metadata`. I prefer to also add `uid=1000,gid=1000,fmask=0133,dmask=0022`.
 3. Create an empty directory where your repo will be. Then enable case sensitivity `setfattr -n system.wsl_case_sensitive -v 1 <path>`. This attribute will be automatically and recursively applied to any future subdirectories. If setfattr(1) errs out with permission denied, you can also effect the same change in CMD.EXE / Windows Powershell as admin with `fsutil file setCaseSensitiveInfo <path> enable`. You can check that the setting is enabled with `getfattr -n system.wsl_case_sensitive <path>` under WSL1.
 4. Create the repo however you like (see steps below for cloning a repo with ssh). Immediately after `git annex init`, do `git config annex.crippledfilesystem true`. If you set `crippledfilesystem` before init, then git annex will try to enter an adjusted branch and trigger the first bug. If you do not set `crippledfilesystem` after init, you will trigger the second bug when doing `git annex add`.
 

Added a comment
diff --git a/doc/todo/windows_support/comment_23_616a7d579730e5a4f5cc314d6328bfd0._comment b/doc/todo/windows_support/comment_23_616a7d579730e5a4f5cc314d6328bfd0._comment
new file mode 100644
index 000000000..ba8529ba5
--- /dev/null
+++ b/doc/todo/windows_support/comment_23_616a7d579730e5a4f5cc314d6328bfd0._comment
@@ -0,0 +1,23 @@
+[[!comment format=mdwn
+ username="jkniiv"
+ avatar="http://cdn.libravatar.org/avatar/05fd8b33af7183342153e8013aa3713d"
+ subject="comment 23"
+ date="2021-10-07T11:36:48Z"
+ content="""
+@asakurareiko: Oh, based on [[https://docs.microsoft.com/en-us/windows/wsl/case-sensitivity#case-sensitivity-options-for-mounting-a-drive-in-wsl-configuration-file]]
+I got the impression that `case=dir` was a prerequisite to make the attribute `system.wsl_case_sensitive`
+work as per the heading \"Default setting: dir for enabling case sensitivity per directory\". I guess
+I was wrong, `case=off` works too. I'll remove `case=dir` it from the instructions once again.
+
+Also the NT symlink requirements are fulfilled in my case simply by way of me having developer mode
+enabled in 21H1 (Windows 10 Pro). I create symlinks all the time with
+\"cmd /c mklink ...\" in Powershell without elevation. Git-annex also creates only relative symlinks
+which was also a requirement in the WSL release news you mentioned. No, this must be one of those
+miscellanous problems mentioned in [[WSL issue 353|https://github.com/microsoft/WSL/issues/353]].
+
+_Edit_: Oh, in fact recreating the symlink after getting the file with `git-annex get` by deleting the symlink
+with rm and then checking it out again with `git checkout -- <file>` seems to allow me to access
+the file in Windows just fine. Interesting. Maybe it's git-annex itself that creates the wrong
+kind of symlink that a later call to plain git can repair. Quite convoluted it seems.
+
+"""]]

Added a comment
diff --git a/doc/bugs/problems_with_SSH_and_relative_paths/comment_12_25dcce6513505d1f2472b24b19d4af73._comment b/doc/bugs/problems_with_SSH_and_relative_paths/comment_12_25dcce6513505d1f2472b24b19d4af73._comment
new file mode 100644
index 000000000..a744bb734
--- /dev/null
+++ b/doc/bugs/problems_with_SSH_and_relative_paths/comment_12_25dcce6513505d1f2472b24b19d4af73._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="asakurareiko@f3d908c71c009580228b264f63f21c7274df7476"
+ nickname="asakurareiko"
+ avatar="http://cdn.libravatar.org/avatar/a865743e357add9d15081840179ce082"
+ subject="comment 12"
+ date="2021-10-07T06:21:25Z"
+ content="""
+The problem I had was actually specific to WSL1. I added a section in [[todo/windows_support]] about cloning a repo with ssh in WSL1.
+"""]]

Added a comment
diff --git a/doc/todo/windows_support/comment_22_8e5b8ec9e6a63bcde7390511c92d2940._comment b/doc/todo/windows_support/comment_22_8e5b8ec9e6a63bcde7390511c92d2940._comment
new file mode 100644
index 000000000..7f47b03b8
--- /dev/null
+++ b/doc/todo/windows_support/comment_22_8e5b8ec9e6a63bcde7390511c92d2940._comment
@@ -0,0 +1,17 @@
+[[!comment format=mdwn
+ username="asakurareiko@f3d908c71c009580228b264f63f21c7274df7476"
+ nickname="asakurareiko"
+ avatar="http://cdn.libravatar.org/avatar/a865743e357add9d15081840179ce082"
+ subject="comment 22"
+ date="2021-10-07T06:17:30Z"
+ content="""
+For me I tried git on windows at first but when checking out the working tree after a clone it slowed down so much that it would not complete in a reasonable amount of time. So that's why I decided to try using WSL1.
+
+In my opinion it's not necessary to use `case=dir`. `case=dir` was at one point the default but was removed as the default due to the potential to cause problems with windows programs ([[https://devblogs.microsoft.com/commandline/improved-per-directory-case-sensitivity-support-in-wsl]]). But if you do have `case=dir` then it is not necessary to set the attribute.
+
+If your symlinks are not working, make sure to have deleted and recreated the symlinks after doing `git annex get` and that the NT symlink requirements listed [here](https://github.com/MicrosoftDocs/WSL/releases/tag/17046) have been met. If the symlink target has changed from file to directory or vice versa the symlink also has to be recreated. However there are other reports of symlinks not working despite following these requirements:
+
+* [[https://github.com/microsoft/WSL/issues/353#issuecomment-544857020]] Commenter is using Windows 10 Home 1903. (I'm using Windows 10 Enterprise)
+* [[https://github.com/microsoft/WSL/issues/353#issuecomment-478953190]] Commenter updated to Windows 10 1809 and it stopped working.
+* [[https://github.com/microsoft/WSL/issues/353#issuecomment-478048780]] Commenter mounted with UNC rather than drive letter.
+"""]]

diff --git a/doc/todo/windows_support.mdwn b/doc/todo/windows_support.mdwn
index 0c365c805..1d930b5e7 100644
--- a/doc/todo/windows_support.mdwn
+++ b/doc/todo/windows_support.mdwn
@@ -143,7 +143,20 @@ Do the following:
 1. Enable Developer mode in Windows settings so that symlinks can be created without elevated privileges.
 2. Mount the NTFS drive with metadata option. This line can be added in `/etc/fstab`: `C: /mnt/c drvfs metadata,case=dir`. I prefer to also add `uid=1000,gid=1000,fmask=0133,dmask=0022`.
 3. Create an empty directory where your repo will be. Then enable case sensitivity `setfattr -n system.wsl_case_sensitive -v 1 <path>`. This attribute will be automatically and recursively applied to any future subdirectories. If setfattr(1) errs out with permission denied, you can also effect the same change in CMD.EXE / Windows Powershell as admin with `fsutil file setCaseSensitiveInfo <path> enable`. You can check that the setting is enabled with `getfattr -n system.wsl_case_sensitive <path>` under WSL1.
-4. Create the repo however you like. Immediately after `git annex init`, do `git config annex.crippledfilesystem true`. If you set `crippledfilesystem` before init, then git annex will try to enter an adjusted branch and trigger the first bug. If you do not set `crippledfilesystem` after init, you will trigger the second bug when doing `git annex add`.
+4. Create the repo however you like (see steps below for cloning a repo with ssh). Immediately after `git annex init`, do `git config annex.crippledfilesystem true`. If you set `crippledfilesystem` before init, then git annex will try to enter an adjusted branch and trigger the first bug. If you do not set `crippledfilesystem` after init, you will trigger the second bug when doing `git annex add`.
+
+### Cloning a repo with ssh
+
+When cloning a repo with ssh, `git annex init` will fail to enable ssh remotes if `crippledfilesystem` is not set, but you also cannot set it before init. Follow these steps to avoid unrelated history in the `git-annex` branch.
+
+    git clone <sshpath> annex
+    cd annex
+    git branch git-annex origin/git-annex
+    git remote remove origin
+    git annex init
+    git config annex.crippledfilesystem true
+    git remote add origin <sshpath>
+    git annex sync
 
 ### Using symlinks and locked files
 

Added a comment: the WSL1 use case
diff --git a/doc/todo/windows_support/comment_21_49ef5997d36ade4797ef77b1b62f5405._comment b/doc/todo/windows_support/comment_21_49ef5997d36ade4797ef77b1b62f5405._comment
new file mode 100644
index 000000000..47e7471e9
--- /dev/null
+++ b/doc/todo/windows_support/comment_21_49ef5997d36ade4797ef77b1b62f5405._comment
@@ -0,0 +1,15 @@
+[[!comment format=mdwn
+ username="jkniiv"
+ avatar="http://cdn.libravatar.org/avatar/05fd8b33af7183342153e8013aa3713d"
+ subject="the WSL1 use case"
+ date="2021-10-07T04:12:48Z"
+ content="""
+User asakurareiko added some instructions to this page how to use git-annex in Ubuntu 20 (presumably
+20.04) in WSL1 and I tested them out in my older Ubuntu 18.04 installation and found out that they
+amazingly work although my Windows apps couldn't access any of the files that were still locked.
+Somehow the symlinks were only in a form that only WSL1 and Cygwin/MSYS2/Git Bash could access
+(mind you I have developer mode active and my Windows 10 is also version 21H1 so the environment
+was otherwise similar). An interesting use case nonetheless. I edited the section with my corrections
+ -- I hope asakurareiko doesn't mind. :)
+
+"""]]

rename WSL1 section to highlight date, add wording about being experimental, reword some awkwardness, add further directions
diff --git a/doc/todo/windows_support.mdwn b/doc/todo/windows_support.mdwn
index d684936fc..0c365c805 100644
--- a/doc/todo/windows_support.mdwn
+++ b/doc/todo/windows_support.mdwn
@@ -131,18 +131,18 @@ Seems like this would need Windows 10.
 > > > > But here's a bug about sqlite in WSL:
 > > > > [[bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol]] --[[Joey]]
 
-## Steps for using git-annex on NTFS with WSL1
+## Update Oct 2021: Steps for using git-annex on NTFS with WSL1 (an experimental setup for those adventurous enough)
 
-These steps are tested on Windows 10 21h1 with Ubuntu 20 and are specifically designed to work around these two bugs:
+The following steps are tested on Windows 10 21h1 with Ubuntu 18.04/20.04 and are specifically designed to work around these two bugs:
 
 * [[bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol]]
 * [[bugs/WSL1__58___git-annex-add_fails_in_DrvFs_filesystem]]
 
-I put this line here so that the following lines do not become bullet points.
+Do the following:
 
 1. Enable Developer mode in Windows settings so that symlinks can be created without elevated privileges.
-2. Mount the NTFS drive with metadata option. This line can be added in `/etc/fstab`: `C: /mnt/c drvfs metadata`. I prefer to also add `uid=1000,gid=1000,fmask=0133,dmask=0022`.
-3. Create an empty directory where your repo will be. Then enable case sensitivity `setfattr -n system.wsl_case_sensitive -v 1 <path>`. This attribute will be automatically and recursively applied to any future subdirectories.
+2. Mount the NTFS drive with metadata option. This line can be added in `/etc/fstab`: `C: /mnt/c drvfs metadata,case=dir`. I prefer to also add `uid=1000,gid=1000,fmask=0133,dmask=0022`.
+3. Create an empty directory where your repo will be. Then enable case sensitivity `setfattr -n system.wsl_case_sensitive -v 1 <path>`. This attribute will be automatically and recursively applied to any future subdirectories. If setfattr(1) errs out with permission denied, you can also effect the same change in CMD.EXE / Windows Powershell as admin with `fsutil file setCaseSensitiveInfo <path> enable`. You can check that the setting is enabled with `getfattr -n system.wsl_case_sensitive <path>` under WSL1.
 4. Create the repo however you like. Immediately after `git annex init`, do `git config annex.crippledfilesystem true`. If you set `crippledfilesystem` before init, then git annex will try to enter an adjusted branch and trigger the first bug. If you do not set `crippledfilesystem` after init, you will trigger the second bug when doing `git annex add`.
 
 ### Using symlinks and locked files

Add steps for WSL1
diff --git a/doc/todo/windows_support.mdwn b/doc/todo/windows_support.mdwn
index da049b2f1..d684936fc 100644
--- a/doc/todo/windows_support.mdwn
+++ b/doc/todo/windows_support.mdwn
@@ -130,3 +130,23 @@ Seems like this would need Windows 10.
 
 > > > > But here's a bug about sqlite in WSL:
 > > > > [[bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol]] --[[Joey]]
+
+## Steps for using git-annex on NTFS with WSL1
+
+These steps are tested on Windows 10 21h1 with Ubuntu 20 and are specifically designed to work around these two bugs:
+
+* [[bugs/WSL_adjusted_braches__58___smudge_fails_with_sqlite_thread_crashed_-_locking_protocol]]
+* [[bugs/WSL1__58___git-annex-add_fails_in_DrvFs_filesystem]]
+
+I put this line here so that the following lines do not become bullet points.
+
+1. Enable Developer mode in Windows settings so that symlinks can be created without elevated privileges.
+2. Mount the NTFS drive with metadata option. This line can be added in `/etc/fstab`: `C: /mnt/c drvfs metadata`. I prefer to also add `uid=1000,gid=1000,fmask=0133,dmask=0022`.
+3. Create an empty directory where your repo will be. Then enable case sensitivity `setfattr -n system.wsl_case_sensitive -v 1 <path>`. This attribute will be automatically and recursively applied to any future subdirectories.
+4. Create the repo however you like. Immediately after `git annex init`, do `git config annex.crippledfilesystem true`. If you set `crippledfilesystem` before init, then git annex will try to enter an adjusted branch and trigger the first bug. If you do not set `crippledfilesystem` after init, you will trigger the second bug when doing `git annex add`.
+
+### Using symlinks and locked files
+
+* You can now use symlinks and locked files but please remember that locked files can still be overwritten. So make sure to unlock them before you edit them.
+* After you `git annex get` files, the symlinks for those files will still be broken. Recreate the symlinks to fix them. You can make a script or delete them and `git checkout`.
+* It can be difficult to use symlinks on Windows because programs will see the link target rather than the link, which makes it impossible to do things like navigating between files in the same directory or using opened file history. You can unlock the files or access them through another filesystem layer such as SMB.

Added a comment
diff --git a/doc/bugs/problems_with_SSH_and_relative_paths/comment_11_a2a778bd1787068cc621b17dd64b31c2._comment b/doc/bugs/problems_with_SSH_and_relative_paths/comment_11_a2a778bd1787068cc621b17dd64b31c2._comment
new file mode 100644
index 000000000..9191f400a
--- /dev/null
+++ b/doc/bugs/problems_with_SSH_and_relative_paths/comment_11_a2a778bd1787068cc621b17dd64b31c2._comment
@@ -0,0 +1,21 @@
+[[!comment format=mdwn
+ username="asakurareiko@f3d908c71c009580228b264f63f21c7274df7476"
+ nickname="asakurareiko"
+ avatar="http://cdn.libravatar.org/avatar/a865743e357add9d15081840179ce082"
+ subject="comment 11"
+ date="2021-10-06T21:10:34Z"
+ content="""
+I'm still experiencing this bug with 8.20200226. If the repo is cloned by ssh before doing `git annex init`, then the result is:
+
+```
+  Unable to parse git config from origin
+
+  Remote origin does not have git-annex installed; setting annex-ignore
+```
+
+
+Creating an empty repo and doing `git annex init` first then adding the remote and pulling in changes works fine, but this creates unrelated history on the `git-annex` branch.
+
+
+This affects ssh remotes but not local remotes.
+"""]]

branch
diff --git a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_11_fe04d3da8859101ba1649fdd9d5ee39e._comment b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_11_fe04d3da8859101ba1649fdd9d5ee39e._comment
index b234f769c..ed5c7e2b1 100644
--- a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_11_fe04d3da8859101ba1649fdd9d5ee39e._comment
+++ b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_11_fe04d3da8859101ba1649fdd9d5ee39e._comment
@@ -24,16 +24,5 @@ committed.
 For borg, each archive would be a subtree; 500k filenames will fit in memory
 or at least fit better than `365*500k`.
 
-The interface I'm thinking about is something like this:
-
-	data ChunkedImportableContents info
-		= ImportableContentsChunk
-			{ importableContentsRoot :: ImportLocation
-			, importableContentsSubTree :: [(ImportLocation, info)]
-			-- ^ locations are relative to importableContentsRoot
-			, importableContentsContinuation :: Annex (ChunkedImportableContents info)
-			}
-		| ImportableContentsComplete (ImportableContents info)
-
-This is a promising idea!
+This is a promising idea! Started working on it in a `borgchunks` branch.
 """]]

ImportableContentsChunkable
This improves the borg special remote memory usage, by
letting it only load one archive's worth of filenames into memory at a
time, and building up a larger tree out of the chunks.
When a borg repository has many archives, git-annex could easily OOM
before. Now, it will use only memory proportional to the number of
annexed keys in an archive.
Minor implementation wart: Each new chunk re-opens the content
identifier database, and also a new vector clock is used for each chunk.
This is a minor innefficiency only; the use of continuations makes
it hard to avoid, although putting the database handle into a Reader
monad would be one way to fix it.
It may later be possible to extend the ImportableContentsChunkable
interface to remotes that are not third-party populated. However, that
would perhaps need an interface that does not use continuations.
The ImportableContentsChunkable interface currently does not allow
populating the top of the tree with anything other than subtrees. It
would be easy to extend it to allow putting files in that tree, but borg
doesn't need that so I left it out for now.
Sponsored-by: Noam Kremen on Patreon
diff --git a/Annex/Import.hs b/Annex/Import.hs
index 2d15c11b9..2e4275fa9 100644
--- a/Annex/Import.hs
+++ b/Annex/Import.hs
@@ -1,6 +1,6 @@
 {- git-annex import from remotes
  -
- - Copyright 2019-2020 Joey Hess <id@joeyh.name>
+ - Copyright 2019-2021 Joey Hess <id@joeyh.name>
  -
  - Licensed under the GNU AGPL version 3 or higher.
  -}
@@ -98,7 +98,7 @@ buildImportCommit
 	:: Remote
 	-> ImportTreeConfig
 	-> ImportCommitConfig
-	-> ImportableContents (Either Sha Key)
+	-> ImportableContentsChunkable Annex (Either Sha Key)
 	-> Annex (Maybe Ref)
 buildImportCommit remote importtreeconfig importcommitconfig importable =
 	case importCommitTracking importcommitconfig of
@@ -123,7 +123,7 @@ buildImportCommit remote importtreeconfig importcommitconfig importable =
 recordImportTree
 	:: Remote
 	-> ImportTreeConfig
-	-> ImportableContents (Either Sha Key)
+	-> ImportableContentsChunkable Annex (Either Sha Key)
 	-> Annex (History Sha, Annex ())
 recordImportTree remote importtreeconfig importable = do
 	imported@(History finaltree _) <- buildImportTrees basetree subdir importable
@@ -262,27 +262,77 @@ buildImportCommit' remote importcommitconfig mtrackingcommit imported@(History t
  - that location, replacing any object that was there.
  -}
 buildImportTrees
+	:: Ref
+	-> Maybe TopFilePath
+	-> ImportableContentsChunkable Annex (Either Sha Key)
+	-> Annex (History Sha)
+buildImportTrees basetree msubdir (ImportableContentsComplete importable) = do
+	repo <- Annex.gitRepo
+	withMkTreeHandle repo $ buildImportTrees' basetree msubdir importable
+buildImportTrees basetree msubdir importable@(ImportableContentsChunked {}) = do
+	repo <- Annex.gitRepo
+	withMkTreeHandle repo $ \hdl ->
+		History
+			<$> go hdl
+			<*> buildImportTreesHistory basetree msubdir
+				(importableHistoryComplete importable) hdl
+  where
+	go hdl = do
+		tree <- gochunks [] (importableContentsChunk importable) hdl
+		importtree <- liftIO $ recordTree' hdl tree
+		graftImportTree basetree msubdir importtree hdl
+
+	gochunks l c hdl = do
+		let subdir = importChunkSubDir $ importableContentsSubDir c
+		-- Full directory prefix where the sub tree is located.
+		let fullprefix = asTopFilePath $ case msubdir of
+			Nothing -> subdir
+			Just d -> getTopFilePath d Posix.</> subdir
+		Tree ts <- convertImportTree (Just fullprefix) $
+			map (\(p, i) -> (mkImportLocation p, i))
+				(importableContentsSubTree c)
+		-- Record this subtree before getting next chunk, this
+		-- avoids buffering all the chunks into memory.
+		tc <- liftIO $ recordSubTree hdl $
+			NewSubTree (asTopFilePath subdir) ts
+		importableContentsNextChunk c >>= \case
+			Nothing -> return (Tree (tc:l))
+			Just c' -> gochunks (tc:l) c' hdl
+
+buildImportTrees'
 	:: Ref
 	-> Maybe TopFilePath
 	-> ImportableContents (Either Sha Key)
+	-> MkTreeHandle
 	-> Annex (History Sha)
-buildImportTrees basetree msubdir importable = History
-	<$> (buildtree (importableContents importable) =<< Annex.gitRepo)
-	<*> buildhistory
+buildImportTrees' basetree msubdir importable hdl = History
+	<$> buildImportTree basetree msubdir (importableContents importable) hdl
+	<*> buildImportTreesHistory basetree msubdir (importableHistory importable) hdl
+
+buildImportTree
+	:: Ref
+	-> Maybe TopFilePath
+	-> [(ImportLocation, Either Sha Key)]
+	-> MkTreeHandle
+	-> Annex Sha
+buildImportTree basetree msubdir ls hdl = do
+	importtree <- liftIO . recordTree' hdl =<< convertImportTree msubdir ls
+	graftImportTree basetree msubdir importtree hdl
+
+graftImportTree
+	:: Ref
+	-> Maybe TopFilePath
+	-> Sha
+	-> MkTreeHandle
+	-> Annex Sha
+graftImportTree basetree msubdir tree hdl = case msubdir of
+	Nothing -> return tree
+	Just subdir -> inRepo $ \repo ->
+		graftTree' tree subdir basetree repo hdl
+
+convertImportTree :: Maybe TopFilePath -> [(ImportLocation, Either Sha Key)] -> Annex Tree
+convertImportTree msubdir ls = treeItemsToTree <$> mapM mktreeitem ls
   where
-	buildhistory = S.fromList
-		<$> mapM (buildImportTrees basetree msubdir)
-			(importableHistory importable)
-	
-	buildtree ls repo = withMkTreeHandle repo $ \hdl -> do
-		importtree <- liftIO . recordTree' hdl 
-			. treeItemsToTree
-			=<< mapM mktreeitem ls
-		case msubdir of
-			Nothing -> return importtree
-			Just subdir -> liftIO $ 
-				graftTree' importtree subdir basetree repo hdl
-	
 	mktreeitem (loc, v) = case v of
 		Right k -> do
 			relf <- fromRepo $ fromTopFilePath topf
@@ -297,6 +347,15 @@ buildImportTrees basetree msubdir importable = History
 		topf = asTopFilePath $
 			maybe lf (\sd -> getTopFilePath sd P.</> lf) msubdir
 
+buildImportTreesHistory
+	:: Ref
+	-> Maybe TopFilePath
+	-> [ImportableContents (Either Sha Key)]
+	-> MkTreeHandle
+	-> Annex (S.Set (History Sha))
+buildImportTreesHistory basetree msubdir history hdl = S.fromList
+	<$> mapM (\ic -> buildImportTrees' basetree msubdir ic hdl) history
+
 canImportKeys :: Remote -> Bool -> Bool
 canImportKeys remote importcontent =
 	importcontent || isJust (Remote.importKey ia)
@@ -324,8 +383,8 @@ importKeys
 	-> ImportTreeConfig
 	-> Bool
 	-> Bool
-	-> ImportableContents (ContentIdentifier, ByteSize)
-	-> Annex (Maybe (ImportableContents (Either Sha Key)))
+	-> ImportableContentsChunkable Annex (ContentIdentifier, ByteSize)
+	-> Annex (Maybe (ImportableContentsChunkable Annex (Either Sha Key)))
 importKeys remote importtreeconfig importcontent thirdpartypopulated importablecontents = do
 	unless (canImportKeys remote importcontent) $
 		giveup "This remote does not support importing without downloading content."
@@ -339,40 +398,82 @@ importKeys remote importtreeconfig importcontent thirdpartypopulated importablec
 	-- When concurrency is enabled, this set is needed to
 	-- avoid two threads both importing the same content identifier.
 	importing <- liftIO $ newTVarIO S.empty
-	withExclusiveLock gitAnnexContentIdentifierLock $
-		bracket CIDDb.openDb CIDDb.closeDb $ \db -> do
-			CIDDb.needsUpdateFromLog db
-				>>= maybe noop (CIDDb.updateFromLog db)
-			(run (go False cidmap importing importablecontents db))
+	withciddb $ \db -> do
+		CIDDb.needsUpdateFromLog db
+			>>= maybe noop (CIDDb.updateFromLog db)
+		(prepclock (run cidmap importing db))
   where
 	-- When not importing content, reuse the same vector
 	-- clock for all state that's recorded. This can save
 	-- a little bit of disk space. Individual file downloads
 	-- while downloading take too long for this optimisation
 	-- to be safe to do.
-	run a
+	prepclock a
 		| importcontent = a
 		| otherwise = reuseVectorClockWhile a
 
-	go oldversion cidmap importing (ImportableContents l h) db = do
+	withciddb = withExclusiveLock gitAnnexContentIdentifierLock .
+		bracket CIDDb.openDb CIDDb.closeDb
+
+	run cidmap importing db = do
 		largematcher <- largeFilesMatcher
+		case importablecontents of
+			ImportableContentsComplete ic ->
+				go False largematcher cidmap importing db ic >>= return . \case
+					Nothing -> Nothing
+					Just v -> Just $ ImportableContentsComplete v
+			ImportableContentsChunked {} -> do
+				c <- gochunked db (importableContentsChunk importablecontents)
+				gohistory largematcher cidmap importing db (importableHistoryComplete importablecontents) >>= return . \case
+					Nothing -> Nothing
+					Just h -> Just $ ImportableContentsChunked
+						{ importableContentsChunk = c
+						, importableHistoryComplete = h
+						}
+
+	go oldversion largematcher cidmap importing db (ImportableContents l h) = do
 		jobs <- forM l $ \i ->
 			if thirdpartypopulated
-				then thirdpartypopulatedimport cidmap db i
+				then Left <$> thirdpartypopulatedimport db i

(Diff truncated)
progress in my head
diff --git a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_10_40a8fbf3c4140e955f7e1503db824aaf._comment b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_10_40a8fbf3c4140e955f7e1503db824aaf._comment
new file mode 100644
index 000000000..30d024a3a
--- /dev/null
+++ b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_10_40a8fbf3c4140e955f7e1503db824aaf._comment
@@ -0,0 +1,35 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 10"""
+ date="2021-10-06T17:09:50Z"
+ content="""
+There is still a big PINNED spike though. I measured this memory use:
+
+	115344 post listContents
+	133816 post importKeys
+	236676 post recordImportTree
+
+listContents produces an `ImportableContents (ContentIdentifier, ByteSize)`
+and that gets transformed through importKeys 
+to `ImportableContents (Either Sha Key)`. The GC should be able to
+free up the first as it's being traversed, but PINNED still goes up during
+that, and memory increases by 20% or so.
+
+Then recordImportTree calls mktreeitem and treeItemsToTree, which between
+then double the memory.
+
+So I think I understand where the memory use is, although why it's PINNED
+is still not clear, and unpinning could still help. I did try converting
+TopFilePath to ShortByteString, since TreeItems contain them, but it didn't
+reduce the amount PINNED and actually used more memory.
+
+To avoid the allocation entirely, it seems that borg's
+listImportableContents would need to generate a Tree itself, rather than
+using ImportableContents. And it could, probably fairly efficiently, but it
+would not be able to reuse the tree import interface as it does now.
+
+(borg could return a `ImportableContents (Either Sha Key)` more easily,
+and still reuse part of the interface, but the conversion to that only
+uses 20% or so of memory so it's not a big enough win. Also when I looked
+at it, it was still not going to be an easy refactoring.)
+"""]]
diff --git a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_11_fe04d3da8859101ba1649fdd9d5ee39e._comment b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_11_fe04d3da8859101ba1649fdd9d5ee39e._comment
new file mode 100644
index 000000000..b234f769c
--- /dev/null
+++ b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_11_fe04d3da8859101ba1649fdd9d5ee39e._comment
@@ -0,0 +1,39 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 11"""
+ date="2021-10-06T18:03:23Z"
+ content="""
+@tomdhunt the tree is being stored in git, so the natural way
+to do something like a difference encoding would be a series of trees
+in a commit sequence.
+
+The tree import interface does support that, but borg remote 
+doesn't bother and puts all the items in a single tree. But even if it did,
+it would still populate the same ImportableContents data structure with
+the same amount of data just a different layout.
+
+But maybe this line of thinking does point toward a solution.. Suppose that
+there was a way for listImportableContents to generate an
+ImportableContentsChunk that contained a subtree, and a continuation to get
+the next subtree. Then each subtree's worth of ImportableContents would be
+passed through to recordImportTree (a version omitting the parts of it that
+commit the tree), and only one subtree at a time would occupy memory. At
+the end a tree would be constucted containing all the subtrees, and
+committed. 
+
+For borg, each archive would be a subtree; 500k filenames will fit in memory
+or at least fit better than `365*500k`.
+
+The interface I'm thinking about is something like this:
+
+	data ChunkedImportableContents info
+		= ImportableContentsChunk
+			{ importableContentsRoot :: ImportLocation
+			, importableContentsSubTree :: [(ImportLocation, info)]
+			-- ^ locations are relative to importableContentsRoot
+			, importableContentsContinuation :: Annex (ChunkedImportableContents info)
+			}
+		| ImportableContentsComplete (ImportableContents info)
+
+This is a promising idea!
+"""]]
diff --git a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_7_f59d9c51716892240ebd12fa80a2e58b._comment b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_7_f59d9c51716892240ebd12fa80a2e58b._comment
index 0d8cad31c..a3a88f1ca 100644
--- a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_7_f59d9c51716892240ebd12fa80a2e58b._comment
+++ b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_7_f59d9c51716892240ebd12fa80a2e58b._comment
@@ -8,16 +8,4 @@ and the -hc profile is unchanged. So the pinned memory is not in refs.
 
 Also tried converting Key to use ShortByteString. That was a win!
 My 20 borg archive test case is down from 320 mb to 242 mb.
-
-Looking at Command.SyncpullThirdPartyPopulated,
-it calls listContents, which calls borg's listImportableContents,
-and produces an `ImportableContents (ContentIdentifier, ByteSize)`
-then that gets passed through importKeys to produce
-an `ImportableContents (Either Sha Key)`. Probably
-double memory is used while doing that conversion, unless
-the GC manages to free the first one while it's traversed.
-
-If borg's listImportableContents included a Key (which it does
-produce already only to throw away!) that might 
-eliminate the big spike just before treeItemsToTree.
 """]]

Added a comment
diff --git a/doc/forum/config_to_make_git_annex_sync_only_sync_metadata__63__/comment_3_07e6f767e38d9812b176ee3fdb00adf0._comment b/doc/forum/config_to_make_git_annex_sync_only_sync_metadata__63__/comment_3_07e6f767e38d9812b176ee3fdb00adf0._comment
new file mode 100644
index 000000000..a9b271d36
--- /dev/null
+++ b/doc/forum/config_to_make_git_annex_sync_only_sync_metadata__63__/comment_3_07e6f767e38d9812b176ee3fdb00adf0._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="falsifian"
+ avatar="http://cdn.libravatar.org/avatar/59c3c23c500d20d83ecb9d1f149be9ae"
+ subject="comment 3"
+ date="2021-10-06T03:36:01Z"
+ content="""
+Clarification: `git annex sync --no-content ...` is working fine for me. So I don't need the script.
+
+I guess I can define a shell alias to fill in the `--no-content` part.
+"""]]

convert Key to ShortByteString
This adds the overhead of a copy when serializing and deserializing keys.
I have not benchmarked much, but runtimes seem barely changed at all by that.
When a lot of keys are in memory, it improves memory use.
And, it prevents keys sometimes getting PINNED in memory and failing to GC,
which is a problem ByteString has sometimes. In particular, git-annex sync
from a borg special remote had that problem and this improved its memory
use by a large amount.
Sponsored-by: Shae Erisson on Patreon
diff --git a/Annex/Export.hs b/Annex/Export.hs
index a01d263cf..3ab68ad53 100644
--- a/Annex/Export.hs
+++ b/Annex/Export.hs
@@ -18,6 +18,7 @@ import qualified Types.Remote as Remote
 import Messages
 
 import Data.Maybe
+import qualified Data.ByteString.Short as S (fromShort, toShort)
 
 -- From a sha pointing to the content of a file to the key
 -- to use to export it. When the file is annexed, it's the annexed key.
@@ -39,7 +40,7 @@ exportKey sha = mk <$> catKey sha
 -- only checksum the content.
 gitShaKey :: Git.Sha -> Key
 gitShaKey (Git.Ref s) = mkKey $ \kd -> kd
-	{ keyName = s
+	{ keyName = S.toShort s
 	, keyVariety = OtherKey "GIT"
 	}
 
@@ -47,7 +48,7 @@ gitShaKey (Git.Ref s) = mkKey $ \kd -> kd
 keyGitSha :: Key -> Maybe Git.Sha
 keyGitSha k
 	| fromKey keyVariety k == OtherKey "GIT" =
-		Just (Git.Ref (fromKey keyName k))
+		Just (Git.Ref (S.fromShort (fromKey keyName k)))
 	| otherwise = Nothing
 
 -- Is a key storing a git sha, and not used for an annexed file?
diff --git a/Backend.hs b/Backend.hs
index d327fde3d..3a0011536 100644
--- a/Backend.hs
+++ b/Backend.hs
@@ -33,6 +33,7 @@ import qualified Backend.URL
 import qualified Backend.External
 
 import qualified Data.Map as M
+import qualified Data.ByteString.Short as S (toShort, fromShort)
 import qualified Data.ByteString.Char8 as S8
 
 {- Build-in backends. Does not include externals. -}
@@ -67,7 +68,7 @@ genKey source meterupdate preferredbackend = do
   where
 	-- keyNames should not contain newline characters.
 	makesane k = alterKey k $ \d -> d
-		{ keyName = S8.map fixbadchar (fromKey keyName k)
+		{ keyName = S.toShort (S8.map fixbadchar (S.fromShort (fromKey keyName k)))
 		}
 	fixbadchar c
 		| c == '\n' = '_'
diff --git a/Backend/External.hs b/Backend/External.hs
index c353e049c..fe7449f1f 100644
--- a/Backend/External.hs
+++ b/Backend/External.hs
@@ -20,6 +20,7 @@ import Utility.Metered
 import qualified Utility.SimpleProtocol as Proto
 
 import qualified Data.ByteString as S
+import qualified Data.ByteString.Short as S (toShort, fromShort)
 import qualified Data.Map.Strict as M
 import Data.Char
 import Control.Concurrent
@@ -285,7 +286,7 @@ toProtoKey k = ProtoKey $ alterKey k $ \d -> d
 	-- The extension can be easily removed, because the protocol
 	-- documentation does not allow '.' to be used in the keyName,
 	-- so the first one is the extension.
-	{ keyName = S.takeWhile (/= dot) (keyName d)
+	{ keyName = S.toShort (S.takeWhile (/= dot) (S.fromShort (keyName d)))
 	, keyVariety = setHasExt (HasExt False) (keyVariety d)
 	}
   where
diff --git a/Backend/Hash.hs b/Backend/Hash.hs
index bd66cb698..4ffbcbbde 100644
--- a/Backend/Hash.hs
+++ b/Backend/Hash.hs
@@ -24,6 +24,7 @@ import Utility.Metered
 import qualified Utility.RawFilePath as R
 
 import qualified Data.ByteString as S
+import qualified Data.ByteString.Short as S (toShort, fromShort)
 import qualified Data.ByteString.Char8 as S8
 import qualified Data.ByteString.Lazy as L
 import Control.DeepSeq
@@ -106,7 +107,7 @@ keyValue hash source meterupdate = do
 	filesize <- liftIO $ getFileSize file
 	s <- hashFile hash file meterupdate
 	return $ mkKey $ \k -> k
-		{ keyName = encodeBS s
+		{ keyName = S.toShort (encodeBS s)
 		, keyVariety = hashKeyVariety hash (HasExt False)
 		, keySize = Just filesize
 		}
@@ -160,7 +161,7 @@ needsUpgrade :: Key -> Bool
 needsUpgrade key = or
 	[ "\\" `S8.isPrefixOf` keyHash key
 	, S.any (not . validInExtension) (snd $ splitKeyNameExtension key)
-	, not (hasExt (fromKey keyVariety key)) && keyHash key /= fromKey keyName key
+	, not (hasExt (fromKey keyVariety key)) && keyHash key /= S.fromShort (fromKey keyName key)
 	]
 
 trivialMigrate :: Key -> Backend -> AssociatedFile -> Annex (Maybe Key)
@@ -171,14 +172,14 @@ trivialMigrate' :: Key -> Backend -> AssociatedFile -> Maybe Int -> Maybe Key
 trivialMigrate' oldkey newbackend afile maxextlen
 	{- Fast migration from hashE to hash backend. -}
 	| migratable && hasExt oldvariety = Just $ alterKey oldkey $ \d -> d
-		{ keyName = keyHash oldkey
+		{ keyName = S.toShort (keyHash oldkey)
 		, keyVariety = newvariety
 		}
 	{- Fast migration from hash to hashE backend. -}
 	| migratable && hasExt newvariety = case afile of
 		AssociatedFile Nothing -> Nothing
 		AssociatedFile (Just file) -> Just $ alterKey oldkey $ \d -> d
-			{ keyName = keyHash oldkey 
+			{ keyName = S.toShort $ keyHash oldkey 
 				<> selectExtension maxextlen file
 			, keyVariety = newvariety
 			}
@@ -186,9 +187,9 @@ trivialMigrate' oldkey newbackend afile maxextlen
 	 - non-extension preserving key, with an extension
 	 - in its keyName. -}
 	| newvariety == oldvariety && not (hasExt oldvariety) &&
-		keyHash oldkey /= fromKey keyName oldkey = 
+		keyHash oldkey /= S.fromShort (fromKey keyName oldkey) = 
 			Just $ alterKey oldkey $ \d -> d
-				{ keyName = keyHash oldkey
+				{ keyName = S.toShort (keyHash oldkey)
 				}
 	| otherwise = Nothing
   where
diff --git a/Backend/Utilities.hs b/Backend/Utilities.hs
index 7121d4f2f..58ba880f9 100644
--- a/Backend/Utilities.hs
+++ b/Backend/Utilities.hs
@@ -16,6 +16,7 @@ import Types.Key
 import Types.KeySource
 
 import qualified Data.ByteString as S
+import qualified Data.ByteString.Short as S (ShortByteString, toShort)
 import qualified Data.ByteString.Lazy as L
 import qualified System.FilePath.ByteString as P
 import Data.Char
@@ -25,13 +26,13 @@ import Data.Word
  - If it's not too long, the full string is used as the keyName.
  - Otherwise, it's truncated, and its md5 is prepended to ensure a unique
  - key. -}
-genKeyName :: String -> S.ByteString
+genKeyName :: String -> S.ShortByteString
 genKeyName s
 	-- Avoid making keys longer than the length of a SHA256 checksum.
-	| bytelen > sha256len = encodeBS $
+	| bytelen > sha256len = S.toShort $ encodeBS $
 		truncateFilePath (sha256len - md5len - 1) s' ++ "-" ++ 
 			show (md5 bl)
-	| otherwise = encodeBS s'
+	| otherwise = S.toShort $ encodeBS s'
   where
 	s' = preSanitizeKeyName s
 	bl = encodeBL s
@@ -47,7 +48,7 @@ addE source sethasext k = do
 	maxlen <- annexMaxExtensionLength <$> Annex.getGitConfig
 	let ext = selectExtension maxlen (keyFilename source)
 	return $ alterKey k $ \d -> d
-		{ keyName = keyName d <> ext
+		{ keyName = keyName d <> S.toShort ext
 		, keyVariety = sethasext (keyVariety d)
 		}
 
diff --git a/Backend/WORM.hs b/Backend/WORM.hs
index af116a807..233ca92e6 100644
--- a/Backend/WORM.hs
+++ b/Backend/WORM.hs
@@ -17,6 +17,7 @@ import Utility.Metered
 
 import qualified Data.ByteString.Char8 as S8
 import qualified Utility.RawFilePath as R
+import qualified Data.ByteString.Short as S (toShort, fromShort)
 
 backends :: [Backend]
 backends = [backend]
@@ -53,12 +54,13 @@ keyValue source _ = do
 {- Old WORM keys could contain spaces and carriage returns, 
  - and can be upgraded to remove them. -}
 needsUpgrade :: Key -> Bool
-needsUpgrade key = any (`S8.elem` fromKey keyName key) [' ', '\r']
+needsUpgrade key =
+	any (`S8.elem` S.fromShort (fromKey keyName key)) [' ', '\r']
 
 removeProblemChars :: Key -> Backend -> AssociatedFile -> Annex (Maybe Key)
 removeProblemChars oldkey newbackend _
 	| migratable = return $ Just $ alterKey oldkey $ \d -> d
-		{ keyName = encodeBS $ reSanitizeKeyName $ decodeBS $ keyName d }
+		{ keyName = S.toShort $ encodeBS $ reSanitizeKeyName $ decodeBS $ S.fromShort $ keyName d }
 	| otherwise = return Nothing
   where
 	migratable = oldvariety == newvariety
diff --git a/Command/Find.hs b/Command/Find.hs
index d89ff2b96..0a5544e43 100644
--- a/Command/Find.hs

(Diff truncated)
comment and correct incorrect info in previous comment
diff --git a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_4_be583237b6edff71763eda1fab2d5992._comment b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_4_be583237b6edff71763eda1fab2d5992._comment
index dbe330df0..977c0eaca 100644
--- a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_4_be583237b6edff71763eda1fab2d5992._comment
+++ b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_4_be583237b6edff71763eda1fab2d5992._comment
@@ -30,14 +30,4 @@ values:
 	209704 before mktreeitem
 	261724 before treeItemsToTree
 	327260 after treeItemsToTree
-
-Also, compare above profile with this (-c) profile:
-
-<img src="https://tmp.joeyh.name/prof2.png">
-
-This shows PINNED is increasing all the way to the end, which seems to
-rule out any of the functions shown in the first profile. 
-
-What the first profile shows running up until the end is export db updates.
-But I tried disabling the db updates and the memory use didn't change.
 """]]
diff --git a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_7_f59d9c51716892240ebd12fa80a2e58b._comment b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_7_f59d9c51716892240ebd12fa80a2e58b._comment
new file mode 100644
index 000000000..d308db752
--- /dev/null
+++ b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_7_f59d9c51716892240ebd12fa80a2e58b._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 7"""
+ date="2021-10-05T23:00:18Z"
+ content="""
+I tried converting Ref to use ShortByteString. Memory use did not improve
+and the -hc profile is unchanged. So the pinned memory is not in refs. My
+guess is it must be filenames in the tree then.
+"""]]

Added a comment
diff --git a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_8_c5e3d0c826de72eb0ca9dff51104a0ab._comment b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_8_c5e3d0c826de72eb0ca9dff51104a0ab._comment
new file mode 100644
index 000000000..a8a2134f8
--- /dev/null
+++ b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_8_c5e3d0c826de72eb0ca9dff51104a0ab._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="tomdhunt"
+ avatar="http://cdn.libravatar.org/avatar/02694633d0fb05bb89f025cf779218a3"
+ subject="comment 8"
+ date="2021-10-05T22:07:44Z"
+ content="""
+If it's just a matter of storing the whole set of keys present in each individual archive, you might be able to handle it via difference encoding. The whole list for the first archive, then just sets of added/removed for each archive after that.
+
+This adds a runtime cost to getting the whole set for any archive after the first one, but even with a few thousand archives it seems that should be relatively small. (I assume that it's more likely to have huge numbers of items in an archive, than huge numbers of archives in a repository.)
+"""]]

Added a comment
diff --git a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_7_2316ba67144849988632c79e5a59a3f6._comment b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_7_2316ba67144849988632c79e5a59a3f6._comment
new file mode 100644
index 000000000..f8202e9a4
--- /dev/null
+++ b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_7_2316ba67144849988632c79e5a59a3f6._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="tomdhunt"
+ avatar="http://cdn.libravatar.org/avatar/02694633d0fb05bb89f025cf779218a3"
+ subject="comment 7"
+ date="2021-10-05T21:30:03Z"
+ content="""
+Yeah, I'm not familiar with the internal architecture but both borg and git-annex handle this dataset fine on their own, so it seems that the intersection between the two should also be doable.
+"""]]

comment
diff --git a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_6_4b71b012153a71e03c57ae3ed3ce2272._comment b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_6_4b71b012153a71e03c57ae3ed3ce2272._comment
new file mode 100644
index 000000000..7d8809850
--- /dev/null
+++ b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_6_4b71b012153a71e03c57ae3ed3ce2272._comment
@@ -0,0 +1,24 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 6"""
+ date="2021-10-05T20:53:24Z"
+ content="""
+@tomdhunt so your repo has in the order of 182 million
+items for git-annex to track. I do think that is probably too many to be
+practical even if this memory problem gets resolved. A list of that many
+items is at least 25 gigabytes in size. Add some memory for data structures
+and it's hard to see it working with even your enviable 64 gb. 
+
+This brings me back to the idea of only including one item for each key...  
+Only the item from the most recent archive.
+If the oldest archives always are deleted first, that would never leave a
+key present in the borg repo without git-annex having a record of the
+archive that contained it.
+
+But if you used borg prune to delete some
+intermediate archives, git-annex could no longer know of any existing
+archive that contains a key, so getting from the borg repo would fail,
+until it re-scanned the whole repo.
+git-annex sync could notice when such an intermediate archive
+has been deleted, and trigger the re-scan.
+"""]]

Added a comment: Thanks!
diff --git a/doc/tips/what_to_do_when_a_repository_is_corrupted/comment_4_d4178d99179e8761162c74673042e28a._comment b/doc/tips/what_to_do_when_a_repository_is_corrupted/comment_4_d4178d99179e8761162c74673042e28a._comment
new file mode 100644
index 000000000..0f09dfaba
--- /dev/null
+++ b/doc/tips/what_to_do_when_a_repository_is_corrupted/comment_4_d4178d99179e8761162c74673042e28a._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="bjornw@6a7d7d0413efc7ed3bb44922586f040bb768b71c"
+ nickname="bjornw"
+ avatar="http://cdn.libravatar.org/avatar/676a7c403af0627c9bde84ee1ab3975c"
+ subject="Thanks!"
+ date="2021-10-05T21:09:44Z"
+ content="""
+This worked like a charm.
+"""]]

Added a comment
diff --git a/doc/forum/config_to_make_git_annex_sync_only_sync_metadata__63__/comment_2_11b9261bc5c32d3e36357afc021ac4e1._comment b/doc/forum/config_to_make_git_annex_sync_only_sync_metadata__63__/comment_2_11b9261bc5c32d3e36357afc021ac4e1._comment
new file mode 100644
index 000000000..ec390af96
--- /dev/null
+++ b/doc/forum/config_to_make_git_annex_sync_only_sync_metadata__63__/comment_2_11b9261bc5c32d3e36357afc021ac4e1._comment
@@ -0,0 +1,11 @@
+[[!comment format=mdwn
+ username="falsifian"
+ avatar="http://cdn.libravatar.org/avatar/59c3c23c500d20d83ecb9d1f149be9ae"
+ subject="comment 2"
+ date="2021-10-05T20:55:32Z"
+ content="""
+Thanks joey, your script is useful and I can use it as a workaround if necessary.
+
+I still don't understand the reason behind the behaviour. Why does `annex.synconlyannex = true` cause content to be synced? Wouldn't it be simpler to just say content is synced iff `annex.synccontent` is set to true? (I leave at the default, `false`, because I don't want content to be synced, but that does not help.)
+
+"""]]

update
diff --git a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_4_be583237b6edff71763eda1fab2d5992._comment b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_4_be583237b6edff71763eda1fab2d5992._comment
index 401bcccc8..dbe330df0 100644
--- a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_4_be583237b6edff71763eda1fab2d5992._comment
+++ b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_4_be583237b6edff71763eda1fab2d5992._comment
@@ -31,5 +31,13 @@ values:
 	261724 before treeItemsToTree
 	327260 after treeItemsToTree
 
+Also, compare above profile with this (-c) profile:
 
+<img src="https://tmp.joeyh.name/prof2.png">
+
+This shows PINNED is increasing all the way to the end, which seems to
+rule out any of the functions shown in the first profile. 
+
+What the first profile shows running up until the end is export db updates.
+But I tried disabling the db updates and the memory use didn't change.
 """]]

comment
diff --git a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_2_3af1b1dd4c1dea54639baac90c60452d._comment b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_2_3af1b1dd4c1dea54639baac90c60452d._comment
index 753576646..62d1ef4a9 100644
--- a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_2_3af1b1dd4c1dea54639baac90c60452d._comment
+++ b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_2_3af1b1dd4c1dea54639baac90c60452d._comment
@@ -8,11 +8,8 @@ repo. The length of each item was 142 bytes, so all the items should
 need about 15 mb of memory. git-annex sync used more than 2 gb
 of memory. So that's a test case for this bug.
 
-Looks like around 500 mb is used listing the repo contents, and
-then after all the borg list is complete, it uses much more memory
-building the git tree.
+Looks like around 500 mb is used listing the repo contents.
 
-I was not including building the git tree in my estimates. I see
-that Annex.Import uses recordTree, which does have to buffer the whole
-tree in memory, but this seems much more memory than that.
+Then after all the borg list is complete, it uses much more memory
+building the git tree.
 """]]
diff --git a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_4_be583237b6edff71763eda1fab2d5992._comment b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_4_be583237b6edff71763eda1fab2d5992._comment
new file mode 100644
index 000000000..401bcccc8
--- /dev/null
+++ b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_4_be583237b6edff71763eda1fab2d5992._comment
@@ -0,0 +1,35 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 4"""
+ date="2021-10-05T19:26:49Z"
+ content="""
+I've tried most types of heap profiles and saw only PINNED.
+But a retainer profile (-hr) told more.
+
+<img src="https://tmp.joeyh.name/prof.png">
+
+Note that 8602 is really getImportableContents, and 14913 is importKeys.
+(Found in git-annex.prof which tells the call stack for each set.) 
+
+I think that buildImportTrees's allocation is due to needing to hash
+git-annex symlinks and retain the shas. (mktreeitem) Unless there's also memory
+fragmentation happening there.
+
+treeItemsToTree might be the real problem, but it's hard to see how to 
+improve it. Maybe stop using it and use a temporary index file to build
+up the tree?
+
+Notice that the 30mb spike shown in the profile is only a fraction of the
+300+ mb that run actually grew to consume. Which gets back to PINNED and fragmentation,
+I'm afraid..
+
+Looking at git-annex from outside, I collected these RSS
+values:
+
+	101508 early borg list
+	209704 before mktreeitem
+	261724 before treeItemsToTree
+	327260 after treeItemsToTree
+
+
+"""]]

Added a comment
diff --git a/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_3_97dff7adb32a087fbc9f546fdea28bbe._comment b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_3_97dff7adb32a087fbc9f546fdea28bbe._comment
new file mode 100644
index 000000000..9958893fd
--- /dev/null
+++ b/doc/bugs/borg_special_remote_memory_usage_high_for_large_borg_repo/comment_3_97dff7adb32a087fbc9f546fdea28bbe._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="tomdhunt"
+ avatar="http://cdn.libravatar.org/avatar/02694633d0fb05bb89f025cf779218a3"
+ subject="comment 3"
+ date="2021-10-05T19:08:23Z"
+ content="""
+The repo in question is my daily backup repository. It keeps an archive for each day going back a year or so, so on order of hundreds of archives. The underlying data is about 8TB, but it only changes small amounts, so the whole borg repo is also about 8TB. Each archive has a git-annex folder in it. (I specified the subdir option to point directly to the folder.) The annex has many small files; total number of keys is about 500k.
+"""]]