Recent changes to this wiki:

diff --git a/doc/bugs/__91__PATCH__93___incorrect_behaviour_in_expandTilde.mdwn b/doc/bugs/__91__PATCH__93___incorrect_behaviour_in_expandTilde.mdwn
index 3573a9106..9e965967c 100644
--- a/doc/bugs/__91__PATCH__93___incorrect_behaviour_in_expandTilde.mdwn
+++ b/doc/bugs/__91__PATCH__93___incorrect_behaviour_in_expandTilde.mdwn
@@ -69,7 +69,7 @@ All that needs to be done is to add an equation for `expandt` to handle the case
 See the following patch:
 
 ```
-From 4d8febfd5ec64516d3f77577a498f96b87ec9c9c Mon Sep 17 00:00:00 2001
+From 680873923197f5eec15365b3e47e3fa05b9573be Mon Sep 17 00:00:00 2001
 From: Grond <grond66@riseup.net>
 Date: Thu, 14 Jan 2021 18:16:31 -0800
 Subject: [PATCH] Fix expandTilde so that it can handle tildes at the end of
@@ -80,14 +80,14 @@ Subject: [PATCH] Fix expandTilde so that it can handle tildes at the end of
  1 file changed, 1 insertion(+)
 
 diff --git a/Git/Construct.hs b/Git/Construct.hs
-index 8b63ac480..b7f018944 100644
+index 8b63ac480..a369bc4a6 100644
 --- a/Git/Construct.hs
 +++ b/Git/Construct.hs
 @@ -187,6 +187,7 @@ expandTilde = expandt True
  	expandt True ('~':'/':cs) = do
  		h <- myHomeDir
  		return $ h </> cs
-+        expandt True "~" = myHomeDir
++	expandt True "~" = myHomeDir
  	expandt True ('~':cs) = do
  		let (name, rest) = findname "" cs
  		u <- getUserEntryForName name

diff --git a/doc/bugs/__91__PATCH__93___incorrect_behaviour_in_expandTilde.mdwn b/doc/bugs/__91__PATCH__93___incorrect_behaviour_in_expandTilde.mdwn
index 6c8e7cde8..3573a9106 100644
--- a/doc/bugs/__91__PATCH__93___incorrect_behaviour_in_expandTilde.mdwn
+++ b/doc/bugs/__91__PATCH__93___incorrect_behaviour_in_expandTilde.mdwn
@@ -65,7 +65,7 @@ git-annex: get: 1 failed
 ```
 
 Fixing the problem is simple enough.
-All that needs to be done is to add a case for `expandt` to handle the case where `~` appears at the end of a string.
+All that needs to be done is to add an equation for `expandt` to handle the case where `~` appears at the end of a string.
 See the following patch:
 
 ```
@@ -100,13 +100,14 @@ index 8b63ac480..b7f018944 100644
 
 1. Create `testfile` in a git-annex repo of your home directory on host `A`
 2. Run `git annex add testfile` in the repo on `A`
-3. Clone your home directory on `A` onto host `B` using `git clone ssh://me@A/~ homedir_A`
-4. `cd` into `homedir_A`
-5. Run `git annex get testfile`
-6. Watch git-annex fail to fetch the file
-7. Run `git remote set-url origin ssh://me@A/~/` to set the remote URL to be something git-annex can deal with
-8. Run `git annex get testfile` again
-9. Watch git-annex suddenly succeed
+3. Run `git commit`
+4. Clone your home directory on `A` onto host `B` using `git clone ssh://me@A/~ homedir_A`
+5. `cd` into `homedir_A`
+6. Run `git annex get testfile`
+7. Watch git-annex fail to fetch the file
+8. Run `git remote set-url origin ssh://me@A/~/` to set the remote URL to be something git-annex can deal with
+9. Run `git annex get testfile` again
+10. Watch git-annex suddenly succeed
 
 ### What version of git-annex are you using? On what operating system?
 

Added a comment
diff --git a/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_4_301ac4a37a0f2da6b9c394509a369484._comment b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_4_301ac4a37a0f2da6b9c394509a369484._comment
new file mode 100644
index 000000000..501afa9f5
--- /dev/null
+++ b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_4_301ac4a37a0f2da6b9c394509a369484._comment
@@ -0,0 +1,26 @@
+[[!comment format=mdwn
+ username="mih"
+ avatar="http://cdn.libravatar.org/avatar/f881df265a423e4f24eff27c623148fd"
+ subject="comment 4"
+ date="2021-01-15T08:32:58Z"
+ content="""
+> Is the relative path actually valid?
+
+As far as I can tell, it is.
+
+```
+(venv3.8.6) worker-629-003:datalad_temp_test_basic_scenariocm66i4fm appveyor$ ls -id ../../../../../../../var/folders/5s/g225f6nd6jl4g8tshbh1ltk40000gn/T/datalad_temp_tree_test_basic_scenariodi3ady04/.git/annex /var/folders/5s/g225f6nd6jl4g8tshbh1ltk40000gn/T/datalad_temp_tree_test_basic_scenariodi3ady04/.git/annex
+12889649026 ../../../../../../../var/folders/5s/g225f6nd6jl4g8tshbh1ltk40000gn/T/datalad_temp_tree_test_basic_scenariodi3ady04/.git/annex
+12889649026 /var/folders/5s/g225f6nd6jl4g8tshbh1ltk40000gn/T/datalad_temp_tree_test_basic_scenariodi3ady04/.git/annex
+```
+
+I cannot come up with an explanation, for the consistent behavior of `ls`, but not `mkdir` for a relative path vs. an absolute path.
+
+```
+(venv3.8.6) worker-629-003:datalad_temp_test_basic_scenariocm66i4fm appveyor$ mkdir ../../../../../../../var/folders/5s/g225f6nd6jl4g8tshbh1ltk40000gn/T/datalad_temp_tree_test_basic_scenariodi3ady04/.git/annex/testdummy
+mkdir: ../../../../../../../var/folders/5s/g225f6nd6jl4g8tshbh1ltk40000gn/T/datalad_temp_tree_test_basic_scenariodi3ady04/.git/annex: No such file or directory
+
+(venv3.8.6) worker-629-003:datalad_temp_test_basic_scenariocm66i4fm appveyor$ mkdir /var/folders/5s/g225f6nd6jl4g8tshbh1ltk40000gn/T/datalad_temp_tree_test_basic_scenariodi3ady04/.git/annex/testdummy
+-> exit 0
+```
+"""]]

diff --git a/doc/bugs/__91__PATCH__93___incorrect_behaviour_in_expandTilde.mdwn b/doc/bugs/__91__PATCH__93___incorrect_behaviour_in_expandTilde.mdwn
new file mode 100644
index 000000000..6c8e7cde8
--- /dev/null
+++ b/doc/bugs/__91__PATCH__93___incorrect_behaviour_in_expandTilde.mdwn
@@ -0,0 +1,141 @@
+### Please describe the problem.
+
+git-annex has issues when trying to deal with SSH (and possibly other kinds) of URLs which have the form:
+
+```
+ssh://user@host/~
+```
+
+When git-annex tries to perform tilde-expansion the path part of the URL on the remote side,
+it runs into problems because the function responsible for doing this (`expandTilde` in `Git/Construct.hs`)
+does not correctly handle the expansion of home directory paths which do not end in a slash,
+such as `~` or `/~`. It will correctly handle strings like `/~/` or `~/`, which is why SSH
+URLs of the form `ssh://user@host/~/` *will* work.
+
+Examining the definition of `expandTilde` makes it clear why this is true:
+
+```haskell
+expandTilde :: FilePath -> IO FilePath
+#ifdef mingw32_HOST_OS
+expandTilde = return
+#else
+expandTilde = expandt True
+  where
+        expandt _ [] = return ""
+        expandt _ ('/':cs) = do
+                v <- expandt True cs
+                return ('/':v)
+        expandt True ('~':'/':cs) = do
+                h <- myHomeDir
+                return $ h </> cs
+        expandt True ('~':cs) = do
+                let (name, rest) = findname "" cs
+                u <- getUserEntryForName name
+                return $ homeDirectory u </> rest
+        expandt _ (c:cs) = do
+                v <- expandt False cs
+                return (c:v)
+        findname n [] = (n, "") 
+        findname n (c:cs)
+                | c == '/' = (n, cs) 
+                | otherwise = findname (n++[c]) cs
+```
+
+The expression `expandTilde "~"` will eventually match the fourth pattern for `expandt`.
+Since `cs == ""` in this context, `name` will also evaluate to `""`.
+This means that `getUserEntryForName` will be called with the null string as an argument.
+Since there is no user on the system with the null string as a username,
+`getUserEntryForName` will throw an exception.
+This will cause git-annex to spit out an error message:
+
+```
+get testfile (from origin...) 
+git-annex-shell: getUserEntryForName: does not exist (no such user)
+rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
+rsync error: error in rsync protocol data stream (code 12) at io.c(235) [Receiver=3.1.3]
+
+  rsync failed -- run git annex again to resume file transfer
+
+  Unable to access these remotes: origin
+
+  Try making some of these repositories available:
+  	1f5118ff-a50e-4bf1-a372-960774bce0ab -- user@A:~/ [origin]
+failed
+git-annex: get: 1 failed
+```
+
+Fixing the problem is simple enough.
+All that needs to be done is to add a case for `expandt` to handle the case where `~` appears at the end of a string.
+See the following patch:
+
+```
+From 4d8febfd5ec64516d3f77577a498f96b87ec9c9c Mon Sep 17 00:00:00 2001
+From: Grond <grond66@riseup.net>
+Date: Thu, 14 Jan 2021 18:16:31 -0800
+Subject: [PATCH] Fix expandTilde so that it can handle tildes at the end of
+ it's input
+
+---
+ Git/Construct.hs | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/Git/Construct.hs b/Git/Construct.hs
+index 8b63ac480..b7f018944 100644
+--- a/Git/Construct.hs
++++ b/Git/Construct.hs
+@@ -187,6 +187,7 @@ expandTilde = expandt True
+ 	expandt True ('~':'/':cs) = do
+ 		h <- myHomeDir
+ 		return $ h </> cs
++        expandt True "~" = myHomeDir
+ 	expandt True ('~':cs) = do
+ 		let (name, rest) = findname "" cs
+ 		u <- getUserEntryForName name
+-- 
+2.20.1
+
+```
+
+### What steps will reproduce the problem?
+
+1. Create `testfile` in a git-annex repo of your home directory on host `A`
+2. Run `git annex add testfile` in the repo on `A`
+3. Clone your home directory on `A` onto host `B` using `git clone ssh://me@A/~ homedir_A`
+4. `cd` into `homedir_A`
+5. Run `git annex get testfile`
+6. Watch git-annex fail to fetch the file
+7. Run `git remote set-url origin ssh://me@A/~/` to set the remote URL to be something git-annex can deal with
+8. Run `git annex get testfile` again
+9. Watch git-annex suddenly succeed
+
+### What version of git-annex are you using? On what operating system?
+
+I'm running Debian 10.7.
+
+The output of `git annex version` is:
+
+```
+git-annex version: 7.20190129
+build flags: Assistant Webapp Pairing S3(multipartupload)(storageclasses) WebDAV Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite
+dependency versions: aws-0.20 bloomfilter-2.0.1.0 cryptonite-0.25 DAV-1.3.3 feed-1.0.0.0 ghc-8.4.4 http-client-0.5.13.1 persistent-sqlite-2.8.2 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.0
+key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL
+remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar hook external
+operating system: linux x86_64
+supported repository versions: 5 7
+upgrade supported from repository versions: 0 1 2 3 4 5 6
+local repository version: 5
+```
+
+### Please provide any additional information below.
+
+[[!format sh """
+# If you can, paste a complete transcript of the problem occurring here.
+# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
+
+
+# End of transcript or log.
+"""]]
+
+### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
+
+Definitely! I'm currently writing some personal file synchronization software that uses git-annex for myself, which is how I noticed this bug.

removed
diff --git a/doc/git-annex-uninit/comment_4_c3df10cdd19a15a80d3aa95373f1d071._comment b/doc/git-annex-uninit/comment_4_c3df10cdd19a15a80d3aa95373f1d071._comment
deleted file mode 100644
index 2373c1614..000000000
--- a/doc/git-annex-uninit/comment_4_c3df10cdd19a15a80d3aa95373f1d071._comment
+++ /dev/null
@@ -1,9 +0,0 @@
-[[!comment format=mdwn
- username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
- nickname="eric.w"
- avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
- subject="comment 4"
- date="2021-01-11T22:42:27Z"
- content="""
-actually --fast seems to have no affect on the speed. (each file seems to be getting hashed)
-"""]]

removed
diff --git a/doc/git-annex-uninit/comment_3_2cf58739c14b8aae6fc12fb1463a303f._comment b/doc/git-annex-uninit/comment_3_2cf58739c14b8aae6fc12fb1463a303f._comment
deleted file mode 100644
index e3f9a10e4..000000000
--- a/doc/git-annex-uninit/comment_3_2cf58739c14b8aae6fc12fb1463a303f._comment
+++ /dev/null
@@ -1,9 +0,0 @@
-[[!comment format=mdwn
- username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
- nickname="eric.w"
- avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
- subject="git-annex uninit --fast"
- date="2021-01-11T22:41:04Z"
- content="""
-are there any caveats to using --fast with this command? I assume it will just skip the hash validation.
-"""]]

Added a comment
diff --git a/doc/git-annex-uninit/comment_5_53358bdd6093e2fb787df8de7190c5fd._comment b/doc/git-annex-uninit/comment_5_53358bdd6093e2fb787df8de7190c5fd._comment
new file mode 100644
index 000000000..b9ca1f8f6
--- /dev/null
+++ b/doc/git-annex-uninit/comment_5_53358bdd6093e2fb787df8de7190c5fd._comment
@@ -0,0 +1,17 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 5"
+ date="2021-01-14T22:15:12Z"
+ content="""
+A faster way of doing uninit is the following:
+
+```cp --no-clobber --dereference --recursive --preserve=all --reflink=auto --verbose ./git_annex_repo/your_symlinks/ ./target_dir/```
+
+This will simply copy (thin COW copy) symlinks (dereferenced) as normal files preserving the mtime, etc. the resulting ./target_dir/ will have your files if they existed in this annex or broken symlinks if the files were not here.
+
+
+
+
+"""]]

close as not a git-annex bug
diff --git a/doc/bugs/error__58___invalid_object__while_setting_metadata.mdwn b/doc/bugs/error__58___invalid_object__while_setting_metadata.mdwn
index 084e38216..0020caf26 100644
--- a/doc/bugs/error__58___invalid_object__while_setting_metadata.mdwn
+++ b/doc/bugs/error__58___invalid_object__while_setting_metadata.mdwn
@@ -24,3 +24,5 @@ I think nothing odd was done besides trying to make this file saved unlocked for
 
 [[!meta author=yoh]]
 [[!tag projects/datalad]]
+
+> [[notabug|done]] --[[Joey]]

Bug fix: export with -J could fail when two files had the same content.
Exporting is done inside a call to writeLockDbWhile which guarantees there
is only one process uploading to a given ExportLocation.
diff --git a/CHANGELOG b/CHANGELOG
index b126a5222..bf34053c9 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -24,6 +24,7 @@ git-annex (8.20201130) UNRELEASED; urgency=medium
     include deletions of submodules.
     Thanks, Kyle Meyer for the patch.
   * Windows: Work around win32 length limits when dealing with lock files.
+  * Bug fix: export with -J could fail when two files had the same content.
 
  -- Joey Hess <id@joeyh.name>  Mon, 04 Jan 2021 12:52:41 -0400
 
diff --git a/Command/Export.hs b/Command/Export.hs
index 3973d3c10..2eae95af8 100644
--- a/Command/Export.hs
+++ b/Command/Export.hs
@@ -283,7 +283,11 @@ performExport r db ek af contentsha loc allfilledvar = do
 	sent <- tryNonAsync $ case ek of
 		AnnexKey k -> ifM (inAnnex k)
 			( notifyTransfer Upload af $
-				upload' (uuid r) k af stdRetry $ \pm -> do
+				-- alwaysUpload because the same key
+				-- could be used for more than one export
+				-- location, and concurrently uploading
+				-- of the content should still be allowed.
+				alwaysUpload (uuid r) k af stdRetry $ \pm -> do
 					let rollback = void $
 						performUnexport r db [ek] loc
 					sendAnnex k rollback $ \f ->
diff --git a/doc/bugs/export_-J_6__to_S3__58___transfer_already_in_progress.mdwn b/doc/bugs/export_-J_6__to_S3__58___transfer_already_in_progress.mdwn
index d599afe38..5f8a21da9 100644
--- a/doc/bugs/export_-J_6__to_S3__58___transfer_already_in_progress.mdwn
+++ b/doc/bugs/export_-J_6__to_S3__58___transfer_already_in_progress.mdwn
@@ -94,3 +94,5 @@ Besides reporting the issue, I also have a question:  could I just rerun `export
 
 [[!meta author=yoh]]
 [[!tag projects/datalad]]
+
+> [[fixed|done]] bypassed this unncessary locking for exports. --[[Joey]]

Windows: Work around win32 length limits when dealing with lock files
diff --git a/CHANGELOG b/CHANGELOG
index 2d5a7f545..b126a5222 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -23,6 +23,7 @@ git-annex (8.20201130) UNRELEASED; urgency=medium
   * When syncing changes back from an adjusted branch to the basis branch,
     include deletions of submodules.
     Thanks, Kyle Meyer for the patch.
+  * Windows: Work around win32 length limits when dealing with lock files.
 
  -- Joey Hess <id@joeyh.name>  Mon, 04 Jan 2021 12:52:41 -0400
 
diff --git a/Utility/LockFile/Windows.hs b/Utility/LockFile/Windows.hs
index 100fa854f..c4b9b6740 100644
--- a/Utility/LockFile/Windows.hs
+++ b/Utility/LockFile/Windows.hs
@@ -1,10 +1,12 @@
 {- Windows lock files
  -
- - Copyright 2014 Joey Hess <id@joeyh.name>
+ - Copyright 2014,2021 Joey Hess <id@joeyh.name>
  -
  - License: BSD-2-clause
  -}
 
+{-# LANGUAGE OverloadedStrings #-}
+
 module Utility.LockFile.Windows (
 	lockShared,
 	lockExclusive,
@@ -16,8 +18,12 @@ module Utility.LockFile.Windows (
 import System.Win32.Types
 import System.Win32.File
 import Control.Concurrent
+import qualified Data.ByteString as B
+import qualified System.FilePath.Windows.ByteString as P
 
 import Utility.FileSystemEncoding
+import Utility.Split
+import Utility.Path.AbsRel
 
 type LockFile = RawFilePath
 
@@ -53,7 +59,8 @@ lockExclusive = openLock fILE_SHARE_NONE
  -}
 openLock :: ShareMode -> LockFile -> IO (Maybe LockHandle)
 openLock sharemode f = do
-	h <- withTString (fromRawFilePath f) $ \c_f ->
+	f' <- convertToNativeNamespace f
+	h <- withTString (fromRawFilePath f') $ \c_f ->
 		c_CreateFile c_f gENERIC_READ sharemode security_attributes
 			oPEN_ALWAYS fILE_ATTRIBUTE_NORMAL (maybePtr Nothing)
 	return $ if h == iNVALID_HANDLE_VALUE
@@ -62,6 +69,32 @@ openLock sharemode f = do
   where
 	security_attributes = maybePtr Nothing
 
+{- Convert a filepath to use Windows's native namespace.
+ - This avoids filesystem length limits.
+ -
+ - This is similar to the way base converts filenames on windows,
+ - but as that is implemented in C (create_device_name) and not
+ - exported, it cannot be used here. Several edge cases are not handled,
+ - including network shares and dos short paths. 
+ -}
+convertToNativeNamespace :: RawFilePath -> IO RawFilePath
+convertToNativeNamespace f
+	| win32_dev_namespace `B.isPrefixOf` f = return f
+	| win32_file_namespace `B.isPrefixOf` f = return f
+	| nt_device_namespace `B.isPrefixOf` f = return f
+	| otherwise = do
+		-- Make absolute because any '.' and '..' in the path
+		-- will not be resolved once it's converted.
+		p <- absPath f
+		-- Normalize slashes.
+		let p' = P.normalise p
+		return (win32_file_namespace <> p')
+  where
+ 
+	win32_dev_namespace = "\\\\.\\"
+	win32_file_namespace = "\\\\?\\"
+	nt_device_namespace = "\\Device\\"
+
 dropLock :: LockHandle -> IO ()
 dropLock = closeHandle
 
diff --git a/doc/bugs/Windows__58___drop_claims_that___34__content_is_locked__34__/comment_2_00fa93b8a064bd1e6ca6c35b1a10d5aa._comment b/doc/bugs/Windows__58___drop_claims_that___34__content_is_locked__34__/comment_2_00fa93b8a064bd1e6ca6c35b1a10d5aa._comment
new file mode 100644
index 000000000..bad4075d5
--- /dev/null
+++ b/doc/bugs/Windows__58___drop_claims_that___34__content_is_locked__34__/comment_2_00fa93b8a064bd1e6ca6c35b1a10d5aa._comment
@@ -0,0 +1,33 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2021-01-13T17:25:08Z"
+ content="""
+It seems that Utility.LockFile.Windows.openLock
+must be returning Nothing for this message
+to be displayed.
+
+Which it does when CreateFile returns `INVALID_HANDLE_VALUE`. Which
+it makes sense it would do for a filename that's too long. Except
+that's taken to mean the locking failed due to it being locked.
+
+(So it seems createfile with the too-long filename is creating the file but
+then failing that way. Which is weird.)
+
+It might be possible to use windows's `GetLastError` to find out that it
+failed due to length, but the API docs don't seem
+to say what the error value is in that case.
+
+Normally ghc modifies filenames on windows to not use the
+compatability layer that has this filename length limit. But since
+this is using the low-level CreateFile, that does not happen here.
+
+The code that does that is not exposed (`create_device_name` in base's
+cbits/fs.c)
+It's basically a matter of prepending `\\?\` to the path, but it also
+has to be made absolute and cannot contain '/'. 
+
+I've implemented something similar in git-annex, which I hope will solve
+this. I have not tried it on windows yet so leaving the bug open for
+confirmation.
+"""]]

rewrite prop_relPathDirToFileAbs_basics
This was not a good test, it broke the requirement that
relPathDirToFileAbs take absolute paths. And it failed when the two
input paths were eg, the same but differently normalized.
Replaced with some tests of the real basics of that function.
diff --git a/Utility/Path.hs b/Utility/Path.hs
index 6bd407e60..b1f7a5f4d 100644
--- a/Utility/Path.hs
+++ b/Utility/Path.hs
@@ -189,8 +189,7 @@ splitShortExtensions' maxextension = go []
 		(base, ext) = splitExtension f
 		len = B.length ext
 
-{- This requires the first path to be absolute, and the
- - second path cannot contain ../ or ./
+{- This requires both paths to be absolute and normalized.
  -
  - On Windows, if the paths are on different drives,
  - a relative path is not possible and the path is simply
diff --git a/Utility/Path/Tests.hs b/Utility/Path/Tests.hs
index ba0330c7f..2d9c6152a 100644
--- a/Utility/Path/Tests.hs
+++ b/Utility/Path/Tests.hs
@@ -35,16 +35,14 @@ prop_upFrom_basics tdir
 	p = fromRawFilePath <$> upFrom (toRawFilePath dir)
 	dir = fromTestableFilePath tdir
 
-prop_relPathDirToFileAbs_basics :: TestableFilePath -> TestableFilePath -> Bool
-prop_relPathDirToFileAbs_basics fromt tot
-	| from == to = null r
-	| otherwise = not (null r)
+prop_relPathDirToFileAbs_basics :: TestableFilePath -> Bool
+prop_relPathDirToFileAbs_basics pt = and
+	[ relPathDirToFileAbs p (p </> "bar") == "bar"
+	, relPathDirToFileAbs (p </> "bar") p == ".."
+	, relPathDirToFileAbs p p == ""
+	]
   where
-	from = fromTestableFilePath fromt
-	to = fromTestableFilePath tot
-	r = fromRawFilePath $ relPathDirToFileAbs
-		(toRawFilePath from)
-		(toRawFilePath to)
+	p = pathSeparator `B.cons` toRawFilePath (fromTestableFilePath pt)
 
 prop_relPathDirToFileAbs_regressionTest :: Bool
 prop_relPathDirToFileAbs_regressionTest = same_dir_shortcurcuits_at_difference
diff --git a/doc/bugs/prop__95__relPathDirToFileAbs__95__basics_fail_on_crippled___126__.mdwn b/doc/bugs/prop__95__relPathDirToFileAbs__95__basics_fail_on_crippled___126__.mdwn
index 9a8cba3a6..0282ec96d 100644
--- a/doc/bugs/prop__95__relPathDirToFileAbs__95__basics_fail_on_crippled___126__.mdwn
+++ b/doc/bugs/prop__95__relPathDirToFileAbs__95__basics_fail_on_crippled___126__.mdwn
@@ -14,3 +14,5 @@ Fresh build of 8.20201129+git100-g2d84bf992-1~ndall+1 when having HOME (base sys
 
 [[!meta author=yoh]]
 [[!tag projects/datalad]]
+
+> [[fixed|done]] --[[Joey]]
diff --git a/doc/bugs/prop__95__relPathDirToFileAbs__95__basics_fail_on_crippled___126__/comment_1_23bf4756af808d1b2cf89f7da119031f._comment b/doc/bugs/prop__95__relPathDirToFileAbs__95__basics_fail_on_crippled___126__/comment_1_23bf4756af808d1b2cf89f7da119031f._comment
new file mode 100644
index 000000000..b388e7415
--- /dev/null
+++ b/doc/bugs/prop__95__relPathDirToFileAbs__95__basics_fail_on_crippled___126__/comment_1_23bf4756af808d1b2cf89f7da119031f._comment
@@ -0,0 +1,11 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2021-01-13T17:01:07Z"
+ content="""
+This is a pure test, and the filesystem does not affect it in any way.
+
+The test is slightly broken, in that when two paths
+are the same except slightly differently normalized (eg, "A" vs "A/"),
+it fails. Really not a great test overall, rewriting.
+"""]]

close
diff --git a/doc/bugs/Build__47__OSXMkLibs.hs_does_not_resolve___64__loader__95__path.mdwn b/doc/bugs/Build__47__OSXMkLibs.hs_does_not_resolve___64__loader__95__path.mdwn
index b09cabafd..60d406b83 100644
--- a/doc/bugs/Build__47__OSXMkLibs.hs_does_not_resolve___64__loader__95__path.mdwn
+++ b/doc/bugs/Build__47__OSXMkLibs.hs_does_not_resolve___64__loader__95__path.mdwn
@@ -11,3 +11,5 @@ I attempted to patch `Build/OSXMkLibs.hs` to handle this myself, but the file co
 
 [[!meta author=jwodder]]
 [[!tag projects/datalad]]
+
+> [[done]] per commeents --[[Joey]]

comment
diff --git a/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_3_f533ffdb2b37f98094fc7633cd686e5c._comment b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_3_f533ffdb2b37f98094fc7633cd686e5c._comment
new file mode 100644
index 000000000..f3eaee6d9
--- /dev/null
+++ b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_3_f533ffdb2b37f98094fc7633cd686e5c._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2021-01-13T16:57:10Z"
+ content="""
+Also it seems common for directories under /var/folders/xx/yyyyy/ to be
+mode 700, so it could somehow involve permissions.
+"""]]

improve comment
diff --git a/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_2_020d7c6292c2a8238a5ec891fba9eddd._comment b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_2_020d7c6292c2a8238a5ec891fba9eddd._comment
index 4621b3fb0..83eacb503 100644
--- a/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_2_020d7c6292c2a8238a5ec891fba9eddd._comment
+++ b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_2_020d7c6292c2a8238a5ec891fba9eddd._comment
@@ -5,24 +5,13 @@
  content="""
 Is the relative path actually valid? There may be some case involving
 symlinks in the parent directories where the relative path could be wrong.
+This part seems to suggest there are symlinks involved:
 
-I am not familiar with /private on OSX. Is the problem something like,
-the user can access /private/foo/bar but not /private/foo, and so a
-relative path traversing that can't find the ".." in that directory?
-
-If so, then a relative path from /private/foo/bar to /private/foo/baz would
-also fail and that case would not be helped by using an absolute
-path to a different top level directory.
+> some of these folders actually live in /private/var, despite being accessible via /var
 
 Of course, it's a bit surprising that a relative path is used when the path
 is into an entirely different top-level directory. But without
 understanding the problem I don't know if switching to using an absolute
 path in that case would only happen to fix this case of the problem and not
 the general case, whatever that is.
-
-It's worth noting that git fails if a parent directory does not have
-the x bit set:
-
-	joey@darkstar:/tmp/foo/bar/baz>git init
-	fatal: Invalid path '/tmp/foo/bar': Permission denied
 """]]

comment
diff --git a/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_2_020d7c6292c2a8238a5ec891fba9eddd._comment b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_2_020d7c6292c2a8238a5ec891fba9eddd._comment
new file mode 100644
index 000000000..4621b3fb0
--- /dev/null
+++ b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_2_020d7c6292c2a8238a5ec891fba9eddd._comment
@@ -0,0 +1,28 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2021-01-13T16:32:06Z"
+ content="""
+Is the relative path actually valid? There may be some case involving
+symlinks in the parent directories where the relative path could be wrong.
+
+I am not familiar with /private on OSX. Is the problem something like,
+the user can access /private/foo/bar but not /private/foo, and so a
+relative path traversing that can't find the ".." in that directory?
+
+If so, then a relative path from /private/foo/bar to /private/foo/baz would
+also fail and that case would not be helped by using an absolute
+path to a different top level directory.
+
+Of course, it's a bit surprising that a relative path is used when the path
+is into an entirely different top-level directory. But without
+understanding the problem I don't know if switching to using an absolute
+path in that case would only happen to fix this case of the problem and not
+the general case, whatever that is.
+
+It's worth noting that git fails if a parent directory does not have
+the x bit set:
+
+	joey@darkstar:/tmp/foo/bar/baz>git init
+	fatal: Invalid path '/tmp/foo/bar': Permission denied
+"""]]

include bugs/todos tagged datalad
sometimes they are not signed with an author
diff --git a/doc/projects/datalad.mdwn b/doc/projects/datalad.mdwn
index ca5939359..badc916fa 100644
--- a/doc/projects/datalad.mdwn
+++ b/doc/projects/datalad.mdwn
@@ -2,14 +2,14 @@ TODOs for DataLad
 =================
 
 [[!inline pages="todo/* and !todo/done and !link(todo/done) and
-(author(yoh) or author(mih) or author(ben) or author(yarikoptic) or author(kyle))" sort=mtime feeds=no actions=yes archive=yes show=0 template=buglist]]
+(author(yoh) or author(mih) or author(ben) or author(yarikoptic) or author(kyle) or tagged(projects/datalad))" sort=mtime feeds=no actions=yes archive=yes show=0 template=buglist]]
 
 
 <details>
 <summary>Done</summary>
 
 [[!inline pages="todo/* and !todo/done and link(todo/done) and
-(author(yoh) or author(mih) or author(ben) or author(yarikoptic) or author(kyle))" feeds=no actions=yes archive=yes show=0 template=buglist]]
+(author(yoh) or author(mih) or author(ben) or author(yarikoptic) or author(kyle) or tagged(projects/datalad))" feeds=no actions=yes archive=yes show=0 template=buglist]]
 
 </details>
 
@@ -17,7 +17,7 @@ My bugs
 =======
 
 [[!inline pages="bugs/* and !bugs/done and !link(bugs/done) and
-(author(yoh) or author(mih) or author(ben) or author(yarikoptic) or author(kyle))" sort=mtime feeds=no actions=yes archive=yes show=0  template=buglist template=buglist]]
+(author(yoh) or author(mih) or author(ben) or author(yarikoptic) or author(kyle) or tagged(projects/datalad))" sort=mtime feeds=no actions=yes archive=yes show=0  template=buglist template=buglist]]
 
 
 
@@ -25,6 +25,6 @@ My bugs
 <summary>Fixed</summary>
 
 [[!inline pages="bugs/* and !bugs/done and link(bugs/done) and
-(author(yoh) or author(mih) or author(ben) or author(yarikoptic) or author(kyle))" feeds=no actions=yes archive=yes show=0  template=buglist template=buglist]]
+(author(yoh) or author(mih) or author(ben) or author(yarikoptic) or author(kyle) or tagged(projects/datalad))" feeds=no actions=yes archive=yes show=0  template=buglist template=buglist]]
 
 </details>

initial report about fresh test fail
diff --git a/doc/bugs/prop__95__relPathDirToFileAbs__95__basics_fail_on_crippled___126__.mdwn b/doc/bugs/prop__95__relPathDirToFileAbs__95__basics_fail_on_crippled___126__.mdwn
new file mode 100644
index 000000000..9a8cba3a6
--- /dev/null
+++ b/doc/bugs/prop__95__relPathDirToFileAbs__95__basics_fail_on_crippled___126__.mdwn
@@ -0,0 +1,16 @@
+### Please describe the problem.
+
+Fresh build of 8.20201129+git100-g2d84bf992-1~ndall+1 when having HOME (base system is Ubuntu) on a crippled FS leads to
+
+```
+2021-01-13T03:46:45.0619292Z     prop_relPathDirToFileAbs_basics:                      FAIL (0.01s)
+2021-01-13T03:46:45.0620328Z       *** Failed! Falsifiable (after 702 tests):
+2021-01-13T03:46:45.0621015Z       TestableFilePath {fromTestableFilePath = "A"}
+2021-01-13T03:46:45.0621819Z       TestableFilePath {fromTestableFilePath = "A/"}
+2021-01-13T03:46:45.0623170Z       Use --quickcheck-replay=895840 to reproduce.
+```
+
+[https://github.com/datalad/git-annex/runs/1692622842?check_suite_focus=true](https://github.com/datalad/git-annex/runs/1692622842?check_suite_focus=true)
+
+[[!meta author=yoh]]
+[[!tag projects/datalad]]

Added a comment
diff --git a/doc/tips/splitting_a_repository/comment_7_0d11c72712da0cf36b8cc91ba7501d52._comment b/doc/tips/splitting_a_repository/comment_7_0d11c72712da0cf36b8cc91ba7501d52._comment
new file mode 100644
index 000000000..a2099f11a
--- /dev/null
+++ b/doc/tips/splitting_a_repository/comment_7_0d11c72712da0cf36b8cc91ba7501d52._comment
@@ -0,0 +1,15 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 7"
+ date="2021-01-11T22:48:27Z"
+ content="""
+> ```
+> # Regenerate the git annex metadata
+> git annex fsck --fast
+> ```
+
+Exactly what does this command do? What metadata is regenerated and what does \"regenerated\" in this context mean?
+
+"""]]

Added a comment
diff --git a/doc/git-annex-uninit/comment_4_c3df10cdd19a15a80d3aa95373f1d071._comment b/doc/git-annex-uninit/comment_4_c3df10cdd19a15a80d3aa95373f1d071._comment
new file mode 100644
index 000000000..2373c1614
--- /dev/null
+++ b/doc/git-annex-uninit/comment_4_c3df10cdd19a15a80d3aa95373f1d071._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 4"
+ date="2021-01-11T22:42:27Z"
+ content="""
+actually --fast seems to have no affect on the speed. (each file seems to be getting hashed)
+"""]]

Added a comment: git-annex uninit --fast
diff --git a/doc/git-annex-uninit/comment_3_2cf58739c14b8aae6fc12fb1463a303f._comment b/doc/git-annex-uninit/comment_3_2cf58739c14b8aae6fc12fb1463a303f._comment
new file mode 100644
index 000000000..e3f9a10e4
--- /dev/null
+++ b/doc/git-annex-uninit/comment_3_2cf58739c14b8aae6fc12fb1463a303f._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="git-annex uninit --fast"
+ date="2021-01-11T22:41:04Z"
+ content="""
+are there any caveats to using --fast with this command? I assume it will just skip the hash validation.
+"""]]

removed myself as the author, damn cut/paste ;)
diff --git a/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist.mdwn b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist.mdwn
index d9f9703ae..0b64b968e 100644
--- a/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist.mdwn
+++ b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist.mdwn
@@ -28,5 +28,4 @@ More system details are here https://ci.appveyor.com/project/mih/datalad/build/j
 git-annex is critical infrastructure for me. There is no day without it. Thx much!
 
 
-[[!meta author=yoh]]
 [[!tag projects/datalad]]

assigned to datalad project
diff --git a/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist.mdwn b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist.mdwn
index 19a873a3b..d9f9703ae 100644
--- a/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist.mdwn
+++ b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist.mdwn
@@ -26,3 +26,7 @@ More system details are here https://ci.appveyor.com/project/mih/datalad/build/j
 ### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
 
 git-annex is critical infrastructure for me. There is no day without it. Thx much!
+
+
+[[!meta author=yoh]]
+[[!tag projects/datalad]]

fixed
diff --git a/doc/bugs/fresh_test_fails_for___34__trust__58____34___-_trust_failed_.mdwn b/doc/bugs/fresh_test_fails_for___34__trust__58____34___-_trust_failed_.mdwn
index fc21f4d47..add6c98df 100644
--- a/doc/bugs/fresh_test_fails_for___34__trust__58____34___-_trust_failed_.mdwn
+++ b/doc/bugs/fresh_test_fails_for___34__trust__58____34___-_trust_failed_.mdwn
@@ -13,3 +13,5 @@ with 6 of similar fails. See e.g. [https://github.com/datalad/git-annex/runs/166
 
 [[!meta author=yoh]]
 [[!tag projects/datalad]]
+
+> [[fixed|done]] earlier today --[[Joey]]

comment
diff --git a/doc/bugs/windows__58___commits_created_despite_alwayscommit__61__fals/comment_4_65647b8423127d450b0d1753767943d6._comment b/doc/bugs/windows__58___commits_created_despite_alwayscommit__61__fals/comment_4_65647b8423127d450b0d1753767943d6._comment
new file mode 100644
index 000000000..5733ae85c
--- /dev/null
+++ b/doc/bugs/windows__58___commits_created_despite_alwayscommit__61__fals/comment_4_65647b8423127d450b0d1753767943d6._comment
@@ -0,0 +1,12 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 4"""
+ date="2021-01-11T18:47:54Z"
+ content="""
+The script only makes one commit of the metadata when I run it on linux.
+(Including in an adjusted unlocked branch like windows uses.)
+
+Since you're using `git -c`, git uses an env var to propagate that setting
+on to child processes. So it's possible something on windows prevents that
+working.
+"""]]

close
diff --git a/doc/todo/provide_windows_build_with_MagicMime.mdwn b/doc/todo/provide_windows_build_with_MagicMime.mdwn
index 4e8c82abd..79de2f64f 100644
--- a/doc/todo/provide_windows_build_with_MagicMime.mdwn
+++ b/doc/todo/provide_windows_build_with_MagicMime.mdwn
@@ -5,4 +5,5 @@ Without such functionality we cannot consistently (cross-platform) use git-annex
 
 [[!meta author=yoh]]
 [[!tag projects/datalad]]
-[[!tag unlikely]]
+
+> [[done]] --[[Joey]] 

idea
diff --git a/doc/todo/dynamic_stall_detection.mdwn b/doc/todo/dynamic_stall_detection.mdwn
new file mode 100644
index 000000000..7a1fc638e
--- /dev/null
+++ b/doc/todo/dynamic_stall_detection.mdwn
@@ -0,0 +1,21 @@
+annex.stalldetection lets remotes be configured with a minimum throughput
+to detect and retry stalls. But most users are not going to configure this. 
+Could something be done to dynamically detect a stall, without configuration?
+
+Eg, wait until data starts to flow, and then check if there's at least some
+data being sent each minute. If so, the progress display is being updated
+at least every minute. So then if 2 minutes go by without more data
+flowing, it's almost certainly stalled. And if the progress display is
+updated less frequently, see if it's updated every 2 minutes, etc. Although
+realistically, progress displays are updated every chunk, and there's
+typically more than 1 chunk per minute. So longer durations than 1 minute
+may be an unncessary complication. And a couple of minutes to detect a
+stall is fine.
+
+It may still need a config to turn it on, because running
+transfers in separate processes can lead to more resource use, or even
+password prompting, which could be annoying to existing users. Also, if it
+gets it wrong and the remote does not support resuming transfers,
+defaulting to on could lead to bad waste of resources. It could
+detect stalls even when not turned on, but only display a message
+suggesting enabling the config. --[[Joey]]

Added a comment: Specific to repository location in /private
diff --git a/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_1_88439da4b9df6e07aa0300700b66d910._comment b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_1_88439da4b9df6e07aa0300700b66d910._comment
new file mode 100644
index 000000000..324a2fbe7
--- /dev/null
+++ b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist/comment_1_88439da4b9df6e07aa0300700b66d910._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="michael.hanke@c60e12358aa3fc6060531bdead1f530ac4d582ec"
+ nickname="michael.hanke"
+ avatar="http://cdn.libravatar.org/avatar/f881df265a423e4f24eff27c623148fd"
+ subject="Specific to repository location in /private"
+ date="2021-01-10T16:10:42Z"
+ content="""
+The issue goes away, when I place the directory in which repositories are created into the user's HOME directory (instead of `/tmp`, which maps to some place under `/private`).
+"""]]

diff --git a/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist.mdwn b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist.mdwn
new file mode 100644
index 000000000..19a873a3b
--- /dev/null
+++ b/doc/bugs/git-annex_get__58___createDirectory__58___does_not_exist.mdwn
@@ -0,0 +1,28 @@
+### Please describe the problem.
+
+A `git-annex get` fails with `createDirectory: does not exist (No such file or directory)` on MacOSX.
+
+### What steps will reproduce the problem?
+
+I can trigger the condition as part of a CI run of the DataLad test suite. Here is an example
+run that shows the failure: https://ci.appveyor.com/project/mih/datalad/build/job/k5u263619e6erk8t
+However, the exact conditions required to trigger the issue are not yet known (c.f. https://github.com/datalad/datalad/issues/5291).
+
+A protocol of an exploration of this issue with debug output is here: https://github.com/datalad/datalad/issues/5301#issuecomment-757467813
+
+In this end I can trigger the error with a `mkdir` performed manually in the shell using the path reported by git-annex (e.g. `../../../../../../../var/folders/5s/g225f6nd6jl4g8tshbh1ltk40000gn/T/datalad_temp_tree_test_basic_scenariodi3ady04/.git/annex/`). But a `mkdir` is successful using a "normalized variant of the path
+pointing to the same physical directory (e.g. `/var/folders/5s/g225f6nd6jl4g8tshbh1ltk40000gn/T/datalad_temp_tree_test_basic_scenariodi3ady04/.git/annex`).
+
+I do not understand enough of this platform to understand what is happening, but it seems that some of these folders actually live in `/private/var`, despite being accessible via `/var`, but I do not see how `mkdir` would error on a relative path and succeed on an absolute one.
+
+
+### What version of git-annex are you using? On what operating system?
+
+8.20201129 on darwin/19.6.0 10.15.7/x86_64
+
+More system details are here https://ci.appveyor.com/project/mih/datalad/build/job/k5u263619e6erk8t#L422
+
+
+### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
+
+git-annex is critical infrastructure for me. There is no day without it. Thx much!

Added a comment
diff --git a/doc/forum/Backup_of_whole_Linux_system/comment_2_af49edff9174c2ee17391e78ac97d5f3._comment b/doc/forum/Backup_of_whole_Linux_system/comment_2_af49edff9174c2ee17391e78ac97d5f3._comment
new file mode 100644
index 000000000..e34b8665e
--- /dev/null
+++ b/doc/forum/Backup_of_whole_Linux_system/comment_2_af49edff9174c2ee17391e78ac97d5f3._comment
@@ -0,0 +1,11 @@
+[[!comment format=mdwn
+ username="AlbertZeyer"
+ avatar="http://cdn.libravatar.org/avatar/b37d71961a6a5abf9b7184ed77b5a941"
+ subject="comment 2"
+ date="2021-01-09T20:14:21Z"
+ content="""
+Yes, I know, but I was basically asking whether anyone has developed an extension already to store such meta data (per path, not content), or done sth similar. Or maybe you could also interpret it as a feature request. Or basically I was just curious whether it makes sense to add some feature like that. I guess this would not be too complicated. So I'm mainly curious whether there are other problems which I don't see right now, or if this is a good or bad idea in general.
+
+Using the Borg special remote sounds like an interesting workaround.
+
+"""]]

Added a comment: checkout borg remote
diff --git a/doc/forum/Backup_of_whole_Linux_system/comment_1_bf48d93872bfadb5daaf02a10e699b79._comment b/doc/forum/Backup_of_whole_Linux_system/comment_1_bf48d93872bfadb5daaf02a10e699b79._comment
new file mode 100644
index 000000000..908599abd
--- /dev/null
+++ b/doc/forum/Backup_of_whole_Linux_system/comment_1_bf48d93872bfadb5daaf02a10e699b79._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="andrew"
+ avatar="http://cdn.libravatar.org/avatar/acc0ece1eedf07dd9631e7d7d343c435"
+ subject="checkout borg remote"
+ date="2021-01-09T15:25:05Z"
+ content="""
+git and git-annex do not store a lot of filesystem metadata. [git-annex metadata](https://git-annex.branchable.com/metadata/) actually stores metadata attached to the file key, which means the metadata is attached to the file content not the file path. Filesystems attach metadata to the file location (within a file tree), so if you have two files with the same content but different permissions in different folders you couldn't represent that information using git-annex metadata.
+
+You might checkout using the `git-annex` [borg special remote](https://git-annex.branchable.com/special_remotes/borg/). You could backup your whole linux system to a borg repository (using standard borg commands). Then you can add that borg repo as a git-annex borg special remote so you could access the files from a git-annex perspective.
+"""]]

report on fresh test fails
diff --git a/doc/bugs/fresh_test_fails_for___34__trust__58____34___-_trust_failed_.mdwn b/doc/bugs/fresh_test_fails_for___34__trust__58____34___-_trust_failed_.mdwn
new file mode 100644
index 000000000..fc21f4d47
--- /dev/null
+++ b/doc/bugs/fresh_test_fails_for___34__trust__58____34___-_trust_failed_.mdwn
@@ -0,0 +1,15 @@
+### Please describe the problem.
+
+Started to happen with today's build of annex for  8.20201129+git80-g1e65d1b9a-1~ndall+1 and was ok for  8.20201129+git73-g0e10402ef-1~ndall+1
+
+```
+    trust:                                                FAIL (0.28s)
+      ./Test/Framework.hs:57:
+      trust failed (transcript follows)
+      trust origin git-annex: Trusting a repository can lead to data loss.If you're sure you know what you're doing, use --force tomake this take effect.If you choose to do so, bear in mind that any time you dropcontent from origin, you will risk losing data.failedgit-annex: trust: 1 failed
+```
+
+with 6 of similar fails. See e.g. [https://github.com/datalad/git-annex/runs/1666536034?check_suite_focus=true](https://github.com/datalad/git-annex/runs/1666536034?check_suite_focus=true) for more detail
+
+[[!meta author=yoh]]
+[[!tag projects/datalad]]

merged fix from kyle
diff --git a/CHANGELOG b/CHANGELOG
index d7992f8c9..2d5a7f545 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -20,6 +20,9 @@ git-annex (8.20201130) UNRELEASED; urgency=medium
     be used.
   * Fix --time-limit, which got broken in several ways by some optimisations
     in version 8.20201007.
+  * When syncing changes back from an adjusted branch to the basis branch,
+    include deletions of submodules.
+    Thanks, Kyle Meyer for the patch.
 
  -- Joey Hess <id@joeyh.name>  Mon, 04 Jan 2021 12:52:41 -0400
 
diff --git a/doc/bugs/Submodule_deletion_not_synced_from_adjusted_branch.mdwn b/doc/bugs/Submodule_deletion_not_synced_from_adjusted_branch.mdwn
index 31dda72e2..4c47a35a4 100644
--- a/doc/bugs/Submodule_deletion_not_synced_from_adjusted_branch.mdwn
+++ b/doc/bugs/Submodule_deletion_not_synced_from_adjusted_branch.mdwn
@@ -121,3 +121,4 @@ base-commit: f3546976483aa4c29e1050081af6d5a03290e25b
 [[!meta author=kyle]]
 [[!tag projects/datalad]]
 
+> [[done]] thanks! --[[Joey]]
diff --git a/doc/bugs/Submodule_deletion_not_synced_from_adjusted_branch/comment_1_6473d7eceae5141e36c87156044b9d06._comment b/doc/bugs/Submodule_deletion_not_synced_from_adjusted_branch/comment_1_6473d7eceae5141e36c87156044b9d06._comment
new file mode 100644
index 000000000..951954194
--- /dev/null
+++ b/doc/bugs/Submodule_deletion_not_synced_from_adjusted_branch/comment_1_6473d7eceae5141e36c87156044b9d06._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2021-01-07T17:41:49Z"
+ content="""
+Clearly correct. The `_` pattern match hid this when
+[[!commit a13c0ce66c]] added TreeCommit.
+"""]]

Behavior change: --trust-glacier option no longer overrides trust
Since that can lead to data loss, which should never be enabled by an
option other than --force.
This commit was sponsored by Jake Vosloo on Patreon.
diff --git a/CHANGELOG b/CHANGELOG
index 4c16316a2..d7992f8c9 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -10,9 +10,9 @@ git-annex (8.20201130) UNRELEASED; urgency=medium
     behavior, mincopies also needs to be set to 0.
   * Behavior change: git-annex trust now needs --force, since unconsidered
     use of trusted repositories can lead to data loss.
-  * Behavior change: --trust option no longer overrides trust, since
-    that can lead to data loss, which should never be enabled by an option
-    other than --force.
+  * Behavior change: --trust and --trust-glacier options no longer overrides
+    trust, since that can lead to data loss, which should never be enabled
+    by an option other than --force.
   * add: Significantly speed up adding lots of non-large files to git,
     by disabling the annex smudge filter when running git add.
   * add --force-small: Run git add rather than updating the index itself,
diff --git a/CmdLine/GitAnnex/Options.hs b/CmdLine/GitAnnex/Options.hs
index d015cdeb7..ecb206b2c 100644
--- a/CmdLine/GitAnnex/Options.hs
+++ b/CmdLine/GitAnnex/Options.hs
@@ -81,9 +81,9 @@ gitAnnexGlobalOptions = commonGlobalOptions ++
 		<> help "override default User-Agent"
 		<> hidden
 		)
-	, globalFlag (Annex.setFlag "trustglacier")
+	, globalFlag (toplevelWarning False "--trust-glacier no longer has any effect")
 		( long "trust-glacier"
-		<> help "Trust Amazon Glacier inventory"
+		<> help "deprecated, does not trust Amazon Glacier inventory"
 		<> hidden
 		)
 	, globalFlag (setdesktopnotify mkNotifyFinish)
diff --git a/Remote/Glacier.hs b/Remote/Glacier.hs
index 5fbebc8bd..5b6f1ce93 100644
--- a/Remote/Glacier.hs
+++ b/Remote/Glacier.hs
@@ -1,6 +1,6 @@
 {- Amazon Glacier remotes.
  -
- - Copyright 2012-2020 Joey Hess <id@joeyh.name>
+ - Copyright 2012-2021 Joey Hess <id@joeyh.name>
  -
  - Licensed under the GNU AGPL version 3 or higher.
  -}
@@ -23,7 +23,6 @@ import Remote.Helper.ExportImport
 import qualified Remote.Helper.AWS as AWS
 import Creds
 import Utility.Metered
-import qualified Annex
 import Annex.UUID
 import Utility.Env
 import Types.ProposedAccepted
@@ -233,8 +232,7 @@ checkKey r k = do
 		s <- liftIO $ readProcessEnv "glacier" (toCommand params) (Just e)
 		let probablypresent = serializeKey k `elem` lines s
 		if probablypresent
-			then ifM (Annex.getFlag "trustglacier")
-				( return True, giveup untrusted )
+			then giveup untrusted
 			else return False
 
 	params = glacierParams (config r)
@@ -248,8 +246,6 @@ checkKey r k = do
 	untrusted = unlines
 			[ "Glacier's inventory says it has a copy."
 			, "However, the inventory could be out of date, if it was recently removed."
-			, "(Use --trust-glacier if you're sure it's still in Glacier.)"
-			, ""
 			]
 
 glacierAction :: Remote -> [CommandParam] -> Annex Bool
diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn
index 7d4c9b34f..513c3755e 100644
--- a/doc/git-annex.mdwn
+++ b/doc/git-annex.mdwn
@@ -813,14 +813,9 @@ may not be explicitly listed on their individual man pages.
 
 * `--trust-glacier`
 
-  Amazon Glacier inventories take hours to retrieve, and may not represent
-  the current state of a repository. So git-annex does not trust that
-  files that the inventory claims are in Glacier are really there.
-  This switch can be used to allow it to trust the inventory.
-
-  Be careful using this, especially if you or someone else might have recently
-  removed a file from Glacier. If you try to drop the only other copy of the
-  file, and this switch is enabled, you could lose data!
+  This used to override trust settings for Glacier special remotes,
+  but now will not do so, because it could lead to data loss,
+  and data loss is now only enabled when using the `--force` option.
 
 * `--backend=name`
 
diff --git a/doc/tips/using_Amazon_Glacier.mdwn b/doc/tips/using_Amazon_Glacier.mdwn
index 402e50a9d..501f4d005 100644
--- a/doc/tips/using_Amazon_Glacier.mdwn
+++ b/doc/tips/using_Amazon_Glacier.mdwn
@@ -59,13 +59,12 @@ So, git-annex plays it safe, and avoids trusting the inventory:
 	drop important_file (gpg) (checking glacier...)
 	  Glacier's inventory says it has a copy.
 	  However, the inventory could be out of date, if it was recently removed.
-	  (Use --trust-glacier if you're sure it's still in Glacier.)
 	
 	(unsafe) 
 	  Could only verify the existence of 0 out of 1 necessary copies
 
-Like it says, you can use `--trust-glacier` if you're sure
-Glacier's inventory is correct and up-to-date.
+To avoid this problem, you can either use `git annex move` to move
+content to Glacier, or you can set the remote to be [[trusted]].
 
 A final potential gotcha with Glacier is that glacier-cli keeps a local
 mapping of file names to Glacier archives. If this cache is lost, or

Behavior change: --trust option no longer overrides trust
Since that can lead to data loss, which should never be enabled by an
option other than --force.
I suppose that using --trust was in some situation, safer than --force,
because it doesn't entirely disable checking for data loss, but only
disables checking involving data that is on the specified repository.
But it seems better to be able to say that data loss only happens with
--force.
This commit was sponsored by Graham Spencer on Patreon.
diff --git a/CHANGELOG b/CHANGELOG
index ff2fa9adb..4c16316a2 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -10,6 +10,9 @@ git-annex (8.20201130) UNRELEASED; urgency=medium
     behavior, mincopies also needs to be set to 0.
   * Behavior change: git-annex trust now needs --force, since unconsidered
     use of trusted repositories can lead to data loss.
+  * Behavior change: --trust option no longer overrides trust, since
+    that can lead to data loss, which should never be enabled by an option
+    other than --force.
   * add: Significantly speed up adding lots of non-large files to git,
     by disabling the annex smudge filter when running git add.
   * add --force-small: Run git add rather than updating the index itself,
diff --git a/CmdLine/GitAnnex/Options.hs b/CmdLine/GitAnnex/Options.hs
index 568fc7de4..d015cdeb7 100644
--- a/CmdLine/GitAnnex/Options.hs
+++ b/CmdLine/GitAnnex/Options.hs
@@ -55,7 +55,7 @@ gitAnnexGlobalOptions = commonGlobalOptions ++
 		)
 	, globalSetter (Remote.forceTrust Trusted) $ strOption
 		( long "trust" <> metavar paramRemote
-		<> help "override trust setting"
+		<> help "deprecated, does not override trust setting"
 		<> hidden
 		<> completeRemotes
 		)
diff --git a/Remote.hs b/Remote.hs
index 1d6250f9e..588457f72 100644
--- a/Remote.hs
+++ b/Remote.hs
@@ -384,8 +384,10 @@ listRemoteNames remotes = intercalate ", " (map name remotes)
 forceTrust :: TrustLevel -> String -> Annex ()
 forceTrust level remotename = do
 	u <- nameToUUID remotename
-	Annex.changeState $ \s ->
-		s { Annex.forcetrust = M.insert u level (Annex.forcetrust s) }
+	if level >= Trusted
+		then toplevelWarning False "Ignoring request to trust repository, because that can lead to data loss."
+		else Annex.changeState $ \s ->
+			s { Annex.forcetrust = M.insert u level (Annex.forcetrust s) }
 
 {- Used to log a change in a remote's having a key. The change is logged
  - in the local repo, not on the remote. The process of transferring the
diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn
index 7571fdfd3..7d4c9b34f 100644
--- a/doc/git-annex.mdwn
+++ b/doc/git-annex.mdwn
@@ -797,7 +797,6 @@ may not be explicitly listed on their individual man pages.
   Also, note that if the time limit prevents git-annex from doing all it
   was asked to, it will exit with a special code, 101.
 
-* `--trust=repository`
 * `--semitrust=repository`
 * `--untrust=repository`
 
@@ -806,6 +805,12 @@ may not be explicitly listed on their individual man pages.
   The repository should be specified using the name of a configured remote,
   or the UUID or description of a repository.
 
+* `--trust=repository`
+
+  This used to override trust settings for a repository, but now will
+  not do so, because trusting a repository can lead to data loss,
+  and data loss is now only enabled when using the `--force` option.
+
 * `--trust-glacier`
 
   Amazon Glacier inventories take hours to retrieve, and may not represent
diff --git a/doc/trust.mdwn b/doc/trust.mdwn
index 75781b7ac..b348a6dc0 100644
--- a/doc/trust.mdwn
+++ b/doc/trust.mdwn
@@ -44,10 +44,6 @@ information for a repository. For example, it may be an offline
 archival drive, from which you rarely or never remove content. Deciding
 when it makes sense to trust the tracking info is up to you.
 
-One way to handle this is just to use `--force` when a command cannot
-access a remote you trust. Or to use `--trust` to specify a repository to
-trust temporarily.
-
 To configure a repository as fully and permanently trusted,
 use the [[git-annex-trust]] command.
 

Behavior change: git-annex trust now needs --force
Since unconsidered use of trusted repositories can lead to data loss.
Trusted has always been this way, but it used to be acceptable for
git-annex to be set up so that data could be lost without using --force,
and most or all other ways that can happen have already been eliminated.
This commit was sponsored by Mark Reidenbach on Patreon.
diff --git a/CHANGELOG b/CHANGELOG
index 9006934c4..ff2fa9adb 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -8,6 +8,8 @@ git-annex (8.20201130) UNRELEASED; urgency=medium
   * Behavior change: When numcopies is set to 0, git-annex used to drop
     content without requiring any copies. Now to get that (highly unsafe)
     behavior, mincopies also needs to be set to 0.
+  * Behavior change: git-annex trust now needs --force, since unconsidered
+    use of trusted repositories can lead to data loss.
   * add: Significantly speed up adding lots of non-large files to git,
     by disabling the annex smudge filter when running git add.
   * add --force-small: Run git add rather than updating the index itself,
diff --git a/Command/Trust.hs b/Command/Trust.hs
index b056c566e..9eb538de8 100644
--- a/Command/Trust.hs
+++ b/Command/Trust.hs
@@ -1,6 +1,6 @@
 {- git-annex command
  -
- - Copyright 2010, 2014 Joey Hess <id@joeyh.name>
+ - Copyright 2010-2021 Joey Hess <id@joeyh.name>
  -
  - Licensed under the GNU AGPL version 3 or higher.
  -}
@@ -9,6 +9,7 @@ module Command.Trust where
 
 import Command
 import qualified Remote
+import qualified Annex
 import Types.TrustLevel
 import Logs.Trust
 import Logs.Group
@@ -29,8 +30,11 @@ trustCommand c level = withWords (commandAction . start)
 		let name = unwords ws
 		u <- Remote.nameToUUID name
 		let si = SeekInput ws
-		starting c (ActionItemOther (Just name)) si (perform u)
-	perform uuid = do
+		starting c (ActionItemOther (Just name)) si (perform name u)
+	perform name uuid = do
+		when (level >= Trusted) $
+			unlessM (Annex.getState Annex.force) $
+				giveup $ trustedNeedsForce name
 		trustSet uuid level
 		when (level == DeadTrusted) $
 			groupSet uuid S.empty
@@ -38,3 +42,14 @@ trustCommand c level = withWords (commandAction . start)
 		when (l /= level) $
 			warning $ "This remote's trust level is overridden to " ++ showTrustLevel l ++ "."
 		next $ return True
+
+trustedNeedsForce :: String -> String
+trustedNeedsForce name = unlines
+	[ "Trusting a repository can lead to data loss."
+	, ""
+	, "If you're sure you know what you're doing, use --force to"
+	, "make this take effect."
+	, ""
+	, "If you choose to do so, bear in mind that any time you drop"
+	, "content from " ++ name ++ ", you will risk losing data."
+	]
diff --git a/doc/git-annex-trust.mdwn b/doc/git-annex-trust.mdwn
index d8adbba0b..f29ced7ed 100644
--- a/doc/git-annex-trust.mdwn
+++ b/doc/git-annex-trust.mdwn
@@ -14,6 +14,13 @@ content. Use with care.
 Repositories can be specified using their remote name, their
 description, or their UUID. To trust the current repository, use "here".
 
+Before trusting a repository, consider this scenario. Repository A
+is trusted and B is not; both contain the same content. `git-annex drop`
+is run on repository A, which checks that B still contains the content,
+and so the drop proceeds. Then `git-annex drop` is run on repository B,
+which trusts A to still contain the content, so the drop succeeds. Now
+the content has been lost.
+
 # SEE ALSO
 
 [[git-annex]](1)

diff --git a/doc/bugs/Submodule_deletion_not_synced_from_adjusted_branch.mdwn b/doc/bugs/Submodule_deletion_not_synced_from_adjusted_branch.mdwn
new file mode 100644
index 000000000..31dda72e2
--- /dev/null
+++ b/doc/bugs/Submodule_deletion_not_synced_from_adjusted_branch.mdwn
@@ -0,0 +1,123 @@
+This is conceptually a continuation of a previous issue:
+https://git-annex.branchable.com/bugs/Sync_of_adjusted_branch_does_not_propagate_changed_submodule_commit/
+
+That was about `git annex sync` propagating submodule modifications
+back to the main branch.  When looking into debugging a DataLad test
+failure involving adjusted branches, I realized that there is still an
+issue with submodule *deletions* not being propagated back.
+
+Here's a demo:
+
+[[!format sh """
+set -eu
+
+cd "$(mktemp -d "${TMPDIR:-/tmp}"/ga-sync-sub-del-XXXX)"
+
+git init
+git commit -mc0 --allow-empty
+git init sub
+git -C sub commit -m'c0 sub' --allow-empty
+git submodule add --name sub ./sub
+git commit -m'add sub'
+
+git annex init
+git annex adjust --unlock
+
+git rm sub
+git commit -m'remove sub'
+git annex sync
+
+git log --format="* %s %d" -p -2
+"""]]
+
+With a git-annex built from 8.20201129-72-gf35469764, the submodule
+deletion remains on the adjusted branch:
+
+```
+[...]
+* git-annex adjusted branch  (HEAD -> adjusted/master(unlocked))
+
+diff --git a/sub b/sub
+deleted file mode 160000
+index 95031de..0000000
+--- a/sub
++++ /dev/null
+@@ -1 +0,0 @@
+-Subproject commit 95031de9306714e09dc535246ef77b9e155999be
+* remove sub  (synced/master, master, refs/basis/adjusted/master(unlocked))
+
+diff --git a/.gitmodules b/.gitmodules
+index c489803..e69de29 100644
+--- a/.gitmodules
++++ b/.gitmodules
+@@ -1,3 +0,0 @@
+-[submodule "sub"]
+-	path = sub
+-	url = ./sub
+```
+
+I tried my hand at fixing this with the attached patch.  I'm not
+confident that it's the right fix, but with it the output of the above
+demo looks as I'd expect (and the git-annex tests still pass):
+
+```
+[...]
+* git-annex adjusted branch  (HEAD -> adjusted/master(unlocked))
+* remove sub  (synced/master, master, refs/basis/adjusted/master(unlocked))
+
+diff --git a/.gitmodules b/.gitmodules
+index c489803..e69de29 100644
+--- a/.gitmodules
++++ b/.gitmodules
+@@ -1,3 +0,0 @@
+-[submodule "sub"]
+-	path = sub
+-	url = ./sub
+diff --git a/sub b/sub
+deleted file mode 160000
+index 102197a..0000000
+--- a/sub
++++ /dev/null
+@@ -1 +0,0 @@
+-Subproject commit 102197a8b5692dc07dde7c1f6dd2f51c7ec6834e
+```
+
+Thanks for taking a look.
+
+[[!format patch """
+From e7abf01499fbd5593044889da529834e1b2999bc Mon Sep 17 00:00:00 2001
+From: Kyle Meyer <kyle@kyleam.com>
+Date: Wed, 6 Jan 2021 19:16:30 -0500
+Subject: [PATCH] adjustTree: Consider submodule deletions
+
+In addition to regular file deletions, the removefiles argument passed
+to adjustTree may contain removed submodules.  When making the new
+tree, filter these out in the same way that is done for regular files
+so that the deletion is propagated.
+---
+ Git/Tree.hs | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/Git/Tree.hs b/Git/Tree.hs
+index 491314fff..d5ac59ea7 100644
+--- a/Git/Tree.hs
++++ b/Git/Tree.hs
+@@ -259,6 +259,7 @@ adjustTree adjusttreeitem addtreeitems resolveaddconflict removefiles r repo =
+ 
+ 	removeset = S.fromList $ map (normalise . gitPath) removefiles
+ 	removed (TreeBlob f _ _) = S.member (normalise (gitPath f)) removeset
++	removed (TreeCommit f _ _) = S.member (normalise (gitPath f)) removeset
+ 	removed _ = False
+ 
+ 	addoldnew [] new = new
+
+base-commit: f3546976483aa4c29e1050081af6d5a03290e25b
+-- 
+2.30.0.284.gd98b1dd5ea
+
+"""]]
+
+
+[[!meta author=kyle]]
+[[!tag projects/datalad]]
+

improve wording
diff --git a/doc/git-annex-mincopies.mdwn b/doc/git-annex-mincopies.mdwn
index 9f2ad9251..57daf0932 100644
--- a/doc/git-annex-mincopies.mdwn
+++ b/doc/git-annex-mincopies.mdwn
@@ -18,16 +18,12 @@ by all clones of the repository. It can be overridden on a per-file basis
 by the annex.mincopies setting in .gitattributes files, or can be
 overridden temporarily with the --mincopies option.
 
-When git-annex is asked to drop a file, it first verifies that the
-number of copies can be satisfied among all the other
-repositories that have a copy of the file.
-
 This supplements the [[git-annex-numcopies]](1) setting. 
 In unusual situations, involving special remotes that do not support
 locking, and concurrent drops of the same content from multiple
 repositories, git-annex may violate the numcopies setting.
-In these unusual situations, git-annex ensures that
-the mincopies setting is not violated.
+In these unusual situations, git-annex ensures that the number of copies
+never goes below mincopies.
 
 # SEE ALSO
 

Merge branch 'master' into requirednumcopies
mincopies
This is conceptually very simple, just making a 1 that was hard coded be
exposed as a config option. The hard part was plumbing all that, and
dealing with complexities like reading it from git attributes at the
same time that numcopies is read.
Behavior change: When numcopies is set to 0, git-annex used to drop
content without requiring any copies. Now to get that (highly unsafe)
behavior, mincopies also needs to be set to 0. It seemed better to
remove that edge case, than complicate mincopies by ignoring it when
numcopies is 0.
This commit was sponsored by Denis Dzyubenko on Patreon.
diff --git a/Annex.hs b/Annex.hs
index bbdee2f06..dc3466e3e 100644
--- a/Annex.hs
+++ b/Annex.hs
@@ -133,7 +133,9 @@ data AnnexState = AnnexState
 	, checkignorehandle :: Maybe (ResourcePool CheckIgnoreHandle)
 	, forcebackend :: Maybe String
 	, globalnumcopies :: Maybe NumCopies
+	, globalmincopies :: Maybe MinCopies
 	, forcenumcopies :: Maybe NumCopies
+	, forcemincopies :: Maybe MinCopies
 	, limit :: ExpandableMatcher Annex
 	, timelimit :: Maybe (Duration, POSIXTime)
 	, uuiddescmap :: Maybe UUIDDescMap
@@ -202,7 +204,9 @@ newState c r = do
 		, checkignorehandle = Nothing
 		, forcebackend = Nothing
 		, globalnumcopies = Nothing
+		, globalmincopies = Nothing
 		, forcenumcopies = Nothing
+		, forcemincopies = Nothing
 		, limit = BuildingMatcher []
 		, timelimit = Nothing
 		, uuiddescmap = Nothing
diff --git a/Annex/CheckAttr.hs b/Annex/CheckAttr.hs
index a532c76df..2de4fbc8a 100644
--- a/Annex/CheckAttr.hs
+++ b/Annex/CheckAttr.hs
@@ -1,4 +1,4 @@
-{- git check-attr interface, with handle automatically stored in the Annex monad
+{- git check-attr interface
  -
  - Copyright 2012-2020 Joey Hess <id@joeyh.name>
  -
@@ -7,6 +7,7 @@
 
 module Annex.CheckAttr (
 	checkAttr,
+	checkAttrs,
 	checkAttrStop,
 	mkConcurrentCheckAttrHandle,
 ) where
@@ -22,14 +23,19 @@ import Annex.Concurrent.Utility
 annexAttrs :: [Git.Attr]
 annexAttrs =
 	[ "annex.backend"
-	, "annex.numcopies"
 	, "annex.largefiles"
+	, "annex.numcopies"
+	, "annex.mincopies"
 	]
 
 checkAttr :: Git.Attr -> RawFilePath -> Annex String
 checkAttr attr file = withCheckAttrHandle $ \h -> 
 	liftIO $ Git.checkAttr h attr file
 
+checkAttrs :: [Git.Attr] -> RawFilePath -> Annex [String]
+checkAttrs attrs file = withCheckAttrHandle $ \h -> 
+	liftIO $ Git.checkAttrs h attrs file
+
 withCheckAttrHandle :: (Git.CheckAttrHandle -> Annex a) -> Annex a
 withCheckAttrHandle a = 
 	maybe mkpool go =<< Annex.getState Annex.checkattrhandle
diff --git a/Annex/Drop.hs b/Annex/Drop.hs
index 00ca4d88a..08654ff22 100644
--- a/Annex/Drop.hs
+++ b/Annex/Drop.hs
@@ -1,6 +1,6 @@
 {- dropping of unwanted content
  -
- - Copyright 2012-2014 Joey Hess <id@joeyh.name>
+ - Copyright 2012-2021 Joey Hess <id@joeyh.name>
  -
  - Licensed under the GNU AGPL version 3 or higher.
  -}
@@ -63,23 +63,30 @@ handleDropsFrom locs rs reason fromhere key afile si preverified runner = do
   where
 	getcopies fs = do
 		(untrusted, have) <- trustPartition UnTrusted locs
-		numcopies <- if null fs
-			then getNumCopies
-			else maximum <$> mapM getFileNumCopies fs
-		return (NumCopies (length have), numcopies, S.fromList untrusted)
+		(numcopies, mincopies) <- if null fs
+			then (,) <$> getNumCopies <*> getMinCopies
+			else do
+				l <- mapM getFileNumMinCopies fs
+				return (maximum $ map fst l, maximum $ map snd l)
+		return (NumCopies (length have), numcopies, mincopies, S.fromList untrusted)
 
 	{- Check that we have enough copies still to drop the content.
 	 - When the remote being dropped from is untrusted, it was not
 	 - counted as a copy, so having only numcopies suffices. Otherwise,
-	 - we need more than numcopies to safely drop. -}
-	checkcopies (have, numcopies, _untrusted) Nothing = have > numcopies
-	checkcopies (have, numcopies, untrusted) (Just u)
+	 - we need more than numcopies to safely drop.
+	 -
+	 - This is not the final check that it's safe to drop, but it
+	 - avoids doing extra work to do that check later in cases where it
+	 - will surely fail.
+	 -}
+	checkcopies (have, numcopies, _mincopies, _untrusted) Nothing = have > numcopies
+	checkcopies (have, numcopies, _mincopies, untrusted) (Just u)
 		| S.member u untrusted = have >= numcopies
 		| otherwise = have > numcopies
 	
-	decrcopies (have, numcopies, untrusted) Nothing =
-		(NumCopies (fromNumCopies have - 1), numcopies, untrusted)
-	decrcopies v@(_have, _numcopies, untrusted) (Just u)
+	decrcopies (have, numcopies, mincopies, untrusted) Nothing =
+		(NumCopies (fromNumCopies have - 1), numcopies, mincopies, untrusted)
+	decrcopies v@(_have, _numcopies, _mincopies, untrusted) (Just u)
 		| S.member u untrusted = v
 		| otherwise = decrcopies v Nothing
 
@@ -105,8 +112,8 @@ handleDropsFrom locs rs reason fromhere key afile si preverified runner = do
 				, return n
 				)
 
-	dodrop n@(have, numcopies, _untrusted) u a = 
-		ifM (safely $ runner $ a numcopies)
+	dodrop n@(have, numcopies, mincopies, _untrusted) u a = 
+		ifM (safely $ runner $ a numcopies mincopies)
 			( do
 				liftIO $ debugM "drop" $ unwords
 					[ "dropped"
@@ -121,12 +128,12 @@ handleDropsFrom locs rs reason fromhere key afile si preverified runner = do
 			, return n
 			)
 
-	dropl fs n = checkdrop fs n Nothing $ \numcopies ->
+	dropl fs n = checkdrop fs n Nothing $ \numcopies mincopies ->
 		stopUnless (inAnnex key) $
-			Command.Drop.startLocal afile ai si numcopies key preverified
+			Command.Drop.startLocal afile ai si numcopies mincopies key preverified
 
-	dropr fs r n  = checkdrop fs n (Just $ Remote.uuid r) $ \numcopies ->
-		Command.Drop.startRemote afile ai si numcopies key r
+	dropr fs r n  = checkdrop fs n (Just $ Remote.uuid r) $ \numcopies mincopies ->
+		Command.Drop.startRemote afile ai si numcopies mincopies key r
 
 	ai = mkActionItem (key, afile)
 
diff --git a/Annex/NumCopies.hs b/Annex/NumCopies.hs
index 7b80e4c48..b76d71bda 100644
--- a/Annex/NumCopies.hs
+++ b/Annex/NumCopies.hs
@@ -1,6 +1,6 @@
 {- git-annex numcopies configuration and checking
  -
- - Copyright 2014-2015 Joey Hess <id@joeyh.name>
+ - Copyright 2014-2021 Joey Hess <id@joeyh.name>
  -
  - Licensed under the GNU AGPL version 3 or higher.
  -}
@@ -10,10 +10,11 @@
 module Annex.NumCopies (
 	module Types.NumCopies,
 	module Logs.NumCopies,
-	getFileNumCopies,
-	getAssociatedFileNumCopies,
+	getFileNumMinCopies,
+	getAssociatedFileNumMinCopies,
 	getGlobalFileNumCopies,
 	getNumCopies,
+	getMinCopies,
 	deprecatedNumCopies,
 	defaultNumCopies,
 	numCopiesCheck,
@@ -41,8 +42,11 @@ import Data.Typeable
 defaultNumCopies :: NumCopies
 defaultNumCopies = NumCopies 1
 
-fromSources :: [Annex (Maybe NumCopies)] -> Annex NumCopies
-fromSources = fromMaybe defaultNumCopies <$$> getM id
+defaultMinCopies :: MinCopies
+defaultMinCopies = MinCopies 1
+
+fromSourcesOr :: v -> [Annex (Maybe v)] -> Annex v
+fromSourcesOr v = fromMaybe v <$$> getM id
 
 {- The git config annex.numcopies is deprecated. -}
 deprecatedNumCopies :: Annex (Maybe NumCopies)
@@ -52,41 +56,93 @@ deprecatedNumCopies = annexNumCopies <$> Annex.getGitConfig
 getForcedNumCopies :: Annex (Maybe NumCopies)
 getForcedNumCopies = Annex.getState Annex.forcenumcopies
 
-{- Numcopies value from any of the non-.gitattributes configuration
+{- Value forced on the command line by --mincopies. -}
+getForcedMinCopies :: Annex (Maybe MinCopies)
+getForcedMinCopies = Annex.getState Annex.forcemincopies
+
+{- NumCopies value from any of the non-.gitattributes configuration
  - sources. -}
 getNumCopies :: Annex NumCopies
-getNumCopies = fromSources
+getNumCopies = fromSourcesOr defaultNumCopies
 	[ getForcedNumCopies
 	, getGlobalNumCopies

(Diff truncated)
diff --git a/doc/forum/Backup_of_whole_Linux_system.mdwn b/doc/forum/Backup_of_whole_Linux_system.mdwn
new file mode 100644
index 000000000..afafc55a0
--- /dev/null
+++ b/doc/forum/Backup_of_whole_Linux_system.mdwn
@@ -0,0 +1,16 @@
+If I would want to backup my whole Linux system, what's unclear or maybe missing from Git Annex:
+
+I'm not exactly sure about the best way to import the files. Should I just copy over all the files (e.g. using `cp -ax /* .`, or maybe `rsync -a /* .` or so) to the repo, and then use `git annex add`? (Let's skip `/dev` and maybe other special files for now.)
+
+Let's say I added now all files to the annex.
+
+I would also want to store the owning user, group, and access attributes, and maybe other extended attributes (ACL, xattr).
+This is not yet covered by Git Annex (by default), right?
+This could be stored as annex metadata. Or maybe better in some other way, because this would be per file path, and not per file content.
+Has anyone already done sth like this? It should not be too hard to do this, right?
+
+I'm also not exactly sure how Git Annex handles symlinks. Would it store the original symlink? Or would it not handle them at all, and just add them to Git itself?
+
+There will be some overlap of the files with other Git Annex repos (e.g. this could contain a subset of pictures I have elsewhere).
+I would want that the annexed data files are shared with my much bigger Annex repo which contains all my main data (pictures and lots of other stuff).
+This is actually the main reason why I consider using Git Annex as well for this purpose, and not some other solution, so that I don't need to store data separately, and get other benefits (to simplify my backups).

Added a comment
diff --git a/doc/forum/Import_existing_files/comment_6_be5331f1203b9610259a84c783b429b0._comment b/doc/forum/Import_existing_files/comment_6_be5331f1203b9610259a84c783b429b0._comment
new file mode 100644
index 000000000..1bbdc572a
--- /dev/null
+++ b/doc/forum/Import_existing_files/comment_6_be5331f1203b9610259a84c783b429b0._comment
@@ -0,0 +1,21 @@
+[[!comment format=mdwn
+ username="AlbertZeyer"
+ avatar="http://cdn.libravatar.org/avatar/b37d71961a6a5abf9b7184ed77b5a941"
+ subject="comment 6"
+ date="2021-01-06T14:59:00Z"
+ content="""
+Thanks for the answer.
+
+Maybe the forum post title here was chosen badly. It's not just about how to import existing files, but also/mainly I was trying to figure out whether Git Annex fits my needs (for a quite big archive of data). That's why I had all these questions. Also because this was not exactly clear to me after reading the docs.
+
+What's still not exactly clear to me is whether it is not a better idea to keep the Annex repo separate from the checked out files. I don't like all the symlinks too much, and a couple of applications behave strange (because they follow the symlinks). I would prefer a solution where the (maybe bare) repo is separate from the checked out tree.
+
+That is why I asked about Git Worktree. But this is still not clear to me.
+
+I also read about [Git Annex Direct mode](https://git-annex.branchable.com/git-annex-direct/), which sounds like it is exactly that? But apparently this is not supported anymore? Why?
+
+I also read about the [Git Annex Assistant](https://git-annex.branchable.com/assistant/), which also sounds like this? But the docs are somewhat sparse, and its not totally clear how this is done, and why the main Git Annex cannot do that, while Git Annex Assistant can do that. But discussions like [this](https://git-annex.branchable.com/design/assistant/desymlink/) sound very relevant (that describes many of the issues I have with symlinks). But I would not specifically want to do it all automatically (I think that's the purpose of the assistant) but do it explicitly (like adding files to the annex, i.e. using the commands `git annex add` etc).
+
+I think this should be possible without having to watch live for changes (via inotify or so) (where it anyway would be easy to miss changes). E.g. `git status` seems to be very fast at such checks. I'm not exactly sure how it does it but I assume it does some fast checks for changed mtime or maybe other things. Some filesystems might also provide other means. E.g. if the file was copied with a reflink (`cp --reflink`) (which anyway makes sense to not store the data twice, and which is much more efficient), it could check whether the reflink has changed. Or otherwise using hardlinks and locking the files (readonly), and unlocking them would make them writeable (that's ok if unlocked files are less efficient to handle, as this would be a rare action).
+
+"""]]

Added a comment
diff --git a/doc/forum/Reverse_index_key_to_list_of_file_paths/comment_2_716efb07d216d9a063bb9498c8e76ea1._comment b/doc/forum/Reverse_index_key_to_list_of_file_paths/comment_2_716efb07d216d9a063bb9498c8e76ea1._comment
new file mode 100644
index 000000000..45f73bfee
--- /dev/null
+++ b/doc/forum/Reverse_index_key_to_list_of_file_paths/comment_2_716efb07d216d9a063bb9498c8e76ea1._comment
@@ -0,0 +1,15 @@
+[[!comment format=mdwn
+ username="AlbertZeyer"
+ avatar="http://cdn.libravatar.org/avatar/b37d71961a6a5abf9b7184ed77b5a941"
+ subject="comment 2"
+ date="2021-01-06T11:15:42Z"
+ content="""
+Thanks for the answer.
+
+How does `git status` checks for changes? I feel it is quite fast at that.
+
+So you could update the persistent database by post-commit hook, and have a temporary virtual overlay when used which takes current staged changes also into account. And maybe you can also add a `--fast` option, which would skip this part, because the user probably knows when to expect staged changes.
+
+I think this would be pretty useful. This would also change somewhat the whole way how I would use the Annex. I expect that I have this case quite often, that some file content is referenced from multiple file paths.
+
+"""]]

branch
diff --git a/doc/todo/lockContent_for_special_remotes.mdwn b/doc/todo/lockContent_for_special_remotes.mdwn
index 1c2c1545c..59dba9e84 100644
--- a/doc/todo/lockContent_for_special_remotes.mdwn
+++ b/doc/todo/lockContent_for_special_remotes.mdwn
@@ -60,3 +60,5 @@ by adding a requirednumcopies. (Analagous to required content configs.)
 Defaulting to 1 as now, but if the user wants to they can set it higher,
 perhaps as high as their numcopies (or even just set it to 1000 and make
 it be treated the same value as numcopies when it's >= numcopies.)
+
+> Started on requirednumcopies branch --[[Joey]]

docs for requirednumcopies
Not implemented yet.
diff --git a/CHANGELOG b/CHANGELOG
index 80859fac9..ca841d47d 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,5 +1,11 @@
 git-annex (8.20201130) UNRELEASED; urgency=medium
 
+  * Added requirednumcopies configuration. This is like numcopies, but is
+    enforced even more strictly. While numcopies can be violated in
+    concurrent drop situations involving special remotes that do not
+    support locking, requirednumcopies cannot be. The default value is 1,
+    which is not a behavior change, but now it can be set to higher
+    values if desired.
   * add: Significantly speed up adding lots of non-large files to git,
     by disabling the annex smudge filter when running git add.
   * add --force-small: Run git add rather than updating the index itself,
diff --git a/doc/copies.mdwn b/doc/copies.mdwn
index 3e0ebc6c6..73482da7b 100644
--- a/doc/copies.mdwn
+++ b/doc/copies.mdwn
@@ -1,35 +1,49 @@
 Annexed data is stored inside  your git repository's `.git/annex` directory.
 Some [[special_remotes]] can store annexed data elsewhere.
 
-It's important that data not get lost by an ill-considered `git annex drop`
-command.  So, git-annex can be configured to try
-to keep N copies of a file's content available across all repositories. 
-(Although [[untrusted_repositories|trust]] don't count toward this total.)
-
-By default, N is 1; it is configured by running `git annex numcopies N`.
-This default can be overridden on a per-file-type basis by the annex.numcopies
-setting in `.gitattributes` files. The --numcopies switch allows
-temporarily using a different value.
-
-`git annex drop` attempts to check with other git remotes, to check that N
-copies of the file exist. If enough repositories cannot be verified to have
-it, it will retain the file content to avoid data loss. Note that
-[[trusted_repositories|trust]] are not explicitly checked.
-
-For example, consider three repositories: Server, Laptop, and USB. Both Server
-and USB have a copy of a file, and N=1. If on Laptop, you `git annex get
-$file`, this will transfer it from either Server or USB (depending on which
-is available), and there are now 3 copies of the file.
-
-Suppose you want to free up space on Laptop again, and you `git annex drop` the file
-there. If USB is connected, or Server can be contacted, git-annex can check
-that it still has a copy of the file, and the content is removed from
-Laptop. But if USB is currently disconnected, and Server also cannot be
-contacted, it can't verify that it is safe to drop the file, and will
-refuse to do so.
+It's important that data not get lost by an ill-considered `git-annex drop`
+command.  So, git-annex can be configured to try to keep a number of copies
+of a file's content available across all repositories. 
+
+By default, it keeps 1 copy; this is configured by running `git-annex
+numcopies N`, or can be overridden on a per-file-type basis by the
+annex.numcopies setting in `.gitattributes` files. The --numcopies switch
+allows temporarily using a different value.
+
+When dropping content, git-annex checks with remotes to make sure
+If enough repositories cannot be verified to have it, it will retain
+the file content to avoid data loss.
 
-With N=2, in order to drop the file content from Laptop, it would need access
-to both USB and Server.
+When it can, git-annex locks enough copies on other repositories, to allow
+it to safely drop a copy without any possibility that numcopies will be
+violated. There are some exceptions, including special remotes not
+supporting locking, and [[trusted repositories|trust]] that are not
+accessible, where locking is not done. 
 
-For more complicated requirements about which repositories contain which
+If such a repository is being relied on to contain a copy and drops it at
+the wrong time, the configured numcopies setting can be violated. To avoid
+losing the last copy in such an unusual situation, git-annex requires that
+at least 1 copy is locked in place when dropping content. If 1 does not
+seem like enough, you can override this default by running `git-annex
+requirednumcopies or setting annex.requirednumcopies in `.gitattributes`
+files.
+
+To express more detailed requirements about which repositories contain which
 content, see [[required_content]].
+
+## example
+
+For example, consider three repositories: Server, Laptop, and USB. Both
+Server and USB have a copy of a file, and numcopies is 1. If on Laptop, you
+`git-annex get $file`, this will transfer it from either Server or USB
+(depending on which is available), and there are now 3 copies of the file.
+
+Suppose you want to free up space on Laptop again, and you `git-annex drop`
+the file there. If USB is connected, or Server can be contacted, git-annex
+can check that it still has a copy of the file, and the content is removed
+from Laptop. But if USB is currently disconnected, and Server also cannot
+be contacted, it can't verify that it is safe to drop the file, and will
+refuse to do so.
+
+With numcopies of 2, in order to drop the file content from Laptop, it
+would need access to both USB and Server.
diff --git a/doc/git-annex-numcopies.mdwn b/doc/git-annex-numcopies.mdwn
index 96701e1ef..15ddb06fb 100644
--- a/doc/git-annex-numcopies.mdwn
+++ b/doc/git-annex-numcopies.mdwn
@@ -14,18 +14,25 @@ repositories. The default is 1.
 Run without a number to get the current value.
 
 This configuration is stored in the git-annex branch, so it will be seen
-by all clones of the repository.
+by all clones of the repository. It can be overridden on a per-file basis
+by the annex.numcopies setting in .gitattributes files, or can be
+overridden temporarily with the --numcopies option.
 
 When git-annex is asked to drop a file, it first verifies that the
-required number of copies can be satisfied among all the other
+number of copies can be satisfied among all the other
 repositories that have a copy of the file.
-  
-This can be overridden on a per-file basis by the annex.numcopies setting
-in .gitattributes files.
+
+In situations involving trusted repositories or special remotes that
+cannot lock content in place, the numcopies setting may be violated
+when the same file is being dropped at the same time from multiple
+repositories. In these unusual situations, git-annex ensures that
+the requirednumcopies setting (default 1) is not violated. See
+[[git-annex-requirednumcopies]](1) for more about this setting.
 
 # SEE ALSO
 
 [[git-annex]](1)
+[[git-annex-requirednumcopies]](1)
 
 # AUTHOR
 
diff --git a/doc/git-annex-requirednumcopies.mdwn b/doc/git-annex-requirednumcopies.mdwn
new file mode 100644
index 000000000..7e25b7ec7
--- /dev/null
+++ b/doc/git-annex-requirednumcopies.mdwn
@@ -0,0 +1,43 @@
+# NAME
+
+git-annex requirednumcopies - configure required number of copies
+
+# SYNOPSIS
+
+git annex requirednumcopies `N`
+
+# DESCRIPTION
+
+Tells git-annex how many copies it is required to preserve of files, over all
+repositories. The default is 1.
+
+Run without a number to get the current value.
+
+This configuration is stored in the git-annex branch, so it will be seen
+by all clones of the repository. It can be overridden on a per-file basis
+by the annex.requirednumcopies setting in .gitattributes files, or can be
+overridden temporarily with the --requirednumcopies option.
+
+When git-annex is asked to drop a file, it makes sure that
+that the required number of copies will still exist in other
+repositories, by locking the content in them, preventing it from
+being dropped.
+
+This supplements the [[git-annex-numcopies]](1) setting. git-annex
+checks that numcopies is met before dropping. But in situations
+involving trusted repositories or special remotes that
+cannot lock content in place, the numcopies setting may be violated
+when the same file is being dropped at the same time from multiple
+repositories. In these unusual situations, git-annex ensures that
+the requirednumcopies setting is not violated.
+
+# SEE ALSO
+
+[[git-annex]](1)
+[[git-annex-numcopies]](1)
+
+# AUTHOR
+
+Joey Hess <id@joeyh.name>
+
+Warning: Automatically converted into a man page by mdwn2man. Edit with care.

the author of this forum post deleted it, so remove comments
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_10_42c2754ddb58ab47a442909f9be494a7._comment b/doc/forum/how_to_get_into_git_annex.../comment_10_42c2754ddb58ab47a442909f9be494a7._comment
deleted file mode 100644
index 62a6be917..000000000
--- a/doc/forum/how_to_get_into_git_annex.../comment_10_42c2754ddb58ab47a442909f9be494a7._comment
+++ /dev/null
@@ -1,12 +0,0 @@
-[[!comment format=mdwn
- username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
- nickname="eric.w"
- avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
- subject="comment 10"
- date="2020-12-31T22:45:03Z"
- content="""
-You will be unsurprised to hear that what you suggested worked. not sure what helped other than me cleaning up my working tree and doing a solid git annex add .; git annex sync. I also removed annex.thin since its evidently not helping me. 
-thanks a ton. what got me here was me basically running through the \"splitting a repo\" process of making a new git repo, doing a cp -rl ./.git/annex/objects to the new repo and then running various tests on it. I just want to make sure I don't step on my own feet here. 
-
-thanks a ton. 
-"""]]
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_11_ad9858a76d5a66b2619cec3e3c17abd2._comment b/doc/forum/how_to_get_into_git_annex.../comment_11_ad9858a76d5a66b2619cec3e3c17abd2._comment
deleted file mode 100644
index f6c587003..000000000
--- a/doc/forum/how_to_get_into_git_annex.../comment_11_ad9858a76d5a66b2619cec3e3c17abd2._comment
+++ /dev/null
@@ -1,12 +0,0 @@
-[[!comment format=mdwn
- username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
- nickname="eric.w"
- avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
- subject="comment 11"
- date="2021-01-01T04:08:43Z"
- content="""
-couple of final notes:
-
-* ```--reflog=always``` isn't a cp option, its reflink, and I am a moron. 
-* that same options on btrfs is the bomb. all of the advantages of hardlinks without the disadvantages. 
-"""]]
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_12_53a210eace9359cab2a28510d373bec4._comment b/doc/forum/how_to_get_into_git_annex.../comment_12_53a210eace9359cab2a28510d373bec4._comment
deleted file mode 100644
index 038dd2166..000000000
--- a/doc/forum/how_to_get_into_git_annex.../comment_12_53a210eace9359cab2a28510d373bec4._comment
+++ /dev/null
@@ -1,53 +0,0 @@
-[[!comment format=mdwn
- username="joey"
- subject="""parent post is rife with incorrect and misleading statements"""
- date="2021-01-04T20:03:19Z"
- content="""
-AFAIK there are no circumstances where git-annex will lose data unless you
-use the --force flag, which is clearly documented as allowing data loss.
-If you have a case where it does, *file a bug report**.
-
-git-annex uninit does *not* delete .git/annex/objects if there are
-any objects in there that are not used by files in the repo, so it can't
-have behaved as you claim it did, at least as far as I can tell. Here is an
-example of it not deleting data, in a situation like the one you claimed 
-caused data loss:
-
-	joey@darkstar:/tmp/demo>git annex add foo
-	joey@darkstar:/tmp/demo>git commit -m add
-	joey@darkstar:/tmp/demo>git rm foo
-	joey@darkstar:/tmp/demo>git annex uninit
-	git-annex: Not fully uninitialized
-	Some annexed data is still left in .git/annex/objects/
-	This may include deleted files, or old versions of modified files.
-	
-	If you don't care about preserving the data, just delete the
-	directory.
-	
-	Or, you can move it to another location, in case it turns out
-	something in there is important.
-	
-	Or, you can run `git annex unused` followed by `git annex dropunused`
-	to remove data that is not used by any tag or branch, which might
-	take care of all the data.
-	
-	Then run `git annex uninit` again to finish.
-	joey@darkstar:/tmp/demo>find .git/annex/objects/ -type f
-	.git/annex/objects/Zj/zZ/SHA256E-s30--9d9f1f02932124b06e803a4899068dbc1df00d126447d226bb312861e0b7de83/SHA256E-s30--9d9f1f02932124b06e803a4899068dbc1df00d126447d226bb312861e0b7de83
-
-I document changes before I implement them, and this website is updated
-on every push of changes to git-annex. While some tip somewhere may be
-out of date, the one you mentioned does not appear to be. It looks
-like you misunderstood something about it.
-
-A drive in a safe full of files with and without git-annex has identical
-durability. Using git-annex does *not* cause file to be less accessible or
-add significant roadblocks to accessing them no matter what problems might
-befall that drive. Worst case, fsck of a corrupted filesystem on that drive
-will rescue files to lost+found with the git-annex key name and not the
-original filename. This is easy to recover from though, using `git-annex
-reinject --known`. Which also, conventiently, works if fsck on a badly
-damaged drive restores the file to lost+found using a bare inode number.
-Which, if you're not using git-annex, puts you in a world of hurt to
-determine what file that originally was.
-"""]]
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_13_f96ca3dab408c0aeb92910abf0ab5c49._comment b/doc/forum/how_to_get_into_git_annex.../comment_13_f96ca3dab408c0aeb92910abf0ab5c49._comment
deleted file mode 100644
index 02b3d0c9c..000000000
--- a/doc/forum/how_to_get_into_git_annex.../comment_13_f96ca3dab408c0aeb92910abf0ab5c49._comment
+++ /dev/null
@@ -1,13 +0,0 @@
-[[!comment format=mdwn
- username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
- nickname="eric.w"
- avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
- subject="comment 13"
- date="2021-01-05T05:55:13Z"
- content="""
-Interesting. well I'll have to test that again. thanks for taking the time to read this & respond and thank you for making an awesome piece of software.
-
-
-
-
-"""]]
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_1_5233b2a59bcf2edefdd49dc6f60542c5._comment b/doc/forum/how_to_get_into_git_annex.../comment_1_5233b2a59bcf2edefdd49dc6f60542c5._comment
deleted file mode 100644
index f49a334f9..000000000
--- a/doc/forum/how_to_get_into_git_annex.../comment_1_5233b2a59bcf2edefdd49dc6f60542c5._comment
+++ /dev/null
@@ -1,8 +0,0 @@
-[[!comment format=mdwn
- username="Lukey"
- avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
- subject="comment 1"
- date="2020-12-31T17:32:46Z"
- content="""
-I've now seen multiple people claiming that the documentation is out of date, but couldn't confirm it myself. Can you provide an example?
-"""]]
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_2_cde005ffa911f72e6d37ec8ad9bf76f8._comment b/doc/forum/how_to_get_into_git_annex.../comment_2_cde005ffa911f72e6d37ec8ad9bf76f8._comment
deleted file mode 100644
index 89470ab02..000000000
--- a/doc/forum/how_to_get_into_git_annex.../comment_2_cde005ffa911f72e6d37ec8ad9bf76f8._comment
+++ /dev/null
@@ -1,14 +0,0 @@
-[[!comment format=mdwn
- username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
- nickname="eric.w"
- avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
- subject="comment 2"
- date="2020-12-31T18:55:29Z"
- content="""
-the most recent example I've run across is the use of 
-git.annex=thin 
-in the link here: https://git-annex.branchable.com/tips/unlocked_files/
-it didn't result in a hardlink being made of the content for either git annex unlock or git annex unannex
-instead I ended up getting the same functionality by use --fast.
-
-"""]]
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_3_20d36c31731303291e4e43af1241692f._comment b/doc/forum/how_to_get_into_git_annex.../comment_3_20d36c31731303291e4e43af1241692f._comment
deleted file mode 100644
index 711ba5947..000000000
--- a/doc/forum/how_to_get_into_git_annex.../comment_3_20d36c31731303291e4e43af1241692f._comment
+++ /dev/null
@@ -1,9 +0,0 @@
-[[!comment format=mdwn
- username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
- nickname="eric.w"
- avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
- subject="comment 3"
- date="2020-12-31T19:34:56Z"
- content="""
-right now I am driving myself crazy trying to understand why I have objects that *nothing is pointing to*, yet git annex unused fails to report them. these objects report 1 hardlink and they are from a migrated backend. I'll try git annex forget, but I really don't understand what is keeping these objects from being reported as unused. 
-"""]]
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_5_05f1010f442de6e6e34c1f8ea5c250c7._comment b/doc/forum/how_to_get_into_git_annex.../comment_5_05f1010f442de6e6e34c1f8ea5c250c7._comment
deleted file mode 100644
index 53e42fa5e..000000000
--- a/doc/forum/how_to_get_into_git_annex.../comment_5_05f1010f442de6e6e34c1f8ea5c250c7._comment
+++ /dev/null
@@ -1,9 +0,0 @@
-[[!comment format=mdwn
- username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
- nickname="eric.w"
- avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
- subject="comment 5"
- date="2020-12-31T20:34:55Z"
- content="""
-I am digging into this further, and it looks like git annex uses cp --reflog=auto, confirmed with filefrag -v, but even if the object from the old backend isn't taking up space, its still frustrating that I can't figure out why git annex is keeping old files around and not reporting them via git annex unused. 
-"""]]
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_5_6ed2e755fb7f738d75468f1c7c4d5555._comment b/doc/forum/how_to_get_into_git_annex.../comment_5_6ed2e755fb7f738d75468f1c7c4d5555._comment
deleted file mode 100644
index 4f14ac711..000000000
--- a/doc/forum/how_to_get_into_git_annex.../comment_5_6ed2e755fb7f738d75468f1c7c4d5555._comment
+++ /dev/null
@@ -1,15 +0,0 @@
-[[!comment format=mdwn
- username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
- nickname="eric.w"
- avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
- subject="comment 5"
- date="2020-12-31T19:40:22Z"
- content="""
-https://git-annex.branchable.com/bugs/migrated_files_not_showing_up_in_unused_list/
-
-according to the link above it should be hardlinked to the new key for the new backend, but this isn't the case. this is on btrfs btw. 
-this is a test repo with no remotes as another data point. 
-also I migrated from SHA256E to SHA256. 
-
-I tried git annex forget --force; git annex sync; git annex unused, still it isn't showing the objects as unused.
-"""]]
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_6_142a83bd4f5b1729635bf891f75bd94b._comment b/doc/forum/how_to_get_into_git_annex.../comment_6_142a83bd4f5b1729635bf891f75bd94b._comment

(Diff truncated)
update
diff --git a/doc/thanks/list b/doc/thanks/list
index 2299f8b88..d98b46001 100644
--- a/doc/thanks/list
+++ b/doc/thanks/list
@@ -106,3 +106,7 @@ Noam Kremen,
 Pluralist Extremist, 
 Shaddy Baddah, 
 Andreas Skielboe, 
+Martin Heistermann, 
+Kevin Mueller, 
+Jarkko Kniivilä, 
+Alex, 

removed
diff --git a/doc/forum/how_to_get_into_git_annex....mdwn b/doc/forum/how_to_get_into_git_annex....mdwn
deleted file mode 100644
index 84660c6aa..000000000
--- a/doc/forum/how_to_get_into_git_annex....mdwn
+++ /dev/null
@@ -1,11 +0,0 @@
-So... I've been flirting with using git annex for literal years now, and if for some reason you are wanting to use it too here are some tips:
-
-* keep backups. seriously. just do it. it's possible to lose data, even though git annex is designed to avoid eating your data it will do it under certain circumstances. you aren't lucky enough to avoid it. trust me. 
-* make a big fat git annex with too many files in it, and kick the tires, hard. run all the commands and try to break it, see what it does under certain circumstances before you run those same commands on your beloved data. (the documentation isn't always up to date, sometimes the options (which are complex) operate differently than the website says and differently than you expect, this is most likely due to code changes that haven't propagated to the website.
-* git annex bogs down fast when you are dealing with a large number of objects, there are ways to get that under control, but nothing is going to make managing an annex with millions of files "fast" for many operations.
-* now that you are a pro at git annex, STILL *keep* backups. git annex isn't a backup. it just isn't. nothing beats a simple usb hard drive stuffed in your safe with all your files on it and without the complexity that is git annex in the way.
-
-
-
-anyways, git annex (and git) are pretty much game changing cool, just recognize that they bring complexity and complexity brings unpredictability, so back your data up.
-

Added a comment
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_13_f96ca3dab408c0aeb92910abf0ab5c49._comment b/doc/forum/how_to_get_into_git_annex.../comment_13_f96ca3dab408c0aeb92910abf0ab5c49._comment
new file mode 100644
index 000000000..02b3d0c9c
--- /dev/null
+++ b/doc/forum/how_to_get_into_git_annex.../comment_13_f96ca3dab408c0aeb92910abf0ab5c49._comment
@@ -0,0 +1,13 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 13"
+ date="2021-01-05T05:55:13Z"
+ content="""
+Interesting. well I'll have to test that again. thanks for taking the time to read this & respond and thank you for making an awesome piece of software.
+
+
+
+
+"""]]

promote forum post to bug report
diff --git a/doc/bugs/adjustedbranchrefresh_ignored_by_git_annex_add.mdwn b/doc/bugs/adjustedbranchrefresh_ignored_by_git_annex_add.mdwn
new file mode 100644
index 000000000..025cf887b
--- /dev/null
+++ b/doc/bugs/adjustedbranchrefresh_ignored_by_git_annex_add.mdwn
@@ -0,0 +1,52 @@
+In some occasions `annex.adjustedbranchrefresh` is ignored when `git annex sync` is run in a branch created with `adjust --unlock-present`.
+
+If `annex.adjustedbranchrefresh` is set to 1, one would expect git-annex to automatically adjust the branch once a file has been `git annex add`-ed or the repository is `git annex sync`-ed. However this does not happen and a manual `git annex adjust --unlock-present` is required.
+
+Is this a bug or am I misunderstanding how `annex.adjustedbranchrefresh` is supposed to work?
+
+> It is a bug --[[Joey]]
+
+The following script reproduces this bug.
+
+```
+#!/bin/bash
+
+set -eux
+
+rm -Rvf /tmp/an-repo.git && mkdir /tmp/an-repo.git && cd /tmp/an-repo.git
+git init --bare
+n=$(date +%s) ; mkdir /tmp/ga-$n && cd /tmp/ga-$n
+git clone --no-local --no-hardlinks /tmp/an-repo.git
+cd an-repo/
+
+git config user.email "email@example.com" ; git config user.name "Name Name"
+git config annex.thin true
+git config annex.adjustedbranchrefresh 1
+git config remote.origin.annex-ignore true
+
+# 8.20201117 is the version in the standalone tarball of 8.20201127
+~/Applications/git-annex/8.20201117-ga314537cd/runshell bash -c '
+git annex init foobar
+
+echo "aaaa" > a && echo "bbbb" > b
+git annex add a b
+git annex sync
+
+git annex adjust --unlock-present
+git annex sync
+
+echo "cccc" > c && echo "dddd" > d
+git annex add c d
+
+echo "## before sync"
+stat -c "%n: %F" a b c d
+
+git annex sync
+echo "## after sync"
+stat -c "%n: %F" a b c d # should show four regular files, but shows two files and two symlinks
+
+git annex sync --content;
+echo "## after sync --content"
+stat -c "%n: %F" a b c d # ibid
+'
+```
diff --git a/doc/bugs/adjustedbranchrefresh_ignored_by_git_annex_add/comment_1_2688af420095e2e9aebf6caeb904ba48._comment b/doc/bugs/adjustedbranchrefresh_ignored_by_git_annex_add/comment_1_2688af420095e2e9aebf6caeb904ba48._comment
new file mode 100644
index 000000000..343305da0
--- /dev/null
+++ b/doc/bugs/adjustedbranchrefresh_ignored_by_git_annex_add/comment_1_2688af420095e2e9aebf6caeb904ba48._comment
@@ -0,0 +1,14 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2021-01-04T20:57:22Z"
+ content="""
+So I think the whole problem is that, git-annex add (and import, addurl,
+etc) should add the files unlocked when in an unlockpresent branch.
+
+I don't think git-annex sync needs to deal with this, problably.
+
+Don't think this really has anything to do with adjustedbranchrefresh.
+That's about updates after getting/dropping files, and that's not been done
+in this case.
+"""]]
diff --git a/doc/forum/adjustedbranchrefresh_ignored__63__/comment_1_999bd4a511c7fa39f5455df895c435f3._comment b/doc/forum/adjustedbranchrefresh_ignored__63__/comment_1_999bd4a511c7fa39f5455df895c435f3._comment
new file mode 100644
index 000000000..7e5bc3136
--- /dev/null
+++ b/doc/forum/adjustedbranchrefresh_ignored__63__/comment_1_999bd4a511c7fa39f5455df895c435f3._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2021-01-04T20:49:40Z"
+ content="""
+I agree, this is a bug. Copied to
+[[bugs/adjustedbranchrefresh_ignored_by_git_annex_add]].
+"""]]

comment
diff --git a/doc/forum/comment_1_2a9e8c859e722a334947bc9b11265c6c._comment b/doc/forum/comment_1_2a9e8c859e722a334947bc9b11265c6c._comment
new file mode 100644
index 000000000..11ebb5d7a
--- /dev/null
+++ b/doc/forum/comment_1_2a9e8c859e722a334947bc9b11265c6c._comment
@@ -0,0 +1,23 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2021-01-04T20:31:58Z"
+ content="""
+There's not currently anything to automate this. I sometimes do it
+manually, eg noticing that drive A is nearly full and `git-annex move --to
+B` of some of its files to free up space. 
+
+That's generally when I need space on drive A for some other purpose
+than that git-annex repo. If drive A is only being used for the one
+git-annex repo, then it doesn't much matter if it fills up before drive B?
+So maybe setting annex.diskreserve to some sufficiently large size in that
+repo on the drive would be better, so there's always space reserved for the
+other things the drive is being used for.
+
+[[design/balanced_preferred_content]] is an old attempt at designing
+a way to automate balancing between drives. Never quite got implemented,
+because of the limitations documented in it.
+It might be that it would work well enough for some use cases?
+
+I'd welcome thoughts on this topic.
+"""]]

wrong, wrong, wrong
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_12_53a210eace9359cab2a28510d373bec4._comment b/doc/forum/how_to_get_into_git_annex.../comment_12_53a210eace9359cab2a28510d373bec4._comment
new file mode 100644
index 000000000..038dd2166
--- /dev/null
+++ b/doc/forum/how_to_get_into_git_annex.../comment_12_53a210eace9359cab2a28510d373bec4._comment
@@ -0,0 +1,53 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""parent post is rife with incorrect and misleading statements"""
+ date="2021-01-04T20:03:19Z"
+ content="""
+AFAIK there are no circumstances where git-annex will lose data unless you
+use the --force flag, which is clearly documented as allowing data loss.
+If you have a case where it does, *file a bug report**.
+
+git-annex uninit does *not* delete .git/annex/objects if there are
+any objects in there that are not used by files in the repo, so it can't
+have behaved as you claim it did, at least as far as I can tell. Here is an
+example of it not deleting data, in a situation like the one you claimed 
+caused data loss:
+
+	joey@darkstar:/tmp/demo>git annex add foo
+	joey@darkstar:/tmp/demo>git commit -m add
+	joey@darkstar:/tmp/demo>git rm foo
+	joey@darkstar:/tmp/demo>git annex uninit
+	git-annex: Not fully uninitialized
+	Some annexed data is still left in .git/annex/objects/
+	This may include deleted files, or old versions of modified files.
+	
+	If you don't care about preserving the data, just delete the
+	directory.
+	
+	Or, you can move it to another location, in case it turns out
+	something in there is important.
+	
+	Or, you can run `git annex unused` followed by `git annex dropunused`
+	to remove data that is not used by any tag or branch, which might
+	take care of all the data.
+	
+	Then run `git annex uninit` again to finish.
+	joey@darkstar:/tmp/demo>find .git/annex/objects/ -type f
+	.git/annex/objects/Zj/zZ/SHA256E-s30--9d9f1f02932124b06e803a4899068dbc1df00d126447d226bb312861e0b7de83/SHA256E-s30--9d9f1f02932124b06e803a4899068dbc1df00d126447d226bb312861e0b7de83
+
+I document changes before I implement them, and this website is updated
+on every push of changes to git-annex. While some tip somewhere may be
+out of date, the one you mentioned does not appear to be. It looks
+like you misunderstood something about it.
+
+A drive in a safe full of files with and without git-annex has identical
+durability. Using git-annex does *not* cause file to be less accessible or
+add significant roadblocks to accessing them no matter what problems might
+befall that drive. Worst case, fsck of a corrupted filesystem on that drive
+will rescue files to lost+found with the git-annex key name and not the
+original filename. This is easy to recover from though, using `git-annex
+reinject --known`. Which also, conventiently, works if fsck on a badly
+damaged drive restores the file to lost+found using a bare inode number.
+Which, if you're not using git-annex, puts you in a world of hurt to
+determine what file that originally was.
+"""]]

fix --time-limit
It got broken in several ways by the streaming seeking optimisations
around version 8.20201007.
Moved time limit checking out of the matcher, which was a hack in the
first place. So everywhere that uses Limit.getMatcher needs to check
time limit. Well, almost everywhere. Command.Info uses it, but it does
not make sense to time limit getting info. And Command.MultiCast uses it
just to build up a list of files that then get passed to a command, so
it would never have hit the timeout in a useful way.
This implementation is a little more expensive when at time limit than
necessary, since it continues seeking only to discard everything after the
time limit. I did try making it close the file handles to force a faster
shutdown, but that didn't work and hung. Could certianly be improved
somehow, but seeking is probably not the expensive bit when a time limit
is hit, so this seems acceptable for now.
diff --git a/Annex.hs b/Annex.hs
index af34393b8..bbdee2f06 100644
--- a/Annex.hs
+++ b/Annex.hs
@@ -1,6 +1,6 @@
 {- git-annex monad
  -
- - Copyright 2010-2020 Joey Hess <id@joeyh.name>
+ - Copyright 2010-2021 Joey Hess <id@joeyh.name>
  -
  - Licensed under the GNU AGPL version 3 or higher.
  -}
@@ -78,6 +78,7 @@ import qualified Database.Keys.Handle as Keys
 import Utility.InodeCache
 import Utility.Url
 import Utility.ResourcePool
+import Utility.HumanTime
 
 import "mtl" Control.Monad.Reader
 import Control.Concurrent
@@ -85,6 +86,7 @@ import Control.Concurrent.STM
 import qualified Control.Monad.Fail as Fail
 import qualified Data.Map.Strict as M
 import qualified Data.Set as S
+import Data.Time.Clock.POSIX
 
 {- git-annex's monad is a ReaderT around an AnnexState stored in a MVar.
  - The MVar is not exposed outside this module.
@@ -133,6 +135,7 @@ data AnnexState = AnnexState
 	, globalnumcopies :: Maybe NumCopies
 	, forcenumcopies :: Maybe NumCopies
 	, limit :: ExpandableMatcher Annex
+	, timelimit :: Maybe (Duration, POSIXTime)
 	, uuiddescmap :: Maybe UUIDDescMap
 	, preferredcontentmap :: Maybe (FileMatcherMap Annex)
 	, requiredcontentmap :: Maybe (FileMatcherMap Annex)
@@ -201,6 +204,7 @@ newState c r = do
 		, globalnumcopies = Nothing
 		, forcenumcopies = Nothing
 		, limit = BuildingMatcher []
+		, timelimit = Nothing
 		, uuiddescmap = Nothing
 		, preferredcontentmap = Nothing
 		, requiredcontentmap = Nothing
diff --git a/CHANGELOG b/CHANGELOG
index efc62e305..80859fac9 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -5,6 +5,8 @@ git-annex (8.20201130) UNRELEASED; urgency=medium
   * add --force-small: Run git add rather than updating the index itself,
     so any other smudge filters than the annex one that may be enabled will
     be used.
+  * Fix --time-limit, which got broken in several ways by some optimisations
+    in version 8.20201007.
 
  -- Joey Hess <id@joeyh.name>  Mon, 04 Jan 2021 12:52:41 -0400
 
diff --git a/CmdLine/GitAnnex/Options.hs b/CmdLine/GitAnnex/Options.hs
index 87660e50b..2d9ccc518 100644
--- a/CmdLine/GitAnnex/Options.hs
+++ b/CmdLine/GitAnnex/Options.hs
@@ -1,6 +1,6 @@
 {- git-annex command-line option parsing
  -
- - Copyright 2010-2019 Joey Hess <id@joeyh.name>
+ - Copyright 2010-2021 Joey Hess <id@joeyh.name>
  -
  - Licensed under the GNU AGPL version 3 or higher.
  -}
@@ -11,6 +11,7 @@ module CmdLine.GitAnnex.Options where
 
 import Control.Monad.Fail as Fail (MonadFail(..))
 import Options.Applicative
+import Data.Time.Clock.POSIX
 import qualified Data.Map as M
 
 import Annex.Common
@@ -403,12 +404,17 @@ jobsOption =
 
 timeLimitOption :: [GlobalOption]
 timeLimitOption = 
-	[ globalSetter Limit.addTimeLimit $ option (eitherReader parseDuration)
+	[ globalSetter settimelimit $ option (eitherReader parseDuration)
 		( long "time-limit" <> short 'T' <> metavar paramTime
 		<> help "stop after the specified amount of time"
 		<> hidden
 		)
 	]
+  where
+	settimelimit duration = do
+		start <- liftIO getPOSIXTime
+		let cutoff = start + durationToPOSIXTime duration
+		Annex.changeState $ \s -> s { Annex.timelimit = Just (duration, cutoff) }
 
 data DaemonOptions = DaemonOptions
 	{ foregroundDaemonOption :: Bool
diff --git a/CmdLine/Seek.hs b/CmdLine/Seek.hs
index 25d46f02e..46ec0f67c 100644
--- a/CmdLine/Seek.hs
+++ b/CmdLine/Seek.hs
@@ -4,7 +4,7 @@
  - the values a user passes to a command, and prepare actions operating
  - on them.
  -
- - Copyright 2010-2020 Joey Hess <id@joeyh.name>
+ - Copyright 2010-2021 Joey Hess <id@joeyh.name>
  -
  - Licensed under the GNU AGPL version 3 or higher.
  -}
@@ -40,15 +40,18 @@ import Annex.Link
 import Annex.InodeSentinal
 import Annex.Concurrent
 import Annex.CheckIgnore
+import Annex.Action
 import qualified Annex.Branch
 import qualified Annex.BranchState
 import qualified Database.Keys
 import qualified Utility.RawFilePath as R
 import Utility.Tuple
+import Utility.HumanTime
 
 import Control.Concurrent.Async
 import System.Posix.Types
 import Data.IORef
+import Data.Time.Clock.POSIX
 import qualified System.FilePath.ByteString as P
 
 data AnnexedFileSeeker = AnnexedFileSeeker
@@ -96,12 +99,17 @@ withFilesNotInGit (CheckGitIgnore ci) ww a l = do
 withPathContents :: ((RawFilePath, RawFilePath) -> CommandSeek) -> CmdParams -> CommandSeek
 withPathContents a params = do
 	matcher <- Limit.getMatcher
-	forM_ params $ \p -> do
-		fs <- liftIO $ get p
-		forM fs $ \f ->
-			whenM (checkmatch matcher f) $
-				a f
+	checktimelimit <- mkCheckTimeLimit
+	go matcher checktimelimit params []
   where
+	go _ _ [] [] = return ()
+	go matcher checktimelimit (p:ps) [] =
+		go matcher checktimelimit ps =<< liftIO (get p)
+	go matcher checktimelimit ps (f:fs) = checktimelimit noop $ do
+		whenM (checkmatch matcher f) $
+			a f
+		go matcher checktimelimit ps fs		
+	
 	get p = ifM (isDirectory <$> getFileStatus p)
 		( map (\f -> 
 			let f' = toRawFilePath f
@@ -237,6 +245,7 @@ withKeyOptions' ko auto mkkeyaction fallbackaction worktreeitems = do
 	-- those. This significantly speeds up typical operations
 	-- that need to look at the location log for each key.
 	runallkeys = do
+		checktimelimit <- mkCheckTimeLimit
 		keyaction <- mkkeyaction
 		config <- Annex.getGitConfig
 		g <- Annex.gitRepo
@@ -246,9 +255,12 @@ withKeyOptions' ko auto mkkeyaction fallbackaction worktreeitems = do
 			LsTree.LsTreeRecursive
 			Annex.Branch.fullname
 		let getk f = fmap (,f) (locationLogFileKey config f)
+		let discard reader = reader >>= \case
+			Nothing -> noop
+			Just _ -> discard reader
 		let go reader = liftIO reader >>= \case
 			Nothing -> return ()
-			Just ((k, f), content) -> do
+			Just ((k, f), content) -> checktimelimit (discard reader) $ do
 				maybe noop (Annex.BranchState.setCache f) content
 				keyaction (SeekInput [], k, mkActionItem k)
 				go reader
@@ -282,14 +294,17 @@ withKeyOptions' ko auto mkkeyaction fallbackaction worktreeitems = do
 seekFiltered :: ((SeekInput, RawFilePath) -> Annex Bool) -> ((SeekInput, RawFilePath) -> CommandSeek) -> Annex ([(SeekInput, RawFilePath)], IO Bool) -> Annex ()
 seekFiltered prefilter a listfs = do
 	matcher <- Limit.getMatcher
+	checktimelimit <- mkCheckTimeLimit
 	(fs, cleanup) <- listfs
-	sequence_ (map (process matcher) fs)
+	go matcher checktimelimit fs
 	liftIO $ void cleanup
   where
-	process matcher v@(_si, f) =
+	go _ _ [] = return ()
+	go matcher checktimelimit (v@(_si, f):rest) = checktimelimit noop $ do
 		whenM (prefilter v) $
 			whenM (matcher $ MatchingFile $ FileInfo (Just f) f Nothing) $
 				a v
+		go matcher checktimelimit rest
 
 data MatcherInfo = MatcherInfo
 	{ matcherAction :: MatchInfo -> Annex Bool
@@ -317,6 +332,7 @@ seekFilteredKeys seeker listfs = do
 		<*> Limit.introspect matchNeedsLocationLog
 	config <- Annex.getGitConfig
 	(l, cleanup) <- listfs
+	checktimelimit <- mkCheckTimeLimit
 	catObjectMetaDataStream g $ \mdfeeder mdcloser mdreader ->
 		catObjectStream g $ \ofeeder ocloser oreader -> do

(Diff truncated)
comment
diff --git a/doc/bugs/git_annex_fsck_--time-limit_broken/comment_1_7367e332f3712b02f1fc25eeecc57d00._comment b/doc/bugs/git_annex_fsck_--time-limit_broken/comment_1_7367e332f3712b02f1fc25eeecc57d00._comment
new file mode 100644
index 000000000..f0066c051
--- /dev/null
+++ b/doc/bugs/git_annex_fsck_--time-limit_broken/comment_1_7367e332f3712b02f1fc25eeecc57d00._comment
@@ -0,0 +1,25 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2021-01-04T17:50:55Z"
+ content="""
+All of this is explained by --time-limit being implemented as a
+hack that throws an exception. Problem being that, the new streaming seeker
+checks matchers in a separate thread and having that thread die of an
+exception probably causes the hang. Also, since it checks the matcher
+before streaming through git, there's a buffer of perhaps many files
+that builds up before the time limit is reached, so those can go on to be
+processed, even after it's said the time limit is reached. Aaad, since it
+runs cleanup actions, this might leave fsck with its database closed
+but trying to use it.
+
+--time-limit could be removed from git-annex entirely. The `timeout`
+command can be used with git-annex. But fsck db flush and close doesn't
+happen when git-annex gets SIGINT and do with --time-limit. So this would
+need maybe a SIGINT handler that runs cleanup actions? And then git-annex
+would run some perhaps expensive cleanup actions whenever ctrl-c'd, which
+might not be desirable since normally that's not necessary.
+
+Or, it needs to not be implemented in this hackish way, but as another
+check that's done before starting processing a seeked file.
+"""]]

close
diff --git a/doc/bugs/git_annex_fix_broken.mdwn b/doc/bugs/git_annex_fix_broken.mdwn
index 642715c21..98272d33b 100644
--- a/doc/bugs/git_annex_fix_broken.mdwn
+++ b/doc/bugs/git_annex_fix_broken.mdwn
@@ -15,3 +15,5 @@
 
 ### What version of git-annex are you using? On what operating system?
 8.20201127 (I know I know... One year old version :)
+
+> not a bug [[done]] --[[Joey]]
diff --git a/doc/bugs/git_annex_fix_broken/comment_1_e8b7e62c21557e84e344b596d96a0735._comment b/doc/bugs/git_annex_fix_broken/comment_1_e8b7e62c21557e84e344b596d96a0735._comment
new file mode 100644
index 000000000..8c6e32d6f
--- /dev/null
+++ b/doc/bugs/git_annex_fix_broken/comment_1_e8b7e62c21557e84e344b596d96a0735._comment
@@ -0,0 +1,14 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2021-01-04T17:43:31Z"
+ content="""
+git-annex silently skips files that are not checked into git, and after you
+run "mv", the file is not checked into git.
+
+So add the file (or just use `git mv` to move it) and then it will work.
+
+Note that, annex.skipunknown can be set to false to make git-annex not
+silently skip files and instead complain. That is scheduled to become the
+default in 1 year, because I know the current behavior can be confusing.
+"""]]

comment
diff --git a/doc/git-annex-add/comment_5_f67bd5fdbe9deeeea1d48175a6b5c536._comment b/doc/git-annex-add/comment_5_f67bd5fdbe9deeeea1d48175a6b5c536._comment
new file mode 100644
index 000000000..a9148f423
--- /dev/null
+++ b/doc/git-annex-add/comment_5_f67bd5fdbe9deeeea1d48175a6b5c536._comment
@@ -0,0 +1,14 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 5"""
+ date="2021-01-04T17:34:56Z"
+ content="""
+You use `git-annex add` when you have a file located
+inside your git-annex repository, which you want to check in.
+
+You use `git-annex import` when you have some other thing
+storing files and you want git-annex to learn about the files stored there.
+Using git-annex import to move individual files to the repository and add
+them is not significantly different than mv+add and will eventually be
+deprecated.
+"""]]

add pointer to annex.largefiles config docs
diff --git a/doc/git-annex-add.mdwn b/doc/git-annex-add.mdwn
index efcf9e567..7d87a4236 100644
--- a/doc/git-annex-add.mdwn
+++ b/doc/git-annex-add.mdwn
@@ -20,6 +20,8 @@ If annex.largefiles is configured, and does not match a file,
 non-large file directly to the git repository, instead of to the annex.
 (By default dotfiles are assumed to not be large, and are added directly
 to git, but annex.dotfiles can be configured to annex those too.)
+See the git-annex manpage for documentation of these and other
+configuration settings.
 
 Large files are added to the annex in locked form, which prevents further
 modification of their content unless unlocked by [[git-annex-unlock]](1).

comment
diff --git a/doc/forum/Import_existing_files/comment_5_d88cedb3d70342071621c1695c8aeb05._comment b/doc/forum/Import_existing_files/comment_5_d88cedb3d70342071621c1695c8aeb05._comment
new file mode 100644
index 000000000..a1062d424
--- /dev/null
+++ b/doc/forum/Import_existing_files/comment_5_d88cedb3d70342071621c1695c8aeb05._comment
@@ -0,0 +1,34 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 5"""
+ date="2021-01-04T17:17:08Z"
+ content="""
+I think you're really overcomplicating things. Some really basic use of
+git-annex as described in the [[walkthrough]] will work fine in the
+situation you describe. Ie, initialize a git-annex repository in
+~/Pictures. If you have some other servers or hard drives that also have
+pictures, initialize git-annex repositories on those as well. Connect these
+repositories that all hold pictures together, by adding git remotes
+pointing to the other pictures repositories.
+
+Then when you `git push` (or `git-annex sync`), git-annex will automatically
+learn if some picture is stored in multiple of the repositories. You'll be
+able to run commands like `git-annex find --copies 2` or `git-annex drop`
+to operate on that information. Similarly, if Picture/BestPics2020/a.jpg
+and Picture/2020/01/a.jpg were the same content, git-annex will notice that
+when you add them to the annex, and will automatically deduplicate.
+
+If you have readonly DVDs or whatever, yes those can be handled in ways
+like Lukey describes, but why bother trying to deal with all those edge
+cases before you're using git-annex at all?
+
+As far as too many files, git has issues with the index file becoming
+slower with more files, but you need huge numbers of files for this to be a
+significant problem -- think millions. git-annex commands that need to
+operate on all files necessarily take longer when there are more files,
+but git-annex always lets you only operate on a subset of files, such as
+the ones in the current directory, so this is not a significant scalability
+problem. Worrying about speed before something is slow is a kind of
+premature optimisation; git-annex has actually been optimised in cases where
+it was slow.
+"""]]

add: Significantly speed up adding lots of non-large files to git
* add: Significantly speed up adding lots of non-large files to git,
by disabling the annex smudge filter when running git add.
* add --force-small: Run git add rather than updating the index itself,
so any other smudge filters than the annex one that may be enabled will
be used.
diff --git a/Annex/GitOverlay.hs b/Annex/GitOverlay.hs
index 2e441a28b..568fd2881 100644
--- a/Annex/GitOverlay.hs
+++ b/Annex/GitOverlay.hs
@@ -20,6 +20,7 @@ import Git.Index
 import Git.Env
 import qualified Annex
 import qualified Annex.Queue
+import Config.Smudge
 
 {- Runs an action using a different git index file. -}
 withIndexFile :: AltIndexFile -> (FilePath -> Annex a) -> Annex a
@@ -67,16 +68,12 @@ withIndexFile i = withAltRepo usecachedgitenv restoregitenv
  - Smudge and clean filters are disabled in this work tree. -}
 withWorkTree :: FilePath -> Annex a -> Annex a
 withWorkTree d a = withAltRepo
-	(\g -> return $ (g { location = modlocation (location g), gitGlobalOpts = gitGlobalOpts g ++ disableSmudgeConfig }, ()))
+	(\g -> return $ (g { location = modlocation (location g), gitGlobalOpts = gitGlobalOpts g ++ bypassSmudgeConfig }, ()))
 	(\g g' -> g' { location = location g, gitGlobalOpts = gitGlobalOpts g })
 	(const a)
   where
 	modlocation l@(Local {}) = l { worktree = Just (toRawFilePath d) }
 	modlocation _ = error "withWorkTree of non-local git repo"
-	disableSmudgeConfig = map Param
-		[ "-c", "filter.annex.smudge="
-		, "-c", "filter.annex.clean="
-		]
 
 {- Runs an action with the git index file and HEAD, and a few other
  - files that are related to the work tree coming from an overlay
diff --git a/CHANGELOG b/CHANGELOG
index b9f4bc51c..efc62e305 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,3 +1,13 @@
+git-annex (8.20201130) UNRELEASED; urgency=medium
+
+  * add: Significantly speed up adding lots of non-large files to git,
+    by disabling the annex smudge filter when running git add.
+  * add --force-small: Run git add rather than updating the index itself,
+    so any other smudge filters than the annex one that may be enabled will
+    be used.
+
+ -- Joey Hess <id@joeyh.name>  Mon, 04 Jan 2021 12:52:41 -0400
+
 git-annex (8.20201129) upstream; urgency=medium
 
   * New borg special remote. This is a new kind of remote, that examines
diff --git a/Command/Add.hs b/Command/Add.hs
index 4f848d25b..0835f0946 100644
--- a/Command/Add.hs
+++ b/Command/Add.hs
@@ -1,6 +1,6 @@
 {- git-annex command
  -
- - Copyright 2010-2020 Joey Hess <id@joeyh.name>
+ - Copyright 2010-2021 Joey Hess <id@joeyh.name>
  -
  - Licensed under the GNU AGPL version 3 or higher.
  -}
@@ -17,13 +17,10 @@ import qualified Database.Keys
 import Annex.FileMatcher
 import Annex.Link
 import Annex.Tmp
-import Annex.HashObject
 import Messages.Progress
-import Git.Types
 import Git.FilePath
 import Config.GitConfig
-import qualified Git.UpdateIndex
-import Utility.FileMode
+import Config.Smudge
 import Utility.OptParse
 import qualified Utility.RawFilePath as R
 
@@ -119,37 +116,26 @@ startSmall o si file =
 addSmall :: CheckGitIgnore -> RawFilePath -> Annex Bool
 addSmall ci file = do
 	showNote "non-large file; adding content to git repository"
-	addFile ci file
+	addFile Small ci file
 
 startSmallOverridden :: AddOptions -> SeekInput -> RawFilePath -> CommandStart
 startSmallOverridden o si file = 
-	starting "add" (ActionItemWorkTreeFile file) si $
-		next $ addSmallOverridden o file
+	starting "add" (ActionItemWorkTreeFile file) si $ next $ do
+		showNote "adding content to git repository"
+		addFile Small (checkGitIgnoreOption o) file
 
-addSmallOverridden :: AddOptions -> RawFilePath -> Annex Bool
-addSmallOverridden o file = do
-	showNote "adding content to git repository"
-	s <- liftIO $ R.getSymbolicLinkStatus file
-	if not (isRegularFile s)
-		then addFile (checkGitIgnoreOption o) file
-		else do
-			-- Can't use addFile because the clean filter will
-			-- honor annex.largefiles and it has been overridden.
-			-- Instead, hash the file and add to the index.
-			sha <- hashFile file
-			let ty = if isExecutable (fileMode s)
-				then TreeExecutable
-				else TreeFile
-			Annex.Queue.addUpdateIndex =<<
-				inRepo (Git.UpdateIndex.stageFile sha ty (fromRawFilePath file))
-			return True
+data SmallOrLarge = Small | Large
 
-addFile :: CheckGitIgnore -> RawFilePath -> Annex Bool
-addFile ci file = do
+addFile :: SmallOrLarge -> CheckGitIgnore -> RawFilePath -> Annex Bool
+addFile smallorlarge ci file = do
 	ps <- gitAddParams ci
-	Annex.Queue.addCommand [] "add" (ps++[Param "--"])
+	Annex.Queue.addCommand cps "add" (ps++[Param "--"])
 		[fromRawFilePath file]
 	return True
+  where
+	cps = case smallorlarge of
+		Large -> []
+		Small -> bypassSmudgeConfig
 
 start :: AddOptions -> SeekInput -> RawFilePath -> AddUnlockedMatcher -> CommandStart
 start o si file addunlockedmatcher = do
@@ -164,7 +150,7 @@ start o si file addunlockedmatcher = do
 			| otherwise -> 
 				starting "add" (ActionItemWorkTreeFile file) si $
 					if isSymbolicLink s
-						then next $ addFile (checkGitIgnoreOption o) file
+						then next $ addFile Small (checkGitIgnoreOption o) file
 						else perform o file addunlockedmatcher
 	addpresent key = 
 		liftIO (catchMaybeIO $ R.getSymbolicLinkStatus file) >>= \case
@@ -180,7 +166,7 @@ start o si file addunlockedmatcher = do
 		starting "add" (ActionItemWorkTreeFile file) si $
 			addingExistingLink file key $ do
 				Database.Keys.addAssociatedFile key =<< inRepo (toTopFilePath file)
-				next $ addFile (checkGitIgnoreOption o) file
+				next $ addFile Large (checkGitIgnoreOption o) file
 
 perform :: AddOptions -> RawFilePath -> AddUnlockedMatcher -> CommandPerform
 perform o file addunlockedmatcher = withOtherTmp $ \tmpdir -> do
diff --git a/Config/Smudge.hs b/Config/Smudge.hs
index d97001885..487f380d5 100644
--- a/Config/Smudge.hs
+++ b/Config/Smudge.hs
@@ -60,3 +60,11 @@ deconfigureSmudgeFilter = do
 		filter (\l -> l `notElem` stdattr && not (null l)) ls
 	unsetConfig (ConfigKey "filter.annex.smudge")
 	unsetConfig (ConfigKey "filter.annex.clean")
+
+-- Params to pass to git to temporarily avoid using the smudge/clean
+-- filters.
+bypassSmudgeConfig :: [CommandParam]
+bypassSmudgeConfig = map Param
+	[ "-c", "filter.annex.smudge="
+	, "-c", "filter.annex.clean="
+	]
diff --git a/Database/Keys.hs b/Database/Keys.hs
index 6b305f060..337f232a3 100644
--- a/Database/Keys.hs
+++ b/Database/Keys.hs
@@ -43,6 +43,7 @@ import Git.FilePath
 import Git.Command
 import Git.Types
 import Git.Index
+import Config.Smudge
 import qualified Utility.RawFilePath as R
 
 import qualified Data.ByteString as S
@@ -237,15 +238,14 @@ reconcileStaged qh = do
 		liftIO $ writeFile indexcache $ showInodeCache cur
 	
 	diff =
-		-- Avoid using external diff command, which would be slow.
-		-- (The -G option may make it be used otherwise.)
-		[ Param "-c", Param "diff.external="
 		-- Avoid running smudge or clean filters, since we want the
 		-- raw output, and they would block trying to access the
 		-- locked database. The --raw normally avoids git diff
 		-- running them, but older versions of git need this.
-		, Param "-c", Param "filter.annex.smudge="
-		, Param "-c", Param "filter.annex.clean="
+		bypassSmudgeConfig ++
+		-- Avoid using external diff command, which would be slow.
+		-- (The -G option may make it be used otherwise.)
+		[ Param "-c", Param "diff.external="
 		, Param "diff"
 		, Param "--cached"
 		, Param "--raw"
diff --git a/doc/todo/speed_up_git_annex_add_of_small_files.mdwn b/doc/todo/speed_up_git_annex_add_of_small_files.mdwn
index 1996e40cc..90567cc09 100644
--- a/doc/todo/speed_up_git_annex_add_of_small_files.mdwn
+++ b/doc/todo/speed_up_git_annex_add_of_small_files.mdwn
@@ -14,3 +14,5 @@ with the existing `--force-small` too, but at least that's not the default.
 
 Possible alternate approach: Unsetting filter.annex.smudge and
 filter.annex.clean when running `git add`?
+

(Diff truncated)
fix bad annex.largefiles example syntax
diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn
index 45306545c..5b653e645 100644
--- a/doc/git-annex.mdwn
+++ b/doc/git-annex.mdwn
@@ -919,7 +919,7 @@ repository, using [[git-annex-config]]. See its man page for a list.)
 
   Used to configure which files are large enough to be added to the annex.
   It is an expression that matches the large files, eg
-  "`include=*.mp3 or largerthan(500kb)`"
+  "`include=*.mp3 or largerthan=500kb`"
   See [[git-annex-matching-expression]](1) for details on the syntax.
 
   Overrides any annex.largefiles attributes in `.gitattributes` files.

comment and todo
diff --git a/doc/forum/Adding_files_to_git__58___Very_long___34__recording_state_in_git__34___phase/comment_7_a4e2f0e65a9dc92fc4dd85183e8f8090._comment b/doc/forum/Adding_files_to_git__58___Very_long___34__recording_state_in_git__34___phase/comment_7_a4e2f0e65a9dc92fc4dd85183e8f8090._comment
new file mode 100644
index 000000000..367202c85
--- /dev/null
+++ b/doc/forum/Adding_files_to_git__58___Very_long___34__recording_state_in_git__34___phase/comment_7_a4e2f0e65a9dc92fc4dd85183e8f8090._comment
@@ -0,0 +1,21 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 7"""
+ date="2021-01-04T16:13:36Z"
+ content="""
+This will avoid the overhead of the smudge filter, when all the files
+you're adding are ones you want stored in git, *not* in git-annex.
+
+	git-annex add --force-small
+
+I do think it would be possible for `git-annex add` to use the same method
+whenever it adds non-large files. But it might have unwanted other effects,
+since the way that manages to be fast is by avoding using `git add` and
+having git-annex hash the file and add it to git itself. Opened
+[[todo/speed_up_git_annex_add_of_small_files]] to consider this.
+
+The only way to speed up `git add` is to disable the smudge filter, but then
+all files you `git add` will be stored in git, not in git-annex. And
+disabling the smudge filter also will prevent using unlocked annexed files.
+(See [[todo/git_smudge_clean_interface_suboptiomal]] for background.)
+"""]]
diff --git a/doc/todo/speed_up_git_annex_add_of_small_files.mdwn b/doc/todo/speed_up_git_annex_add_of_small_files.mdwn
new file mode 100644
index 000000000..1996e40cc
--- /dev/null
+++ b/doc/todo/speed_up_git_annex_add_of_small_files.mdwn
@@ -0,0 +1,16 @@
+When adding a lot of small files to git with `git annex add`,
+it is slow because git runs the smudge filter on all files
+and [[that_is_slow|todo/git_smudge_clean_interface_suboptiomal]].
+
+But `git-annex add --force-small` is much much faster, because that
+bypasses git add entirely, hashing the content and staging it in the index
+from git-annex. So could that same method be used to speed up the slow case?
+
+My concern with doing this is that there may be things that `git add`
+does that are not done when bypassing it. The only one I can think of is,
+if the user has other smudge/clean filters than the git-annex one
+installed, they would not be run either. It could be argued that's a bug
+with the existing `--force-small` too, but at least that's not the default.
+
+Possible alternate approach: Unsetting filter.annex.smudge and
+filter.annex.clean when running `git add`?

comment
diff --git a/doc/git-annex-metadata/comment_11_deee9a4c3a812c9c8097d89f2f6c7d76._comment b/doc/git-annex-metadata/comment_11_deee9a4c3a812c9c8097d89f2f6c7d76._comment
new file mode 100644
index 000000000..27042676c
--- /dev/null
+++ b/doc/git-annex-metadata/comment_11_deee9a4c3a812c9c8097d89f2f6c7d76._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 11"""
+ date="2021-01-04T16:06:02Z"
+ content="""
+@AlbertZeyer, this man page says it is "stored in the git-annex branch."
+
+That branch is synced whenever you push it, which git-annex sync does do,
+but git push can also be set up to do. The branch is automatically merged.
+"""]]

comment
diff --git a/doc/git-annex-move/comment_7_e23e2a133db02112ca99aeda0499e841._comment b/doc/git-annex-move/comment_7_e23e2a133db02112ca99aeda0499e841._comment
new file mode 100644
index 000000000..53295014e
--- /dev/null
+++ b/doc/git-annex-move/comment_7_e23e2a133db02112ca99aeda0499e841._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 7"""
+ date="2021-01-04T16:04:45Z"
+ content="""
+Use `git mv` if you want to rename an annexed file. That does not change
+the key, the old key will work fine despite the extension having changed.
+"""]]

comment
diff --git a/doc/forum/Reverse_index_key_to_list_of_file_paths/comment_1_acae766349a036040a03e75fa8ed34c6._comment b/doc/forum/Reverse_index_key_to_list_of_file_paths/comment_1_acae766349a036040a03e75fa8ed34c6._comment
new file mode 100644
index 000000000..dbcdb86a0
--- /dev/null
+++ b/doc/forum/Reverse_index_key_to_list_of_file_paths/comment_1_acae766349a036040a03e75fa8ed34c6._comment
@@ -0,0 +1,17 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2021-01-04T15:58:24Z"
+ content="""
+> How would I make sure that potential moves/renames will update the index?
+
+If I had a good answer to that question I would have built it already.
+
+I mean, a post-commit hook can notice changed after the fact, but noticing
+them when they've just been staged is harder.
+
+It does not make sense to store such an index in the git-annex branch,
+because it's redundant information to what's already stored in git trees.
+
+This is discussed in [[todo/cache_key_info]].
+"""]]

Added a comment
diff --git a/doc/forum/Import_existing_files/comment_4_9c7677838ad28d540a2a514d718f9f1d._comment b/doc/forum/Import_existing_files/comment_4_9c7677838ad28d540a2a514d718f9f1d._comment
new file mode 100644
index 000000000..b0836dd5d
--- /dev/null
+++ b/doc/forum/Import_existing_files/comment_4_9c7677838ad28d540a2a514d718f9f1d._comment
@@ -0,0 +1,13 @@
+[[!comment format=mdwn
+ username="AlbertZeyer"
+ avatar="http://cdn.libravatar.org/avatar/b37d71961a6a5abf9b7184ed77b5a941"
+ subject="comment 4"
+ date="2021-01-04T12:04:04Z"
+ content="""
+That is the best solution with `find`? There is no reverse index? I made a separate forum entry for this question [here](https://git-annex.branchable.com/forum/Reverse_index_key_to_list_of_file_paths/), to discuss that a bit more separately.
+
+Why exactly does `git annex sync` (or other ops) get slower on bigger repos? In principle it could be implemented in a way that it should not get slower (basically always avoiding any need to iterate through all objects, which should always be possible to avoid by having some indices for any operations which needs that).
+
+Does it make sense to split up the repo, but share the Git Annex object files (shared `.git/annex/objects`)?
+
+"""]]

Added a comment
diff --git a/doc/git-annex-metadata/comment_10_278fca1c579d0acdcce819449aec8eee._comment b/doc/git-annex-metadata/comment_10_278fca1c579d0acdcce819449aec8eee._comment
new file mode 100644
index 000000000..c644f06b3
--- /dev/null
+++ b/doc/git-annex-metadata/comment_10_278fca1c579d0acdcce819449aec8eee._comment
@@ -0,0 +1,13 @@
+[[!comment format=mdwn
+ username="AlbertZeyer"
+ avatar="http://cdn.libravatar.org/avatar/b37d71961a6a5abf9b7184ed77b5a941"
+ subject="comment 10"
+ date="2021-01-03T22:07:02Z"
+ content="""
+From this man page, it's not totally clear how/where the metadata is stored. Is it inside the Git repo (i.e. as regular file), or inside the Annex, or somewhere else? Is this information synced when you do `git push` (as part of Git), or via `git annex sync`? 
+
+How does it resolve any conflicts?
+
+Is the metadata itself under version control? (If it is in Git itself, then clearly yes, but that's not clear to me.)
+
+"""]]

diff --git a/doc/forum/Reverse_index_key_to_list_of_file_paths.mdwn b/doc/forum/Reverse_index_key_to_list_of_file_paths.mdwn
new file mode 100644
index 000000000..d6171bab6
--- /dev/null
+++ b/doc/forum/Reverse_index_key_to_list_of_file_paths.mdwn
@@ -0,0 +1,8 @@
+I understand from [here](https://git-annex.branchable.com/forum/Import_existing_files/#comment-29ece0290fa1314ca48caf8f435570d2) that there is no reverse index from a key to list of file paths pointing to that key (i.e. pointing to the value).
+
+`find . -lname '*<key>'` would be an extremely slow operation on a big repo as it would go through the whole repo. And this is probably a common operation I frequently want to do.
+
+What if I would want to build one? How would I make sure that potential moves/renames will update the index?
+
+I understand from [here](https://git-annex.branchable.com/git-annex-metadata/) that you can attach meta information to a key (via `git annex metadata`). This sounds as it would be useful to contain such reverse information, right?
+

Added a comment
diff --git a/doc/git-annex-move/comment_6_eb62a9020575d89799815f6e4b98b28c._comment b/doc/git-annex-move/comment_6_eb62a9020575d89799815f6e4b98b28c._comment
new file mode 100644
index 000000000..6eb032729
--- /dev/null
+++ b/doc/git-annex-move/comment_6_eb62a9020575d89799815f6e4b98b28c._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="AlbertZeyer"
+ avatar="http://cdn.libravatar.org/avatar/b37d71961a6a5abf9b7184ed77b5a941"
+ subject="comment 6"
+ date="2021-01-03T21:48:50Z"
+ content="""
+Renaming files is still somewhat unclear then. So I should just use `git mv` to rename files?
+What if the file extension changes? E.g. I have lots of pictures with uppercase `.JPG` and I might want to change them to `.jpg`. I might also have some files as `.jpeg` and to unify them, I might change them as well to `.jpg`. But as far as I understand, that would also change the default hash, as the default hash contains the extension? But doing `git mv` will take care about all of that via pre-commit hooks?
+
+"""]]

Added a comment
diff --git a/doc/forum/Adding_files_to_git__58___Very_long___34__recording_state_in_git__34___phase/comment_6_cf89b44d67752edabfbf577d1212e7ad._comment b/doc/forum/Adding_files_to_git__58___Very_long___34__recording_state_in_git__34___phase/comment_6_cf89b44d67752edabfbf577d1212e7ad._comment
new file mode 100644
index 000000000..ca1f3cebe
--- /dev/null
+++ b/doc/forum/Adding_files_to_git__58___Very_long___34__recording_state_in_git__34___phase/comment_6_cf89b44d67752edabfbf577d1212e7ad._comment
@@ -0,0 +1,13 @@
+[[!comment format=mdwn
+ username="AlbertZeyer"
+ avatar="http://cdn.libravatar.org/avatar/b37d71961a6a5abf9b7184ed77b5a941"
+ subject="comment 6"
+ date="2021-01-02T23:49:14Z"
+ content="""
+I'm having a very similar issue. Adding files is quite slow, and it hangs for several minutes in `(recording state in git...)` now (that started after adding quite a few files already), and the time seems to increase (I fear that it will soon be hours, not minutes, making it basically unusable...).
+
+I have not really configured anything (i.e. it should use all the defaults). I just did `git init` and `git annex init`, and then started to import files using `git annex import`. That's all.
+
+I don't really know about this smudge thing. Is that enabled by default? If that is causing problems, should I maybe disable it?
+
+"""]]

Added a comment
diff --git a/doc/git-annex-add/comment_4_3bbbe94633b6cb4ef93b7942eb36cc6c._comment b/doc/git-annex-add/comment_4_3bbbe94633b6cb4ef93b7942eb36cc6c._comment
new file mode 100644
index 000000000..f13b66319
--- /dev/null
+++ b/doc/git-annex-add/comment_4_3bbbe94633b6cb4ef93b7942eb36cc6c._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="AlbertZeyer"
+ avatar="http://cdn.libravatar.org/avatar/b37d71961a6a5abf9b7184ed77b5a941"
+ subject="comment 4"
+ date="2021-01-02T16:23:05Z"
+ content="""
+But is there a difference to `git annex import`? What is the difference? Why would you use `git annex add` instead of `git annex import`?
+
+"""]]

diff --git a/doc/bugs/git_annex_fix_broken.mdwn b/doc/bugs/git_annex_fix_broken.mdwn
new file mode 100644
index 000000000..642715c21
--- /dev/null
+++ b/doc/bugs/git_annex_fix_broken.mdwn
@@ -0,0 +1,17 @@
+### Please describe the problem.
+`git annex fix` doesn't fix up broken symlinks afer moving a file.
+
+### What steps will reproduce the problem?
+
+    git init
+    git annex init
+    mkdir dir
+    touch dir/a
+    git annex add .
+    git annex sync
+    mv dir/a .
+    git annex fix a
+    ls -alh
+
+### What version of git-annex are you using? On what operating system?
+8.20201127 (I know I know... One year old version :)

Added a comment
diff --git a/doc/git-annex-move/comment_5_e7608b7fac4ec80781cb4281dc2bf596._comment b/doc/git-annex-move/comment_5_e7608b7fac4ec80781cb4281dc2bf596._comment
new file mode 100644
index 000000000..2395e2857
--- /dev/null
+++ b/doc/git-annex-move/comment_5_e7608b7fac4ec80781cb4281dc2bf596._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="Lukey"
+ avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
+ subject="comment 5"
+ date="2021-01-02T15:22:18Z"
+ content="""
+`git annex move` belongs so the same class of commands as `git annex get`, `git annex drop` and `git annex copy` in that it manages file content. Git annex automatically registers a pre-commit hook to fixup symlinks. `git annex fix` can also be used to fixup symlinks, but it currently is broken.
+"""]]

Added a comment
diff --git a/doc/git-annex-reinject/comment_3_2dcdd82efbd6dcac0f3b729d55a09386._comment b/doc/git-annex-reinject/comment_3_2dcdd82efbd6dcac0f3b729d55a09386._comment
new file mode 100644
index 000000000..aa9f7b4aa
--- /dev/null
+++ b/doc/git-annex-reinject/comment_3_2dcdd82efbd6dcac0f3b729d55a09386._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="Lukey"
+ avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
+ subject="comment 3"
+ date="2021-01-02T15:16:04Z"
+ content="""
+The difference of `git annex reinject` to (`git annex import` or `cp/mv; git annex add`) is that only known file contents will be reinjected.
+"""]]

Added a comment
diff --git a/doc/git-annex-add/comment_3_8517f9634d217f731efd704405d3f2ca._comment b/doc/git-annex-add/comment_3_8517f9634d217f731efd704405d3f2ca._comment
new file mode 100644
index 000000000..1cc51ec20
--- /dev/null
+++ b/doc/git-annex-add/comment_3_8517f9634d217f731efd704405d3f2ca._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="Lukey"
+ avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
+ subject="comment 3"
+ date="2021-01-02T15:12:49Z"
+ content="""
+The various configuration options are documented in the main [[git-annex]] manpage, at the bottom.
+
+If it is a one-shot, just use `cp/mv` and `git annex add`. If you want to frequently import from that location, use directory special-remotes with importtree=yes.
+"""]]

Added a comment
diff --git a/doc/forum/Import_existing_files/comment_3_69188f669e6fe5ca1a8c34c3dc3ec201._comment b/doc/forum/Import_existing_files/comment_3_69188f669e6fe5ca1a8c34c3dc3ec201._comment
new file mode 100644
index 000000000..aed6dac42
--- /dev/null
+++ b/doc/forum/Import_existing_files/comment_3_69188f669e6fe5ca1a8c34c3dc3ec201._comment
@@ -0,0 +1,20 @@
+[[!comment format=mdwn
+ username="Lukey"
+ avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
+ subject="comment 3"
+ date="2021-01-02T15:05:01Z"
+ content="""
+You can of course just use `~/Pictures` directly as a repository. So `cd ~/Pictures; git init; git annex init`.
+
+`git annex sync` does a little more things than just `git commit`. For example, it also automatically commits deletion of files.
+
+Sorry, I thought the existing copies of your Photos where just backups of your `~/Pictures`. In that case I suggest you to `mv` the files into the annex and then just `git annex add` them. For DVD's import to a sub-directory of your master branch instead of a dummy branch and without the `--no-content` option.
+
+\"Too many files\" depends on you liking. The more files the slower some operations get, like `git annex sync`. I suggest you to set something like `git annex config --set annex.largefiles 'largerthan=32kb'`. This way, small files get added to git itself instead of git-annex, which speeds up git-annex operations if there are a lot of small files. Note that these small files will be in every clone of the repo and can't be `git annex drop`ed.
+
+The various configuration options are documented in the main [[git-annex]] manpage, at the bottom. Without the `annex.dotfiles` option, dotfiles (any file starting with \".\" and anything inside directories starting with \".\") will still be added, but to git itself with the disadvantages mentioned above.
+
+You can get the key/hash for that file with `git annex info <file>`, and then search for other files with the same content with `find . -lname '*<key>'`.
+
+You can just `cp/mv` the files in the annex and `git annex add` them. Note that for duplicate files in the annex, only one copy of the data/file content will be stored.
+"""]]

Added a comment
diff --git a/doc/forum/Import_existing_files/comment_2_362ce3b030970db82c8dd0d98791186b._comment b/doc/forum/Import_existing_files/comment_2_362ce3b030970db82c8dd0d98791186b._comment
new file mode 100644
index 000000000..45887ce22
--- /dev/null
+++ b/doc/forum/Import_existing_files/comment_2_362ce3b030970db82c8dd0d98791186b._comment
@@ -0,0 +1,35 @@
+[[!comment format=mdwn
+ username="AlbertZeyer"
+ avatar="http://cdn.libravatar.org/avatar/b37d71961a6a5abf9b7184ed77b5a941"
+ subject="comment 2"
+ date="2021-01-01T22:30:34Z"
+ content="""
+Hi, thanks for the answer.
+
+What if I would want to leave `~/Pictures` as-is, and not move it, nor change it? I would prefer that. I just want to add its content to a Git Annex repo, and easily sync future changes as well to the repo (e.g. after I added more files, or renamed some files, or updated some files).
+
+Why `git annex sync` and not `git commit`? I always did only `git commit` so far.
+
+Why `git annex reinject` and not `git annex import` or `cp|mv` & `git annex add`?
+Also, why would I not add files which were/are not part of the original `~/Pictures`? The original `~/Pictures` would not have contained all of the pictures, as they are somewhat distributed. So I want to add unknown files as well.
+
+Why would I import the DVD to a dummy branch? I would want it all in my master/main branch, or not? (I also don't quite understand why I would want branches at all?)
+I also potentially want to `git annex get` such a file at some point.
+
+What are \"too many files\" for a single repo? And why is that a problem?
+I am just adding a Google Takeout archive to Git Annex ([via](https://github.com/albertz/chrome-ext-google-takeout-downloader/)), and it will contain also many of the files of `~/Pictures` (although not all; and sometimes, but not always, in smaller quality, but often also in original quality), but also many other files. So it's already pretty mixed up.
+Or does it make sense to just share the Annex object storage (`.git/annex/objects`) in multiple repos?
+Or do you mean that as the intended use case for branches actually?
+
+What dotfiles does `annex.dotfiles` include? Just all `.*`?
+Why would I not want to add dotfiles? I think I would want to just archive the whole directory as-is.
+
+Also, after reading a bit further, and trying it out a bit, I don't quite understand:
+
+Given some file path (e.g. `Picture/BestPics2020/a.jpg`), how can I find other paths of the same file? (E.g. I would also have the file stored under `Picture/2020/01/a.jpg` or so.) Is that with `git annex list`? I'm not sure this lists all paths. So far I only see a single path always.
+
+I'm not really sure how to use `git annex import` properly, in case the file is already annexed under a different path. In any case, I also want to add the new path (new name).
+
+Sorry for the many follow-up questions, but this is still all somewhat unclear to me.
+
+"""]]

Added a comment: Difference to import/add?
diff --git a/doc/git-annex-reinject/comment_2_d1a04e31fea877ae5fe873fbd01fdcaa._comment b/doc/git-annex-reinject/comment_2_d1a04e31fea877ae5fe873fbd01fdcaa._comment
new file mode 100644
index 000000000..1a65fc6e0
--- /dev/null
+++ b/doc/git-annex-reinject/comment_2_d1a04e31fea877ae5fe873fbd01fdcaa._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="AlbertZeyer"
+ avatar="http://cdn.libravatar.org/avatar/b37d71961a6a5abf9b7184ed77b5a941"
+ subject="Difference to import/add?"
+ date="2021-01-01T21:49:29Z"
+ content="""
+Considering `git annex reinject /tmp/foo.iso foo.iso`, what is the difference to `git import `/tmp/foo.iso` or `cp /tmp/foo.iso; git annex add foo.iso`?
+"""]]

Added a comment: rename or move files
diff --git a/doc/git-annex-move/comment_4_013e20add9a007d3f9a9c2a2ceb6cb06._comment b/doc/git-annex-move/comment_4_013e20add9a007d3f9a9c2a2ceb6cb06._comment
new file mode 100644
index 000000000..8b9f2effc
--- /dev/null
+++ b/doc/git-annex-move/comment_4_013e20add9a007d3f9a9c2a2ceb6cb06._comment
@@ -0,0 +1,13 @@
+[[!comment format=mdwn
+ username="AlbertZeyer"
+ avatar="http://cdn.libravatar.org/avatar/b37d71961a6a5abf9b7184ed77b5a941"
+ subject="rename or move files"
+ date="2021-01-01T21:33:41Z"
+ content="""
+Is this command also for renaming or moving files, like `git mv`?
+
+If not, I think this should be explained more clearly in the documentation.
+
+If not, how would I move/rename files then? As I understand, annexed files are just symlinks. So if I would move the file to another directory (e.g. via `git mv` or just `mv`), the symlink might break.
+
+"""]]

Added a comment: Adding external files
diff --git a/doc/git-annex-add/comment_2_43cf725964c63a2d2545d9f204316a57._comment b/doc/git-annex-add/comment_2_43cf725964c63a2d2545d9f204316a57._comment
new file mode 100644
index 000000000..2b390d980
--- /dev/null
+++ b/doc/git-annex-add/comment_2_43cf725964c63a2d2545d9f204316a57._comment
@@ -0,0 +1,11 @@
+[[!comment format=mdwn
+ username="AlbertZeyer"
+ avatar="http://cdn.libravatar.org/avatar/b37d71961a6a5abf9b7184ed77b5a941"
+ subject="Adding external files"
+ date="2021-01-01T21:30:38Z"
+ content="""
+Let's assume I have some external files in my `~/Pictures` and I want to import them.
+
+Should I use `git annex import ~/Pictures/BestPics2020` or `cp -r ~/Pictures/BestPics2020 .; git annex add BestPics2020`? Is there a difference? Which way would be recommended or preferred?
+
+"""]]

Added a comment: annex.largefiles
diff --git a/doc/git-annex-add/comment_1_0e6b855afb4fba540ea5560df26839c5._comment b/doc/git-annex-add/comment_1_0e6b855afb4fba540ea5560df26839c5._comment
new file mode 100644
index 000000000..d233d96d1
--- /dev/null
+++ b/doc/git-annex-add/comment_1_0e6b855afb4fba540ea5560df26839c5._comment
@@ -0,0 +1,11 @@
+[[!comment format=mdwn
+ username="AlbertZeyer"
+ avatar="http://cdn.libravatar.org/avatar/b37d71961a6a5abf9b7184ed77b5a941"
+ subject="annex.largefiles"
+ date="2021-01-01T21:25:43Z"
+ content="""
+Does annex.largefiles has some documentation? It would be nice to link to that on the doc of git-annex-add.
+
+Esp, after reading this, I wonder about the default value of annex.largefiles. (I assume/hope it is disabled?)
+
+"""]]

diff --git a/doc/bugs/git_annex_fsck_--time-limit_broken.mdwn b/doc/bugs/git_annex_fsck_--time-limit_broken.mdwn
new file mode 100644
index 000000000..93c20bba0
--- /dev/null
+++ b/doc/bugs/git_annex_fsck_--time-limit_broken.mdwn
@@ -0,0 +1,13 @@
+### Please describe the problem.
+`git annex fsck --time-limit=` is broken. <br>
+For one, there is a large delay between the specified time limit until something actually happens. With 20 seconds, `git annex fsck` always runs more than 5 minutes. And then something of the following happens: <br>
+Sometimes it works as intended. <br>
+Sometimes it prints "Time limit (20s) reached!" but hangs without exiting. <br>
+Sometimes it prints "Time limit (20s) reached!" but continues fscking. <br>
+
+### What steps will reproduce the problem?
+In a sufficiently large repo run `git annex fsck --time-limit=20s`.
+
+### What version of git-annex are you using? On what operating system?
+8.20201127
+

Added a comment
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_11_ad9858a76d5a66b2619cec3e3c17abd2._comment b/doc/forum/how_to_get_into_git_annex.../comment_11_ad9858a76d5a66b2619cec3e3c17abd2._comment
new file mode 100644
index 000000000..f6c587003
--- /dev/null
+++ b/doc/forum/how_to_get_into_git_annex.../comment_11_ad9858a76d5a66b2619cec3e3c17abd2._comment
@@ -0,0 +1,12 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 11"
+ date="2021-01-01T04:08:43Z"
+ content="""
+couple of final notes:
+
+* ```--reflog=always``` isn't a cp option, its reflink, and I am a moron. 
+* that same options on btrfs is the bomb. all of the advantages of hardlinks without the disadvantages. 
+"""]]

Added a comment
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_10_42c2754ddb58ab47a442909f9be494a7._comment b/doc/forum/how_to_get_into_git_annex.../comment_10_42c2754ddb58ab47a442909f9be494a7._comment
new file mode 100644
index 000000000..62a6be917
--- /dev/null
+++ b/doc/forum/how_to_get_into_git_annex.../comment_10_42c2754ddb58ab47a442909f9be494a7._comment
@@ -0,0 +1,12 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 10"
+ date="2020-12-31T22:45:03Z"
+ content="""
+You will be unsurprised to hear that what you suggested worked. not sure what helped other than me cleaning up my working tree and doing a solid git annex add .; git annex sync. I also removed annex.thin since its evidently not helping me. 
+thanks a ton. what got me here was me basically running through the \"splitting a repo\" process of making a new git repo, doing a cp -rl ./.git/annex/objects to the new repo and then running various tests on it. I just want to make sure I don't step on my own feet here. 
+
+thanks a ton. 
+"""]]

Added a comment
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_9_9ccb674318f87a63849e654f789722cf._comment b/doc/forum/how_to_get_into_git_annex.../comment_9_9ccb674318f87a63849e654f789722cf._comment
new file mode 100644
index 000000000..7c70ffb8a
--- /dev/null
+++ b/doc/forum/how_to_get_into_git_annex.../comment_9_9ccb674318f87a63849e654f789722cf._comment
@@ -0,0 +1,11 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 9"
+ date="2020-12-31T22:05:19Z"
+ content="""
+I'll chew on the rest of your response, I was bent on hardlinks because I haven't messed with btrfs reflog COW thing much, but its likely clearly the way to go here, so all of my consternation with hardlinks is likely getting me nowhere. I am just always at 90% full and so I don't want to do anything that is going to run me out of space in the middle of an expensive operation. 
+
+anyways, thanks. I guess I just wish I had only put big files into my annex at first, though I would never have known how badly it fails at scale (on my hardware, etc.)
+"""]]

Added a comment
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_8_48bddaee71ad5f246199d66be12709d3._comment b/doc/forum/how_to_get_into_git_annex.../comment_8_48bddaee71ad5f246199d66be12709d3._comment
new file mode 100644
index 000000000..8288e6379
--- /dev/null
+++ b/doc/forum/how_to_get_into_git_annex.../comment_8_48bddaee71ad5f246199d66be12709d3._comment
@@ -0,0 +1,19 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 8"
+ date="2020-12-31T22:00:07Z"
+ content="""
+thanks for responding...
+
+I used git annex unannex because I tried using git annex uninit and it DELETED my entire multi TB ./.git/annex/objects, even though I only had a handful of symlinks on in that repo, I wanted to find another way to unannex files that wouldn't delete my technically \"unused\" data. 
+
+and git annex unannex was what I tried when git annex unlock would not hardlink the files via annex.thin=true. it was only with toying with the 2 commands and finally --fast that I was able to get it to hardlink the files
+
+my end goal was to be able to remove my data reliably from git annex entirely without it purging the object store. 
+
+and now as I read about hardlinks=true or whatever I see that git annex doesn't really love to hardlink multiple files past 2 because then multiple, independent files being modified would corrupt the object store. 
+
+I just want this thing to be reliable at scale. I put all my data into it but the speed is killing me, so I want to be able to get it out or split off data types to secondary git annexes, while having some idea of what it's doing under the covers so I don't get surprised. 
+"""]]

Added a comment
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_7_96147f452229d09d4fa8ed4d606341aa._comment b/doc/forum/how_to_get_into_git_annex.../comment_7_96147f452229d09d4fa8ed4d606341aa._comment
new file mode 100644
index 000000000..f266af8de
--- /dev/null
+++ b/doc/forum/how_to_get_into_git_annex.../comment_7_96147f452229d09d4fa8ed4d606341aa._comment
@@ -0,0 +1,12 @@
+[[!comment format=mdwn
+ username="Lukey"
+ avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
+ subject="comment 7"
+ date="2020-12-31T21:46:33Z"
+ content="""
+Hmm, you seem to have mixed a lot of things up here: <br>
+1. You are not supposed to use `git annex unannex` to unlock a file. Just pretend this command doesn't exist for now and use `git annex unlock` instead. In general, look at the manpages of the commands. For example `man git-annex-unannex`. <br>
+2. Before doing anything further, clean up your repository from the mistake above. First, add all unannexed files back to the annex with `git annex add .` (from the root of your repo) and then commit everything with `git annex sync`. `git status` should now output `nothing to commit, working tree clean`. <br>
+3. After setting `git config annex.thin true` you are supposed to run `git annex fix`. That's exactly what the link you gave says. But as you are using btrfs, I suggest you not to use hard-links, as git annex makes use of reflinks already. <br>
+4. Now that you have a clean worktree, try `git annex unused` again. If it still doesn't work post the full output of `git annex unused` here.
+"""]]

Added a comment
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_6_142a83bd4f5b1729635bf891f75bd94b._comment b/doc/forum/how_to_get_into_git_annex.../comment_6_142a83bd4f5b1729635bf891f75bd94b._comment
new file mode 100644
index 000000000..57417bb28
--- /dev/null
+++ b/doc/forum/how_to_get_into_git_annex.../comment_6_142a83bd4f5b1729635bf891f75bd94b._comment
@@ -0,0 +1,21 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 6"
+ date="2020-12-31T20:51:12Z"
+ content="""
+after (re)reading the following:
+
+https://git-annex.branchable.com/forum/switching_backends/
+
+https://git-annex.branchable.com/bugs/migrated_files_not_showing_up_in_unused_list/
+
+I confirmed again that git annex sync was re-ran, there are no remotes, so that isn't a thing here. I checked out each git branch and did a 
+
+```find ./???/ -lname '*c0ade___this_is_a_long_hash___566fd3*'```
+
+and nothing in any branch is pointed to this old backend key. 
+
+so I am both stymied and befuddled... any tips are appreciated. 
+"""]]

Added a comment
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_5_05f1010f442de6e6e34c1f8ea5c250c7._comment b/doc/forum/how_to_get_into_git_annex.../comment_5_05f1010f442de6e6e34c1f8ea5c250c7._comment
new file mode 100644
index 000000000..53e42fa5e
--- /dev/null
+++ b/doc/forum/how_to_get_into_git_annex.../comment_5_05f1010f442de6e6e34c1f8ea5c250c7._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 5"
+ date="2020-12-31T20:34:55Z"
+ content="""
+I am digging into this further, and it looks like git annex uses cp --reflog=auto, confirmed with filefrag -v, but even if the object from the old backend isn't taking up space, its still frustrating that I can't figure out why git annex is keeping old files around and not reporting them via git annex unused. 
+"""]]

removed
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_4_9a96e13da383c9b61145dd7ec16421c9._comment b/doc/forum/how_to_get_into_git_annex.../comment_4_9a96e13da383c9b61145dd7ec16421c9._comment
deleted file mode 100644
index ea8f2a036..000000000
--- a/doc/forum/how_to_get_into_git_annex.../comment_4_9a96e13da383c9b61145dd7ec16421c9._comment
+++ /dev/null
@@ -1,15 +0,0 @@
-[[!comment format=mdwn
- username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
- nickname="eric.w"
- avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
- subject="comment 4"
- date="2020-12-31T19:40:00Z"
- content="""
-https://git-annex.branchable.com/bugs/migrated_files_not_showing_up_in_unused_list/
-
-according to the link above it should be hardlinked to the new key for the new backend, but this isn't the case. this is on btrfs btw. 
-this is a test repo with no remotes as another data point. 
-also I migrated from SHA256E to SHA256. 
-
-I tried git annex forget; git annex sync; git annex unused, still it isn't showing the objects as unused.
-"""]]

Added a comment
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_5_6ed2e755fb7f738d75468f1c7c4d5555._comment b/doc/forum/how_to_get_into_git_annex.../comment_5_6ed2e755fb7f738d75468f1c7c4d5555._comment
new file mode 100644
index 000000000..4f14ac711
--- /dev/null
+++ b/doc/forum/how_to_get_into_git_annex.../comment_5_6ed2e755fb7f738d75468f1c7c4d5555._comment
@@ -0,0 +1,15 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 5"
+ date="2020-12-31T19:40:22Z"
+ content="""
+https://git-annex.branchable.com/bugs/migrated_files_not_showing_up_in_unused_list/
+
+according to the link above it should be hardlinked to the new key for the new backend, but this isn't the case. this is on btrfs btw. 
+this is a test repo with no remotes as another data point. 
+also I migrated from SHA256E to SHA256. 
+
+I tried git annex forget --force; git annex sync; git annex unused, still it isn't showing the objects as unused.
+"""]]

Added a comment
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_4_9a96e13da383c9b61145dd7ec16421c9._comment b/doc/forum/how_to_get_into_git_annex.../comment_4_9a96e13da383c9b61145dd7ec16421c9._comment
new file mode 100644
index 000000000..ea8f2a036
--- /dev/null
+++ b/doc/forum/how_to_get_into_git_annex.../comment_4_9a96e13da383c9b61145dd7ec16421c9._comment
@@ -0,0 +1,15 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 4"
+ date="2020-12-31T19:40:00Z"
+ content="""
+https://git-annex.branchable.com/bugs/migrated_files_not_showing_up_in_unused_list/
+
+according to the link above it should be hardlinked to the new key for the new backend, but this isn't the case. this is on btrfs btw. 
+this is a test repo with no remotes as another data point. 
+also I migrated from SHA256E to SHA256. 
+
+I tried git annex forget; git annex sync; git annex unused, still it isn't showing the objects as unused.
+"""]]

Added a comment
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_3_20d36c31731303291e4e43af1241692f._comment b/doc/forum/how_to_get_into_git_annex.../comment_3_20d36c31731303291e4e43af1241692f._comment
new file mode 100644
index 000000000..711ba5947
--- /dev/null
+++ b/doc/forum/how_to_get_into_git_annex.../comment_3_20d36c31731303291e4e43af1241692f._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 3"
+ date="2020-12-31T19:34:56Z"
+ content="""
+right now I am driving myself crazy trying to understand why I have objects that *nothing is pointing to*, yet git annex unused fails to report them. these objects report 1 hardlink and they are from a migrated backend. I'll try git annex forget, but I really don't understand what is keeping these objects from being reported as unused. 
+"""]]

Added a comment
diff --git a/doc/forum/how_to_get_into_git_annex.../comment_2_cde005ffa911f72e6d37ec8ad9bf76f8._comment b/doc/forum/how_to_get_into_git_annex.../comment_2_cde005ffa911f72e6d37ec8ad9bf76f8._comment
new file mode 100644
index 000000000..89470ab02
--- /dev/null
+++ b/doc/forum/how_to_get_into_git_annex.../comment_2_cde005ffa911f72e6d37ec8ad9bf76f8._comment
@@ -0,0 +1,14 @@
+[[!comment format=mdwn
+ username="eric.w@eee65cd362d995ced72640c7cfae388ae93a4234"
+ nickname="eric.w"
+ avatar="http://cdn.libravatar.org/avatar/8d9808c12db3a3f93ff7f9e74c0870fc"
+ subject="comment 2"
+ date="2020-12-31T18:55:29Z"
+ content="""
+the most recent example I've run across is the use of 
+git.annex=thin 
+in the link here: https://git-annex.branchable.com/tips/unlocked_files/
+it didn't result in a hardlink being made of the content for either git annex unlock or git annex unannex
+instead I ended up getting the same functionality by use --fast.
+
+"""]]

diff --git a/doc/forum/how_to_get_into_git_annex....mdwn b/doc/forum/how_to_get_into_git_annex....mdwn
index 290428900..84660c6aa 100644
--- a/doc/forum/how_to_get_into_git_annex....mdwn
+++ b/doc/forum/how_to_get_into_git_annex....mdwn
@@ -1,9 +1,9 @@
 So... I've been flirting with using git annex for literal years now, and if for some reason you are wanting to use it too here are some tips:
 
-1) keep backups. seriously. just do it. it's possible to lose data, even though git annex is designed to avoid eating your data it will do it under certain circumstances. you aren't lucky enough to avoid it. trust me. 
-2) make a big fat git annex with too many files in it, and kick the tires, hard. run all the commands and try to break it, see what it does under certain circumstances before you run those same commands on your beloved data. (the documentation isn't always up to date, sometimes the options (which are complex) operate differently than the website says and differently than you expect, this is most likely due to code changes that haven't propagated to the website.
-3) git annex bogs down fast when you are dealing with a large number of objects, there are ways to get that under control, but nothing is going to make managing an annex with millions of files "fast" for many operations.
-4) now that you are a pro at git annex, STILL *keep* backups. git annex isn't a backup. it just isn't. nothing beats a simple usb hard drive stuffed in your safe with all your files on it and without the complexity that is git annex in the way.
+* keep backups. seriously. just do it. it's possible to lose data, even though git annex is designed to avoid eating your data it will do it under certain circumstances. you aren't lucky enough to avoid it. trust me. 
+* make a big fat git annex with too many files in it, and kick the tires, hard. run all the commands and try to break it, see what it does under certain circumstances before you run those same commands on your beloved data. (the documentation isn't always up to date, sometimes the options (which are complex) operate differently than the website says and differently than you expect, this is most likely due to code changes that haven't propagated to the website.
+* git annex bogs down fast when you are dealing with a large number of objects, there are ways to get that under control, but nothing is going to make managing an annex with millions of files "fast" for many operations.
+* now that you are a pro at git annex, STILL *keep* backups. git annex isn't a backup. it just isn't. nothing beats a simple usb hard drive stuffed in your safe with all your files on it and without the complexity that is git annex in the way.