Recent changes to this wiki:

devblog
diff --git a/doc/devblog/day_510__v6_get_drop_index.mdwn b/doc/devblog/day_510__v6_get_drop_index.mdwn
new file mode 100644
index 000000000..9cd7e805b
--- /dev/null
+++ b/doc/devblog/day_510__v6_get_drop_index.mdwn
@@ -0,0 +1,14 @@
+I've now fixed the worst problem with v6 mode, which was that get/drop of
+unlocked files would cause git to think that the files were modified.
+
+Since the clean filter now runs quite fast, I was able to fix that by,
+after git-annex updates the worktree, restaging the not-really-modified 
+file in the index.
+
+This approach is not optimal; index file updates have overhead; and only
+one process can update the index file at one time. [[todo/smudge]] has a
+bunch of new todo items for cases where this change causes problems. Still,
+it seems a lot better than the old behavior, which made v6 mode nearly
+unusable IMHO.
+
+This work is supported by the NSF-funded DataLad project.

finally fixed v6 get/drop git status
After updating the worktree for an add/drop, update git's index, so git
status will not show the files as modified.
What actually happens is that the index update removes the inode
information from the index. The next git status (or similar) run
then has to do some work. It runs the clean filter.
So, this depends on the clean filter being reasonably fast and on git
not leaking memory when running it. Both problems were fixed in
a96972015dd76271b46432151e15d5d38d7151ff, but only for git 2.5. Anyone
using an older git will see very expensive git status after an add/drop.
This uses the same git update-index queue as other parts of git-annex, so
the actual index update is fairly efficient. Of course, updating the index
does still have some overhead. The annex.queuesize config will control how
often the index gets updated when working on a lot of files.
This is an imperfect workaround... Added several todos about new
problems this workaround causes. Still, this seems a lot better than the
old behavior.
This commit was supported by the NSF-funded DataLad project.
diff --git a/Annex/Content.hs b/Annex/Content.hs
index 8011a8230..76a5454d3 100644
--- a/Annex/Content.hs
+++ b/Annex/Content.hs
@@ -24,6 +24,7 @@ module Annex.Content (
 	checkDiskSpace,
 	needMoreDiskSpace,
 	moveAnnex,
+	Restage(..),
 	populatePointerFile,
 	linkToAnnex,
 	linkFromAnnex,
@@ -545,7 +546,7 @@ moveAnnex key src = ifM (checkSecureHashes key)
 			fs <- map (`fromTopFilePath` g)
 				<$> Database.Keys.getAssociatedFiles key
 			unless (null fs) $ do
-				mapM_ (populatePointerFile key dest) fs
+				mapM_ (populatePointerFile (Restage True) key dest) fs
 				Database.Keys.storeInodeCaches key (dest:fs)
 		)
 	storeindirect = storeobject =<< calcRepo (gitAnnexLocation key)
@@ -586,14 +587,23 @@ checkSecureHashes key
 		, return True
 		)
 
-populatePointerFile :: Key -> FilePath -> FilePath -> Annex ()
-populatePointerFile k obj f = go =<< liftIO (isPointerFile f)
+newtype Restage = Restage Bool
+
+{- Populates a pointer file with the content of a key. -}
+populatePointerFile :: Restage -> Key -> FilePath -> FilePath -> Annex ()
+populatePointerFile (Restage restage) k obj f = go =<< liftIO (isPointerFile f)
   where
 	go (Just k') | k == k' = do
 		destmode <- liftIO $ catchMaybeIO $ fileMode <$> getFileStatus f
 		liftIO $ nukeFile f
 		ifM (linkOrCopy k obj f destmode)
-			( thawContent f
+			( do
+				thawContent f
+	 			-- The pointer file is re-staged,
+				-- so git won't think it's been modified.
+				when restage $ do
+					pointersha <- hashPointerFile k
+					stagePointerFile f destmode pointersha
 			, liftIO $ writePointerFile f k destmode
 			)
 	go _ = return ()
@@ -816,12 +826,6 @@ cleanObjectLoc key cleaner = do
 			<=< catchMaybeIO $ removeDirectory dir
 
 {- Removes a key's file from .git/annex/objects/
- -
- - When a key has associated pointer files, they are checked for
- - modifications, and if unmodified, are reset.
- -
- - In direct mode, deletes the associated files or files, and replaces
- - them with symlinks.
  -}
 removeAnnex :: ContentRemovalLock -> Annex ()
 removeAnnex (ContentRemovalLock key) = withObjectLoc key remove removedirect
@@ -834,22 +838,33 @@ removeAnnex (ContentRemovalLock key) = withObjectLoc key remove removedirect
 			=<< Database.Keys.getAssociatedFiles key
 		Database.Keys.removeInodeCaches key
 		Direct.removeInodeCache key
+ 
+	-- Check associated pointer file for modifications, and reset if
+	-- it's unmodified.
 	resetpointer file = ifM (isUnmodified key file)
 		( do
 			mode <- liftIO $ catchMaybeIO $ fileMode <$> getFileStatus file
 			secureErase file
 			liftIO $ nukeFile file
 			liftIO $ writePointerFile file key mode
-		-- Can't delete the pointer file.
+			-- Re-stage the pointer, so git won't think it's
+			-- been modified.
+			pointersha <- hashPointerFile key
+			stagePointerFile file mode pointersha
+		-- Modified file, so leave it alone.
 		-- If it was a hard link to the annex object,
 		-- that object might have been frozen as part of the
 		-- removal process, so thaw it.
 		, void $ tryIO $ thawContent file
 		)
+ 
+	-- In direct mode, deletes the associated files or files, and replaces
+	-- them with symlinks.
 	removedirect fs = do
 		cache <- Direct.recordedInodeCache key
 		Direct.removeInodeCache key
 		mapM_ (resetfile cache) fs
+	
 	resetfile cache f = whenM (Direct.sameInodeCache f cache) $ do
 		l <- calcRepo $ gitAnnexLink f key
 		secureErase f
diff --git a/Annex/Ingest.hs b/Annex/Ingest.hs
index 1bc081560..aa556b371 100644
--- a/Annex/Ingest.hs
+++ b/Annex/Ingest.hs
@@ -140,14 +140,13 @@ ingestAdd' ld@(Just (LockedDown cfg source)) mk = do
 			return (Just k)
 
 {- Ingests a locked down file into the annex. Does not update the working
- - tree or the index.
- -}
+ - tree or the index. -}
 ingest :: Maybe LockedDown -> Maybe Key -> Annex (Maybe Key, Maybe InodeCache)
-ingest = ingest' Nothing
+ingest ld mk = ingest' Nothing ld mk (Restage True)
 
-ingest' :: Maybe Backend -> Maybe LockedDown -> Maybe Key -> Annex (Maybe Key, Maybe InodeCache)
-ingest' _ Nothing _ = return (Nothing, Nothing)
-ingest' preferredbackend (Just (LockedDown cfg source)) mk = withTSDelta $ \delta -> do
+ingest' :: Maybe Backend -> Maybe LockedDown -> Maybe Key -> Restage -> Annex (Maybe Key, Maybe InodeCache)
+ingest' _ Nothing _ _ = return (Nothing, Nothing)
+ingest' preferredbackend (Just (LockedDown cfg source)) mk restage = withTSDelta $ \delta -> do
 	k <- case mk of
 		Nothing -> do
 			backend <- maybe (chooseBackend $ keyFilename source) (return . Just) preferredbackend
@@ -172,7 +171,7 @@ ingest' preferredbackend (Just (LockedDown cfg source)) mk = withTSDelta $ \delt
 	golocked key mcache s =
 		tryNonAsync (moveAnnex key $ contentLocation source) >>= \case
 			Right True -> do
-				populateAssociatedFiles key source
+				populateAssociatedFiles key source restage
 				success key mcache s		
 			Right False -> giveup "failed to add content to annex"
 			Left e -> restoreFile (keyFilename source) key e
@@ -186,7 +185,7 @@ ingest' preferredbackend (Just (LockedDown cfg source)) mk = withTSDelta $ \delt
 		linkToAnnex key (keyFilename source) (Just cache) >>= \case
 			LinkAnnexFailed -> failure "failed to link to annex"
 			_ -> do
-				finishIngestUnlocked' key source
+				finishIngestUnlocked' key source restage
 				success key (Just cache) s
 	gounlocked _ _ _ = failure "failed statting file"
 
@@ -218,23 +217,23 @@ finishIngestDirect key source = do
 finishIngestUnlocked :: Key -> KeySource -> Annex ()
 finishIngestUnlocked key source = do
 	cleanCruft source
-	finishIngestUnlocked' key source
+	finishIngestUnlocked' key source (Restage True)
 
-finishIngestUnlocked' :: Key -> KeySource -> Annex ()
-finishIngestUnlocked' key source = do
+finishIngestUnlocked' :: Key -> KeySource -> Restage -> Annex ()
+finishIngestUnlocked' key source restage = do
 	Database.Keys.addAssociatedFile key =<< inRepo (toTopFilePath (keyFilename source))
-	populateAssociatedFiles key source
+	populateAssociatedFiles key source restage
 
 {- Copy to any other locations using the same key. -}
-populateAssociatedFiles :: Key -> KeySource -> Annex ()
-populateAssociatedFiles key source = do
+populateAssociatedFiles :: Key -> KeySource -> Restage -> Annex ()
+populateAssociatedFiles key source restage = do
 	obj <- calcRepo (gitAnnexLocation key)
 	g <- Annex.gitRepo
 	ingestedf <- flip fromTopFilePath g
 		<$> inRepo (toTopFilePath (keyFilename source))
 	afs <- map (`fromTopFilePath` g) <$> Database.Keys.getAssociatedFiles key
 	forM_ (filter (/= ingestedf) afs) $
-		populatePointerFile key obj
+		populatePointerFile restage key obj
 
 cleanCruft :: KeySource -> Annex ()
 cleanCruft source = when (contentLocation source /= keyFilename source) $
diff --git a/CHANGELOG b/CHANGELOG
index 7e12578c9..67fdc8b22 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -9,6 +9,8 @@ git-annex (6.20180808) UNRELEASED; urgency=medium
   * v6 add: Take advantage of improved SIGPIPE handler in git 2.5 to
     speed up the clean filter by not reading the file content from the
     pipe. This also avoids git buffering the whole file content in memory.
+  * v6: After updating the worktree for an add/drop, update git's index,
+    so git status will not show the files as modified.
 
  -- Joey Hess <id@joeyh.name>  Wed, 08 Aug 2018 11:24:08 -0400
 
diff --git a/Command/Smudge.hs b/Command/Smudge.hs
index cfd327c8d..c2e74c076 100644
--- a/Command/Smudge.hs
+++ b/Command/Smudge.hs
@@ -97,7 +97,7 @@ clean file = do
 					<$> catKeyFile file
 				liftIO . emitPointer
 					=<< go
-					=<< (\ld -> ingest' currbackend ld Nothing)
+					=<< (\ld -> ingest' currbackend ld Nothing norestage)
 					=<< lockDown cfg file
 			, liftIO $ B.hPut stdout b
 			)
@@ -111,6 +111,9 @@ clean file = do
 		{ lockingFile = False
 		, hardlinkFileTmp = False
 		}
+	-- Can't restage associated files because git add runs this and has

(Diff truncated)
response
diff --git a/doc/forum/Forcing_offline_copies_seem_available/comment_3_3d57e187679b36be62f3918eb996c3df._comment b/doc/forum/Forcing_offline_copies_seem_available/comment_3_3d57e187679b36be62f3918eb996c3df._comment
new file mode 100644
index 000000000..9ede92c73
--- /dev/null
+++ b/doc/forum/Forcing_offline_copies_seem_available/comment_3_3d57e187679b36be62f3918eb996c3df._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2018-08-14T17:40:10Z"
+ content="""
+Indeed, making the proxy remote not accessible will avoid such probles.
+
+You will certianly need to make git-annex trust the remote if you want it
+to count the tape as a copy of a file.
+"""]]

Added a comment
diff --git a/doc/forum/Forcing_offline_copies_seem_available/comment_2_7d0e219628ad19093c6b6410d24c5f9a._comment b/doc/forum/Forcing_offline_copies_seem_available/comment_2_7d0e219628ad19093c6b6410d24c5f9a._comment
new file mode 100644
index 000000000..a63102f62
--- /dev/null
+++ b/doc/forum/Forcing_offline_copies_seem_available/comment_2_7d0e219628ad19093c6b6410d24c5f9a._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="jarno"
+ avatar="http://cdn.libravatar.org/avatar/da157278ab330a9c936dd733c43740f8"
+ subject="comment 2"
+ date="2018-08-14T16:03:39Z"
+ content="""
+Nice, thank you! I think I'll just delete or move the proxy away after feeding the (fake) availability info to the main repo. That should prevent git-annex from accidentally discovering that the data doesn't actually exist on the proxy, right?
+"""]]

response
diff --git a/doc/forum/Forcing_offline_copies_seem_available/comment_1_75e8597a24c925981edf83c517d5a799._comment b/doc/forum/Forcing_offline_copies_seem_available/comment_1_75e8597a24c925981edf83c517d5a799._comment
new file mode 100644
index 000000000..dc37d67c9
--- /dev/null
+++ b/doc/forum/Forcing_offline_copies_seem_available/comment_1_75e8597a24c925981edf83c517d5a799._comment
@@ -0,0 +1,32 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2018-08-14T14:50:51Z"
+ content="""
+You can use `git annex setpresentkey` to tell git-annex that a remote with
+a given uuid contains a given content.
+
+For example, if the proxy remote is named proxy
+and you know it contains all annexed files in the current directory
+and below, you could run this to tell git-annex that the proxy contains
+all the files it thought it didn't contain:
+
+	uuid=$(git config remote.proxy.annex-uuid)
+	git annex find --not --in proxy --format "\${key} $uuid 1\n" | \
+		git annex setpresentkey --batch
+
+There will be some problems using this empty proxy remote, eg if you
+run `git annex move somefile --from proxy`, git-annex will try to
+delete the content from it, see the content is not there, and update its
+location tracking to say that the proxy does not contain the content any
+longer. `git annex fsck --from proxy` will do similar so you'll need to
+avoid it.
+
+And, you'll probably want to use `git annex trust proxy` so that `git-annex
+drop` assumes it contains the content you said it has; by default git-annex
+will double-check and that check will fail.
+
+To avoid all these kind of issues with the proxy, a better approach might
+be to make a custom special remote that actually accesses the data on the
+tape drive. See [[special remote implementation howto|special_remotes/external]]
+"""]]
diff --git a/doc/todo/smudge.mdwn b/doc/todo/smudge.mdwn
index 9bf5bd783..e3df6b10d 100644
--- a/doc/todo/smudge.mdwn
+++ b/doc/todo/smudge.mdwn
@@ -42,13 +42,14 @@ git-annex should use smudge/clean filters.
   > update-index would add the new content. To avoid this, use 
   > `git update-index --index-info`. The next run of `git status`
   > then runs the clean filter, and will detect if the file has gotten
-  > modified after the get/drop.
+  > modified after the get/drop. TODO
 
-* Implement git's new `filter.<driver>.process` interface, which will
+* Use git's new `filter.<driver>.process` interface, which will
   let only 1 git-annex process be started by git when processing
   multiple files, and so should be faster.
 
-  See [[todo/Long_Running_Filter_Process]]
+  See [[todo/Long_Running_Filter_Process]] .. it's not currently actually a
+  win but might be a good way to improve git to work better with v6.
 
 * Checking out a different branch causes git to smudge all changed files,
   and write their content. This does not honor annex.thin. A warning

diff --git a/doc/forum/Hello_everyone.mdwn b/doc/forum/Hello_everyone.mdwn
new file mode 100644
index 000000000..6735c2499
--- /dev/null
+++ b/doc/forum/Hello_everyone.mdwn
@@ -0,0 +1,4 @@
+Hi Everyone
+I'm a new member and my name is Kullboys.
+I'm happy to be familiar with you.
+Thanks[.](https://lab.louiz.org/snippets/421) 

full plan
diff --git a/doc/todo/smudge.mdwn b/doc/todo/smudge.mdwn
index 3673b3e94..9bf5bd783 100644
--- a/doc/todo/smudge.mdwn
+++ b/doc/todo/smudge.mdwn
@@ -35,11 +35,14 @@ git-annex should use smudge/clean filters.
   And developed a patch set: [[git-patches]]
 
   > Thanks to [[!commit a96972015dd76271b46432151e15d5d38d7151ff]], 
-  > the clean filter is now very quick, so perhaps git update-index would
-  > be ok?
-  > 
-  > Better, `git update-index --index-info` can be used, this avoids
-  > running the clean filter.
+  > the clean filter is now very quick, so, this can be fixed by running
+  > git update-index with files affected by get/drop.
+  >
+  > In case a file's content quickly changes after get/drop, git
+  > update-index would add the new content. To avoid this, use 
+  > `git update-index --index-info`. The next run of `git status`
+  > then runs the clean filter, and will detect if the file has gotten
+  > modified after the get/drop.
 
 * Implement git's new `filter.<driver>.process` interface, which will
   let only 1 git-annex process be started by git when processing

even better idea
diff --git a/doc/todo/smudge.mdwn b/doc/todo/smudge.mdwn
index 1e07602bf..3673b3e94 100644
--- a/doc/todo/smudge.mdwn
+++ b/doc/todo/smudge.mdwn
@@ -37,6 +37,9 @@ git-annex should use smudge/clean filters.
   > Thanks to [[!commit a96972015dd76271b46432151e15d5d38d7151ff]], 
   > the clean filter is now very quick, so perhaps git update-index would
   > be ok?
+  > 
+  > Better, `git update-index --index-info` can be used, this avoids
+  > running the clean filter.
 
 * Implement git's new `filter.<driver>.process` interface, which will
   let only 1 git-annex process be started by git when processing

update
diff --git a/doc/todo/smudge.mdwn b/doc/todo/smudge.mdwn
index 3ebb92b0f..1e07602bf 100644
--- a/doc/todo/smudge.mdwn
+++ b/doc/todo/smudge.mdwn
@@ -5,14 +5,12 @@ git-annex should use smudge/clean filters.
 * Reconcile staged changes into the associated files database, whenever
   the database is queried. This is needed to handle eg:
 
-  ```
-  	git add largefile
+	git add largefile
 	git mv largefile othername
 	git annex move othername --to foo
 	# fails to drop content from associated file othername,
 	# because it doesn't know it has that name
 	# git commit clears up this mess
-  ```
 
 * Dropping a smudged file causes git status (and git annex status)
   to show it as modified,  because the timestamp has changed. 
@@ -36,10 +34,16 @@ git-annex should use smudge/clean filters.
 
   And developed a patch set: [[git-patches]]
 
+  > Thanks to [[!commit a96972015dd76271b46432151e15d5d38d7151ff]], 
+  > the clean filter is now very quick, so perhaps git update-index would
+  > be ok?
+
 * Implement git's new `filter.<driver>.process` interface, which will
   let only 1 git-annex process be started by git when processing
   multiple files, and so should be faster.
 
+  See [[todo/Long_Running_Filter_Process]]
+
 * Checking out a different branch causes git to smudge all changed files,
   and write their content. This does not honor annex.thin. A warning
   message is printed in this case.  
@@ -73,8 +77,7 @@ git-annex should use smudge/clean filters.
 
   Last verified with git 2.18 in 2018. 
 
-  To check: Does the long-running filter process interface have the same
-  problem?
+  Note that the long-running filter process interface has the same problem.
 
 * Eventually (but not yet), make v6 the default for new repositories.
   Note that the assistant forces repos into direct mode; that will need to

status
diff --git a/doc/todo/Long_Running_Filter_Process/comment_3_24d89d0e8eb2da6e43d107caa71e042b._comment b/doc/todo/Long_Running_Filter_Process/comment_3_24d89d0e8eb2da6e43d107caa71e042b._comment
new file mode 100644
index 000000000..f37fdf21f
--- /dev/null
+++ b/doc/todo/Long_Running_Filter_Process/comment_3_24d89d0e8eb2da6e43d107caa71e042b._comment
@@ -0,0 +1,16 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2018-08-13T20:24:02Z"
+ content="""
+The "filterdriver" branch implements support for these.
+
+However, it's actually slower than the old interface, because the new
+interface requires git-annex read the whole file content from git when
+adding a file, and the old interface let it not read any content.
+
+Since the new interface does have capabilities, a new capability could
+prevent schepping the content over the pipe, and let the filter driver
+refer to the worktree file instead, and respond with the path of a file.
+This would be similar to my old patch set for the old interface.
+"""]]

devblog
diff --git a/doc/devblog/day_509__filterdriver.mdwn b/doc/devblog/day_509__filterdriver.mdwn
new file mode 100644
index 000000000..8bc6fc9c4
--- /dev/null
+++ b/doc/devblog/day_509__filterdriver.mdwn
@@ -0,0 +1,16 @@
+Working on a "filterdriver" branch, I've implemented support for the
+long-running smudge/clean process interface.
+
+It works, but not really any better than the old smudge/clean interface.
+Unfortunately git leaks memory just as badly in the new interface as it did
+in the old interface when sending large data to the smudge filter. Also,
+the new interface requires that the clean filter read all the content of the
+file from git, even when it's just going to look at the file on disk, so
+that's worse performance.
+
+So, I don't think I'll be merging that branch yet, but git's interface does
+support adding capabilities, and perhaps a capability could be added that
+avoids it schlepping the file content over the pipe. Same as my old git
+patches tried to do with the old smudge/clean interface.
+
+This work is supported by the NSF-funded DataLad project.

response
diff --git a/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD/comment_1_a959c15b7144acc5d68bb3891107a480._comment b/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD/comment_1_a959c15b7144acc5d68bb3891107a480._comment
new file mode 100644
index 000000000..456fd7796
--- /dev/null
+++ b/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD/comment_1_a959c15b7144acc5d68bb3891107a480._comment
@@ -0,0 +1,17 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2018-08-13T16:35:43Z"
+ content="""
+This is caused by this bug in esquelito: 
+https://github.com/bitemyapp/esqueleto/issues/80
+
+The best way to avoid this kind of transient breakage in the haskell
+dependencies of git-annex is to build it using stack, instead of cabal.
+stack pins packages to a consistent working set.
+
+I don't really see this as something that warrants a change to git-annex.
+Using bleeding edge versions of all build dependencies will break, that's
+why the build docs recommend not using cabal if you don't want to be involved
+in fixing that kind  of breakage.
+"""]]

Added a comment: Similar issue again
diff --git a/doc/bugs/Massive_git_add_produces_sqlite_crashes/comment_4_3f44cf6a9251f664cd0e00d168232696._comment b/doc/bugs/Massive_git_add_produces_sqlite_crashes/comment_4_3f44cf6a9251f664cd0e00d168232696._comment
new file mode 100644
index 000000000..7c5d61e96
--- /dev/null
+++ b/doc/bugs/Massive_git_add_produces_sqlite_crashes/comment_4_3f44cf6a9251f664cd0e00d168232696._comment
@@ -0,0 +1,12 @@
+[[!comment format=mdwn
+ username="webanck"
+ avatar="http://cdn.libravatar.org/avatar/cd273f76ef8c4218510b4f50ef7e1f3d"
+ subject="Similar issue again"
+ date="2018-08-13T14:18:17Z"
+ content="""
+Hello, it has been a while since I posted here about this issue with sqlite but it keeps following me! I randomly get errors while trying to lock files: 
+```sqlite worker thread crashed: SQLite3 returned ErrorIO while attempting to perform prepare \"SELECT null from content limit 1\": disk I/O error```
+
+Should I worry about the state of my hard drive? And I don't know if it is intended, but when this happens, the process doesn't stop with a failure code, it just freezes.
+I checked with top, and git-annex seems to continue doing stuff as it is still using a full core.
+"""]]

removed
diff --git a/doc/bugs/Massive_git_add_produces_sqlite_crashes/comment_4_854add3cc6b737b752c16269e1e06f45._comment b/doc/bugs/Massive_git_add_produces_sqlite_crashes/comment_4_854add3cc6b737b752c16269e1e06f45._comment
deleted file mode 100644
index f81ed5ef2..000000000
--- a/doc/bugs/Massive_git_add_produces_sqlite_crashes/comment_4_854add3cc6b737b752c16269e1e06f45._comment
+++ /dev/null
@@ -1,13 +0,0 @@
-[[!comment format=mdwn
- username="webanck"
- avatar="http://cdn.libravatar.org/avatar/cd273f76ef8c4218510b4f50ef7e1f3d"
- subject="Similar issue again"
- date="2018-08-13T14:11:48Z"
- content="""
-This issue with sqlite keeps following me!
-I randomly get errors while trying to lock files:
-```sqlite worker thread crashed: SQLite3 returned ErrorIO while attempting to perform prepare \"SELECT null from content limit 1\": disk I/O error```
-
-Should I worry about the state of my hard drive?
-And I don't know if it is intended, but when this happens, the process doesn't stop with a failure code, it just freezes.
-"""]]

Added a comment: Similar issue again
diff --git a/doc/bugs/Massive_git_add_produces_sqlite_crashes/comment_4_854add3cc6b737b752c16269e1e06f45._comment b/doc/bugs/Massive_git_add_produces_sqlite_crashes/comment_4_854add3cc6b737b752c16269e1e06f45._comment
new file mode 100644
index 000000000..f81ed5ef2
--- /dev/null
+++ b/doc/bugs/Massive_git_add_produces_sqlite_crashes/comment_4_854add3cc6b737b752c16269e1e06f45._comment
@@ -0,0 +1,13 @@
+[[!comment format=mdwn
+ username="webanck"
+ avatar="http://cdn.libravatar.org/avatar/cd273f76ef8c4218510b4f50ef7e1f3d"
+ subject="Similar issue again"
+ date="2018-08-13T14:11:48Z"
+ content="""
+This issue with sqlite keeps following me!
+I randomly get errors while trying to lock files:
+```sqlite worker thread crashed: SQLite3 returned ErrorIO while attempting to perform prepare \"SELECT null from content limit 1\": disk I/O error```
+
+Should I worry about the state of my hard drive?
+And I don't know if it is intended, but when this happens, the process doesn't stop with a failure code, it just freezes.
+"""]]

Added a comment: Also with embedcreds=yes
diff --git a/doc/forum/Change_or_add_S3_credentials/comment_1_864932114dcc7aaf1d88edb0673f1d86._comment b/doc/forum/Change_or_add_S3_credentials/comment_1_864932114dcc7aaf1d88edb0673f1d86._comment
new file mode 100644
index 000000000..bd9f92c8a
--- /dev/null
+++ b/doc/forum/Change_or_add_S3_credentials/comment_1_864932114dcc7aaf1d88edb0673f1d86._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="Mara"
+ avatar="http://cdn.libravatar.org/avatar/9b8abe5a5b0a41b88fc9970a88c2e317"
+ subject="Also with embedcreds=yes"
+ date="2018-08-13T12:28:09Z"
+ content="""
+Also with `public=no` but `embedcreds=yes`.
+
+It can be useful to embed read-only credentials, but allow users to easily add/store their own credentials (locally) with write-access.
+"""]]

diff --git a/doc/forum/Change_or_add_S3_credentials.mdwn b/doc/forum/Change_or_add_S3_credentials.mdwn
new file mode 100644
index 000000000..736f9e4e1
--- /dev/null
+++ b/doc/forum/Change_or_add_S3_credentials.mdwn
@@ -0,0 +1,6 @@
+How do I change or add S3 credentials, when a S3 special remote is already initialised/enabled?
+
+I have a repository with a `public=yes` S3 remote, such that people can read the data without credentials.
+But then when they need to upload files, how do they add their credentials?
+
+Setting the `AWS_*` environment variables when running `git annex copy --to=s3` works, but then the credentials are not stored.

diff --git a/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD.mdwn b/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD.mdwn
index 949ab8745..d8e9fd8f6 100644
--- a/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD.mdwn
+++ b/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD.mdwn
@@ -11,24 +11,20 @@ git-annex can't compile on FreeBSD; specifically, the build fails while satisfyi
 ### What version of git-annex are you using? On what operating system?
 
 git-annex HEAD.
+
 FreeBSD 11.1-RELEASE r321309 GENERIC amd64
 
 ### Please provide any additional information below.
 
 The full log is available at [https://gitlab.com/snippets/1743708](https://gitlab.com/snippets/1743708).  Summary below:
 
-[[!format sh """
-cabal: Error: some packages failed to install:
-esqueleto-2.5.3-J2ccnERt7unG9UdXfc7jAa depends on esqueleto-2.5.3 which failed
-to install.
-persistent-2.7.0-IWtmEvQAI3yHscMZvQrE6P failed during the building phase. The
-exception was:
-ExitFailure 1
-persistent-sqlite-2.6.4-3aF88LYjPwqbsHGVQ1VUp depends on
-persistent-sqlite-2.6.4 which failed to install.
-persistent-template-2.5.4-2tn9hCQqx2e2mAPIKgHBFO depends on
-persistent-template-2.5.4 which failed to install.
-"""]]
+    cabal: Error: some packages failed to install:
+    esqueleto-2.5.3-J2ccnERt7unG9UdXfc7jAa depends on esqueleto-2.5.3 which failed to install.
+    persistent-2.7.0-IWtmEvQAI3yHscMZvQrE6P failed during the building phase. The exception was: ExitFailure 1
+    persistent-sqlite-2.6.4-3aF88LYjPwqbsHGVQ1VUp depends on
+    persistent-sqlite-2.6.4 which failed to install.
+    persistent-template-2.5.4-2tn9hCQqx2e2mAPIKgHBFO depends on
+    persistent-template-2.5.4 which failed to install.
 
 ### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
 

diff --git a/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD.mdwn b/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD.mdwn
index 7fe915359..949ab8745 100644
--- a/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD.mdwn
+++ b/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD.mdwn
@@ -15,7 +15,7 @@ FreeBSD 11.1-RELEASE r321309 GENERIC amd64
 
 ### Please provide any additional information below.
 
-The full log is available at https://gitlab.com/snippets/1743708.  Summary below:
+The full log is available at [https://gitlab.com/snippets/1743708](https://gitlab.com/snippets/1743708).  Summary below:
 
 [[!format sh """
 cabal: Error: some packages failed to install:

diff --git a/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD.mdwn b/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD.mdwn
new file mode 100644
index 000000000..7fe915359
--- /dev/null
+++ b/doc/bugs/git-annex_can__39__t_compile_on_FreeBSD.mdwn
@@ -0,0 +1,35 @@
+### Please describe the problem.
+
+git-annex can't compile on FreeBSD; specifically, the build fails while satisfying dependencies.
+
+### What steps will reproduce the problem?
+
+1. git clone git://git-annex.branchable.com/ git-annex
+2. cd git-annex
+3. cabal install -j -f-assistant -webapp -webdav -pairing -xmpp -dns -dbus -magicmime --only-dependencies
+
+### What version of git-annex are you using? On what operating system?
+
+git-annex HEAD.
+FreeBSD 11.1-RELEASE r321309 GENERIC amd64
+
+### Please provide any additional information below.
+
+The full log is available at https://gitlab.com/snippets/1743708.  Summary below:
+
+[[!format sh """
+cabal: Error: some packages failed to install:
+esqueleto-2.5.3-J2ccnERt7unG9UdXfc7jAa depends on esqueleto-2.5.3 which failed
+to install.
+persistent-2.7.0-IWtmEvQAI3yHscMZvQrE6P failed during the building phase. The
+exception was:
+ExitFailure 1
+persistent-sqlite-2.6.4-3aF88LYjPwqbsHGVQ1VUp depends on
+persistent-sqlite-2.6.4 which failed to install.
+persistent-template-2.5.4-2tn9hCQqx2e2mAPIKgHBFO depends on
+persistent-template-2.5.4 which failed to install.
+"""]]
+
+### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
+
+No, I'm afraid.  But it looks really good!  I'm trying to use it to add a bunch of high-res images to my Jeykll website, all managed through Git, with the images stored in S3.

diff --git a/doc/forum/Forcing_offline_copies_seem_available.mdwn b/doc/forum/Forcing_offline_copies_seem_available.mdwn
new file mode 100644
index 000000000..078972697
--- /dev/null
+++ b/doc/forum/Forcing_offline_copies_seem_available.mdwn
@@ -0,0 +1,3 @@
+I have a large archive repository copied on an LTO tape, plus a proxy version of it with data dropped on local HDD. I've added this stripped proxy as a remote to a central repository to keep track of what is stored on the tape. Unfortunately, since the proxy has no contents, the central repo thinks there are no copies available anywhere, even though the offline tape contains all the data.
+
+How could I convince the central repo that the copies do in fact exist, even if they are very much offline?

devblog
diff --git a/doc/devblog/day_508__git-protocol.mdwn b/doc/devblog/day_508__git-protocol.mdwn
new file mode 100644
index 000000000..b3de2ffa8
--- /dev/null
+++ b/doc/devblog/day_508__git-protocol.mdwn
@@ -0,0 +1,20 @@
+Spent today implementing the git pkt-line protocol. Git uses it for a bunch
+of internal stuff, but also to talk to long-running filter processes.
+
+This was my first time using attoparsec, which I quite enjoyed aside from
+some difficulty in parsing a 4 byte hex number. Even though parsing to a
+Word16 should naturally only consume 4 bytes, attoparsec will actually
+consume subsequent bytes that look like hex. And it may parse fewer than 4
+bytes too. So my parser had to take 4 bytes and feed them back into a call
+to attoparsec. Which seemed weird, but works. I also used
+bytestring-builder, and between the two libraries, this should be quite a
+fast implementation of the protocol.
+
+With that 300 lines of code written, it should be easy to implement support
+for the rest of the long-running filter process protocol. Which will surely
+speed up v6 a bit, since at least git won't be running git-annex over and
+over again for each file in the worktree. I hope it will also avoid a memory
+leak in git. That'll be the rest of the low-hanging fruit, before v6
+improvements get really interesting.
+
+This work is supported by the NSF-funded DataLad project.

Added a comment: my 1c
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_7_d987bcd1f589cd9f19a4df92461f9e1a._comment b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_7_d987bcd1f589cd9f19a4df92461f9e1a._comment
new file mode 100644
index 000000000..23a34f0e4
--- /dev/null
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_7_d987bcd1f589cd9f19a4df92461f9e1a._comment
@@ -0,0 +1,15 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="my 1c"
+ date="2018-08-10T19:17:16Z"
+ content="""
+> Let's please not entangle this bug with that other bug.
+
+Sure!  I just (probably erroneously) felt that they stem from the same point of absent clear \"semantic\" on either conversion should happen or not.  I am  yet to fully digest what you are suggesting, and either and how we should address for this at datalad level, but meanwhile FWIW:
+
+- adding `-n` to the `commit` (and not to `add`) is as uncommon to me in my daily use of git/git-annex, and I hope that I would never have to use it while performing regular \"annex unlock file(s); annex add file(s); commit file(s)\" sequence in order to maintain a file(s) under annex.
+
+- either a file `smallen` according to git-annex/largefiles setting is unknown to the user (or some higher level tool using git-annex as datalad) without explicitly checking (not even sure yet how) or doing `git annex add`-ing it/them and seeing either it would now be added to git whenever it was added to annex before.  So hopefully we do not need to do that either.
+
+"""]]

devblog
diff --git a/doc/devblog/day_507__v6_revisited.mdwn b/doc/devblog/day_507__v6_revisited.mdwn
new file mode 100644
index 000000000..cb8bcaa47
--- /dev/null
+++ b/doc/devblog/day_507__v6_revisited.mdwn
@@ -0,0 +1,23 @@
+Plan is to take some time this August and revisit v6, hoping to move it
+toward being production ready.
+
+Today I studied the "Long Running Filter Process" documentation in
+gitattributes(5), as well as the supplimental documentation in git about
+the protocol they use. This interface was added to git after v6 mode was
+implemented, and hopefully some of v6's issues can be fixed by using it in
+some way. But I don't know how yet, it's not as simple as using this
+interface as-is (it was designed for something different), but
+finding a creative trick using it.
+
+So far I have [this idea](http://git-annex.branchable.com/todo/Long_Running_Filter_Process/#comment-7c571c4ed26ce370ccd48db0a4aff4fc)
+to explore. It's promising, might fix the worst of the problems.
+
+Also, reading over all the notes in [[todo/smudge]], I finally
+checked and yes, git doesn't require filters to consume all stdin anymore,
+and when they don't consume stdin, git doesn't leak memory anymore either.
+Which let me massively speed up `git add` in v6 repos. While before `git
+add` of a gigabyte file made git grow to a gigabyte in memory and copied a
+gigabyte through a pipe, it's now just as fast as `git annex add` in v5
+mode is.
+
+This work is supported by the NSF-funded DataLad project.

one way to use this
diff --git a/doc/todo/Long_Running_Filter_Process/comment_2_c2380f19248abf98928743fabd88ed05._comment b/doc/todo/Long_Running_Filter_Process/comment_2_c2380f19248abf98928743fabd88ed05._comment
new file mode 100644
index 000000000..56cb1ab67
--- /dev/null
+++ b/doc/todo/Long_Running_Filter_Process/comment_2_c2380f19248abf98928743fabd88ed05._comment
@@ -0,0 +1,41 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2018-08-09T20:18:46Z"
+ content="""
+One of v6's big problems is that dropping or getting an annexed file
+updates the file in the working tree, which makes git status think
+the file is modified, even though the clean filter will output
+the same pointer as before. Runing `git add` to clear it up is quite
+expensive since the large file content has to be read.
+Maybe a long-running filter process could avoid this problem.
+
+----
+
+If git can be coaxed somehow into re-running the smudge filter,
+git-annex could provide the new worktree content to git via it,
+and let git update the working tree.
+
+Git would make a copy, which git-annex currently does, so the only
+added overhead would be sending the file content to git down the pipe.
+(Well and it won't use reflink for the copy on COW filesystems.)
+
+annex.thin is a complication, but it could be handled by hard linking the
+work tree file that git writes back into the annex, overwriting the file that
+was there. (This approach could also make git checkout of a branch honor 
+annex.thin.)
+
+How to make git re-run the smudge filter? It needs to want to update the
+working tree. One way is to touch the worktree files and then run 
+`git checkout`. Although this risks losing modifications the user made
+to the files so would need to be done with care.
+
+That seems like it would defer working tree updates until the git-annex
+get command was done processing all files. Sometimes I want to use a
+file while the same get command is still running for other files.
+It might work to use the "delay" capability of the filter process
+interface. Get git to re-smudge all affected files, and when it
+asks for content for each, send "delayed". Then as git-annex gets
+each file, respond to git's "list_available_blobs" with a single blob,
+which git should request and use to update the working tree.
+"""]]

massive v6 add speed/memory improvement
v6 add: Take advantage of improved SIGPIPE handler in git 2.5 to speed up
the clean filter by not reading the file content from the pipe. This also
avoids git buffering the whole file content in memory.
When built with an older git, still consumes stdin. If built with a newer
git and used with an older one, it breaks, but that's acceptable --
checking the git version every time would make repeated smudge runs slow.
This commit was supported by the NSF-funded DataLad project.
diff --git a/CHANGELOG b/CHANGELOG
index fcc69d1ee..7e12578c9 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -6,6 +6,9 @@ git-annex (6.20180808) UNRELEASED; urgency=medium
     Affected commands: find, add, whereis, drop, copy, move, get
   * Make metadata --batch combined with matching options refuse to run,
     since it does not seem worth supporting that combination.
+  * v6 add: Take advantage of improved SIGPIPE handler in git 2.5 to
+    speed up the clean filter by not reading the file content from the
+    pipe. This also avoids git buffering the whole file content in memory.
 
  -- Joey Hess <id@joeyh.name>  Wed, 08 Aug 2018 11:24:08 -0400
 
diff --git a/Command/Smudge.hs b/Command/Smudge.hs
index 1644ee257..cfd327c8d 100644
--- a/Command/Smudge.hs
+++ b/Command/Smudge.hs
@@ -16,6 +16,7 @@ import Annex.Ingest
 import Annex.CatFile
 import Logs.Location
 import qualified Database.Keys
+import qualified Git.BuildVersion
 import Git.FilePath
 import Backend
 
@@ -68,7 +69,7 @@ smudge file = do
 
 -- Clean filter is fed file content on stdin, decides if a file
 -- should be stored in the annex, and outputs a pointer to its
--- injested content.
+-- injested content if so. Otherwise, the original content.
 clean :: FilePath -> CommandStart
 clean file = do
 	b <- liftIO $ B.hGetContents stdin
@@ -76,10 +77,18 @@ clean file = do
 		then liftIO $ B.hPut stdout b
 		else ifM (shouldAnnex file)
 			( do
-				-- Even though we ingest the actual file,
-				-- and not stdin, we need to consume all
-				-- stdin, or git will get annoyed.
-				B.length b `seq` return ()
+				-- Before git 2.5, failing to consume all
+				-- stdin here would cause a SIGPIPE and
+				-- crash it.
+				-- Newer git catches the signal and
+				-- stops sending, which is much faster.
+				-- (Also, git seems to forget to free memory
+				-- when sending the file, so the less we
+				-- let it send, the less memory it will
+				-- waste.)
+				if Git.BuildVersion.older "2.5"
+					then B.length b `seq` return ()
+					else liftIO $ hClose stdin
 				-- Look up the backend that was used
 				-- for this file before, so that when
 				-- git re-cleans a file its backend does
diff --git a/doc/todo/smudge.mdwn b/doc/todo/smudge.mdwn
index 3a980b10a..3ebb92b0f 100644
--- a/doc/todo/smudge.mdwn
+++ b/doc/todo/smudge.mdwn
@@ -65,16 +65,16 @@ git-annex should use smudge/clean filters.
 * When git runs the smudge filter, it buffers all its output in ram before
   writing it to a file. So, checking out a branch with a large v6 unlocked files
   can cause git to use a lot of memory.
-  (This needs to be fixed in git, but my proposed interface in
+
+  This needs to be fixed in git, but my proposed interface in
   <http://thread.gmane.org/gmane.comp.version-control.git/294425> would
   avoid the problem for git checkout, since it would use the new interface
-  and not the smudge filter.)
+  and not the smudge filter.
+
+  Last verified with git 2.18 in 2018. 
 
-* When `git add` is run with a large file, it allocates memory for
-  the whole file content, even though it's only going
-  to stream it to the clean filter. My proposed smudge/clean
-  interface patch also fixed this problem, since it made git not read
-  the file at all.
+  To check: Does the long-running filter process interface have the same
+  problem?
 
 * Eventually (but not yet), make v6 the default for new repositories.
   Note that the assistant forces repos into direct mode; that will need to
diff --git a/doc/todo/smudge/comment_12_a4712d510432931062406c4e256f04b1._comment b/doc/todo/smudge/comment_12_a4712d510432931062406c4e256f04b1._comment
new file mode 100644
index 000000000..cfa1be8bc
--- /dev/null
+++ b/doc/todo/smudge/comment_12_a4712d510432931062406c4e256f04b1._comment
@@ -0,0 +1,11 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""re: Git 2.5 allows smudge filters to not read all of stdin"""
+ date="2018-08-09T22:11:00Z"
+ content="""
+@torarnv thanks for pointing that out.. I finally got around to verifying
+that, and was able to speed up the smudge filter. Also this avoids the
+problem that git for some reason buffers the whole file content in memory
+when it sends it to the smudge filter, which is a pretty bad memory leak in git
+that no longer affects this.
+"""]]

soften
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_6_7081b1386ca8807dff79e8613088b619._comment b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_6_7081b1386ca8807dff79e8613088b619._comment
index 05da91a75..36ae0a0a5 100644
--- a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_6_7081b1386ca8807dff79e8613088b619._comment
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_6_7081b1386ca8807dff79e8613088b619._comment
@@ -9,5 +9,9 @@ in v6 mode too. And `-n` is a flag to `git commit`, not `git add`.
 
 Let's please not entangle this bug with that other bug. Unless your goal is
 that I merge them. Bear in mind that I consider the other bug a snake pit,
-and probably should have closed it as utterly useless some time ago.
+and probably should have closed it as utterly useless some time ago.a
+
+(Maybe that was uncharitable, but the other bug seems pretty well blocked
+on a complete reimplementation of v6 mode leading to a v6 mode that is not
+experimental, and entangling this bug into that does not seem wise.)
 """]]

followup
diff --git a/doc/bugs/Too_difficult_if_not_impossible_to_explicitly_add__47__keep_file_under_git___40__not_annex__41___in_v6_without_employing_.gitattributes/comment_10_5eb0622b326a8094b06f5f0de627e288._comment b/doc/bugs/Too_difficult_if_not_impossible_to_explicitly_add__47__keep_file_under_git___40__not_annex__41___in_v6_without_employing_.gitattributes/comment_10_5eb0622b326a8094b06f5f0de627e288._comment
new file mode 100644
index 000000000..dcba5d142
--- /dev/null
+++ b/doc/bugs/Too_difficult_if_not_impossible_to_explicitly_add__47__keep_file_under_git___40__not_annex__41___in_v6_without_employing_.gitattributes/comment_10_5eb0622b326a8094b06f5f0de627e288._comment
@@ -0,0 +1,20 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 10"""
+ date="2018-08-09T19:52:50Z"
+ content="""
+Ben, I can reproduce that, but the file appearing modified in git status 
+is a known  problem documented in [[todo/smudge]]. It's one
+of the primary reasons that v6 mode remains experiemental.
+
+While `git commit -a` in that clone does cause the file to be converted
+from git to annex, touching the file and committing has the same effect. If
+you want to juggle annexed and non-annexed files in a v6 repository without
+letting annex.largefiles tell git-annex what to do, you have to manually
+tell it what to do every time the file is staged. When you `git commit -a`,
+you stage the file and so you need to include `-c annex.largefiles=nothing`
+to keep it from transitioning to the annex.
+
+It think it might make sense to get v6 working to the point that it's
+non-experimental before worrying about such a marginal edge case as this.
+"""]]

response
diff --git a/doc/tips/largefiles/comment_11_e7c83e38c92b8fc91bae140c1d7c007d._comment b/doc/tips/largefiles/comment_11_e7c83e38c92b8fc91bae140c1d7c007d._comment
new file mode 100644
index 000000000..a29114eeb
--- /dev/null
+++ b/doc/tips/largefiles/comment_11_e7c83e38c92b8fc91bae140c1d7c007d._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""Re: Adopting "git annex add" as default command in workflow"""
+ date="2018-08-09T19:46:22Z"
+ content="""
+@davicastro yes, using git-annex add for adding both kinds of files is
+workflow this is about. Other than git add features like `--interactive`
+I see no need to ever use git add once you have this set up.
+"""]]

hm
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_6_7081b1386ca8807dff79e8613088b619._comment b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_6_7081b1386ca8807dff79e8613088b619._comment
new file mode 100644
index 000000000..05da91a75
--- /dev/null
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_6_7081b1386ca8807dff79e8613088b619._comment
@@ -0,0 +1,13 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 6"""
+ date="2018-08-09T19:16:10Z"
+ content="""
+In v6, `git add` rather than `git annex add` will also add it to the annex,
+given annex.largefiles setting. Of course, `git annex add` can also be used
+in v6 mode too. And `-n` is a flag to `git commit`, not `git add`.
+
+Let's please not entangle this bug with that other bug. Unless your goal is
+that I merge them. Bear in mind that I consider the other bug a snake pit,
+and probably should have closed it as utterly useless some time ago.
+"""]]

document converting from git to annex and back
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_4_8ab0f0852a1dc6c51a502f07faba61eb._comment b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_4_8ab0f0852a1dc6c51a502f07faba61eb._comment
new file mode 100644
index 000000000..7a3436b63
--- /dev/null
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_4_8ab0f0852a1dc6c51a502f07faba61eb._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 4"""
+ date="2018-08-09T19:04:28Z"
+ content="""
+Documented both conversions in
+<https://git-annex.branchable.com/tips/largefiles>
+"""]]
diff --git a/doc/tips/largefiles.mdwn b/doc/tips/largefiles.mdwn
index ae81e54ef..b0c37c7d1 100644
--- a/doc/tips/largefiles.mdwn
+++ b/doc/tips/largefiles.mdwn
@@ -108,3 +108,34 @@ be stored in the annex, you can temporarily override the configuration like
 this:
 
 	git annex add -c annex.largefiles=anything smallfile
+
+## converting git to annexed
+
+When you have a file that is currently stored in git, and you want to
+convert that to be stored in the annex, here's how to accomplish that:
+
+	git rm --cached file
+	git annex add -c annex.largefiles=anything file
+	git commit file
+
+This first removes the file from git's index cache, and then adds it back
+using git-annex. You can modify the file before the `git-annex add` step,
+perhaps replacing it with new larger content that necessitates git-annex.
+
+## converting annexed to git
+
+When you have a file that is currently stored in the annex, and you want to
+convert that to be stored in git, here's how to accomplish that:
+
+	git annex unlock file
+	git add file
+	git commit -n file
+
+You can modify the file after unlocking it and before adding it to
+git. And this is probably a good idea if it was really a big file,
+so that you can replace its content with something smaller.
+
+Notice the `-n` switch when committing the file. This bypasses the
+[[git-annex-precommit]] hook. In this situation, that hook sees an unlocked
+file and wants to add it back to the annex, so you have to bypass the hook
+to get the file committed to git.

Added a comment: may be one more gitattribute to instruct on either conversion is desired for the file?
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_4_010b0b10ce6f9cd2b9a3f9847ff2a91a._comment b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_4_010b0b10ce6f9cd2b9a3f9847ff2a91a._comment
new file mode 100644
index 000000000..e1ec8152d
--- /dev/null
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_4_010b0b10ce6f9cd2b9a3f9847ff2a91a._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="may be one more gitattribute to instruct on either conversion is desired for the file?"
+ date="2018-08-09T18:59:56Z"
+ content="""
+I will place implementation and possible tech difficulties aside for now.
+I am afraid that here and [there](http://git-annex.branchable.com/bugs/Too_difficult_if_not_impossible_to_explicitly_add__47__keep_file_under_git___40__not_annex__41___in_v6_without_employing_.gitattributes/) we (well, me?) indeed wanted to see two conflicting behaviors somehow happen.  On one hand (in [there](http://git-annex.branchable.com/bugs/Too_difficult_if_not_impossible_to_explicitly_add__47__keep_file_under_git___40__not_annex__41___in_v6_without_employing_.gitattributes/)) we would like to keep the file initially committed to git under git, regardless what .gitattributes instructs.
+On the other, here I expected file to automagically jump between git and annex depending on `.gitattributes`.  So, rather than explicit \"to git\" or \"to annex\" you outlined, to me the question sounds more like \"retain the same storage (git or annex) as before\" or \"possibly perform conversion according to .gitattributes\".  And I see usecases where for some files (directories, e.g. `.datalad/metadata`) we would like to see one strategy (auto-conversion) and for the others (default?) the other (maintain git/annex).   Given that in v6 there would only be `git add` (so no explicit `git` vs `git annex` add), and that `-n` for `git add` is a flag I was not even aware about, may be it is better to think about being able to explicitly set some additional gitattribute to allow (or disallow?) the conversion for given files, and then have consistent user-level `git annex add` (and in v6 `git add`) which would perform necessary actions across provided files according to `largefiles` and that additional attribute value to decide on the destiny of the file?
+"""]]

one more thought
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment
index ab542574e..25f6c61a2 100644
--- a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment
@@ -36,7 +36,9 @@ to git as they desire.
 
 Which is better, the implicit conversion of the explicit? I am not
 sure, but lean toward the explicit since it doesn't have this potential
-to confuse users.
+to confuse users. Also, the implicit conversion would only work when
+annex.largefiles is being used, but the explicit conversion can be done
+irregardless.
 
 The explicit paths would be:
 

typo
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment
index 9bde26d79..ab542574e 100644
--- a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment
@@ -42,7 +42,7 @@ The explicit paths would be:
 
 	# annex to git                 # git to annex
 	git annex unlock file		
-	largen file                    smallen file
+	smallen file                   largen file
 	git add file                   git annex add file
 	git commit -n                  git commit
 

rethink and Q
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment
index 3c788db56..9bde26d79 100644
--- a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment
@@ -14,17 +14,40 @@ to tell that this file is not supposed to be treated as an unlocked file.
 I don't think we want `git annex add` to commit the file. That would be
 very surprising behavior!
 
-Instead, let's improve the pre-commit hook. It
-currently looks for typechanged files and adds them as annexed files.
-Why not make it check annex.largefiles? All it has to do is, if the file
-doesn't match annex.largefiles, leave it typechanged. Git will then commit
-its contents to git.
-
-With that, all the user has to do is unlock the file, modify it to make it
-small, and commit, and it will automatically be converted to an in-git file.
-
-(Since the pre-commit hook has that partial commit blocking
-when there are typechanged files, that will need to be changed to
-not block partial commits when none of typechanged files match
-annex.largefiles.)
+What git-annex could do is have the pre-commit hook notice that the file
+doesn't match annex.largefiles and not re-annex it, allowing the typechange to
+get committed to git. Then the user would only need to unlock the file,
+modify it to make it non-large, and commit it to get it checked into git.
+
+In a way, this is *too* easy, because if the user sees that working, they may
+expect to be able to turn a small file back into an annexed file by
+making the content large and running git commit on it w/o git-annex add.
+Which would be bad because that would commit a large file to git.
+I suppose the pre-commit could handle that too, but imagine that replacing
+eg a `configure` script that's expected to be shipped in the git repository
+with an annex symlink, which would be surprising.
+
+So it may be better to keep the conversion from annexed to in-git file
+and back explicit. This could be done by `git annex add` detecting
+this situation and erroring out with a message that suggests running
+`git commit -n` if the user wants to change the annexed file to a in-git
+file. That bypasses the pre-commit hook, so the typechange gets committed
+to git as they desire.
+
+Which is better, the implicit conversion of the explicit? I am not
+sure, but lean toward the explicit since it doesn't have this potential
+to confuse users.
+
+The explicit paths would be:
+
+	# annex to git                 # git to annex
+	git annex unlock file		
+	largen file                    smallen file
+	git add file                   git annex add file
+	git commit -n                  git commit
+
+Seems worth documenting somewhere. Or making a command that handles these
+conversions, but the largen and smallen steps being manual, and the
+possibility to combine multiple of these into a single commit argues
+against a conversion command.
 """]]

a plan
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment
new file mode 100644
index 000000000..3c788db56
--- /dev/null
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_3_92b71940fe9d00bf58ee08b327f4a991._comment
@@ -0,0 +1,30 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2018-08-09T16:17:33Z"
+ content="""
+So the root problem is that when we have a typechanged file and want to
+convert that to be not typechanged, we have to git commit it.
+As long as the previous commit is a symlink and the file in the index is
+not, the file will be typechanged by definition.
+
+When git-annex add runs `git add file`, it's doing the only thing it can
+do, but it leaves the file typechanged, and so git-annex later has no way
+to tell that this file is not supposed to be treated as an unlocked file.
+I don't think we want `git annex add` to commit the file. That would be
+very surprising behavior!
+
+Instead, let's improve the pre-commit hook. It
+currently looks for typechanged files and adds them as annexed files.
+Why not make it check annex.largefiles? All it has to do is, if the file
+doesn't match annex.largefiles, leave it typechanged. Git will then commit
+its contents to git.
+
+With that, all the user has to do is unlock the file, modify it to make it
+small, and commit, and it will automatically be converted to an in-git file.
+
+(Since the pre-commit hook has that partial commit blocking
+when there are typechanged files, that will need to be changed to
+not block partial commits when none of typechanged files match
+annex.largefiles.)
+"""]]

same root cause for this too
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_2_19de1d89dd0a266ce43e0f25cd2b74cd._comment b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_2_19de1d89dd0a266ce43e0f25cd2b74cd._comment
index 8401b95d4..50c512e36 100644
--- a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_2_19de1d89dd0a266ce43e0f25cd2b74cd._comment
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_2_19de1d89dd0a266ce43e0f25cd2b74cd._comment
@@ -4,4 +4,9 @@
  date="2018-08-09T16:05:27Z"
  content="""
 The double output from `git-annex add file` is also some kind of minor bug.
+
+The double output seems to have the same root cause too: The file is left
+typechanged form by the first pass of the add, and so the second pass
+sees it again. When annex.largefiles lets the file be annexed, the doubled
+output does not occur.
 """]]

understand now
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_1_ad19d064dd9c175debe647a6928e4f35._comment b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_1_ad19d064dd9c175debe647a6928e4f35._comment
index b627c5d2e..d6167d789 100644
--- a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_1_ad19d064dd9c175debe647a6928e4f35._comment
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_1_ad19d064dd9c175debe647a6928e4f35._comment
@@ -3,16 +3,14 @@
  subject="""comment 1"""
  date="2018-08-09T15:57:57Z"
  content="""
-The partial commit blocking is explained in
-[[!commit adc5ca70a8095a389273e7c286cb32de6873a5a3]]
+I take this as not being a bug about the partial commit
+blocking (as explained in
+[[!commit adc5ca70a8095a389273e7c286cb32de6873a5a3]]), which is working
+around a git behavior and so can't be fixed other than by going to v6.
 
-That is working around a behavior of git, so it cannot be improved in
-git-annex. If it didn't block partial commits, the repository would be left
-in a broken state. The error message explains what you need to do.
-
-(The best fix for this issue is making v6 mode work well enough to ditch v5
-mode. This is indeed one of the motivations for v6 mode in the first place.)
-
-So I don't see a bug here except possibly with the part where git commit -a
-commits to annex not git.
+Instead, I think this is a bug about git annex add of an unlocked file
+not converting it to a in-git file when annex.largefiles says it ought
+to. If it did that it would not run into the partial commit blocking
+at all. And, the observersion about git commit -a committing to the annex
+not to git points at the same problem.
 """]]

and
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_2_19de1d89dd0a266ce43e0f25cd2b74cd._comment b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_2_19de1d89dd0a266ce43e0f25cd2b74cd._comment
new file mode 100644
index 000000000..8401b95d4
--- /dev/null
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_2_19de1d89dd0a266ce43e0f25cd2b74cd._comment
@@ -0,0 +1,7 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2018-08-09T16:05:27Z"
+ content="""
+The double output from `git-annex add file` is also some kind of minor bug.
+"""]]

followup
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_1_ad19d064dd9c175debe647a6928e4f35._comment b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_1_ad19d064dd9c175debe647a6928e4f35._comment
new file mode 100644
index 000000000..b627c5d2e
--- /dev/null
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now/comment_1_ad19d064dd9c175debe647a6928e4f35._comment
@@ -0,0 +1,18 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2018-08-09T15:57:57Z"
+ content="""
+The partial commit blocking is explained in
+[[!commit adc5ca70a8095a389273e7c286cb32de6873a5a3]]
+
+That is working around a behavior of git, so it cannot be improved in
+git-annex. If it didn't block partial commits, the repository would be left
+in a broken state. The error message explains what you need to do.
+
+(The best fix for this issue is making v6 mode work well enough to ditch v5
+mode. This is indeed one of the motivations for v6 mode in the first place.)
+
+So I don't see a bug here except possibly with the part where git commit -a
+commits to annex not git.
+"""]]

initial report about failure to commit
diff --git a/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now.mdwn b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now.mdwn
new file mode 100644
index 000000000..9fdfa2012
--- /dev/null
+++ b/doc/bugs/cannot_commit___34__annex_add__34__ed_modified_file_which_switched_its_largefile_status_to_be_committed_to_git_now.mdwn
@@ -0,0 +1,70 @@
+### Please describe the problem.
+
+Leads to a failure of 'git commit' upon attempt to commit a file which went from "largefile" to small, according to .gitattributes settings, if we `git annex add file` before committing the change.
+
+### What version of git-annex are you using? On what operating system?
+
+6.20180720+gitg03978571f-1~ndall+1
+
+### Please provide any additional information below.
+
+Here is a full script to reproduce it:
+[[!format sh """
+#!/bin/bash
+
+set -ex
+
+builtin cd /tmp; 
+
+if [ -e /tmp/repo ]; then
+    chmod -R +w /tmp/repo; 
+    rm -rf /tmp/repo; 
+fi
+mkdir /tmp/repo; 
+cd /tmp/repo; 
+git init; 
+git annex init; 
+echo '* annex.largefiles=(largerthan=5b)' >.gitattributes; 
+git add .gitattributes; 
+git commit -m 'added .gitattri'; 
+echo 123456 > file; 
+git annex add file; 
+git commit -m add1; 
+ls -l; 
+
+git annex unlock file; 
+echo 123 >| file
+git annex add file
+
+# this would work but commit to git-annex, not git despite .gitattributes settings
+# git commit -m edit -a 
+# This one would fail to commit at all, complaining about "partial commit"
+git commit -m edit file
+ls -l file; 
+
+git status
+
+
+"""]]
+
+which leads to 
+[[!format sh """
+...
++ git annex add file
+add file (non-large file; adding content to git repository) ok
+add file (non-large file; adding content to git repository) ok
+(recording state in git...)
++ git commit -m edit file
+git-annex: Cannot make a partial commit with unlocked annexed files. You should `git annex add` the files you want to commit, and then run git commit.
+"""]]
+
+additional observations:
+
+- works fine if remains large file (e.g. we just append to it)
+- does not fail if we do `git commit -a` not `git commit file`, but it commits it to annex not to git, despite previous `git annex add` message rightfully says that "non-large file; adding content to git repository" 
+
+Expected behavior:
+- have consistent behavior between `commit -a` and `commit file`
+- commit without a failure, committing to git (since .gitattributes instructs so and even `git annex add` reports that)
+
+[[!meta author=yoh]]

make --batch honor matching options
When --batch is used with matching options like --in, --metadata, etc, only
operate on the provided files when they match those options. Otherwise, a
blank line is output in the batch protocol.
Affected commands: find, add, whereis, drop, copy, move, get
In the case of find, the documentation for --batch already said it honored
the matching options. The docs for the rest didn't, but it makes sense to
have them honor them. While this is a behavior change, why specify the
matching options with --batch if you didn't want them to apply?
Note that the batch output for all of the affected commands could
already output a blank line in other cases, so batch users should
already be prepared to deal with it.
git-annex metadata didn't seem worth making support the matching options,
since all it does is output metadata or set metadata, the use cases for
using it in combination with the martching options seem small. Made it
refuse to run when they're combined, leaving open the possibility for later
support if a use case develops.
This commit was sponsored by Brett Eisenberg on Patreon.
diff --git a/CHANGELOG b/CHANGELOG
index 2b011e011..fcc69d1ee 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,3 +1,14 @@
+git-annex (6.20180808) UNRELEASED; urgency=medium
+
+  * When --batch is used with matching options like --in, --metadata,
+    etc, only operate on the provided files when they match those options.
+    Otherwise, a blank line is output in the batch protocol.
+    Affected commands: find, add, whereis, drop, copy, move, get
+  * Make metadata --batch combined with matching options refuse to run,
+    since it does not seem worth supporting that combination.
+
+ -- Joey Hess <id@joeyh.name>  Wed, 08 Aug 2018 11:24:08 -0400
+
 git-annex (6.20180807) upstream; urgency=medium
 
   * S3: Support credential-less download from remotes configured
diff --git a/CmdLine/Batch.hs b/CmdLine/Batch.hs
index 82038314c..cea108f12 100644
--- a/CmdLine/Batch.hs
+++ b/CmdLine/Batch.hs
@@ -12,6 +12,8 @@ import Types.Command
 import CmdLine.Action
 import CmdLine.GitAnnex.Options
 import Options.Applicative
+import Limit
+import Types.FileMatcher
 
 data BatchMode = Batch | NoBatch
 
@@ -72,5 +74,18 @@ batchCommandAction a = maybe (batchBadInput Batch) (const noop)
 
 -- Reads lines of batch input and passes the filepaths to a CommandStart
 -- to handle them.
-batchFiles :: (FilePath -> CommandStart) -> Annex ()
-batchFiles a = batchInput Right $ batchCommandAction . a
+--
+-- File matching options are not checked.
+allBatchFiles :: (FilePath -> CommandStart) -> Annex ()
+allBatchFiles a = batchInput Right $ batchCommandAction . a
+
+-- Like allBatchFiles, but checks the file matching options
+-- and skips non-matching files.
+batchFilesMatching :: (FilePath -> CommandStart) -> Annex ()
+batchFilesMatching a = do
+	matcher <- getMatcher
+	allBatchFiles $ \f ->
+		ifM (matcher $ MatchingFile $ FileInfo f f)
+			( a f
+			, return Nothing
+			)
diff --git a/Command/Add.hs b/Command/Add.hs
index 10148ad50..de602a9a7 100644
--- a/Command/Add.hs
+++ b/Command/Add.hs
@@ -62,7 +62,7 @@ seek o = allowConcurrentOutput $ do
 		Batch
 			| updateOnly o ->
 				giveup "--update --batch is not supported"
-			| otherwise -> batchFiles gofile
+			| otherwise -> batchFilesMatching gofile
 		NoBatch -> do
 			l <- workTreeItems (addThese o)
 			let go a = a gofile l
diff --git a/Command/Copy.hs b/Command/Copy.hs
index daf2e66bc..d3248f42c 100644
--- a/Command/Copy.hs
+++ b/Command/Copy.hs
@@ -47,7 +47,7 @@ seek :: CopyOptions -> CommandSeek
 seek o = allowConcurrentOutput $ do
 	let go = whenAnnexed $ start o
 	case batchOption o of
-		Batch -> batchInput Right (batchCommandAction . go)
+		Batch -> batchFilesMatching go
 		NoBatch -> withKeyOptions
 			(keyOptions o) (autoMode o)
 			(Command.Move.startKey (fromToOptions o) Command.Move.RemoveNever)
diff --git a/Command/Drop.hs b/Command/Drop.hs
index baeae66ee..4d7f13f68 100644
--- a/Command/Drop.hs
+++ b/Command/Drop.hs
@@ -54,7 +54,7 @@ parseDropFromOption = parseRemoteOption <$> strOption
 seek :: DropOptions -> CommandSeek
 seek o = allowConcurrentOutput $
 	case batchOption o of
-		Batch -> batchInput Right (batchCommandAction . go)
+		Batch -> batchFilesMatching go
 		NoBatch -> withKeyOptions (keyOptions o) (autoMode o)
 			(startKeys o)
 			(withFilesInGit go)
diff --git a/Command/Find.hs b/Command/Find.hs
index 10eff3527..9d7c040d2 100644
--- a/Command/Find.hs
+++ b/Command/Find.hs
@@ -51,7 +51,7 @@ parseFormatOption =
 seek :: FindOptions -> CommandSeek
 seek o = case batchOption o of
 	NoBatch -> withFilesInGit go =<< workTreeItems (findThese o)
-	Batch -> batchFiles go
+	Batch -> batchFilesMatching go
   where
 	go = whenAnnexed $ start o
 
diff --git a/Command/Get.hs b/Command/Get.hs
index eac8e88a4..fde65c501 100644
--- a/Command/Get.hs
+++ b/Command/Get.hs
@@ -42,7 +42,7 @@ seek o = allowConcurrentOutput $ do
 	from <- maybe (pure Nothing) (Just <$$> getParsed) (getFrom o)
 	let go = whenAnnexed $ start o from
 	case batchOption o of
-		Batch -> batchInput Right (batchCommandAction . go)
+		Batch -> batchFilesMatching go
 		NoBatch -> withKeyOptions (keyOptions o) (autoMode o)
 			(startKeys from)
 			(withFilesInGit go)
diff --git a/Command/MetaData.hs b/Command/MetaData.hs
index 282b7fda0..23f16a53a 100644
--- a/Command/MetaData.hs
+++ b/Command/MetaData.hs
@@ -15,6 +15,7 @@ import Annex.WorkTree
 import Messages.JSON (JSONActionItem(..))
 import Types.Messages
 import Utility.Aeson
+import Limit
 
 import qualified Data.Set as S
 import qualified Data.Map as M
@@ -83,8 +84,11 @@ seek o = case batchOption o of
 			(seeker $ whenAnnexed $ start c o)
 			=<< workTreeItems (forFiles o)
 	Batch -> withMessageState $ \s -> case outputType s of
-		JSONOutput _ -> batchInput parseJSONInput $
-			commandAction . startBatch
+		JSONOutput _ -> ifM limited
+			( giveup "combining --batch with file matching options is not currently supported"
+			, batchInput parseJSONInput $
+				commandAction . startBatch
+			)
 		_ -> giveup "--batch is currently only supported in --json mode"
 
 start :: VectorClock -> MetaDataOptions -> FilePath -> Key -> CommandStart
diff --git a/Command/Move.hs b/Command/Move.hs
index b50c877bc..f5de2c963 100644
--- a/Command/Move.hs
+++ b/Command/Move.hs
@@ -57,7 +57,7 @@ seek :: MoveOptions -> CommandSeek
 seek o = allowConcurrentOutput $ do
 	let go = whenAnnexed $ start (fromToOptions o) (removeWhen o)
 	case batchOption o of
-		Batch -> batchInput Right (batchCommandAction . go)
+		Batch -> batchFilesMatching go
 		NoBatch -> withKeyOptions (keyOptions o) False
 			(startKey (fromToOptions o) (removeWhen o))
 			(withFilesInGit go)
diff --git a/Command/Whereis.hs b/Command/Whereis.hs
index b14e231c1..988c4aaf5 100644
--- a/Command/Whereis.hs
+++ b/Command/Whereis.hs
@@ -40,7 +40,7 @@ seek o = do
 	m <- remoteMap id
 	let go = whenAnnexed $ start m
 	case batchOption o of
-		Batch -> batchFiles go
+		Batch -> batchFilesMatching go
 		NoBatch -> 
 			withKeyOptions (keyOptions o) False
 				(startKeys m)
diff --git a/doc/bugs/find_with_batch_does_not_apply_matching_options.mdwn b/doc/bugs/find_with_batch_does_not_apply_matching_options.mdwn
index 767f1db48..813ae9caf 100644
--- a/doc/bugs/find_with_batch_does_not_apply_matching_options.mdwn
+++ b/doc/bugs/find_with_batch_does_not_apply_matching_options.mdwn
@@ -8,3 +8,6 @@ Using `git annex find --batch` with matching options seems to not apply them.
 
 ### What version of git-annex are you using? On what operating system?
 I'd rather not say ~~because it is ancient~~. Joey says this is reproducible on recent git-annex versions though.
+
+> Not only find, but a bunch of commands supporting --batch had this
+> oversight. Fixed them all. [[done]] --[[Joey]]
diff --git a/doc/git-annex-add.mdwn b/doc/git-annex-add.mdwn
index ff7bc4004..f24ec761d 100644
--- a/doc/git-annex-add.mdwn
+++ b/doc/git-annex-add.mdwn
@@ -77,8 +77,9 @@ annexed content, and other symlinks.
   the file is added, and repeat.
 
   Note that if a file is skipped (due to not existing, being gitignored,
-  already being in git etc), an empty line will be output instead of the
-  normal output produced when adding a file.
+  already being in git, or doesn't meet the matching options), 
+  an empty line will be output instead of the normal output produced
+  when adding a file.
 
 # SEE ALSO
 
diff --git a/doc/git-annex-copy.mdwn b/doc/git-annex-copy.mdwn
index fedeaa067..9a5b48be9 100644
--- a/doc/git-annex-copy.mdwn

(Diff truncated)
moreinfo this
diff --git a/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to.mdwn b/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to.mdwn
index 429750ad1..6bd4b4b1e 100644
--- a/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to.mdwn
+++ b/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to.mdwn
@@ -23,3 +23,4 @@ git-annex-shell: git-shell failed
 ### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
 Yes.
 
+[[moreinfo]]
diff --git a/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_3_335fea47cef83b6dfad7ce8c53193f82._comment b/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_3_335fea47cef83b6dfad7ce8c53193f82._comment
new file mode 100644
index 000000000..714035994
--- /dev/null
+++ b/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_3_335fea47cef83b6dfad7ce8c53193f82._comment
@@ -0,0 +1,12 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2018-08-08T15:04:09Z"
+ content="""
+In what configuration?
+
+This clearly does not affect the general case, or everyone who is happily
+using git-annex would have reported the bug.
+
+Please provide enough information to reproduce your problem.
+"""]]

close
diff --git a/doc/bugs/git-annex_requires_an_SSH_remote_to_have_an_absolute_path.mdwn b/doc/bugs/git-annex_requires_an_SSH_remote_to_have_an_absolute_path.mdwn
index 0611801b4..23720ebd8 100644
--- a/doc/bugs/git-annex_requires_an_SSH_remote_to_have_an_absolute_path.mdwn
+++ b/doc/bugs/git-annex_requires_an_SSH_remote_to_have_an_absolute_path.mdwn
@@ -33,3 +33,5 @@ operating system: linux x86_64
 
 ### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
 Yes, I use it on my laptop without issues. I'm trying to set up a server on my desktop.
+
+> [[done]], apparently the reporter was mistaken --[[Joey]] 

diff --git a/doc/bugs/find_with_batch_does_not_apply_matching_options.mdwn b/doc/bugs/find_with_batch_does_not_apply_matching_options.mdwn
new file mode 100644
index 000000000..767f1db48
--- /dev/null
+++ b/doc/bugs/find_with_batch_does_not_apply_matching_options.mdwn
@@ -0,0 +1,10 @@
+### Please describe the problem.
+Using `git annex find --batch` with matching options seems to not apply them.
+
+### What steps will reproduce the problem?
+
+    find -type l | git annex find --batch --copies 2
+    ... list of files that include files with only 1 copy ...
+
+### What version of git-annex are you using? On what operating system?
+I'd rather not say ~~because it is ancient~~. Joey says this is reproducible on recent git-annex versions though.

cleanupc
diff --git a/doc/news/version_6.20171124/comment_1_df20feb1a0db999545c1d2d1d544c1bb._comment b/doc/news/version_6.20171124/comment_1_df20feb1a0db999545c1d2d1d544c1bb._comment
deleted file mode 100644
index 3d78f52ac..000000000
--- a/doc/news/version_6.20171124/comment_1_df20feb1a0db999545c1d2d1d544c1bb._comment
+++ /dev/null
@@ -1,8 +0,0 @@
-[[!comment format=mdwn
- username="sunny256"
- avatar="http://cdn.libravatar.org/avatar/8a221001f74d0e8f4dadee3c7d1996e4"
- subject="Version missing from the annex"
- date="2017-11-29T16:15:03Z"
- content="""
-It seems as this version is missing from https://downloads.kitenet.net/.git/ , the newest version there is v6.20171109.
-"""]]
diff --git a/doc/news/version_6.20171124/comment_2_1161cc986af34bfcd3a868e38bedae7e._comment b/doc/news/version_6.20171124/comment_2_1161cc986af34bfcd3a868e38bedae7e._comment
deleted file mode 100644
index b5cecf29d..000000000
--- a/doc/news/version_6.20171124/comment_2_1161cc986af34bfcd3a868e38bedae7e._comment
+++ /dev/null
@@ -1,8 +0,0 @@
-[[!comment format=mdwn
- username="joey"
- subject="""comment 2"""
- date="2017-11-29T21:38:26Z"
- content="""
-Indeed it was. I must have forgotten to push out the files for that
-release. Done so now.
-"""]]
diff --git a/doc/news/version_6.20180316/comment_1_81de8aa3859e65b944027f67bf5c8cc1._comment b/doc/news/version_6.20180316/comment_1_81de8aa3859e65b944027f67bf5c8cc1._comment
deleted file mode 100644
index c8c952878..000000000
--- a/doc/news/version_6.20180316/comment_1_81de8aa3859e65b944027f67bf5c8cc1._comment
+++ /dev/null
@@ -1,8 +0,0 @@
-[[!comment format=mdwn
- username="Michael"
- avatar="http://cdn.libravatar.org/avatar/86811fdafa094c610ec8ef8858a78dbf"
- subject="prebuilt i386 still at old version?"
- date="2018-04-08T01:44:24Z"
- content="""
-https://downloads.kitenet.net/git-annex/linux/current/git-annex-standalone-i386.tar.gz - as of 2018-04-07 shows 6.20180227-g32d682dd8
-"""]]

add news item for git-annex 6.20180807
diff --git a/doc/news/version_6.20180409.mdwn b/doc/news/version_6.20180409.mdwn
deleted file mode 100644
index 37a4c5b80..000000000
--- a/doc/news/version_6.20180409.mdwn
+++ /dev/null
@@ -1,26 +0,0 @@
-git-annex 6.20180409 released with [[!toggle text="these changes"]]
-[[!toggleable text="""
-   * Added adb special remote which allows exporting files to Android devices.
-   * For url downloads, git-annex now defaults to using a http library,
-     rather than wget or curl. But, if annex.web-options is set, it will
-     use curl. To use the .netrc file, run:
-       git config annex.web-options --netrc
-   * git-annex no longer uses wget (and wget is no longer shipped with
-     git-annex builds).
-   * Enable HTTP connection reuse across multiple files for improved speed.
-   * Fix calculation of estimated completion for progress meter.
-   * OSX app: Work around libz/libPng/ImageIO.framework version skew
-     by not bundling libz, assuming OSX includes a suitable libz.1.dylib.
-   * Added annex.retry, annex.retry-delay, and per-remote versions
-     to configure transfer retries.
-   * Also do forward retrying in cases where no exception is thrown,
-     but the transfer failed.
-   * When adding a new version of a file, and annex.genmetadata is enabled,
-     don't copy the data metadata from the old version of the file,
-     instead use the mtime of the file.
-   * Avoid running annex.http-headers-command more than once.
-   * info: Added "combined size of repositories containing these files"
-     stat when run on a directory.
-   * info: Changed sorting of numcopies stats table, so it's ordered
-     by the variance from the desired number of copies.
-   * Fix resuming a download when using curl."""]]
\ No newline at end of file
diff --git a/doc/news/version_6.20180807.mdwn b/doc/news/version_6.20180807.mdwn
new file mode 100644
index 000000000..e3d84838e
--- /dev/null
+++ b/doc/news/version_6.20180807.mdwn
@@ -0,0 +1,15 @@
+git-annex 6.20180807 released with [[!toggle text="these changes"]]
+[[!toggleable text="""
+   * S3: Support credential-less download from remotes configured
+     with public=yes exporttree=yes.
+   * Fix reversion in display of http 404 errors.
+   * Added remote.name.annex-speculate-present config that can be used to
+     make cache remotes.
+   * Added --accessedwithin matching option.
+   * Added annex.commitmessage config that can specify a commit message
+     for the git-annex branch instead of the usual "update".
+   * Fix wrong sorting of remotes when using -J, it was sorting by uuid,
+     rather than cost.
+   * addurl: Include filename in --json-progress output.
+   * Fix git-annex branch data loss that could occur after
+     git-annex forget --drop-dead."""]]
\ No newline at end of file

Added a comment: Sorry
diff --git a/doc/bugs/git-annex_requires_an_SSH_remote_to_have_an_absolute_path/comment_2_406afc475368b5680be6c99514fbda74._comment b/doc/bugs/git-annex_requires_an_SSH_remote_to_have_an_absolute_path/comment_2_406afc475368b5680be6c99514fbda74._comment
new file mode 100644
index 000000000..341f72912
--- /dev/null
+++ b/doc/bugs/git-annex_requires_an_SSH_remote_to_have_an_absolute_path/comment_2_406afc475368b5680be6c99514fbda74._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="https://openid-provider.appspot.com/iakornfeld"
+ nickname="iakornfeld"
+ avatar="http://cdn.libravatar.org/avatar/c0369f5727cad81d1ecf6c2e657b42a1b756284aad0229351f9027a2cfcb2037"
+ subject="Sorry"
+ date="2018-08-07T14:12:20Z"
+ content="""
+This was me misreading the debug output.
+"""]]

removed
diff --git a/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_2_63326b4e97f2b30529e7e3d2bdabbad7._comment b/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_2_63326b4e97f2b30529e7e3d2bdabbad7._comment
deleted file mode 100644
index 1cc98f72d..000000000
--- a/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_2_63326b4e97f2b30529e7e3d2bdabbad7._comment
+++ /dev/null
@@ -1,9 +0,0 @@
-[[!comment format=mdwn
- username="https://openid-provider.appspot.com/iakornfeld"
- nickname="iakornfeld"
- avatar="http://cdn.libravatar.org/avatar/c0369f5727cad81d1ecf6c2e657b42a1b756284aad0229351f9027a2cfcb2037"
- subject="Yes"
- date="2018-08-07T14:07:38Z"
- content="""
-git push does this.
-"""]]

Added a comment: Yes
diff --git a/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_3_abaeb3d5c0322deeec790d80b2948549._comment b/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_3_abaeb3d5c0322deeec790d80b2948549._comment
new file mode 100644
index 000000000..a02a75868
--- /dev/null
+++ b/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_3_abaeb3d5c0322deeec790d80b2948549._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="https://openid-provider.appspot.com/iakornfeld"
+ nickname="iakornfeld"
+ avatar="http://cdn.libravatar.org/avatar/c0369f5727cad81d1ecf6c2e657b42a1b756284aad0229351f9027a2cfcb2037"
+ subject="Yes"
+ date="2018-08-07T14:07:56Z"
+ content="""
+git-annex does this when checking remotes for git-annex support.
+"""]]

Added a comment: Yes
diff --git a/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_2_63326b4e97f2b30529e7e3d2bdabbad7._comment b/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_2_63326b4e97f2b30529e7e3d2bdabbad7._comment
new file mode 100644
index 000000000..1cc98f72d
--- /dev/null
+++ b/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_2_63326b4e97f2b30529e7e3d2bdabbad7._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="https://openid-provider.appspot.com/iakornfeld"
+ nickname="iakornfeld"
+ avatar="http://cdn.libravatar.org/avatar/c0369f5727cad81d1ecf6c2e657b42a1b756284aad0229351f9027a2cfcb2037"
+ subject="Yes"
+ date="2018-08-07T14:07:38Z"
+ content="""
+git push does this.
+"""]]

Fix git-annex branch data loss that could occur after git-annex forget --drop-dead
Added getStaged, to get the versions of git-annex branch files staged in its
index, and use during transitions so the result of merging sibling branches
is used.
The catFileStop in performTransitionsLocked is absolutely necessary,
without that the bug still occurred, because git cat-file was already
running and was looking at the old index file.
Note that getLocal still has cat-file look at the git-annex branch, not the
index. It might be faster if it looked at the index, but probably only
marginally so, and I've not benchmarked it to see if it's faster at all. I
didn't want to change unrelated behavior as part of this bug fix. And as
the need for catFileStop shows, using the index file has added
complications.
Anyway, it still seems fine for getLocal to look at the git-annex branch,
because normally the index file is updated just before the git-annex branch
is committed, and so they'll contain the same information. It's only during
a transition that the two diverge.
This commit was sponsored by Paul Walmsley in honor of Mark Phillips.
diff --git a/Annex/Branch.hs b/Annex/Branch.hs
index a3945a18c..e465b7532 100644
--- a/Annex/Branch.hs
+++ b/Annex/Branch.hs
@@ -223,10 +223,15 @@ getLocal :: FilePath -> Annex String
 getLocal file = go =<< getJournalFileStale file
   where
 	go (Just journalcontent) = return journalcontent
-	go Nothing = getRaw file
+	go Nothing = getRef fullname file
 
-getRaw :: FilePath -> Annex String
-getRaw = getRef fullname
+{- Gets the content of a file as staged in the branch's index. -}
+getStaged :: FilePath -> Annex String
+getStaged = getRef indexref
+  where
+	-- This makes git cat-file be run with ":file",
+	-- so it looks at the index.
+	indexref = Ref ""
 
 getHistorical :: RefDate -> FilePath -> Annex String
 getHistorical date file =
@@ -533,6 +538,10 @@ performTransitionsLocked jl ts neednewlocalbranch transitionedrefs = do
 	-- update the git-annex branch, while it usually holds changes
 	-- for the head branch. Flush any such changes.
 	Annex.Queue.flush
+	-- Stop any running git cat-files, to ensure that the
+	-- getStaged calls below use the current index, and not some older
+	-- one.
+	catFileStop
 	withIndex $ do
 		prepareModifyIndex jl
 		run $ mapMaybe getTransitionCalculator tlist
@@ -557,15 +566,15 @@ performTransitionsLocked jl ts neednewlocalbranch transitionedrefs = do
 	 - to hold changes to every file in the branch at once.)
 	 -
 	 - When a file in the branch is changed by transition code,
-	 - that value is remembered and fed into the code for subsequent
+	 - its new content is remembered and fed into the code for subsequent
 	 - transitions.
 	 -}
 	run [] = noop
 	run changers = do
-		trustmap <- calcTrustMap <$> getRaw trustLog
+		trustmap <- calcTrustMap <$> getStaged trustLog
 		fs <- branchFiles
 		forM_ fs $ \f -> do
-			content <- getRaw f
+			content <- getStaged f
 			apply changers f content trustmap
 	apply [] _ _ _ = return ()
 	apply (changer:rest) file content trustmap =
diff --git a/CHANGELOG b/CHANGELOG
index 8e1b0eb1f..908c42b03 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -11,6 +11,8 @@ git-annex (6.20180720) UNRELEASED; urgency=medium
   * Fix wrong sorting of remotes when using -J, it was sorting by uuid,
     rather than cost.
   * addurl: Include filename in --json-progress output.
+  * Fix git-annex branch data loss that could occur after
+    git-annex forget --drop-dead.
 
  -- Joey Hess <id@joeyh.name>  Tue, 31 Jul 2018 12:14:11 -0400
 
diff --git a/doc/bugs/git_annex_forget_--drop-dead_can_lose_group_information.mdwn b/doc/bugs/git_annex_forget_--drop-dead_can_lose_group_information.mdwn
index c9e94389e..5c76e9494 100644
--- a/doc/bugs/git_annex_forget_--drop-dead_can_lose_group_information.mdwn
+++ b/doc/bugs/git_annex_forget_--drop-dead_can_lose_group_information.mdwn
@@ -152,3 +152,5 @@ B IS IN GROUP:
 ### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
 
 Yes, git-annex is slowly replacing all of my other sync and backup systems I've cobbled together over the years.
+
+> [[fixed|done]] --[[Joey]]
diff --git a/doc/bugs/git_annex_forget_--drop-dead_can_lose_group_information/comment_1_f4b78aeaf9163b374254c08057222661._comment b/doc/bugs/git_annex_forget_--drop-dead_can_lose_group_information/comment_1_f4b78aeaf9163b374254c08057222661._comment
new file mode 100644
index 000000000..75bd9bd1a
--- /dev/null
+++ b/doc/bugs/git_annex_forget_--drop-dead_can_lose_group_information/comment_1_f4b78aeaf9163b374254c08057222661._comment
@@ -0,0 +1,25 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2018-08-06T19:49:06Z"
+ content="""
+Great bug report!
+
+There is a general data loss problem in this scenario, it's not specific to
+group changes at all. Changes that were only present in a sibling git-annex
+branch are not being preserved when the repository updates its git-annex
+branch index file for a transition.
+
+The index file lacking those changes then gets committed with the sibling
+branches as parent(s). So the changes are effectively reverted.
+
+The root cause is that the handleTransitions uses getRaw
+to get the contents of files. That uses git cat-file git-annex:$file, which
+gets the version last committed to the git-annex branch,
+not the version from the git-annex branch index file. And handleTransitions is
+run after all sibling branches have been union merged in the index file
+but not committed yet. 
+
+The fix is to instead use git cat-file :$file, so it will get the version
+from the index. 
+"""]]
diff --git a/doc/bugs/git_annex_forget_--drop-dead_can_lose_group_information/comment_2_1468e26d2da85bcaf50715fbe3c38490._comment b/doc/bugs/git_annex_forget_--drop-dead_can_lose_group_information/comment_2_1468e26d2da85bcaf50715fbe3c38490._comment
new file mode 100644
index 000000000..8fe3936bc
--- /dev/null
+++ b/doc/bugs/git_annex_forget_--drop-dead_can_lose_group_information/comment_2_1468e26d2da85bcaf50715fbe3c38490._comment
@@ -0,0 +1,19 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2018-08-06T21:26:36Z"
+ content="""
+If you lost things from the git-annex branch due this bug, 
+you can find the commit that contained them by `git log git-annex`,
+and look for the commit before the "continuing transition" commit.
+
+It's then possible to get those changes applied back to the git-annex
+brannch; there should be no permanent data loss due to this bug.
+
+Eg, here the commit that contained the lost group change was
+261d1be6a2093f1e4059ed3030016c365f29413f. To get that back into the
+git-annex branch, I ran:
+
+	git update-ref git-annex 261d1be6a2093f1e4059ed3030016c365f29413f
+	git annex merge
+"""]]

diff --git a/doc/bugs/git_annex_forget_--drop-dead_can_lose_group_information.mdwn b/doc/bugs/git_annex_forget_--drop-dead_can_lose_group_information.mdwn
new file mode 100644
index 000000000..c9e94389e
--- /dev/null
+++ b/doc/bugs/git_annex_forget_--drop-dead_can_lose_group_information.mdwn
@@ -0,0 +1,154 @@
+### Please describe the problem.
+
+If multiple remotes edit group information and one of them does `git annex forget --force --drop-dead` some of those edits can be lost on sync.
+
+### What steps will reproduce the problem?
+
+Make a temporary directory and `cd` into it. Then run this script:
+
+[[!format sh """
+#!/bin/bash
+
+git annex version
+
+git init a
+cd a
+git annex init
+touch test
+git annex add test
+git annex sync
+cd ..
+git clone a b
+cd b
+git annex sync
+
+cd ../a
+git annex group here ga
+git annex sync
+cd ../b
+git annex group here gb
+git annex forget --force --drop-dead
+git annex sync
+
+cd ../a
+git annex sync
+cd ../b
+git annex sync
+
+cd ../a
+echo "A IS IN GROUP:"
+git annex group .
+cd ../b
+echo "B IS IN GROUP:"
+git annex group .
+
+"""]]
+
+
+### What version of git-annex are you using? On what operating system?
+
+6.20170101-1+deb9u2 on Debian Stretch but I also tested this occurs in version 6.20180719
+
+### Please provide any additional information below.
+
+Here's the output of the above script. The interesting part is the last two lines which show that remote 'b' is not in any group despite being added to group 'gb'.
+
+[[!format sh """
+git-annex version: 6.20170101.1
+build flags: Assistant Webapp Pairing Testsuite S3(multipartupload)(storageclasses) WebDAV Inotify DBus DesktopNotify ConcurrentOutput TorrentParser MagicMime Feeds Quvi
+key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 SHA1E SHA1 MD5E MD5 WORM URL
+remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav tahoe glacier ddar hook external
+Initialized empty Git repository in /home/matthew/test-git-annex/a/.git/
+init  ok
+(recording state in git...)
+add test ok
+(recording state in git...)
+commit  
+[master (root-commit) ffcf48d] git-annex in matthew@thorium:~/test-git-annex/a
+ 1 file changed, 1 insertion(+)
+ create mode 120000 test
+ok
+Cloning into 'b'...
+done.
+(merging origin/git-annex into git-annex...)
+commit  (recording state in git...)
+
+On branch master
+Your branch is up-to-date with 'origin/master'.
+nothing to commit, working tree clean
+ok
+pull origin 
+ok
+push origin 
+Counting objects: 3, done.
+Delta compression using up to 4 threads.
+Compressing objects: 100% (3/3), done.
+Writing objects: 100% (3/3), 409 bytes | 0 bytes/s, done.
+Total 3 (delta 0), reused 0 (delta 0)
+To /home/matthew/test-git-annex/a
+ * [new branch]      git-annex -> synced/git-annex
+ok
+group here ok
+(recording state in git...)
+commit  
+On branch master
+nothing to commit, working tree clean
+ok
+group here ok
+(recording state in git...)
+forget git-annex (recording state in git...)
+ok
+(recording state in git...)
+commit  
+On branch master
+Your branch is up-to-date with 'origin/master'.
+nothing to commit, working tree clean
+ok
+pull origin 
+remote: Counting objects: 3, done.
+remote: Compressing objects: 100% (3/3), done.
+remote: Total 3 (delta 0), reused 0 (delta 0)
+Unpacking objects: 100% (3/3), done.
+From /home/matthew/test-git-annex/a
+   e9a879d..000eb8e  git-annex  -> origin/git-annex
+ok
+(merging origin/git-annex into git-annex...)
+(recording state in git...)
+(recording state in git...)
+push origin 
+Counting objects: 11, done.
+Delta compression using up to 4 threads.
+Compressing objects: 100% (10/10), done.
+Writing objects: 100% (11/11), 1.03 KiB | 0 bytes/s, done.
+Total 11 (delta 3), reused 0 (delta 0)
+To /home/matthew/test-git-annex/a
+ + e9a879d...00986fa git-annex -> synced/git-annex (forced update)
+ok
+(merging synced/git-annex into git-annex...)
+(recording state in git...)
+commit  
+On branch master
+nothing to commit, working tree clean
+ok
+commit  
+On branch master
+Your branch is up-to-date with 'origin/master'.
+nothing to commit, working tree clean
+ok
+pull origin 
+remote: Counting objects: 3, done.
+remote: Compressing objects: 100% (3/3), done.
+remote: Total 3 (delta 0), reused 0 (delta 0)
+Unpacking objects: 100% (3/3), done.
+From /home/matthew/test-git-annex/a
+ + 000eb8e...965b6af git-annex  -> origin/git-annex  (forced update)
+ok
+(merging origin/git-annex into git-annex...)
+A IS IN GROUP:
+ga
+B IS IN GROUP:
+"""]]
+
+### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
+
+Yes, git-annex is slowly replacing all of my other sync and backup systems I've cobbled together over the years.

addurl: Include filename in --json-progress output when known.
diff --git a/CHANGELOG b/CHANGELOG
index 0fb62325f..8e1b0eb1f 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -10,6 +10,7 @@ git-annex (6.20180720) UNRELEASED; urgency=medium
     for the git-annex branch instead of the usual "update".
   * Fix wrong sorting of remotes when using -J, it was sorting by uuid,
     rather than cost.
+  * addurl: Include filename in --json-progress output.
 
  -- Joey Hess <id@joeyh.name>  Tue, 31 Jul 2018 12:14:11 -0400
 
diff --git a/Command/AddUrl.hs b/Command/AddUrl.hs
index 21a65746c..d3f84a7bb 100644
--- a/Command/AddUrl.hs
+++ b/Command/AddUrl.hs
@@ -126,7 +126,7 @@ checkUrl r o u = do
   where
 
 	go _ (Left e) = void $ commandAction $ do
-		showStart' "addurl" (Just u)
+		showStartAddUrl u o
 		warning (show e)
 		next $ next $ return False
 	go deffile (Right (UrlContents sz mf)) = do
@@ -148,7 +148,7 @@ startRemote :: Remote -> AddUrlOptions -> FilePath -> URLString -> Maybe Integer
 startRemote r o file uri sz = do
 	pathmax <- liftIO $ fileNameLengthLimit "."
 	let file' = joinPath $ map (truncateFilePath pathmax) $ splitDirectories file
-	showStart' "addurl" (Just uri)
+	showStartAddUrl uri o
 	showNote $ "from " ++ Remote.name r 
 	showDestinationFile file'
 	next $ performRemote r o uri file' sz
@@ -192,7 +192,7 @@ startWeb o urlstring = go $ fromMaybe bad $ parseURI urlstring
 	bad = fromMaybe (giveup $ "bad url " ++ urlstring) $
 		Url.parseURIRelaxed $ urlstring
 	go url = do
-		showStart' "addurl" (Just urlstring)
+		showStartAddUrl urlstring o
 		pathmax <- liftIO $ fileNameLengthLimit "."
 		urlinfo <- if relaxedOption (downloadOptions o)
 			then pure Url.assumeUrlExists
@@ -311,6 +311,15 @@ downloadWeb o url urlinfo file =
 					warning $ dest ++ " already exists; not overwriting"
 					return Nothing
 
+{- The destination file is not known at start time unless the user provided
+ - a filename. It's not displayed then for output consistency, 
+ - but is added to the json when available. -}
+showStartAddUrl url o = do
+	showStart' "addurl" (Just url)
+	case fileOption (downloadOptions o) of
+		Nothing -> noop
+		Just file -> maybeShowJSON $ JSONChunk [("file", file)]
+
 showDestinationFile :: FilePath -> Annex ()
 showDestinationFile file = do
 	showNote ("to " ++ file)
diff --git a/doc/todo/provide___39__file__39___in_--json-progress_record_for_addurl.mdwn b/doc/todo/provide___39__file__39___in_--json-progress_record_for_addurl.mdwn
index a68af9564..12b4cd91e 100644
--- a/doc/todo/provide___39__file__39___in_--json-progress_record_for_addurl.mdwn
+++ b/doc/todo/provide___39__file__39___in_--json-progress_record_for_addurl.mdwn
@@ -11,3 +11,9 @@ $> git annex addurl --file bigone --json --json-progress https://s3.amazonaws.co
 Thanks in advance
 
 [[!meta author=yoh]]
+
+> In general addurl doesn't know the filename until after it's downloaded
+> the url (due to running youtube-dl on html urls), but when --file
+> or --batch --with-files is used, it does know the filename early.
+> So, made the json-progress include the filename when it's known.
+> [[done]] --[[Joey]]

close
diff --git a/doc/bugs/get_--json_fails_whenever_plain_get_works___40__with_https_urls__41__.mdwn b/doc/bugs/get_--json_fails_whenever_plain_get_works___40__with_https_urls__41__.mdwn
index 1528de30d..6839e77a8 100644
--- a/doc/bugs/get_--json_fails_whenever_plain_get_works___40__with_https_urls__41__.mdwn
+++ b/doc/bugs/get_--json_fails_whenever_plain_get_works___40__with_https_urls__41__.mdwn
@@ -48,3 +48,7 @@ get sub-01/label/lh.aparc.a2009s.annot (from web...)
 """]]
 
 [[!meta author=yoh]]
+
+> git-annex no longer uses either curl or wget by default, and always uses
+> curl when configured to do so, so this kind of surprising behavior will
+> no longer occur [[done]] --[[Joey]]

followup
diff --git a/doc/todo/sync_--branches__to_sync_only_specified_branches___40__e.g._git-annex__41__/comment_2_73bd5d343286f61c9ac753ed3b00c149._comment b/doc/todo/sync_--branches__to_sync_only_specified_branches___40__e.g._git-annex__41__/comment_2_73bd5d343286f61c9ac753ed3b00c149._comment
new file mode 100644
index 000000000..ae5c7ec7e
--- /dev/null
+++ b/doc/todo/sync_--branches__to_sync_only_specified_branches___40__e.g._git-annex__41__/comment_2_73bd5d343286f61c9ac753ed3b00c149._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2018-08-06T15:55:56Z"
+ content="""
+This is quite old, is it still wanted?
+
+git's remote.name.fetch config can make it only fetch a particular branch,
+so that's one way to do this without adding an option.
+"""]]

comment
diff --git a/doc/todo/annex_merge_--remotes/comment_3_35614da544e315529b236a36e1b28e2d._comment b/doc/todo/annex_merge_--remotes/comment_3_35614da544e315529b236a36e1b28e2d._comment
new file mode 100644
index 000000000..be9e73303
--- /dev/null
+++ b/doc/todo/annex_merge_--remotes/comment_3_35614da544e315529b236a36e1b28e2d._comment
@@ -0,0 +1,17 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2018-08-06T15:47:56Z"
+ content="""
+This came up again in <https://git-annex.branchable.com/tips/local_caching_of_annexed_files/>
+and there it was sufficient to configure remote.name.fetch so that no
+branches were fetched from the cache remote.
+
+Approach #3 can be implemented using:
+
+	fetch = refs/heads/master:refs/remotes/private/nomerge/master
+
+This prevents git-fetch from fetching the git-annex branch, and
+it makes the remote master banch fetch into a name that
+git-annex won't automatically merge into master.
+"""]]

already implmeneted
diff --git a/doc/todo/git_annex_info___60__remote__62___does_not_list_all_the_parameters_for_the_remote.mdwn b/doc/todo/git_annex_info___60__remote__62___does_not_list_all_the_parameters_for_the_remote.mdwn
index d83d3b40b..647b0054f 100644
--- a/doc/todo/git_annex_info___60__remote__62___does_not_list_all_the_parameters_for_the_remote.mdwn
+++ b/doc/todo/git_annex_info___60__remote__62___does_not_list_all_the_parameters_for_the_remote.mdwn
@@ -27,3 +27,7 @@ d7e13bf3-0c0e-44c9-a626-c7af6a628df7 chunk=50MiB encryption=none externaltype=rc
 needed to see what is the prefix -- which is stored in remote.log -- but not printed by 'git annex info' neither in --verbose nor --json mode
 
 [[!meta author=yoh]]
+
+> This got fixed already it turns out, GETINFO.
+> Of course this and other special remotes will need changes to use it,
+> but that's outside the scope of git-annex, so [[done]]. --[[Joey]]

update
diff --git a/doc/thanks/list b/doc/thanks/list
index ff6056dad..0339544ff 100644
--- a/doc/thanks/list
+++ b/doc/thanks/list
@@ -101,3 +101,5 @@ Diederik de Haas,
 Nick Piper, 
 Ryan Newton, 
 Brett Eisenberg, 
+paul walmsley, 
+John Lee, 

initial expression of the desire
diff --git a/doc/todo/provide___39__file__39___in_--json-progress_record_for_addurl.mdwn b/doc/todo/provide___39__file__39___in_--json-progress_record_for_addurl.mdwn
new file mode 100644
index 000000000..a68af9564
--- /dev/null
+++ b/doc/todo/provide___39__file__39___in_--json-progress_record_for_addurl.mdwn
@@ -0,0 +1,13 @@
+Would it be sensibly easy to provide "file" field in progress json records for addurl?  I guess in any usecase (provided or deduced from url filename) it should be known at that stage.
+ATM it is just "null" and I guess (didn't try ATM) it would be impossible to associate particular progress reports with corresponding files in the `--batch -J` mode
+
+[[!format sh """
+$> git annex addurl --file bigone --json --json-progress https://s3.amazonaws.com/fcp-indi/data/Projects/ABIDE_Initiative/Outputs/freesurfer/5.1/UCLA_1_0051257/mri/T1.mgz                                              
+{"byte-progress":259645,"action":{"command":"addurl","file":null},"total-size":2459677,"percent-progress":"10.56%"}
+{"byte-progress":1304125,"action":{"command":"addurl","file":null},"total-size":2459677,"percent-progress":"53.02%"}
+{"command":"addurl","note":"to bigone","success":true,"key":"MD5E-s2459677--ad5bf54490212c7e9d88f15e16c4b0c1","file":"bigone"}
+"""]]
+
+Thanks in advance
+
+[[!meta author=yoh]]

Added a comment: no generic solution is possible in indirect mode BUT still would be nice to have a 99% solution
diff --git a/doc/bugs/assistant_doesn__39__t_sync_file_permissions/comment_7_3cc6eeb8eae14ac3727b1e420f96ee7d._comment b/doc/bugs/assistant_doesn__39__t_sync_file_permissions/comment_7_3cc6eeb8eae14ac3727b1e420f96ee7d._comment
new file mode 100644
index 000000000..8b62a7d93
--- /dev/null
+++ b/doc/bugs/assistant_doesn__39__t_sync_file_permissions/comment_7_3cc6eeb8eae14ac3727b1e420f96ee7d._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="no generic solution is possible in indirect mode BUT still would be nice to have a 99% solution"
+ date="2018-08-03T21:51:22Z"
+ content="""
+just ran into this as well, so was looking around. 
+I am afraid that in indirect mode no \"proper\" solution is possible since for the same content (git-annex key) there could originally be multiple files with different permissions -- e.g. one executable and one not. 
+**BUT** IMHO even though no proper solution possible, if would indeed be very useful to have it resolved to work for 99% of cases, where such collisions aren't likely and a \"union\" of executable bit across present files in the repo could be used (so if one is executable, all others with the same content are as well).  Since git annex by default inherits/propagates metadata changes across \"editions\" of the files it would already be handy even if e.g. executable shell scripts gets modified which is kinda a neat side effect ;-)
+"""]]

devblog
diff --git a/doc/devblog/day_506__summer_features.mdwn b/doc/devblog/day_506__summer_features.mdwn
new file mode 100644
index 000000000..52eea6a45
--- /dev/null
+++ b/doc/devblog/day_506__summer_features.mdwn
@@ -0,0 +1,38 @@
+After the big security fix push, I've had a bit of a vacation. Several new
+features have also landed in git-annex though.
+
+git-worktree support is a feature I'm fairly excited by.
+It turned out to be possible to make git-annex just work in working trees
+set up by `git worktree`, and they share the same object files. So,
+if you need several checkouts of a repository for whatever reason,
+this makes it really efficient to do. It's much better than the old
+method of using `git clone --shared`.
+
+A new `--accessedwithin` option matches files whose content was accessed
+within a given amount of time. (Using the atime.) Of course it can
+be combined with other options, for example
+`git annex move --to archive --not --accessedwithin=30d`  
+There are a few open requests for other new file matching options that I
+hope to get to soon.
+
+A small configuration addition of remote.name.annex-speculate-present
+to make git-annex try to get content from a remote even if its records
+don't indicate the remote contains the content allows setting up an interesting
+kind of [local cache of annexed files](https://git-annex.branchable.com/tips/local_caching_of_annexed_files/)
+which can even be shared between unrelated git-annex repositories, with
+inter-repository deduplication. 
+
+I suspect that remote.name.annex-speculate-present may also have other
+uses. It warps git-annex's behavior in a small but fundamental way which
+could let it fit into new places. Will be interesting to see.
+
+There's also a annex.commitmessage config, which I am much less excited by,
+but enough people have asked for it over the years.
+
+Also fixed a howler of a bug today: In -J mode, remotes were sorted not
+by cost, but by UUID! How did that not get noticed for 2 years?
+
+Much of this work was sponsored by NSF-funded DataLad project at Dartmouth
+Colledge, as has been the case for the past 4 years. All told they've
+funded over 1000 hours of work on git-annex. This is the last month of that
+funding.

response
diff --git a/doc/forum/Check_when_your_last_fsck_was__63__/comment_3_8fa3680b084fad45d7a184d1191fedce._comment b/doc/forum/Check_when_your_last_fsck_was__63__/comment_3_8fa3680b084fad45d7a184d1191fedce._comment
new file mode 100644
index 000000000..42faf8846
--- /dev/null
+++ b/doc/forum/Check_when_your_last_fsck_was__63__/comment_3_8fa3680b084fad45d7a184d1191fedce._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2018-08-03T18:28:07Z"
+ content="""
+That was implemented for a different use-case (`git-annex expire`) but
+you can certianly use it to keep track of the last time some fsck, perhaps
+a partial one, was run on a repository.
+"""]]

probably pebak
diff --git a/doc/bugs/git-annex_requires_an_SSH_remote_to_have_an_absolute_path/comment_1_f328d7248e7eac4d69bc543feb90da7c._comment b/doc/bugs/git-annex_requires_an_SSH_remote_to_have_an_absolute_path/comment_1_f328d7248e7eac4d69bc543feb90da7c._comment
new file mode 100644
index 000000000..73bd7de88
--- /dev/null
+++ b/doc/bugs/git-annex_requires_an_SSH_remote_to_have_an_absolute_path/comment_1_f328d7248e7eac4d69bc543feb90da7c._comment
@@ -0,0 +1,13 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2018-08-03T18:19:38Z"
+ content="""
+I use home-relative paths in ssh remotes with git-annex all the time.
+It works fine.
+
+Seems to me that this, as well as your other bug report about using
+[git-annex-shell in a very strange way](http://git-annex.branchable.com/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/), suggest that you are doing something
+strange that you need to go into detail about in order for this to be a
+useful bug report.
+"""]]

why?
diff --git a/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_1_b0a4f459d40bca7a0b24d1bec1e0f4a8._comment b/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_1_b0a4f459d40bca7a0b24d1bec1e0f4a8._comment
new file mode 100644
index 000000000..de19505f5
--- /dev/null
+++ b/doc/bugs/git-annex-shell_-c_git-annex-shell_doesn__39__t_work__44___while_git-annex_expects_it_to/comment_1_b0a4f459d40bca7a0b24d1bec1e0f4a8._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2018-08-03T18:18:34Z"
+ content="""
+Why are you trying to do this? Is there a normal git-annex use case that
+involves that happening?
+"""]]

prevent fetch/merge/push to cache
diff --git a/doc/tips/local_caching_of_annexed_files.mdwn b/doc/tips/local_caching_of_annexed_files.mdwn
index 5c0809933..460100c0d 100644
--- a/doc/tips/local_caching_of_annexed_files.mdwn
+++ b/doc/tips/local_caching_of_annexed_files.mdwn
@@ -49,6 +49,7 @@ a remote, and configure it as follows:
 	git config remote.cache.annex-cost 10
 	git config remote.cache.annex-pull false
 	git config remote.cache.annex-push false
+	git config remote.cache.fetch do-not-fetch-from-this-remote:
 
 The annex-speculate-present setting is the essential part. It makes
 git-annex know that the cache repository may contain the content of any
@@ -59,11 +60,12 @@ The low annex-cost makes git-annex try to get content from the cache remote
 before any other remotes.
 
 The annex-pull and annex-push settings prevent `git-annex sync` from
-pulling and pushing to the remote. The cache repository will remain an
-empty git repository (except for the content of annexed files). This means
-that the same cache can be used with multiple different git-annex
-repositories, without intermingling their git data. You should also avoid
-manual `git pull` and `git push` to the cache remote.
+pulling and pushing to the remote, and the remote.cache.fetch setting
+further prevents git commands from fetching from it or pushing to it. The
+cache repository will remain an empty git repository (except for the
+content of annexed files). This means that the same cache can be used with
+multiple different git-annex repositories, without intermingling their git
+data.
 
 ## populating the cache
 
diff --git a/doc/tips/local_caching_of_annexed_files/comment_12_cbc8fbd8a98574683008ca363d3ac6b7._comment b/doc/tips/local_caching_of_annexed_files/comment_12_cbc8fbd8a98574683008ca363d3ac6b7._comment
new file mode 100644
index 000000000..c259bafe9
--- /dev/null
+++ b/doc/tips/local_caching_of_annexed_files/comment_12_cbc8fbd8a98574683008ca363d3ac6b7._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 12"""
+ date="2018-08-03T18:09:26Z"
+ content="""
+Found a remote.cache.fetch that will prevent most accidents, though of
+course the determined footgun script may find a way.
+"""]]

response
diff --git a/doc/forum/git-annex__58___unknown_response_from_git_cat-file/comment_1_96ad30294a6f18c0f73be1136de47632._comment b/doc/forum/git-annex__58___unknown_response_from_git_cat-file/comment_1_96ad30294a6f18c0f73be1136de47632._comment
new file mode 100644
index 000000000..b9d423c0a
--- /dev/null
+++ b/doc/forum/git-annex__58___unknown_response_from_git_cat-file/comment_1_96ad30294a6f18c0f73be1136de47632._comment
@@ -0,0 +1,22 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2018-08-03T17:41:16Z"
+ content="""
+I was able to reproduce the error message when I made a git-annex symlink
+that has a newline at the end of its link target. 
+
+So, it seems you must have one in your tree. Find it with:
+
+	find -ls | grep SHA256E-s1287921--0970d35c130c8f678fe9cd7
+
+Deleting the symlink that finds and replacing it with a new symlink without
+the newline at the end of its link target should fix the problem.
+
+It would be interesting to know how this symlink came to be. If you make
+another clone of the repository, do you get the same symlink? Was the file
+originally added to the NTFS repository and the bad symlink somehow came
+from there?
+
+Also, what git-annex version was used to add the bad symlink?
+"""]]

Added a comment: could we just make it "avoidable"?
diff --git a/doc/tips/local_caching_of_annexed_files/comment_11_5bf79eae24dd6b69daa1cd10e0b4d296._comment b/doc/tips/local_caching_of_annexed_files/comment_11_5bf79eae24dd6b69daa1cd10e0b4d296._comment
new file mode 100644
index 000000000..f5ea28992
--- /dev/null
+++ b/doc/tips/local_caching_of_annexed_files/comment_11_5bf79eae24dd6b69daa1cd10e0b4d296._comment
@@ -0,0 +1,14 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="could we just make it &quot;avoidable&quot;?"
+ date="2018-08-03T17:54:01Z"
+ content="""
+> There are just too many ways the user could bypass such protections. Including, for example, configuring git to fetch from cache to origin/ tracking branches.
+
+My concern is not really about making it impossible, but about making it unlikely or avoidable.   It is as similar as you cannot avoid completely someone merging `git-annex` branch \"manually\" using regular git-merge with some -Sours to \"avoid\" the conflicts. It is unavoidable but very unlikely ;)   ATM my problem is \"likely\" (as likely as me, the first user of the feature, ran into this problem right away) and \"unavoidable\" (`annex merge` has no option/mode to avoid merging those).  As long as we could avoid it somehow (e.g. by providing some option to `annex merge`) in those situations, it would be great.  My concern is that we cannot avoid it at all.
+
+> make it dead and use `git-annex forget --drop-dead`
+
+yeap, we will add that information to some FAQ etc, very useful.  But it might be a bit too late if we share that blown up git-annex branch publicly and people merge it into their git-annex'es.  If someone is as advanced as configuring git with alternative fetch settings, they could indeed resort to this.
+"""]]

response
diff --git a/doc/forum/Is_there_a_way_to_override_the_import_destination__63__/comment_1_5f094cc606eebd2aeeaaa4b861cf2c21._comment b/doc/forum/Is_there_a_way_to_override_the_import_destination__63__/comment_1_5f094cc606eebd2aeeaaa4b861cf2c21._comment
new file mode 100644
index 000000000..27d17f99e
--- /dev/null
+++ b/doc/forum/Is_there_a_way_to_override_the_import_destination__63__/comment_1_5f094cc606eebd2aeeaaa4b861cf2c21._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2018-08-03T17:37:59Z"
+ content="""
+`git-annex import` imports files into the current working directory,
+so make whatever subdirectory structure you want, cd into the place
+and import to there.
+"""]]

close
diff --git a/doc/bugs/git-annex-export_treeish_subdir_path_does_not_exist__91____91__done__93____93__.mdwn b/doc/bugs/git-annex-export_treeish_subdir_path_does_not_exist__91____91__done__93____93__.mdwn
index 2450ddd73..83a418e59 100644
--- a/doc/bugs/git-annex-export_treeish_subdir_path_does_not_exist__91____91__done__93____93__.mdwn
+++ b/doc/bugs/git-annex-export_treeish_subdir_path_does_not_exist__91____91__done__93____93__.mdwn
@@ -50,3 +50,4 @@ operating system: linux x86_64
 ### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
 Yep, it has been a long time since I used it but I am back to see what I can do to manage my files properly :)
 
+> [[done]] --[[Joey]]
diff --git a/doc/bugs/git-annex-export_treeish_subdir_path_does_not_exist__91____91__done__93____93__/comment_2_d3808b85328fbbc6f4aef5086e225187._comment b/doc/bugs/git-annex-export_treeish_subdir_path_does_not_exist__91____91__done__93____93__/comment_2_d3808b85328fbbc6f4aef5086e225187._comment
new file mode 100644
index 000000000..68aea9ad7
--- /dev/null
+++ b/doc/bugs/git-annex-export_treeish_subdir_path_does_not_exist__91____91__done__93____93__/comment_2_d3808b85328fbbc6f4aef5086e225187._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2018-08-03T17:31:39Z"
+ content="""
+It wasn't upgrading git that fixed it. The bug you reported was present in
+git-annex versions before 6.20171109. You had an older version listed
+in the bug report, so I assume you also upgraded git-annex to the fixed
+version.
+"""]]

response
diff --git a/doc/tips/local_caching_of_annexed_files/comment_10_c93895a509ab4e458043450bccf930dc._comment b/doc/tips/local_caching_of_annexed_files/comment_10_c93895a509ab4e458043450bccf930dc._comment
new file mode 100644
index 000000000..68143ccc4
--- /dev/null
+++ b/doc/tips/local_caching_of_annexed_files/comment_10_c93895a509ab4e458043450bccf930dc._comment
@@ -0,0 +1,23 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 10"""
+ date="2018-08-03T17:18:04Z"
+ content="""
+I fear that preventing merging of branches fetched from the cache remote
+in git-annex would be a game of whack-a-mole. There are just too many
+ways the user could bypass such protections. Including, for example,
+configuring git to fetch from cache to origin/ tracking branches.
+
+I remember at some point discussing isolating repos from one-another so
+that data from one repo can't leak across a boundary to another repo, while
+still having it be a remote, and it was similarly just not tractable. Can't
+seem to find the thread, but it's basically the same problem.
+
+If you do accidentially merge the git-annex branch from a cache remote,
+you can always make it dead and use git-annex forget --drop-dead.
+
+If you really want to avoid any possibility of git fetching from the caching
+remote, make it a directory special remote! But, there is not currently
+any way to make annex.hardlink work for directory special remotes, so it
+will be less efficient.
+"""]]

Fix wrong sorting of remotes when using -J
It was sorting by uuid, rather than cost!
Avoid future bugs of this kind by changing the Ord to primarily compare
by cost, with uuid only used when the cost is the same.
This commit was supported by the NSF-funded DataLad project.
diff --git a/Annex/Transfer.hs b/Annex/Transfer.hs
index a1cb14a4c..b20700541 100644
--- a/Annex/Transfer.hs
+++ b/Annex/Transfer.hs
@@ -279,4 +279,4 @@ pickRemote l a = go l =<< Annex.getState Annex.concurrency
 lessActiveFirst :: M.Map Remote Integer -> Remote -> Remote -> Ordering
 lessActiveFirst active a b
 	| Remote.cost a == Remote.cost b = comparing (`M.lookup` active) a b
-	| otherwise = compare a b
+	| otherwise = comparing Remote.cost a b
diff --git a/CHANGELOG b/CHANGELOG
index 3dcae535d..0fb62325f 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -8,6 +8,8 @@ git-annex (6.20180720) UNRELEASED; urgency=medium
   * Added --accessedwithin matching option.
   * Added annex.commitmessage config that can specify a commit message
     for the git-annex branch instead of the usual "update".
+  * Fix wrong sorting of remotes when using -J, it was sorting by uuid,
+    rather than cost.
 
  -- Joey Hess <id@joeyh.name>  Tue, 31 Jul 2018 12:14:11 -0400
 
diff --git a/Types/Remote.hs b/Types/Remote.hs
index 9f61f7041..9922b6569 100644
--- a/Types/Remote.hs
+++ b/Types/Remote.hs
@@ -146,8 +146,13 @@ instance Show (RemoteA a) where
 instance Eq (RemoteA a) where
 	x == y = uuid x == uuid y
 
+-- Order by cost since that is the important order of remotes
+-- when deciding which to use. But since remotes often have the same cost
+-- and Ord must be total, do a secondary ordering by uuid.
 instance Ord (RemoteA a) where
-	compare = comparing uuid
+	compare a b
+		| cost a == cost b = comparing uuid a b
+		| otherwise = comparing cost a b
 
 instance ToUUID (RemoteA a) where
 	toUUID = uuid
diff --git a/doc/tips/local_caching_of_annexed_files/comment_9_7db7787a306c70ba3c1687c1d103608d._comment b/doc/tips/local_caching_of_annexed_files/comment_9_7db7787a306c70ba3c1687c1d103608d._comment
new file mode 100644
index 000000000..d539ec97a
--- /dev/null
+++ b/doc/tips/local_caching_of_annexed_files/comment_9_7db7787a306c70ba3c1687c1d103608d._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 9"""
+ date="2018-08-03T16:30:35Z"
+ content="""
+The -J2 web bug was not related to caching remotes at all
+but was an accidental sort by remote uuid rather than cost.
+I've fixed it.
+"""]]

Added a comment: any update on this?
diff --git a/doc/bugs/Can__39__t_add_remotes_through_the_web_assistant/comment_4_52df3e78719c529987621b212588b785._comment b/doc/bugs/Can__39__t_add_remotes_through_the_web_assistant/comment_4_52df3e78719c529987621b212588b785._comment
new file mode 100644
index 000000000..aa63dc834
--- /dev/null
+++ b/doc/bugs/Can__39__t_add_remotes_through_the_web_assistant/comment_4_52df3e78719c529987621b212588b785._comment
@@ -0,0 +1,13 @@
+[[!comment format=mdwn
+ username="AlexP"
+ avatar="http://cdn.libravatar.org/avatar/a5756f1e491fa69cba8c2338ce459ed8"
+ subject="any update on this?"
+ date="2018-08-02T20:01:57Z"
+ content="""
+does anyone have a solution to this?  I can't seem to add remotes through the webapp.  I consistantly get the following error:
+
+Internal Server Error
+there is no available git remote named \"blablabla\"
+
+thanks
+"""]]

Added a comment: re: annex merge cache
diff --git a/doc/tips/local_caching_of_annexed_files/comment_8_0e4571cc81ade6510b7a1ddfff9cb055._comment b/doc/tips/local_caching_of_annexed_files/comment_8_0e4571cc81ade6510b7a1ddfff9cb055._comment
new file mode 100644
index 000000000..cc846529c
--- /dev/null
+++ b/doc/tips/local_caching_of_annexed_files/comment_8_0e4571cc81ade6510b7a1ddfff9cb055._comment
@@ -0,0 +1,12 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="re: annex merge cache"
+ date="2018-08-02T18:49:51Z"
+ content="""
+> Well, git-annex merge does not fetch, it only merges refs it sees.
+
+That is correct!  My alias to fetch all remotes (useful to quickly update on the current state of development in feature branches of others) fetched the cache as well.  Despite viral nature of git tags I consider it to be a good general approach.  But fetching is not merging -- I can remove any of those remotes at any moment happen some remote became too heavy or smth like that (tags are trickier).
+
+IMHO `annex merge` should also not merge those remotes which are not \"pullable\" by default.  May be it could take remote name(s) as its argument(s) to merge only specified ones (ATM arguments seems to be silently ignored), happen someone really need to merge somehow any of those.  That would prevent accidental blow up of the git-annex branch in case cache remote gets fetched.
+"""]]

response
diff --git a/doc/tips/local_caching_of_annexed_files/comment_7_ffdf07bac38a6c29b799281441fd32c9._comment b/doc/tips/local_caching_of_annexed_files/comment_7_ffdf07bac38a6c29b799281441fd32c9._comment
new file mode 100644
index 000000000..8e4dbbf76
--- /dev/null
+++ b/doc/tips/local_caching_of_annexed_files/comment_7_ffdf07bac38a6c29b799281441fd32c9._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 7"""
+ date="2018-08-02T18:15:13Z"
+ content="""
+Well, git-annex merge does not fetch, it only merges refs it sees. With
+the configuration I gave in the tip, you will not have a cache/git-annex branch for it
+to merge.
+"""]]

added annex.commitmessage
Added annex.commitmessage config that can specify a commit message for the
git-annex branch instead of the usual "update".
This commit was supported by the NSF-funded DataLad project.
diff --git a/Annex/Branch.hs b/Annex/Branch.hs
index 0aab4db60..a3945a18c 100644
--- a/Annex/Branch.hs
+++ b/Annex/Branch.hs
@@ -1,6 +1,6 @@
 {- management of the git-annex branch
  -
- - Copyright 2011-2017 Joey Hess <id@joeyh.name>
+ - Copyright 2011-2018 Joey Hess <id@joeyh.name>
  -
  - Licensed under the GNU GPL version 3 or higher.
  -}
@@ -19,6 +19,7 @@ module Annex.Branch (
 	getHistorical,
 	change,
 	maybeChange,
+	commitMessage,
 	commit,
 	forceCommit,
 	getBranch,
@@ -51,7 +52,7 @@ import qualified Git.Tree
 import Git.LsTree (lsTreeParams)
 import qualified Git.HashObject
 import Annex.HashObject
-import Git.Types
+import Git.Types (Ref(..), fromRef, RefDate, TreeItemType(..))
 import Git.FilePath
 import Annex.CatFile
 import Annex.Perms
@@ -177,9 +178,9 @@ updateTo' pairs = do
 	go branchref dirty tomerge jl = withIndex $ do
 		let (refs, branches) = unzip tomerge
 		cleanjournal <- if dirty then stageJournal jl else return noop
-		let merge_desc = if null tomerge
-			then "update"
-			else "merging " ++
+		merge_desc <- if null tomerge
+			then commitMessage
+			else return $ "merging " ++
 				unwords (map Git.Ref.describe branches) ++ 
 				" into " ++ fromRef name
 		localtransitions <- parseTransitionsStrictly "local"
@@ -259,6 +260,11 @@ maybeChange file f = lockJournal $ \jl -> do
 set :: JournalLocked -> FilePath -> String -> Annex ()
 set = setJournalFile
 
+{- Commit message used when making a commit of whatever data has changed
+ - to the git-annex brach. -}
+commitMessage :: Annex String
+commitMessage = fromMaybe "update" . annexCommitMessage <$> Annex.getGitConfig
+
 {- Stages the journal, and commits staged changes to the branch. -}
 commit :: String -> Annex ()
 commit = whenM journalDirty . forceCommit
diff --git a/Annex/Content.hs b/Annex/Content.hs
index 2363793bc..8011a8230 100644
--- a/Annex/Content.hs
+++ b/Annex/Content.hs
@@ -975,7 +975,7 @@ saveState nocommit = doSideAction $ do
 	Annex.Queue.flush
 	unless nocommit $
 		whenM (annexAlwaysCommit <$> Annex.getGitConfig) $
-			Annex.Branch.commit "update"
+			Annex.Branch.commit =<< Annex.Branch.commitMessage
 
 {- Downloads content from any of a list of urls. -}
 downloadUrl :: Key -> MeterUpdate -> [Url.URLString] -> FilePath -> Annex Bool
diff --git a/Annex/MakeRepo.hs b/Annex/MakeRepo.hs
index 189e98c7d..f80e30359 100644
--- a/Annex/MakeRepo.hs
+++ b/Annex/MakeRepo.hs
@@ -82,7 +82,7 @@ initRepo' desc mgroup = unlessM isInitialized $ do
 	maybe noop (defaultStandardGroup u) mgroup
 	{- Ensure branch gets committed right away so it is
 	 - available for merging immediately. -}
-	Annex.Branch.commit "update"
+	Annex.Branch.commit =<< Annex.Branch.commitMessage
 
 {- Checks if a git repo exists at a location. -}
 probeRepoExists :: FilePath -> IO Bool
diff --git a/Assistant/Sync.hs b/Assistant/Sync.hs
index 508b86efa..6792c1303 100644
--- a/Assistant/Sync.hs
+++ b/Assistant/Sync.hs
@@ -124,7 +124,7 @@ pushToRemotes remotes = do
 pushToRemotes' :: UTCTime -> [Remote] -> Assistant [Remote]
 pushToRemotes' now remotes = do
 	(g, branch, u) <- liftAnnex $ do
-		Annex.Branch.commit "update"
+		Annex.Branch.commit =<< Annex.Branch.commitMessage
 		(,,)
 			<$> gitRepo
 			<*> join Command.Sync.getCurrBranch
diff --git a/Assistant/WebApp/Configurators/Fsck.hs b/Assistant/WebApp/Configurators/Fsck.hs
index c70e5269a..ee6ab1d91 100644
--- a/Assistant/WebApp/Configurators/Fsck.hs
+++ b/Assistant/WebApp/Configurators/Fsck.hs
@@ -138,7 +138,7 @@ postConfigFsckR = page "Consistency checks" (Just Configuration) $ do
 changeSchedule :: Handler () -> Handler Html
 changeSchedule a = do
 	a
-	liftAnnex $ Annex.Branch.commit "update"
+	liftAnnex $ Annex.Branch.commit =<< Annex.Branch.commitMessage
 	redirect ConfigFsckR
 
 getRemoveActivityR :: UUID -> ScheduledActivity -> Handler Html
diff --git a/CHANGELOG b/CHANGELOG
index 1fd935164..3dcae535d 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -6,6 +6,8 @@ git-annex (6.20180720) UNRELEASED; urgency=medium
   * Added remote.name.annex-speculate-present config that can be used to
     make cache remotes.
   * Added --accessedwithin matching option.
+  * Added annex.commitmessage config that can specify a commit message
+    for the git-annex branch instead of the usual "update".
 
  -- Joey Hess <id@joeyh.name>  Tue, 31 Jul 2018 12:14:11 -0400
 
diff --git a/Command/Commit.hs b/Command/Commit.hs
index 131169e68..7465f78d3 100644
--- a/Command/Commit.hs
+++ b/Command/Commit.hs
@@ -21,7 +21,7 @@ seek = withNothing start
 
 start :: CommandStart
 start = next $ next $ do
-	Annex.Branch.commit "update"
+	Annex.Branch.commit =<< Annex.Branch.commitMessage
 	_ <- runhook <=< inRepo $ Git.hookPath "annex-content"
 	return True
   where
diff --git a/Command/Merge.hs b/Command/Merge.hs
index 66b519973..1ed669aff 100644
--- a/Command/Merge.hs
+++ b/Command/Merge.hs
@@ -27,7 +27,7 @@ mergeBranch = do
 	next $ do
 		Annex.Branch.update
 		-- commit explicitly, in case no remote branches were merged
-		Annex.Branch.commit "update"
+		Annex.Branch.commit =<< Annex.Branch.commitMessage
 		next $ return True
 
 mergeSynced :: CommandStart
diff --git a/Command/Sync.hs b/Command/Sync.hs
index 4442ed499..52fa929ee 100644
--- a/Command/Sync.hs
+++ b/Command/Sync.hs
@@ -301,7 +301,7 @@ commit :: SyncOptions -> CommandStart
 commit o = stopUnless shouldcommit $ next $ next $ do
 	commitmessage <- maybe commitMsg return (messageOption o)
 	showStart' "commit" Nothing
-	Annex.Branch.commit "update"
+	Annex.Branch.commit =<< Annex.Branch.commitMessage
 	ifM isDirect
 		( do
 			void stageDirect
@@ -544,7 +544,7 @@ pushBranch remote branch g = directpush `after` annexpush `after` syncpush
 
 commitAnnex :: CommandStart
 commitAnnex = do
-	Annex.Branch.commit "update"
+	Annex.Branch.commit =<< Annex.Branch.commitMessage
 	stop
 
 mergeAnnex :: CommandStart
diff --git a/Logs/Web.hs b/Logs/Web.hs
index abea00db6..bfe971e8a 100644
--- a/Logs/Web.hs
+++ b/Logs/Web.hs
@@ -78,7 +78,7 @@ knownUrls = do
 	 - any journaled changes are reflected in it, since we're going
 	 - to query its index directly. -}
 	Annex.Branch.update
-	Annex.Branch.commit "update"
+	Annex.Branch.commit =<< Annex.Branch.commitMessage
 	Annex.Branch.withIndex $ do
 		top <- fromRepo Git.repoPath
 		(l, cleanup) <- inRepo $ Git.LsFiles.stagedDetails [top]
diff --git a/Remote/Git.hs b/Remote/Git.hs
index fd8c05b3c..979e8db44 100644
--- a/Remote/Git.hs
+++ b/Remote/Git.hs
@@ -728,7 +728,7 @@ commitOnCleanup repo r a = go `after` a
 	cleanup
 		| not $ Git.repoIsUrl repo = onLocalFast repo r $
 			doQuietSideAction $
-				Annex.Branch.commit "update"
+				Annex.Branch.commit =<< Annex.Branch.commitMessage
 		| otherwise = void $ do
 			Just (shellcmd, shellparams) <-
 				Ssh.git_annex_shell NoConsumeStdin
diff --git a/Types/GitConfig.hs b/Types/GitConfig.hs
index 4475abf58..31d6fbe41 100644
--- a/Types/GitConfig.hs
+++ b/Types/GitConfig.hs
@@ -62,6 +62,7 @@ data GitConfig = GitConfig
 	, annexBloomAccuracy :: Maybe Int
 	, annexSshCaching :: Maybe Bool

(Diff truncated)
thought
diff --git a/doc/todo/be_able_to_specify_custom_commit_message_for_git-annex_branch_commit/comment_4_caf913b53a54ac010dba253fca1ef76e._comment b/doc/todo/be_able_to_specify_custom_commit_message_for_git-annex_branch_commit/comment_4_caf913b53a54ac010dba253fca1ef76e._comment
new file mode 100644
index 000000000..fa3e71d14
--- /dev/null
+++ b/doc/todo/be_able_to_specify_custom_commit_message_for_git-annex_branch_commit/comment_4_caf913b53a54ac010dba253fca1ef76e._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 4"""
+ date="2018-08-02T17:46:51Z"
+ content="""
+OTOH, maybe merging is never something one would want to add a custom
+message for -- it's an entirely a function of the inputs -- and so only
+"update" should get replaced.
+"""]]

followup
diff --git a/doc/todo/be_able_to_specify_custom_commit_message_for_git-annex_branch_commit/comment_3_5241eace21e873678a0fbb353f4ece69._comment b/doc/todo/be_able_to_specify_custom_commit_message_for_git-annex_branch_commit/comment_3_5241eace21e873678a0fbb353f4ece69._comment
new file mode 100644
index 000000000..68432661f
--- /dev/null
+++ b/doc/todo/be_able_to_specify_custom_commit_message_for_git-annex_branch_commit/comment_3_5241eace21e873678a0fbb353f4ece69._comment
@@ -0,0 +1,16 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2018-08-02T17:38:23Z"
+ content="""
+There are a few other messages than "update" that show up in some rare
+occasions including (message ++ " (recovery from race #"..) and
+"new branch for transition".
+
+And, there's "merging remote/git-annex into git-annex"
+which is common enough.
+
+I kind of have the feeling that the rare ones are rare enough
+that we might want to always use them, and only override 
+merging and update.
+"""]]

Added a comment: re: annex merge cache
diff --git a/doc/tips/local_caching_of_annexed_files/comment_6_c0d79b07a4c83c9d081d7c8f702fa4dc._comment b/doc/tips/local_caching_of_annexed_files/comment_6_c0d79b07a4c83c9d081d7c8f702fa4dc._comment
new file mode 100644
index 000000000..7b79e27d8
--- /dev/null
+++ b/doc/tips/local_caching_of_annexed_files/comment_6_c0d79b07a4c83c9d081d7c8f702fa4dc._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="re: annex merge cache"
+ date="2018-08-02T17:35:22Z"
+ content="""
+>  but you also have to avoid pulling from it yourself.
+
+I think we do call out to `annex merge` from time to time to update information about annex objects availability from any remote it might want to do so.  Since `sync` does more we avoid using it for those cases.  `git annex merge` doesn't even care about any argument given to it, so we cannot simply avoid calling it on `cache` remotes by specifying all other remotes.  Would it be possible to get some option `--only-pullable` or alike to make it prevent merging \"caches\"?
+"""]]

response
diff --git a/doc/todo/support_ssh__58____47____47___or_sftp__58____47____47___urls_via___34__built-in__34___ssh_support/comment_3_04be2f010aeb792e070f1ff93435fabc._comment b/doc/todo/support_ssh__58____47____47___or_sftp__58____47____47___urls_via___34__built-in__34___ssh_support/comment_3_04be2f010aeb792e070f1ff93435fabc._comment
new file mode 100644
index 000000000..372aa680f
--- /dev/null
+++ b/doc/todo/support_ssh__58____47____47___or_sftp__58____47____47___urls_via___34__built-in__34___ssh_support/comment_3_04be2f010aeb792e070f1ff93435fabc._comment
@@ -0,0 +1,14 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2018-08-02T17:21:13Z"
+ content="""
+Well, depends on coreutils and needs a stat(1) parser. Also, will sftp
+servers necessarily let arbitrary commands be run over ssh? Driving sftp
+interactive and parsing its ls -l would add more complexity to getting file
+size.
+
+The url security fixes also mean these uris can't be used without relaxing
+the security policy.. Makes me wonder how much special-casing makes sense
+for such an edge feature.
+"""]]

Added a comment: re: parallel and costs
diff --git a/doc/tips/local_caching_of_annexed_files/comment_5_dc1e5e3d256a2618f6c2a7864b01c08e._comment b/doc/tips/local_caching_of_annexed_files/comment_5_dc1e5e3d256a2618f6c2a7864b01c08e._comment
new file mode 100644
index 000000000..a49d91cc8
--- /dev/null
+++ b/doc/tips/local_caching_of_annexed_files/comment_5_dc1e5e3d256a2618f6c2a7864b01c08e._comment
@@ -0,0 +1,51 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="re: parallel and costs"
+ date="2018-08-02T17:28:46Z"
+ content="""
+Sorry - I am still missing.  
+I followed your example so the cost for cache is 10, whenever for web it is default 200:
+[[!format sh \"\"\"
+$> git annex info cache web | grep -e remote: -e cost
+remote: cache
+cost: 10.0
+remote: web
+cost: 200.0
+\"\"\"]]
+but it does download from the web in parallel download case -- so what am I missing?
+[[!format sh \"\"\"
+~/datalad/openfmri/ds000001 > datalad get -J 1 sub-01/anat/sub-*_T1w.nii.gz 
+get(ok): /home/yoh/datalad/openfmri/ds000001/sub-01/anat/sub-01_T1w.nii.gz (file) [from cache...]               
+
+~/datalad/openfmri/ds000001 > git annex drop sub-01/anat/sub-*_T1w.nii.gz
+drop sub-01/anat/sub-01_T1w.nii.gz (checking http://openneuro.s3.amazonaws.com/ds000001/ds000001_R1.1.0/uncompressed/sub001/anatomy/highres001.nii.gz?versionId=8TJ17W9WInNkQPdiQ9vS7wo8ZJ9llF80...) ok
+(recording state in git...)
+
+~/datalad/openfmri/ds000001 > datalad get -J 2 sub-01/anat/sub-*_T1w.nii.gz  
+get(ok): /home/yoh/datalad/openfmri/ds000001/sub-01/anat/sub-01_T1w.nii.gz (file) [from web...] 
+\"\"\"]]
+nothing in --debug output hints on the costs:
+[[!format sh \"\"\"
+~/datalad/openfmri/ds000001 > git annex get -J 2 --debug sub-01/anat/sub-*_T1w.nii.gz
+[2018-08-02 13:28:03.896215705] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"ls-files\",\"--cached\",\"-z\",\"--\",\"sub-01/anat/sub-01_T1w.nii.gz\"]
+[2018-08-02 13:28:03.900141316] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"show-ref\",\"git-annex\"]
+[2018-08-02 13:28:03.904139213] process done ExitSuccess
+[2018-08-02 13:28:03.904230988] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"show-ref\",\"--hash\",\"refs/heads/git-annex\"]
+[2018-08-02 13:28:03.908376239] process done ExitSuccess
+[2018-08-02 13:28:03.908608977] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"log\",\"refs/heads/git-annex..ff8578c5e3bdd1c67b2d9ca8082893fe6425f729\",\"--pretty=%H\",\"-n1\"]
+[2018-08-02 13:28:03.913502761] process done ExitSuccess
+[2018-08-02 13:28:03.914221081] chat: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"cat-file\",\"--batch\"]
+[2018-08-02 13:28:03.914683852] chat: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"cat-file\",\"--batch-check=%(objectname) %(objecttype) %(objectsize)\"]
+[2018-08-02 13:28:03.920509994] read: git [\"config\",\"--null\",\"--list\"]
+[2018-08-02 13:28:03.925910945] process done ExitSuccess
+get sub-01/anat/sub-01_T1w.nii.gz 
+[2018-08-02 13:28:03.926689119] chat: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"cat-file\",\"--batch\"]
+[2018-08-02 13:28:03.9274736] chat: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"cat-file\",\"--batch-check=%(objectname) %(objectty(from web...) ze)\"]
+76%   4.12 MiB        859 KiB/s 1s
+73%   3.96 MiB        842 KiB/s 1s
+...
+\"\"\"]]
+
+
+"""]]

followup
diff --git a/doc/tips/local_caching_of_annexed_files/comment_4_42906e52528907a0960ab8b08c2eedd4._comment b/doc/tips/local_caching_of_annexed_files/comment_4_42906e52528907a0960ab8b08c2eedd4._comment
new file mode 100644
index 000000000..f19586dfc
--- /dev/null
+++ b/doc/tips/local_caching_of_annexed_files/comment_4_42906e52528907a0960ab8b08c2eedd4._comment
@@ -0,0 +1,20 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 4"""
+ date="2018-08-02T16:58:45Z"
+ content="""
+Parallel downloads will use the cache repository for everything if it has a
+lower cost than other repositories. That's why the cost is set to 10 in the
+example. If it has the same cost as another repository, parallel downloads
+will spread the load between them. (This also means you can have multiple
+caches with the same cost and distribute load amoung them..)
+
+You should never be pulling from the cache repo, so there should be nothing
+to merge from it. That's what the remote.cache.annex-pull is there to prevent
+git-annex sync doing, but you also have to avoid pulling from it yourself.
+
+Using tunables with the cache does seem to work. Since all remotes usually
+have the same tunables as the local repo, there could potentially be
+bugs (or optimisations?) where it applies the local tunables to the
+remote, but in a little testing it seemed to work.
+"""]]

Added a comment: could be taken as a feature! but also annex should avoid merging cache git-annex
diff --git a/doc/tips/local_caching_of_annexed_files/comment_3_2a80533f259cde64662db7e6a1c1742c._comment b/doc/tips/local_caching_of_annexed_files/comment_3_2a80533f259cde64662db7e6a1c1742c._comment
new file mode 100644
index 000000000..0a2240882
--- /dev/null
+++ b/doc/tips/local_caching_of_annexed_files/comment_3_2a80533f259cde64662db7e6a1c1742c._comment
@@ -0,0 +1,28 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="could be taken as a feature! but also annex should avoid merging cache git-annex"
+ date="2018-08-02T14:30:51Z"
+ content="""
+I have two cache repos -- `cache` is just a regular one and `cache2` with those tuned up parameters:
+
+[[!format sh \"\"\"
+$> git annex merge       
+merge git-annex (merging cache/git-annex cache2/git-annex into git-annex...)
+git-annex: Remote repository is tuned in incompatible way; cannot be merged with local repository.
+
+\"\"\"]]
+
+and it didn't merge any of those which is good -- we do not want a possibly monstrous history of the cache to be merged into every repo using it
+
+But then when I remove that `cache2` git-annex does merge it:
+
+[[!format sh \"\"\"
+$> git remote rm cache2
+$> git annex merge     
+merge git-annex (merging cache/git-annex into git-annex...)
+(recording state in git...)
+ok
+\"\"\"]]
+which imho shouldn't happen -- annex shouldn't merge \"cache\" histories into this repository git-annex history.  I guess there should be one more config option to set for those remotes?
+"""]]

Added a comment: is it "safe" to tune?
diff --git a/doc/tips/local_caching_of_annexed_files/comment_2_d3ac760cfee0decfbcb4f81ddbc39a3f._comment b/doc/tips/local_caching_of_annexed_files/comment_2_d3ac760cfee0decfbcb4f81ddbc39a3f._comment
new file mode 100644
index 000000000..3c94fda3d
--- /dev/null
+++ b/doc/tips/local_caching_of_annexed_files/comment_2_d3ac760cfee0decfbcb4f81ddbc39a3f._comment
@@ -0,0 +1,11 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="is it &quot;safe&quot; to tune?"
+ date="2018-08-02T13:53:04Z"
+ content="""
+Hi Joey,
+
+Would it be safe to init that repo with those `tunables` such as ` -c annex.tune.objecthash1=true -c annex.tune.branchhash1=true` to save some inodes etc?   Any other tunable which might be of benefit (I still hope that I will see the time whenever the \"KEY/\" directory would be gone ;-))?
+I've tried with those two above (although annex.tune.branchhash1=true is probably irrelevant here) and it seems to do the right thing (at least for the objecthash1), but I just wanted to make sure I am not shooting myself into the foot.
+"""]]

Added a comment: is not going from cache with parallel get e.g. -J 2
diff --git a/doc/tips/local_caching_of_annexed_files/comment_1_ad1ab9d78cc0c45fcda54bf9f03b4f8f._comment b/doc/tips/local_caching_of_annexed_files/comment_1_ad1ab9d78cc0c45fcda54bf9f03b4f8f._comment
new file mode 100644
index 000000000..cde6979a4
--- /dev/null
+++ b/doc/tips/local_caching_of_annexed_files/comment_1_ad1ab9d78cc0c45fcda54bf9f03b4f8f._comment
@@ -0,0 +1,30 @@
+[[!comment format=mdwn
+ username="yarikoptic"
+ avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
+ subject="is not going from cache with parallel get e.g. -J 2 "
+ date="2018-08-02T13:33:08Z"
+ content="""
+Sweet! Thank you Joey
+
+The main issue so far detected is that if it is a parallel download (we have it as a default in datalad), it doesn't consider cache:
+
+[[!format sh \"\"\"
+$> git annex get -J1 sub-01/anat/sub-*_T1w.nii.gz      
+get sub-01/anat/sub-01_T1w.nii.gz (from cache...) ok
+(recording state in git...)
+
+$> datalad drop sub-01/anat/sub-*_T1w.nii.gz
+drop(ok): /home/yoh/datalad/openfmri/ds000001/sub-01/anat/sub-01_T1w.nii.gz (file)
+
+$> git annex get -J2 sub-01/anat/sub-*_T1w.nii.gz
+get sub-01/anat/sub-01_T1w.nii.gz (from web...) 
+22%   1.2 MiB         880 KiB/s 4s
+16%   891.34 KiB      895 KiB/s 5s
+
+\"\"\"]]
+
+I am still digesting either having cache operations/state reflected in git-annex branch is a ok or not so ok (whenever # of files is large etc) thing
+
+[[!meta author=yoh]]
+
+"""]]

diff --git a/doc/forum/Find_unlocked__47__locked_files.mdwn b/doc/forum/Find_unlocked__47__locked_files.mdwn
index 9f5468569..922c8c560 100644
--- a/doc/forum/Find_unlocked__47__locked_files.mdwn
+++ b/doc/forum/Find_unlocked__47__locked_files.mdwn
@@ -1,5 +1,5 @@
 Hello, I would like to know if there is any way to specifically list the locked or unlocked annexed files in a git annex.
-I looked at the git-annex-find and gt-annex-matching-options pages and on Google but I didn't find anything.
+I looked at the [[git-annex-lock]], [[git-annex-find]], and [[git-annex-matching-options]] pages and even the discussion about the [[tips/unlocked_files]] but I didn't find anything.
 
 I know it wouldn't make any sens for the older versions, but in the v6 mode, I think it might be useful to add such a shortcut search.
 I mean, we can already look for local content with:

diff --git a/doc/forum/Find_unlocked__47__locked_files.mdwn b/doc/forum/Find_unlocked__47__locked_files.mdwn
new file mode 100644
index 000000000..9f5468569
--- /dev/null
+++ b/doc/forum/Find_unlocked__47__locked_files.mdwn
@@ -0,0 +1,23 @@
+Hello, I would like to know if there is any way to specifically list the locked or unlocked annexed files in a git annex.
+I looked at the git-annex-find and gt-annex-matching-options pages and on Google but I didn't find anything.
+
+I know it wouldn't make any sens for the older versions, but in the v6 mode, I think it might be useful to add such a shortcut search.
+I mean, we can already look for local content with:
+```git annex find --in=here```
+so why not create something like
+```git annex find --locked/unlocked=yes/no```
+?
+
+Sure it is already more or less doable by looking at symlinks:
+[[!format sh """
+#list all broken symlinks (locked absent files ?)
+find . -xtype l 
+#list all symlinks (locked present files ?)
+find -L . -xtype l
+#list all files that aren't symlinks (unlocked files ?)
+find . -type -f
+"""]]
+But it is also possible for any symlink or file not to be part of the annex.
+So, in order to find the locked/unlocked files, it would require to intersect the previous sets of files with the set of annexed ones.
+
+Am I missing any easy tip or command argument to do this ?

hint about when requesttyle=path is needed
diff --git a/doc/special_remotes/S3.mdwn b/doc/special_remotes/S3.mdwn
index cca8e1ce7..f432e6a6b 100644
--- a/doc/special_remotes/S3.mdwn
+++ b/doc/special_remotes/S3.mdwn
@@ -62,6 +62,9 @@ the S3 remote.
 * `requeststyle` - Set to "path" to use path style requests, instead of the
   default DNS style requests. This is needed with some S3 services.
 
+  If you get an error about a host name not existing, it's a good
+  indication that you need to use this.
+
 * `bucket` - S3 requires that buckets have a globally unique name, 
   so by default, a bucket name is chosen based on the remote name
   and UUID. This can be specified to pick a bucket name.

Added --accessedwithin matching option.
Useful for dropping old objects from cache repositories.
But also, quite a genrally useful thing to have..
Rather than imitiating find's -atime and other options, all of which are
pretty horrible to use, I made this match files accessed within a time
period, using the same duration format used by git-annex schedule and
--limit-time
In passing, changed the --limit-time option parser to parse the
duration, instead of having it later throw an error.
This commit was supported by the NSF-funded DataLad project.
diff --git a/CHANGELOG b/CHANGELOG
index dd1870c25..1fd935164 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -5,6 +5,7 @@ git-annex (6.20180720) UNRELEASED; urgency=medium
   * Fix reversion in display of http 404 errors.
   * Added remote.name.annex-speculate-present config that can be used to
     make cache remotes.
+  * Added --accessedwithin matching option.
 
  -- Joey Hess <id@joeyh.name>  Tue, 31 Jul 2018 12:14:11 -0400
 
diff --git a/CmdLine/GitAnnex/Options.hs b/CmdLine/GitAnnex/Options.hs
index 97cef88b9..791f499d1 100644
--- a/CmdLine/GitAnnex/Options.hs
+++ b/CmdLine/GitAnnex/Options.hs
@@ -38,6 +38,7 @@ import CmdLine.Usage
 import CmdLine.GlobalSetter
 import qualified Backend
 import qualified Types.Backend as Backend
+import Utility.HumanTime
 
 -- Global options that are accepted by all git-annex sub-commands,
 -- although not always used.
@@ -275,6 +276,12 @@ nonWorkTreeMatchingOptions' =
 		<> help "match files the repository wants to drop"
 		<> hidden
 		)
+	, globalSetter Limit.addAccessedWithin $ option (str >>= parseDuration)
+		( long "accessedwithin"
+		<> metavar paramTime
+		<> help "match files accessed within a time interval"
+		<> hidden
+		)
 	]
 
 -- Options to match files which may not yet be annexed.
@@ -371,7 +378,7 @@ jobsOption =
 
 timeLimitOption :: [GlobalOption]
 timeLimitOption = 
-	[ globalSetter Limit.addTimeLimit $ strOption
+	[ globalSetter Limit.addTimeLimit $ option (str >>= parseDuration)
 		( long "time-limit" <> short 'T' <> metavar paramTime
 		<> help "stop after the specified amount of time"
 		<> hidden
diff --git a/Limit.hs b/Limit.hs
index 5d00e2e68..93b32a89f 100644
--- a/Limit.hs
+++ b/Limit.hs
@@ -298,21 +298,32 @@ limitMetaData s = case parseMetaDataMatcher s of
 		. S.filter matching
 		. metaDataValues f <$> getCurrentMetaData k
 
-addTimeLimit :: String -> Annex ()
-addTimeLimit s = do
-	let seconds = maybe (giveup "bad time-limit") durationToPOSIXTime $
-		parseDuration s
+addTimeLimit :: Duration -> Annex ()
+addTimeLimit duration = do
 	start <- liftIO getPOSIXTime
-	let cutoff = start + seconds
+	let cutoff = start + durationToPOSIXTime duration
 	addLimit $ Right $ const $ const $ do
 		now <- liftIO getPOSIXTime
 		if now > cutoff
 			then do
-				warning $ "Time limit (" ++ s ++ ") reached!"
+				warning $ "Time limit (" ++ fromDuration duration ++ ") reached!"
 				shutdown True
 				liftIO $ exitWith $ ExitFailure 101
 			else return True
 
+addAccessedWithin :: Duration -> Annex ()
+addAccessedWithin duration = do
+	now <- liftIO getPOSIXTime
+	addLimit $ Right $ const $ checkKey $ check now
+  where
+	check now k = inAnnexCheck k $ \f ->
+		liftIO $ catchDefaultIO False $ do
+			s <- getFileStatus f
+			let accessed = realToFrac (accessTime s)
+			let delta = now - accessed
+			return $ delta <= secs
+	secs = fromIntegral (durationSeconds duration)
+
 lookupFileKey :: FileInfo -> Annex (Maybe Key)
 lookupFileKey = lookupFile . currFile
 
diff --git a/doc/git-annex-matching-options.mdwn b/doc/git-annex-matching-options.mdwn
index 2802fe60b..81f705f3a 100644
--- a/doc/git-annex-matching-options.mdwn
+++ b/doc/git-annex-matching-options.mdwn
@@ -145,6 +145,20 @@ in either of two repositories.
   
   Note that this will not match anything when using --all or --unused.
 
+* `--accessedwithin=interval`
+
+  Matches files that were accessed recently, within the specified time
+  interval.
+  
+  The interval can be in the form "5m" or "1h" or "2d" or "1y", or a
+  combination such as "1h5m".
+
+  So for example, `--accessedwithin=1d` matches files that have been
+  accessed within the past day.
+
+  If the OS or filesystem does not support access times, this will not
+  match any files.
+
 * `--not`
 
   Inverts the next matching option. For example, to only act on
diff --git a/doc/tips/local_caching_of_annexed_files.mdwn b/doc/tips/local_caching_of_annexed_files.mdwn
index b7ddb545b..5c0809933 100644
--- a/doc/tips/local_caching_of_annexed_files.mdwn
+++ b/doc/tips/local_caching_of_annexed_files.mdwn
@@ -21,10 +21,10 @@ You'll need git-annex 6.20180802 or newer to follow these instructions.
 ## creating the cache
 
 First let's create a new, empty git-annex repository. It will be put in
-~/.annex-cache in the example, but for best results, it in the same
+~/.annex-cache in the example, but for best results, put it in the same
 filesystem as your other git-annex repositories.
 
-	git init ~/.annex-cache
+	git init --bare ~/.annex-cache
 	cd ~/.annex-cache
 	git annex init
 	git config annex.hardlink true
@@ -79,11 +79,23 @@ enough start.
 
 ## cleaning the cache
 
-XXX find
+You safely can remove content from the cache at any time to free up disk
+space.
+
+To remove everything:
+
+	cd ~/.annex-cache
+	git annex drop --force
+
+To remove files that have not been requested from the cache for the past day:
+
+	cd ~/.annex-cache
+	git annex drop --force --not --accessedwithin=1d
 
 ## automatically populating the cache
 
-XXX
+The assistant can be used to automatically populate the cache with files
+that git-annex downloads into a repository.
 
 ## more caches
 

diff --git a/doc/forum/Combine_tags_in_view_branches.mdwn b/doc/forum/Combine_tags_in_view_branches.mdwn
index 5de355bd6..8dfb8208f 100644
--- a/doc/forum/Combine_tags_in_view_branches.mdwn
+++ b/doc/forum/Combine_tags_in_view_branches.mdwn
@@ -6,7 +6,6 @@ git annex view photos videos
 
 produces a directory tree like so:
 
-```
 + photos
   - a.jpg
   - b.jpg
@@ -15,11 +14,9 @@ produces a directory tree like so:
   - a.mp4
   - b.mp4
   - ...
-```
 
 Is there a way to achieve the following output?:
 
-```
 - a.jpg
 - b.jpg
 - a.mp4

diff --git a/doc/forum/Combine_tags_in_view_branches.mdwn b/doc/forum/Combine_tags_in_view_branches.mdwn
new file mode 100644
index 000000000..5de355bd6
--- /dev/null
+++ b/doc/forum/Combine_tags_in_view_branches.mdwn
@@ -0,0 +1,27 @@
+Is there a way to 'combine' tags in view branches? For example:
+
+```
+git annex view photos videos
+```
+
+produces a directory tree like so:
+
+```
++ photos
+  - a.jpg
+  - b.jpg
+  - ...
++ videos
+  - a.mp4
+  - b.mp4
+  - ...
+```
+
+Is there a way to achieve the following output?:
+
+```
+- a.jpg
+- b.jpg
+- a.mp4
+- b.mp4
+- ...

cache remotes via annex-speculate-present
Added remote.name.annex-speculate-present config that can be used to
make cache remotes.
Implemented it in Remote.keyPossibilities, which is used by the
get/move/copy/mirror commands, and nothing else. This way, things like
whereis will not show content that's speculatively present.
The assistant and sync --content were not using Remote.keyPossibilities,
and were changed to use it.
The efficiency hit should be small; Remote.keyPossibilities is only
used before transferring a file, which is the expensive operation.
And, it's only doing one lookup of the remoteList and a very cheap
filter over it.
Note that, git-annex still updates the location log when copying content
to a remote with annex-speculate-present set. In this case, the location
tracking will indicate that content is present in the remote. This may
not be wanted for caches, or may not be a real problem for them. TBD.
This commit was supported by the NSF-funded DataLad project.
diff --git a/Assistant/TransferQueue.hs b/Assistant/TransferQueue.hs
index f1df845f4..6a4473262 100644
--- a/Assistant/TransferQueue.hs
+++ b/Assistant/TransferQueue.hs
@@ -92,7 +92,7 @@ queueTransfersMatching matching reason schedule k f direction
 				filter (\r -> not (inset s r || Remote.readonly r))
 					(syncDataRemotes st)
 	  where
-		locs = S.fromList <$> Remote.keyLocations k
+		locs = S.fromList . map Remote.uuid <$> Remote.keyPossibilities k
 		inset s r = S.member (Remote.uuid r) s
 	gentransfer r = Transfer
 		{ transferDirection = direction
diff --git a/CHANGELOG b/CHANGELOG
index 54a244cc1..dd1870c25 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -3,6 +3,8 @@ git-annex (6.20180720) UNRELEASED; urgency=medium
   * S3: Support credential-less download from remotes configured
     with public=yes exporttree=yes.
   * Fix reversion in display of http 404 errors.
+  * Added remote.name.annex-speculate-present config that can be used to
+    make cache remotes.
 
  -- Joey Hess <id@joeyh.name>  Tue, 31 Jul 2018 12:14:11 -0400
 
diff --git a/Command/Sync.hs b/Command/Sync.hs
index 0fb3bdc3f..4442ed499 100644
--- a/Command/Sync.hs
+++ b/Command/Sync.hs
@@ -616,7 +616,7 @@ seekSyncContent o rs = do
  -}
 syncFile :: Either (Maybe (Bloom Key)) (Key -> Annex ()) -> [Remote] -> AssociatedFile -> Key -> Annex Bool
 syncFile ebloom rs af k = onlyActionOn' k $ do
-	locs <- Remote.keyLocations k
+	locs <- map Remote.uuid <$> Remote.keyPossibilities k
 	let (have, lack) = partition (\r -> Remote.uuid r `elem` locs) rs
 
 	got <- anyM id =<< handleget have
diff --git a/Remote.hs b/Remote.hs
index ff891962a..842c3bc60 100644
--- a/Remote.hs
+++ b/Remote.hs
@@ -1,6 +1,6 @@
 {- git-annex remotes
  -
- - Copyright 2011 Joey Hess <id@joeyh.name>
+ - Copyright 2011-2018 Joey Hess <id@joeyh.name>
  -
  - Licensed under the GNU GPL version 3 or higher.
  -}
@@ -278,13 +278,21 @@ keyLocations key = trustExclude DeadTrusted =<< loggedLocations key
 
 {- Cost ordered lists of remotes that the location log indicates
  - may have a key.
+ -
+ - Also includes remotes with remoteAnnexSpeculatePresent set.
  -}
 keyPossibilities :: Key -> Annex [Remote]
 keyPossibilities key = do
 	u <- getUUID
 	-- uuids of all remotes that are recorded to have the key
 	locations <- filter (/= u) <$> keyLocations key
-	fst <$> remoteLocations locations []
+	speclocations <- map uuid
+		. filter (remoteAnnexSpeculatePresent . gitconfig)
+		<$> remoteList
+	-- there are unlikely to be many speclocations, so building a Set
+	-- is not worth the expense
+	let locations' = speclocations ++ filter (`notElem` speclocations) locations
+	fst <$> remoteLocations locations' []
 
 {- Given a list of locations of a key, and a list of all
  - trusted repositories, generates a cost-ordered list of
diff --git a/Types/GitConfig.hs b/Types/GitConfig.hs
index 26ad354c8..4475abf58 100644
--- a/Types/GitConfig.hs
+++ b/Types/GitConfig.hs
@@ -226,6 +226,7 @@ data RemoteGitConfig = RemoteGitConfig
 	, remoteAnnexStartCommand :: Maybe String
 	, remoteAnnexStopCommand :: Maybe String
 	, remoteAnnexAvailability :: Maybe Availability
+	, remoteAnnexSpeculatePresent :: Bool
 	, remoteAnnexBare :: Maybe Bool
 	, remoteAnnexRetry :: Maybe Integer
 	, remoteAnnexRetryDelay :: Maybe Seconds
@@ -281,6 +282,7 @@ extractRemoteGitConfig r remotename = do
 		, remoteAnnexStartCommand = notempty $ getmaybe "start-command"
 		, remoteAnnexStopCommand = notempty $ getmaybe "stop-command"
 		, remoteAnnexAvailability = getmayberead "availability"
+		, remoteAnnexSpeculatePresent = getbool "speculate-present" False
 		, remoteAnnexBare = getmaybebool "bare"
 		, remoteAnnexRetry = getmayberead "retry"
 		, remoteAnnexRetryDelay = Seconds
diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn
index 163a628c1..3d2f92f32 100644
--- a/doc/git-annex.mdwn
+++ b/doc/git-annex.mdwn
@@ -1283,6 +1283,13 @@ Here are all the supported configuration settings.
   Can be used to tell git-annex whether a remote is LocallyAvailable
   or GloballyAvailable. Normally, git-annex determines this automatically.
 
+* `remote.<name>.annex-speculate-present`
+
+  Make git-annex speculate that this remote may contain the content of any
+  file, even though its normal location tracking does not indicate that it
+  does. This will cause git-annex to try to get all file contents from the
+  remote. Can be useful in setting up a caching remote.
+
 * `remote.<name>.annex-bare`
 
   Can be used to tell git-annex if a remote is a bare repository
diff --git a/doc/tips/local_caching_of_annexed_files.mdwn b/doc/tips/local_caching_of_annexed_files.mdwn
new file mode 100644
index 000000000..b7ddb545b
--- /dev/null
+++ b/doc/tips/local_caching_of_annexed_files.mdwn
@@ -0,0 +1,91 @@
+Here's how to set up a local cache of annexed files, that can be used
+to avoid repeated downloads.
+
+An example use case: Your CI system is operating on a git-annex repository,
+so every time it runs it makes a fresh clone of the repository and uses
+`git-annex get` to download a lot of data into it.
+
+We'll create a cache repository, set it as a remote of the other git-annex
+repositories, and configure git-annex to check the cache first before other
+more expensive ways of retrieving content. The cache can be cleaned out
+whenever you like with simple unix commands. 
+
+Some other nice properties -- When used on a system like BTRFS with COW
+support, content from the cache can populate multiple other repositories
+without using any additional disk space. And, git-annex repositories that
+are otherwise unrelated can share use of the cache if they happen to
+contain a common file.
+
+You'll need git-annex 6.20180802 or newer to follow these instructions.
+
+## creating the cache
+
+First let's create a new, empty git-annex repository. It will be put in
+~/.annex-cache in the example, but for best results, it in the same
+filesystem as your other git-annex repositories.
+
+	git init ~/.annex-cache
+	cd ~/.annex-cache
+	git annex init
+	git config annex.hardlink true
+	git annex untrust here
+
+The cache does not need to be a git annex repository; any kind of special
+remote can be used as a cache too. But, using a git repository lets
+annex.hardlink be used to make hard links between the cache and
+repositories using it.
+
+The cache is made untrusted, because its contents can be cleaned at any
+time; other repositories should not trust it to retain content.
+
+## making repositories use the cache
+
+Now in each git-annex repository that you want to use the cache, add it as
+a remote, and configure it as follows:
+
+	cd my-repository
+	git remote add cache ~/.annex-cache
+	git config remote.cache.annex-speculate-present true
+	git config remote.cache.annex-cost 10
+	git config remote.cache.annex-pull false
+	git config remote.cache.annex-push false
+
+The annex-speculate-present setting is the essential part. It makes
+git-annex know that the cache repository may contain the content of any
+annexed file. So, when getting a file, git-annex will try the cache
+repository first.
+
+The low annex-cost makes git-annex try to get content from the cache remote
+before any other remotes.
+
+The annex-pull and annex-push settings prevent `git-annex sync` from
+pulling and pushing to the remote. The cache repository will remain an
+empty git repository (except for the content of annexed files). This means
+that the same cache can be used with multiple different git-annex
+repositories, without intermingling their git data. You should also avoid
+manual `git pull` and `git push` to the cache remote.
+
+## populating the cache
+
+For the cache to be used, you need to get file contents into it somehow.
+A simple way to do that is, in a git-annex repository that already
+contains the content of files:
+
+	git annex copy --to cache
+
+You could run that anytime after you get content. There are also ways to
+automate it, but getting some files into the cache manually is a good
+enough start.
+
+## cleaning the cache
+
+XXX find

(Diff truncated)