I have two sets of git annex repositories:
A: Contains a tree of some files and then a huge chunk of files
B: Already contains a ton of files
Both repository sets have their own set of repos on different machines that are connected to another.
I now want the huge chunk of content in A to be in B instead of A but the rest of the content should remain in A.
On git side, I think basically what this boils down to is that I want to move the git sub-tree to the other repo which ought to be possible by simply git adding them. I might want to get fancy and cherry-pick the commit from the other repo in a way that keeps it but moves the content to another subdirectory but that's the simple part.
On the git-annex side, I want all the keys that were exclusively referenced in this set of files to be marked as dead and dropped from remotes which contain them. I effectively want to remove those keys from git-annex' consideration.
The keys should be added to B. They should start out with a blank location tracking state as B should not know about A's repositories. They should however retain any other metadata they previously had in A.
How could I achieve this?

I do this quite often because I use a monorepo approach with regular refactoring of subtrees into their own submodules. I have yet to find a bulletproof way to do this on the git-annex side.
The first step is as simple as
git annex unannexinA, or including--include "*"if pattern matching is easier.gitside, this logs the files as deleted from the main repo (src, let's call her). This is ideal so that you have a record for yourself (with a descriptive commit message) of where you've moved your files to.git-annexside, (once you commit), the file data will eventually become "unused" - you'll have to do some combination ofgit annex pushandgit annex sync [--cleanup]to ensure all branches really don't reference those files (including remote branches andsynced/*branches).Now the question is: how do we get the data into the new repo (
dst) and safely drop fromsrc?dstas a remote ofsrcand pull onlydst'sgit-annexbranch, which (after moving, re-annexing, and committing the unannexed files todst) now shows as having a copy of those files. (Warning: this has bad side-effects).dstto move any (used) files fromsrc(Warning: this has bad side-effects).dstas a remote andmoveunused files over (requires a clean unused stack already and having to do the push/sync stuff correctly and fully before the files can be released)srcfirst then move them over todst. (Required because perdst's knowledge, it has no record ofsrchaving any keys. I find it logical albeit sad thatgit-annexcan't dynamically poll local repos' annexes for file content)Conclusions
git annex unusedgives nothing) as much as you can, and clean it out before testing out any sort of move/drop operations like this.gx unannexinsrc:srcas a remote indst,mvfiles intodst,gx addfiles indst,gx copyfiles fromdstback tosrc, then dogx move -f <src>dst. If it so happens that one of these files is actually duplicate data with something you want to also be insrc, this will drop it and leave no record insrcof where it went (besides yourgitcommit message).As described, there are still side effects with Option 4, but it's so far the best option I've devised. Oh, and if you want to keep
srcaround as a remote ondstto e.g. remind yourself of various relations, make sure you configure it in.git/configwith:annex.sync=false. This skips it when you do agit annex syncremote.fetchspec, or addremote.skipFetchAll=true. This ensuresgit fetchdoesn't fetch all the branch and unrelated objectsNow, what happens if a side-effect does happen and it looks like you lost some content and don't know where it went?
git annex whereisis no help. Instead, you have to extract the key from the now broken symlink and runfind <> -type f -iname "<KEY>". Easy enough but kind of scary when it happens to you.Side-Effects of Option 1+2:
git-annexsynchronizationDON'T DEAD OPEN INSIDE
While this is currently the only way to propagate annex key information, it has bad side-effects:
git-annexbranch. For me this is a no-go because I have redundant remotes (an exporttree calleddropboxin my case)deadthese remotes or repos and by coincidence thegit-annexbranch is later absorbed in the other direction, chaos ensues (deadis propagated, remote annex key history is killed: especially gross for export/importtrees)dead,forget --drop-deadthensemitrust UUID. Many steps, potentially undefined condition. Gross.Potential Feature Requests
Ideally, I would wish
git-annexcould intelligently scan another repo's annex and populate information about what keys it has simply by what keys are objectively in.git/annex/objects. This pulls in the information we care about without cluttering additional information relevant only to each respective repo. Then, presuming you've set up a remote (dst) pointing to this repo (src) and rungit annex info, thensrcshould have a list of keys that are insidedst, andgx whereisfromsrcwill identify the keys insidedst, anddropwill happily do so.acquaintancerepo that is not allowed to be synced, pulled, fetched, pushed to.gx forget, the list of keys is wiped.Turns out, the answer is simple:
git rm --cached "B"B):git addgit remote add tmp.parent <relpath/from/B/root/to/A/root>git annex getgit remote remove tmp.parentif you need just the files moved around
I haven't used metadata so I can't comment on how to move that around but you might have to rely on something akin to my first comment. In my brief testing, because metadata is stored in the
git-annexbranch on a per-key level, it does in fact require merging of the git-annex branch somehow to transfer.In short:
git-annexcan get file content in both an informed and uninformed way. Ifgit-annexknows about content in a repo because of historic moves/copies-to or merging ofgit-annexbranches, it has informed knowledge of what's in certain remotes. If it does not, then it can still do an uninformed query for potential file content. In this way, e.g.git annex infoandgit annex listmay show file content as not in a particular remote, but agit annex getorgit annex movemay actually still work.