forum/Strategy for dealing with an old archive drivegit-annexhttp://git-annex.branchable.com/forum/Strategy_for_dealing_with_an_old_archive_drive/git-annexikiwiki2021-04-21T22:33:55Zcomment 1http://git-annex.branchable.com/forum/Strategy_for_dealing_with_an_old_archive_drive/comment_1_7405e507dd15f207445be1e5099e02f3/joey2021-04-21T21:30:39Z2021-04-21T21:19:54Z
<p>2TB of data is no problem. git does start to slow down as the number of
files in a tree increases, with 200,000 or so where it might start to become
noticable. With this many files, updating .git/index will need to write out
something like 50mb of data to disk.</p>
<p>(git has some "split index" stuff that is supposed to help with this, but
I have not had the best experience with it.)</p>
<p>Committing the files to a branch other than master might be a reasonable
compromise. Then you can just copy the git-annex symlinks over to master as
needed, or check out the branch from time to time.</p>
<p>The main bottleneck doing that would be that the git-annex branch will also
contain 1 location log file per annexed file, and writing to
.git/annex/index will slow down a bit with so many files too. But,
git-annex has a lot of optimisations around batching writes to its index that
should make the impact minimal.</p>
comment 2http://git-annex.branchable.com/forum/Strategy_for_dealing_with_an_old_archive_drive/comment_2_bafde6d22fdb5106c2e4a519e5945072/pat2021-04-21T22:33:55Z2021-04-21T22:33:55Z
<blockquote><p>Committing the files to a branch other than master might be a reasonable compromise. Then you can just copy the git-annex symlinks over to master as needed, or check out the branch from time to time.</p></blockquote>
<p>I think that could work nicely. I do like the idea of having my files annexed, and distributing them across machines that way, so this strikes me as a good compromise.</p>
<p>Thank you for the idea!</p>