forum/Large Windows annexes with *nix remotesgit-annexhttp://git-annex.branchable.com/forum/Large_Windows_annexes_with___42__nix_remotes/git-annexikiwiki2021-03-02T22:09:45Zcomment 1http://git-annex.branchable.com/forum/Large_Windows_annexes_with___42__nix_remotes/comment_1_c59077a9d6500df0cd88ac5a18085085/joey2021-03-02T22:09:45Z2021-01-19T15:58:37Z
<p>Direct mode was only supported by old versions of git-annex. It did work on
windows. The replacement (adusted unlocked branches + annex.thin) is better
in every way except one: On windows (and some filesystems like FAT), it is
not able to avoid storing 2 copies of files, because git-annex isn't able
to hard link files there. If there was a reasonable way to do that on
windows, that could be a big improvement, but I have not dug into whether
windows has anything similar enough to hard link for git-annex to use it.</p>
<p>git-annex scales to hundreds of thousands of files.</p>
<p>If I had two directories like your V# and V#-related, I might make them
each into their own repository, and set them each as a git remote of the
other. That would let git-annex know that identical files have two copies.
(Or, the parent directory could be made into a git-annex repository, which
would let git-annex deduplicate identical files, but since that needs
symlink support, it won't happen on windows.)</p>
<p>You can certianly use GIT_DIR with git-annex.</p>
<p>Generally the best thing to do with storage on the other side of a network
connection is to run git-annex on it locally, or use it as some kind of
special remote.</p>
comment 2http://git-annex.branchable.com/forum/Large_Windows_annexes_with___42__nix_remotes/comment_2_bfc92cb2117b26eec95687950827c78b/joey2021-03-02T22:09:45Z2021-01-19T16:11:42Z
As far as support goes, see <a href="http://git-annex.branchable.com/thanks/">thanks</a>.
comment 1http://git-annex.branchable.com/forum/Large_Windows_annexes_with___42__nix_remotes/comment_1_a38e0ecef49a81a1d2b3d2eaabab65ab/Lukey2021-03-02T22:09:45Z2021-01-19T16:45:30Z
<p>direct-mode doesn't exist anymore, it is replaced with the <a href="https://git-annex.branchable.com/tips/unlocked_files">annex.thin config</a> (See "using less disk space"). And yes, it works on NTFS.</p>
<p>It definitely isn't going to be fast, the numbers you gave suggests that there will be ~1000000 files per repository (For the V*-related dirs). Still you should try it and see if it's fast enough for you. Some tips to improve performance: Don't use <code>include=</code>/<code>exclude=</code> in preferred-content-expressions and <a href="https://git-annex.branchable.com/tips/Repositories_with_large_number_of_files/">Repositories with large number of files</a>. My experimental script <a href="https://git-annex.branchable.com/todo/Incremental_git_annex_sync<em>--content</em>--all/">here</a> might also be worth a try.</p>
<p>Having the root of the repo on the root of the drive and then excluding everything that shouldn't be in the repo via <code>.gitignore</code> can be a vivable approach. But with that many files I'd create one repo per directory. It could also be done with git worktrees.</p>
<p>Don't use git-annex on top of a network share, in that case run it directly on the server. git-annex is designed to run on local drives/storage. Also, git-annex on windows is way slower than on linux.</p>
<p>You can donate via <a href="https://www.patreon.com/joeyh">Patreon</a> and <a href="https://liberapay.com/joeyh/">Liberapay</a>.</p>