Hi,
i'm thinking to use git-annex to synchronize my virtual machine directory (Virtualbox) between 3 pc. It's quite big: more than 200GB and some of the images are 40Gb in size.
The synchronization will be over a lan (obviously). It is already in place with 2pc and unison but the configuration of the 3rd pc is cumbersome. Does anybody have experiences with git-annex and such amount of data?
Thanks in advance
Gabriele
This volume of data should be no problem for git-annex.
The only catch would be if you're running those VM images and want to sync them as they're changed. With git-annex, you'd need to
git annex unlock
a file to allow it to be modified, and thengit annex add
it back and commit changes made to it.Hi, due to my requirement I need to revert vm image every time before running it via "git reset --hard" which is really fast on the other hand "git annex unlock" takes really long, I run git-annex on Centos 6 and git-annex version git-annex-3.20120522-2.1.el6.x86_64, if I update git-annex version can it help to fasten "unlock"?
thanks a lot
The reason
git annex unlock
is slow is because it makes a copy of the entire file. The file is left as-is in the annex so the old version is available later, and the unlocked copy is made available for modification.More recent versions of git-annex support v6 mode, which has a annex.thin configuration that makes
git annex unlock
not do this copy, so it's very fast. But then no copy of the old version of the file will be made, and so you won't be able to revert to the old version. Which seems to be an important part of your workflow.Another way to make
git annex unlock
fast is to use a file system that supports Copy On Write (CoW). git-annex will use CoW automatically when available, and then unlocking doesn't need to actually copy the file, but the old version will still be preserved. Btrfs is the only filesystem I know of that supports CoW, although there may be others.