I have a bunch of directory trees with large data files scattered over various computers and disk drives - they contain photos, videos, music, and so on. In many cases I initially copied one of these trees from one machine to another just as a cheap and dirty backup, and then made small modifications to both trees in ways I no longer remember. For example, I returned from a trip with a bunch of new photos, and then might have rotated some of them 90 degrees on one machine, and edited or renamed them on another.
What I want to do now is use git-annex as a way of initially synchronising the trees, and then fully managing them on an ongoing basis. Note that the trees are not yet git repositories. In order to be able to detect straight-forward file renames, I believe thatprobably makes the most sense.
I've been playing around and arrived at the following setup procedure. For the sake of discussion, I assume that we have two trees
b which live in the same directory referred to by
$td, and that all large files end with the
# Setup git in 'a'. cd $td/a git init # Setup git-annex in 'a'. echo '* annex.backend=SHA1' > .gitattributes git add .gitattributes git commit -m'use SHA1 backend' git annex init # Annex all large files. find -name \*.avi | xargs git annex add git add . git commit -m'Initial import' # Setup git in 'b'. cd $td/b git clone -n $td/a new mv new/.git . rmdir new git reset # reset git index to b's wd - hangover from cloning from 'a' # Setup git-annex in 'b'. # This merges a's (origin's) git-annex branch into the local git-annex branch. git annex init # Annex all large files - because we're using SHA1 backend, some # should hash to the same keys as in 'a'. find -name \*.avi | xargs git annex add git add . git commit -m'Changes in b tree' git remote add a $td/a # Now pull changes in 'b' back to 'a'. cd $td/a git remote add b $td/b git pull b master
This seems to work, but have I missed anything?