forum/migrate existing git repository to git-annexgit-annexhttp://git-annex.branchable.com/forum/migrate_existing_git_repository_to_git-annex/git-annexikiwiki2014-01-16T17:47:45Zcomment 1http://git-annex.branchable.com/forum/migrate_existing_git_repository_to_git-annex/comment_1_4181bf34c71e2e8845e6e5fb55d53381/joey2013-11-27T22:47:37Z2011-02-25T05:16:48Z
<p>I don't know how to approach this yet, but I support the idea -- it would be great if there was a tool that could punch files out of git history and put them in the annex. (Of course with typical git history rewriting caveats.)</p>
<p>Sounds like it might be enough to add a switch to git-annex that overrides where it considers the top of the git repository to be?</p>
comment 2http://git-annex.branchable.com/forum/migrate_existing_git_repository_to_git-annex/comment_2_5f08da5e21c0b3b5a8d1e4408c0d6405/tyger2013-11-27T22:47:37Z2011-03-01T14:07:50Z
<p>My current workflow looks like this (I'm still experimenting):</p>
<h3>Create backup clone for migration</h3>
<pre><code>git clone original migrate
cd migrate
for branch in $(git branch -a | grep remotes/origin | grep -v HEAD); do git checkout --track $branch; done
</code></pre>
<h3>Inject git annex initialization at repository base</h3>
<pre><code>git symbolic-ref HEAD refs/heads/newroot
git rm --cached *.rpm
git clean -f -d
git annex init master
git cherry-pick $(git rev-list --reverse master | head -1)
git rebase --onto newroot newroot master
git rebase master mybranch # how to automate this for all branches?
git branch -d newroot
</code></pre>
<h3>Start migration with tree filter</h3>
<pre><code>echo \*.rpm annex.backend=SHA1 > .git/info/attributes
MYWORKDIR=$(pwd) git filter-branch --tree-filter ' \
if [ ! -d .git-annex ]; then \
mkdir .git-annex; \
cp ${MYWORKDIR}/.git-annex/uuid.log .git-annex/; \
cp ${MYWORKDIR}/.gitattributes .; \
fi
for rpm in $(git ls-files | grep "\.rpm$"); do \
echo; \
git annex add $rpm; \
annexdest=$(readlink $rpm); \
if [ -e .git-annex/$(basename $annexdest).log ]; then \
echo "FOUND $(basename $annexdest).log"; \
else \
echo "COPY $(basename $annexdest).log"; \
cp ${MYWORKDIR}/.git-annex/$(basename $annexdest).log .git-annex/; \
fi; \
ln -sf ${annexdest#../../} $rpm; \
done; \
git reset HEAD .git-rewrite; \
: \
' -- $(git branch | cut -c 3-)
rm -rf .temp
git reset --hard
</code></pre>
<p>There are still some drawbacks:</p>
<ul>
<li>git history shows that git annex log files are modified with each checkin</li>
<li>branches have to be rebased manually before starting migration</li>
</ul>
comment 3http://git-annex.branchable.com/forum/migrate_existing_git_repository_to_git-annex/comment_3_f483038c006cf7dcccf1014fa771744f/tyger2013-11-27T22:47:37Z2011-03-02T08:15:37Z
<blockquote><p>Sounds like it might be enough to add a switch to git-annex that overrides where it considers the top of the git repository to be?</p></blockquote>
<p>It should sufficient to honor GIT_DIR/GIT_WORK_TREE/GIT_INDEX_FILE environment variables. git filter-branch sets GIT_WORK_TREE to ., but this can be mitigated by starting the filter script with 'GIT_WORK_TREE=$(pwd $GIT_WORK_TREE)'. E.g. GIT_DIR=/home/tyger/repo/.git, GIT_WORK_TREE=/home/tyger/repo/.git-rewrite/t, then git annex should be able to compute the correct relative path or maybe use absolute pathes in symlinks.</p>
<p>Another problem I observed is that git annex add automatically commits the symlink; this behaviour doesn't work well with filter-tree. git annex commits the wrong path (.git-rewrite/t/LINK instead of LINK). Also filter-tree doesn't expect that the filter script commmits anything; new files in the temporary work tree will be committed by filter-tree on each iteration of the filter script (missing files will be removed).</p>
Rebase all brancheshttp://git-annex.branchable.com/forum/migrate_existing_git_repository_to_git-annex/comment_4_057f0079fbee3451ccda08026bab21d4/Laura2014-01-16T17:47:45Z2014-01-16T17:47:45Z
<div class="highlight-sh"><pre class="hl">For the portion<span class="hl opt">:</span> git rebase master mybranch <span class="hl slc"># how to automate this for all branches?</span>
Try this<span class="hl opt">:</span>
branch_to_ignore<span class="hl opt">=</span><span class="hl str">'git-annex|master|newroot'</span>
<span class="hl kwa">for</span> branch <span class="hl kwa">in</span> $<span class="hl opt">(</span>git for-each-ref <span class="hl kwb">--sort</span><span class="hl opt">=</span><span class="hl kwb">-committerdate</span> refs<span class="hl opt">/</span>heads <span class="hl kwb">--format</span><span class="hl opt">=</span><span class="hl str">'%(refname:short)'</span> | <span class="hl kwc">egrep</span> <span class="hl kwb">-v</span> <span class="hl kwd">$branch_to_ignore</span> <span class="hl opt">)</span>
<span class="hl kwa">do</span> git rebase <span class="hl kwb">--onto</span> master <span class="hl str">"</span><span class="hl ipl">$branch</span><span class="hl str">~"</span> <span class="hl str">"</span><span class="hl ipl">$branch</span><span class="hl str">"</span>
<span class="hl kwb">echo</span> <span class="hl str">"Rebasing branch</span> <span class="hl ipl">$branch</span> <span class="hl str">onto master...."</span>
<span class="hl kwa">done</span>
Feel free to add<span class="hl opt">/</span>correct as necessary
</pre></div>