devblog/day 642 cost modelgit-annexhttp://git-annex.branchable.com/devblog/day_642__cost_model/git-annexikiwiki2021-11-09T20:05:08Zcomment 1http://git-annex.branchable.com/devblog/day_642__cost_model/comment_1_36ecaafc1d7f6460758d6824a357432c/jkniiv2021-11-09T08:48:33Z2021-11-09T08:48:33Z
<p>My first impression of commit <a href="http://source.git-annex.branchable.com/?p=source.git;a=commitdiff;h=a0758bdd1002e798f62353efa725ac2972589b96">a0758bdd1002e798f62353efa725ac2972589b96</a> with the cost model
is quite positive as I'm the one with multigigabyte annexed files in his otherwise rather small (by number of files)
repo and thus I'm affected by the limitations of the filter-process method which pipes all the content of annexed
files from git to git-annex. Compared to commit <a href="http://source.git-annex.branchable.com/?p=source.git;a=commitdiff;h=837025b14f523f9180f82d0cced1e53a8a9b94de">837025b14f523f9180f82d0cced1e53a8a9b94de</a>, which frankly
for me was unusable in this particular repo with <code>filter.annex.process</code> set, the new version behaves rather nicely
in that a simple test of <code>time git checkout git-annex</code> followed by
<code>time git checkout 'adjusted/master(hidemissing-unlocked)'</code> turns out to be faster than using an unoptimised version
(=8.20211028) without the long-running <code>filter-process</code> functionality. Obviously, it's only the first stage,
i.e. checking out the git-annex branch, that becomes faster by over 50 percentage points but I'll take any improvement
in my daily git-annex operations. <img src="http://git-annex.branchable.com/smileys/smile.png" alt=":)" /></p>
<p>The timings I got are as follows.</p>
<ul>
<li><code>git checkout git-annex</code>
<ul>
<li>unoptimised 8.20211028 / w/o <code>filter-process</code>: 103s</li>
<li>commit 837025b14 / w/ <code>filter-process</code> enabled: 36s</li>
<li>commit 9d3ce224e / w/ <code>filter-process</code> enabled: 37s</li>
</ul>
</li>
<li><code>git checkout 'adjusted/master(hidemissing-unlocked)'</code>
<ul>
<li>unoptimised 8.20211028 / w/o <code>filter-process</code>: 49s</li>
<li>commit 837025b14 / w/ <code>filter-process</code> enabled: 57 minutes (I had dropped a few files, in reality this would've taken even longer)</li>
<li>commit 9d3ce224e / w/ <code>filter-process</code> enabled: 43s</li>
</ul>
</li>
</ul>
<p>This repo is on Windows (with annex.thin set) and locally has only 13 annexed files on this very drive but the files
cover some 870 gigabytes worth of system backup images so individual files are definitely on the larger side for
git-annex.</p>
comment 2http://git-annex.branchable.com/devblog/day_642__cost_model/comment_2_5bb7a20dd3141423ce4109d8fd4098c4/joey2021-11-09T20:05:08Z2021-11-09T19:55:51Z
<p>Thanks @jkniiv, that's good to hear. That is exactly the results I would
have hoped for for a repository like yours. To speed up the checkout
of the annexed files in your case will need improvements to git,
probably. But the crucial thing is it's not gotten worse, and the other
checkout improved a lot.</p>
<p>I would be curious how you find <code>git add</code>'s performance now, if you ever
use that to add large annexed files.</p>