forum/"du" equivalent on an annex?git-annexhttp://git-annex.branchable.com/forum/__34__du__34___equivalent_on_an_annex__63__/git-annexikiwiki2024-01-07T04:11:37Zcomment 1http://git-annex.branchable.com/forum/__34__du__34___equivalent_on_an_annex__63__/comment_1_a41bd02361aa961e5285aeaf1ea062be/sunny2562013-11-27T22:47:37Z2012-11-28T23:21:11Z
<p>du(1) also accepts the -L option, so if you for example want to find what directories occupies most storage:</p>
<pre><code>$ du -L | sort -n
</code></pre>
comment 2http://git-annex.branchable.com/forum/__34__du__34___equivalent_on_an_annex__63__/comment_2_28ba62a546f5cc8f416491423d743d8a/sunny2562013-11-27T22:47:37Z2012-11-28T23:24:11Z
<p>And if you want to find the biggest files in a directory tree:</p>
<pre><code>$ find -type l -print0 | xargs -0 du -L | sort -n | tail -500
</code></pre>
comment 3http://git-annex.branchable.com/forum/__34__du__34___equivalent_on_an_annex__63__/comment_3_8d97f40c1d14b7230f3656a00a99cf80/edheil [wordpress.com]2013-11-27T22:47:37Z2012-11-29T02:03:24Z
Sweet! I should have RTFM a bit more. Thanks. <img src="http://git-annex.branchable.com/smileys/smile.png" alt=":)" />
comment 4http://git-annex.branchable.com/forum/__34__du__34___equivalent_on_an_annex__63__/comment_4_baa8fbbdd5c449a0dc2bb622cb4a47ce/Steve2013-11-27T22:47:37Z2012-11-29T23:51:21Z
<p>I've been thinking about writing a sort of git-annex du. I'm surprised to find someone else looking for such a thing. While "du -L" will tell you how much space is used by files you actually have, I was interested in knowing (approximately) how much space would be used if you were to git-annex get everything you don't yet have.</p>
<p>There are many options and variations to think about, such as:</p>
<ul>
<li>do you want to count duplicate files once or as many times as they appear (as if you 'git-annex lock'd them all)</li>
<li>maybe you want to know how much space is used by files that reside only on a certain remote or set of remotes</li>
<li>you might want to know how much space would be used by all the files you don't yet have, but not count the files you already have</li>
</ul>
<p>All of the backends so so far seem to store the size of the files in the filename, so my plan was to read it out of the links. If anybody has a better idea about how to get the sizes of annexed files or options that would be handy for a git-annex du, let me know. I'll see if I can get the start of something useful this weekend. I'll post here when I have something to share.</p>
<p>I'm also open to suggestions for the executable name. Right now I'm thinking "gadu" for git-annex disk usage.</p>
comment 5http://git-annex.branchable.com/forum/__34__du__34___equivalent_on_an_annex__63__/comment_5_2ee6cbbfe54a2e7b6e8eb539c18e663d/sunny2562013-11-27T22:47:37Z2012-11-30T00:29:44Z
<p>Steve, that would be a very useful utility. I've been thinking of such a tool, but haven't gotten around to write it yet. It would be practical to have before copying big/many files from another drive. If I've been short of free space, I've executed <code>du -L</code> in the source directory, but that's a bit cumbersome.</p>
<p>And "gadu" is a fine name, yes. Goes well along with my "ga" shortcut for "git annex", which I created two hours after I started using git-annex. I've probably saved thousands of keystrokes because of that. ☺</p>
gadu 0.01 is uphttp://git-annex.branchable.com/forum/__34__du__34___equivalent_on_an_annex__63__/comment_6_48f6a2761a34b7f991325f1d24e2c5ff/Steve2013-11-27T22:47:37Z2012-12-08T06:22:50Z
I've got an initial try at gadu up over at <a href="http://git-annex.mysteryvortex.com/git-annex-utils.html">http://git-annex.mysteryvortex.com/git-annex-utils.html</a> I created a separate thread for it: <a href="http://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/">http://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/</a>
Also try sizeshttp://git-annex.branchable.com/forum/__34__du__34___equivalent_on_an_annex__63__/comment_7_d632baff41b8582f1a79bc5018c68545/John2013-11-27T22:47:37Z2013-01-02T22:05:39Z
The "sizes" tool on Hackage is a git-annex aware du-like utility. Just give it the "-A" flag to have it interpret annex symlinks as if they were normal files, and to also ignore files inside a .git/annex.
comment 8http://git-annex.branchable.com/forum/__34__du__34___equivalent_on_an_annex__63__/comment_8_6a51c22d9893fa5b1503f5a14b1eb8ce/anarcat [id.koumbit.net]2015-03-31T03:11:06Z2015-03-31T03:11:06Z
i have used the <a href="http://dev.yorhel.nl/ncdu/bug/16">symlink patch of ncdu</a> with good results here...
comment 9http://git-annex.branchable.com/forum/__34__du__34___equivalent_on_an_annex__63__/comment_9_a2bda4c74ef09a58709c0c5f6ee1b726/anarcat2015-06-24T21:40:07Z2015-06-24T21:40:07Z
... but nowadays, i use <code>git annex info --fast *</code>.
comment 10http://git-annex.branchable.com/forum/__34__du__34___equivalent_on_an_annex__63__/comment_10_a80eccce0bdb33748ab1ccba9b88a65f/anarcat2020-06-26T20:29:26Z2020-06-26T20:29:26Z
<p>... or even, more fancy:</p>
<pre><code>git annex info --fast * --json | jq -j '."local annex size", "\t", .directory, "\t", "\n"' | sort -h
</code></pre>
<p>Downside: the json output doesn't give us something <code>sort</code> can really work with (it expects <code>M</code>, <code>G</code>, not <code>mebibytes</code>, <code>gibibytes</code>, which is arguably a bug...). But precision fanatics can also work around that with:</p>
<pre><code>git annex info --fast * --json --bytes | jq -j '."local annex size", "\t", .directory, "\t", "\n"' | sort -h
</code></pre>
<p>Then you can go crazy trying to convert those numbers <a href="https://unix.stackexchange.com/questions/346902/need-to-convert-bytes-to-gb-mb-kb-in-normal-decimal-format">back to something readable</a> in your own spare time... <img src="http://git-annex.branchable.com/smileys/smile4.png" alt=";)" /></p>
Using fusehttp://git-annex.branchable.com/forum/__34__du__34___equivalent_on_an_annex__63__/comment_11_e3f82edf70d28f00d30cfd98f189cd86/wzhd2022-05-14T03:10:18Z2022-05-14T03:10:18Z
<p>Wrote a bare minimum <a href="https://codeberg.org/wzhd/annexize/">fuse fs</a> so that du-like utilities like ncdu, gt5, gdu can be used.</p>
<p>It reads each symlink target, try to get a number after <code>SHA256E-s</code>, and pretends it's regular file with that size. <code>git-annex add</code>ed files don't need to be locally available.</p>
<p>Files can be deleted but no other operations are implemented.</p>
Using fuse annexize - works like a charmhttp://git-annex.branchable.com/forum/__34__du__34___equivalent_on_an_annex__63__/comment_12_041ae920e2e6fd83967dc98646452fa7/psxvoid2024-01-07T04:11:37Z2024-01-07T04:11:37Z
<p>Hey @wzhd,</p>
<p>Thanks a lot for this tool - annexize works like a charm even in tags views!</p>
<p>It's basically solve the most important problem for me:</p>
<p>I can use ncdu for organizing files in my local git annex repository that
does not contain any actual files (only file links), and then just sync
with linked repos that do store those files.</p>