forum/gadu - git-annex disk usagegit-annexhttp://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/git-annexikiwiki2015-06-23T04:15:24Z0.02 is uphttp://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_1_067d0ffe8900751bd2d2743254ac4d77/Steve2013-11-27T22:47:37Z2012-12-08T14:20:16Z
<p>I fixed some bugs that gave the wrong answer occasionally, and made gadu much smarter now.</p>
<p>It now searches for the .git dir an makes sure the git-annex links are well formed before counting them.
I also added a few more du like options.</p>
comment 2http://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_2_ec8b57426e4d82c3392eb7dd683f2ddc/sunny2562013-11-27T22:47:37Z2012-12-08T15:35:16Z
<p>Have downloaded v0.02 and experimented a bit, and it seems to work nicely. A couple of things, though:</p>
<ul>
<li>It displays the sizes in 512 byte blocks as default. I find that very confusing, and the standard <code>du</code>(1) from GNU coreutils uses 1024kB as default. AFAIK 512 byte blocks is an old way of measuring sizes from the really ancient UNIX days. Traditionally correct, maybe, but not very useful these days.</li>
<li>When not specifying a path, can it use "<code>.</code>" as default?</li>
<li>A human-readable format would've been nice, like 234M or 13G. The <code>du</code>(1) from GNU coreutils uses <code>-h</code> for this, but that option is already used for <code>--help</code>. And that's OK, I think <code>-h</code> should be reserved for that purpose. IMHO using <code>-h</code> as a synonym for <code>--human-readable</code> was a bad choice by coreutils, but it's too late to change that now.</li>
</ul>
<p>Is there any Git repository available for git-annex-utils somewhere? That's my preferred way of getting updates and follow the development.</p>
<p>Anyway, thanks. <img src="http://git-annex.branchable.com/smileys/smile.png" alt=":)" /></p>
gadu 0.03 is uphttp://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_3_38296fef5a2dc5794c2dc09df676b8c1/Steve2013-11-27T22:47:37Z2012-12-09T13:05:10Z
<ul>
<li>1K blocksize is now the default</li>
<li>"." is now the default path</li>
<li>implemented -B/--block-size option</li>
<li>--help is no longer -h, it only has a long option like du</li>
<li>implemented -h/--human-readable option</li>
</ul>
<p>du will take up to yottabytes for the --block-size option. I had been fudging the sizes with a size_t thinking 16 exabytes was plenty big enough for now, but since I was implementing --block-size I went ahead and converted everything to use the GNU MP. So libgmp is now a dependency.</p>
<p>--human-readable probably doesn't have exactly the same output, but I think it is good enough. I tried to make the options work mostly the same as du from core-utils. Let me know if you find other discrepancies.</p>
<p>I'll see about making the git tree available soon, but it may have to wait until next weekend. I may also look into a forum for the website, or a mailing list.</p>
comment 4http://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_4_1bcc94f9982c6cfd0888f3dba0f9221e/sunny2562013-11-27T22:47:37Z2012-12-09T20:13:47Z
Thanks a lot, Steve. Awesome, got everything on my wishlist. <img src="http://git-annex.branchable.com/smileys/smile.png" alt=":)" /> A very useful utility, and works perfectly. Will be using this a lot. git-annex-utils is a good name for this, I'm sure if you place it on GitHub or somewhere else you'll get lots of contributions and this could grow to be a project containing many useful utilities for git-annex.
comment 5http://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_5_4365cd3031456fac1b563ee72984638e/Steve2013-11-27T22:47:37Z2012-12-10T04:07:53Z
<p>I pay attention to feedback <img src="http://git-annex.branchable.com/smileys/smile4.png" alt=";)" /></p>
<p>I'm not done with it yet, I want to add in some options to limit what gets counted.</p>
<p>For example: If you have two annexed files that contain the same content using the same backend, they will be stored only once in the .git/annex/objects directory but be counted twice by gadu.</p>
<p>I want to fix that, but I'll leave an option to keep that behavior if you want. I also want to add options to count or not count files that exist in a certain repo. It will be very easy to add options to only count files that you have or don't have locally as well.</p>
<p>Making it pay attention to environment variables that git and git-annex do would also be a good idea. (like GIT_DIR, etc...)</p>
<p>I'm open to good ideas that anybody has, unfortunately I can only work on it on the weekends for now.</p>
Great util!http://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_6_2b03d7b857497cb811e992f85700cdcc/markus2013-11-27T22:47:37Z2012-12-28T17:45:27Z
<p>Hi</p>
<p>gadu is a great util! The speed increase compared to "du -smL" will make it my fav. util for size calc!</p>
<p>ciao markus</p>
sizes has git-annex supporthttp://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_7_03a4dfaf3bd73d41c6f3c3fab0a6a922/John2013-11-27T22:47:37Z2012-12-30T22:47:42Z
Just to note that I already added git-annex support to my "sizes" utility on Hackage several months back. With -A, it shows you storage totals with annex symlinks computed fully resolved.
git repo is now uphttp://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_8_fc6ddb4dc075ee42368863c1b026dbf7/Steve2013-11-27T22:47:37Z2013-01-01T23:39:33Z
<p>sunny256, the git repo is now accessible at <a href="http://git.mysteryvortex.com">http://git.mysteryvortex.com</a></p>
<p>Markus, never used the -m option myself. I added it in git it'll be in the next tarball. (I plan to go through the du man page and add all appropriate options soon)</p>
<p>John, I wasn't aware of your sizes utility. I'll look into it.</p>
The git-annex-utils repohttp://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_9_f03254e518cbdda73e4b88e72476275d/sunny2562013-11-27T22:47:37Z2013-01-09T09:39:16Z
Thanks for setting up the git-annex-utils repo, Steve. Will install the newest gadu(1) on my computers now. I'll probably contribute some patches for some utility scripts or programs, if that's OK. What's your preferred way to receive patches? GPG-encrypted mail to your address in the commit log?
comment 10http://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_10_f632a62c4dbbf01b29f146893d7725f9/Steve2013-11-27T22:47:37Z2013-01-14T01:54:12Z
<p>No problem, glad to see it is useful. I'm not exactly a web guy, but I want to get some sort of comment/discussion system up there soon so we aren't filling up Joey's web site with semi-offtopic discussion. (also a little beautification is in order)</p>
<p>Yes, contributions are welcome. GPG/PGP encrypted email is the preferred mode of communication.</p>
<p>Currently I ask for copyright assignment in case I want to change licenses in the future. I pledge not to go to a non-free license, but the GPL3 license choice was fairly arbitrary. I might want to add the "or any later version" clause, for example. There is also potential for a library to be split off which might benefit from something like LGPL licensing or similar. I haven't really studied the licensing situation since GPL3 came around, so I need to take some time to look into it.</p>
<p>I don't want to have a licensing discussion here though as it would be offtopic. Feel free to email me and we can discuss.</p>
comment 11http://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_11_73461da2d55d040cb43e0db286975821/joey2013-11-27T22:47:37Z2013-03-11T05:31:04Z
<p>I don't want to steal gadu's thunder, and I really quite like having an ecosystem of tools develop around git-annex.</p>
<p>With that said, "git annex status ." now shows the disk used for all files in the current directory and below. It also shows the number of keys, and the total amount of disk those keys would use.</p>
<p>Additionally, you can use all the standard git-annex file limiting options. For example, here I'm finding out how much disk space is used by files located on a <em>remote</em> system:</p>
<pre>
git annex status . --in turtle
directory: .
local annex keys: 0
local annex size: 0 bytes
known annex keys: 10
known annex size: 3 gigabytes
</pre>
comment 12http://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_12_6c4fb123091bde435c18ac3dfd5a9b77/joey2013-11-27T22:47:37Z2013-03-11T05:33:09Z
BTW, I think gadu still has its own uses, due to having a du like output, that can list space used by subdirectories. You can do that with git annex status *, but it's much more verbose, and doesn't show a break down by deeper subdirectories.
Update on determining disk usagehttp://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_13_8e0e86ae716ff018025808f417e1f7f6/Andrew2015-01-22T06:29:36Z2015-01-22T06:29:36Z
<p>I just had a look at this question today as I learn git-annex. I think the commands have changed since the last comment. However, there remain several ways to determine disk usage, for example in the folder <code>Music</code></p>
<pre><code>git annex info Music
</code></pre>
<p>but you could also use <code>du</code> with</p>
<pre><code>du --human-readable --dereference Music
</code></pre>
it's git annex info, not statushttp://git-annex.branchable.com/forum/gadu_-_git-annex_disk_usage/comment_14_d8f69914b88feb3f3ed4f72c26dd74e5/anarcat2015-06-23T04:15:24Z2015-06-23T04:15:24Z
<p>so the previous comments by joeyh were correct 2 years ago, but now git annex status behaves more like git-status than anything else, and will <em>not</em> give you disk usage.</p>
<p>however, <code>git annex info</code> will, and if you use <code>--fast</code>, it works pretty fast as well. example, on my pictures collection:</p>
<pre>
[1044]anarcat@marcos:Photos$ time git annex info --fast *
directory: 1969
local annex keys: 5
local annex size: 10.95 megabytes
annexed files in working tree: 5
size of annexed files in working tree: 10.95 megabytes
directory: 1970
local annex keys: 26
local annex size: 827.5 megabytes
annexed files in working tree: 26
size of annexed files in working tree: 827.5 megabytes
directory: 1998
local annex keys: 10
local annex size: 3.31 megabytes
annexed files in working tree: 10
size of annexed files in working tree: 3.31 megabytes
directory: 2004
local annex keys: 49
local annex size: 42.01 megabytes
annexed files in working tree: 49
size of annexed files in working tree: 42.01 megabytes
directory: 2005
local annex keys: 561
local annex size: 379.23 megabytes
annexed files in working tree: 561
size of annexed files in working tree: 379.23 megabytes
directory: 2006
local annex keys: 932
local annex size: 995.95 megabytes
annexed files in working tree: 932
size of annexed files in working tree: 995.95 megabytes
directory: 2007
local annex keys: 1162
local annex size: 2.33 gigabytes
annexed files in working tree: 1162
size of annexed files in working tree: 2.33 gigabytes
directory: 2008
local annex keys: 658
local annex size: 934.88 megabytes
annexed files in working tree: 658
size of annexed files in working tree: 934.88 megabytes
directory: 2009
local annex keys: 500
local annex size: 836.65 megabytes
annexed files in working tree: 500
size of annexed files in working tree: 836.65 megabytes
directory: 2010
local annex keys: 1049
local annex size: 1.85 gigabytes
annexed files in working tree: 1049
size of annexed files in working tree: 1.85 gigabytes
directory: 2011
local annex keys: 1206
local annex size: 1.54 gigabytes
annexed files in working tree: 1206
size of annexed files in working tree: 1.54 gigabytes
directory: 2012
local annex keys: 2767
local annex size: 10.52 gigabytes
annexed files in working tree: 2767
size of annexed files in working tree: 10.52 gigabytes
directory: 2013
local annex keys: 4071
local annex size: 32.49 gigabytes
annexed files in working tree: 4071
size of annexed files in working tree: 32.49 gigabytes
directory: 2014
local annex keys: 6930
local annex size: 27.34 gigabytes
annexed files in working tree: 6930
size of annexed files in working tree: 27.34 gigabytes
directory: 2015
local annex keys: 2134
local annex size: 8.07 gigabytes
annexed files in working tree: 2134
size of annexed files in working tree: 8.07 gigabytes
directory: rando-velo
local annex keys: 184
local annex size: 537.58 megabytes
annexed files in working tree: 184
size of annexed files in working tree: 537.58 megabytes
directory: RMLL2008-Koumbit
local annex keys: 11
local annex size: 25.58 megabytes
annexed files in working tree: 11
size of annexed files in working tree: 25.58 megabytes
5.47user 1.75system 0:14.70elapsed 49%CPU (0avgtext+0avgdata 30524maxresident)k
121136inputs+0outputs (2major+18418minor)pagefaults 0swaps
</pre>
<p>whereas without <code>--fast</code> is much slower, presumably because it's fetching the tracking information:</p>
<pre>
[1045]anarcat@marcos:Photos$ time git annex info *
directory: 1969
local annex keys: 5
local annex size: 10.95 megabytes
annexed files in working tree: 5
size of annexed files in working tree: 10.95 megabytes
numcopies stats:
numcopies +0: 5
directory: 1970
local annex keys: 26
local annex size: 827.5 megabytes
annexed files in working tree: 26
size of annexed files in working tree: 827.5 megabytes
numcopies stats:
numcopies +0: 26
directory: 1998
local annex keys: 10
local annex size: 3.31 megabytes
annexed files in working tree: 10
size of annexed files in working tree: 3.31 megabytes
numcopies stats:
numcopies +0: 10
directory: 2004
local annex keys: 49
local annex size: 42.01 megabytes
annexed files in working tree: 49
size of annexed files in working tree: 42.01 megabytes
numcopies stats:
numcopies +0: 49
directory: 2005
local annex keys: 561
local annex size: 379.23 megabytes
annexed files in working tree: 561
size of annexed files in working tree: 379.23 megabytes
numcopies stats:
numcopies +0: 561
directory: 2006
local annex keys: 932
local annex size: 995.95 megabytes
annexed files in working tree: 932
size of annexed files in working tree: 995.95 megabytes
numcopies stats:
numcopies +0: 932
directory: 2007
local annex keys: 1162
local annex size: 2.33 gigabytes
annexed files in working tree: 1162
size of annexed files in working tree: 2.33 gigabytes
numcopies stats:
numcopies +0: 1162
directory: 2008
local annex keys: 658
local annex size: 934.88 megabytes
annexed files in working tree: 658
size of annexed files in working tree: 934.88 megabytes
numcopies stats:
numcopies +0: 658
directory: 2009
local annex keys: 500
local annex size: 836.65 megabytes
annexed files in working tree: 500
size of annexed files in working tree: 836.65 megabytes
numcopies stats:
numcopies +0: 500
directory: 2010
local annex keys: 1049
local annex size: 1.85 gigabytes
annexed files in working tree: 1049
size of annexed files in working tree: 1.85 gigabytes
numcopies stats:
numcopies +0: 1049
directory: 2011
local annex keys: 1206
local annex size: 1.54 gigabytes
annexed files in working tree: 1206
size of annexed files in working tree: 1.54 gigabytes
numcopies stats:
numcopies +0: 1206
directory: 2012
local annex keys: 2767
local annex size: 10.52 gigabytes
annexed files in working tree: 2767
size of annexed files in working tree: 10.52 gigabytes
numcopies stats:
numcopies +0: 2767
directory: 2013
local annex keys: 4071
local annex size: 32.49 gigabytes
annexed files in working tree: 4071
size of annexed files in working tree: 32.49 gigabytes
numcopies stats:
numcopies +0: 4071
directory: 2014
local annex keys: 6930
local annex size: 27.34 gigabytes
annexed files in working tree: 6930
size of annexed files in working tree: 27.34 gigabytes
numcopies stats:
numcopies +0: 6930
directory: 2015
local annex keys: 2134
local annex size: 8.07 gigabytes
annexed files in working tree: 2134
size of annexed files in working tree: 8.07 gigabytes
numcopies stats:
numcopies +0: 2134
directory: rando-velo
local annex keys: 184
local annex size: 537.58 megabytes
annexed files in working tree: 184
size of annexed files in working tree: 537.58 megabytes
numcopies stats:
numcopies +0: 184
directory: RMLL2008-Koumbit
local annex keys: 11
local annex size: 25.58 megabytes
annexed files in working tree: 11
size of annexed files in working tree: 25.58 megabytes
numcopies stats:
numcopies +0: 11
37.46user 5.70system 1:54.20elapsed 37%CPU (0avgtext+0avgdata 30704maxresident)k
107912inputs+0outputs (2major+19426minor)pagefaults 0swaps
</pre>
<p>14 seconds vs 114 seconds! almost an order of magnitude of difference...</p>
<p>still, it seems to me <code>git annex info --fast $path</code> should be more clearly put forward as an alternative du solution for now. maybe this should be made into a tips page?</p>