forum/Move historygit-annexhttp://git-annex.branchable.com/forum/Move_history/git-annexikiwiki2019-01-21T15:42:51Zcomment 1http://git-annex.branchable.com/forum/Move_history/comment_1_b720837e8e4ab8f4b0648f8ac336a605/andrew2019-01-21T15:42:51Z2018-12-30T21:58:55Z
If you run <code>git annex info</code> you can see how much of this space is from actual annexed objects <code>local annex size</code>. Anyhow if the files aren't associated with any files in your working tree and you don't want them you can remove them with the <code>unused</code> command, there are some pointers here: <a href="http://git-annex.branchable.com/walkthrough/unused_data/">http://git-annex.branchable.com/walkthrough/unused_data/</a>.
comment 2http://git-annex.branchable.com/forum/Move_history/comment_2_17b8c2fd3ac706cd666ad5f5c0a21715/Chymera2019-01-21T15:42:51Z2019-01-02T12:08:19Z
<blockquote><p> If you run git annex info you can see how much of this space is from actual annexed objects local annex size</p></blockquote>
<p>I don't understand what you mean, <code>git annex info</code> tells me nothing about the repository size, and I have no idea where to enter the “local annex size” string.
At any rate, no, I have already run <code>git annex dropunused</code> to get rid of all unused files. All the rest seems to be needed for some reason. My question was whether I can move all of this history to one of my volumes which have more space on them.</p>
comment 3http://git-annex.branchable.com/forum/Move_history/comment_3_9ff881c26c3bd7f3a61595ffc1df743a/andrew2019-01-21T15:42:51Z2019-01-03T01:32:28Z
<p>There are various ways to forget history, both in git and git-annex. I don't have enough clarity into what history is taking up space in your repository to give you a good answer. Answering the following questions will give me more insight into where the space is being used up, then I can give you some ideas on how to reclaim it:</p>
<p>Is the repo in question direct or indirect (I am not sure what you meant by "direct mode network")? Output of <code>git annex info | grep "repository mode"</code> command will tell you this.</p>
<p>What git-annex repo version is the repo in question? Output of <code>git annex version | grep "local repository version"</code> command will tell you this.</p>
<p>If you cd into the repo in question and run <code>git-annex info</code> it gives you various information about what git-annex thinks about the repository. One of the outputs of this command is "local annex size" which tells you how much space this repo is taking up. In a direct mode repo this should be the same size as you get from sizing all the files in your working directory excluding the .git directory (<code>du -sh --exclude=.git</code> on Linux). Otherwise in an indirect mode repo, the "local annex size" given by <code>git-annex info</code> should match the size of the <code>.git/annex/objects</code> directory.</p>
<p>If you cd into the repo in question what are the outputs of the following commands.</p>
<p>Size of git annex objects (In a direct mode repo this should be very small):</p>
<pre><code>du -sh .git/annex/objects/
</code></pre>
<p>Size of git objects (This just tells you how much history is stored in git. This should also be small (unless you store a lot of large files in git, which you probably don't since you are using git-annex):</p>
<pre><code>du -sh .git/objects/
</code></pre>
<p>Size of working tree (this will tell you file content present in this repo):</p>
<pre><code>git annex info | grep "size of annexed files in working tree"
</code></pre>
comment 4http://git-annex.branchable.com/forum/Move_history/comment_4_d0c97744e9ccca21e4023f657e9f5f30/Chymera2019-01-21T15:42:51Z2019-01-03T19:33:01Z
<pre><code>chymera@clusterhost ~/ni_data/ofM.dr $ git annex info | grep "repository mode"
repository mode: direct
chymera@clusterhost ~/ni_data/ofM.dr $ git annex version | grep "local repository version"
local repository version: 5
</code></pre>
<blockquote><p>One of the outputs of this command is "local annex size" which tells you how much space this repo is taking up.</p></blockquote>
<p>this does not happen, <code>git-annex info | grep "local annex size"</code> returns nothing.</p>
<pre><code>chymera@clusterhost ~/ni_data $ du -sh .git/annex/objects/
1.6G .git/annex/objects/
chymera@clusterhost ~/ni_data $ du -sh .git/objects/
218M .git/objects/
chymera@clusterhost ~/ni_data $ du -sh
779G
chymera@clusterhost ~/ni_data $ du -sh .git/
501G .git/
</code></pre>
comment 6http://git-annex.branchable.com/forum/Move_history/comment_6_0655707fe70710408b93b5f53f8d6f26/Chymera2019-01-21T15:42:51Z2019-01-03T19:43:23Z
<pre><code>chymera@clusterhost ~/ni_data/ofM.dr $ git annex info | grep "repository mode"
repository mode: direct
chymera@clusterhost ~/ni_data/ofM.dr $ git annex version | grep "local repository version"
local repository version: 5
chymera@clusterhost ~/ni_data $ du -sh .git/annex/objects/
1.6G .git/annex/objects/
chymera@clusterhost ~/ni_data $ du -sh .git/objects/
218M .git/objects/
chymera@clusterhost ~/ni_data $ du -sh
779G
chymera@clusterhost ~/ni_data $ du -sh .git/
501G .git/
</code></pre>
comment 6http://git-annex.branchable.com/forum/Move_history/comment_6_f2e1d20e269648358b1eced616727e63/Chymera2019-01-21T15:42:51Z2019-01-03T22:56:25Z
<pre><code>git annex info | grep "size of annexed files in working tree"
</code></pre>
<p>This does nothing but hang and I am not sure whether it's git annex or grep that hangs:</p>
<pre><code>chymera@clusterhost /mnt/overflow $ ps aux | ag annex
chymera 5884 0.0 0.0 139920 3388 pts/7 S+ 23:53 0:00 git annex info
chymera 5885 0.0 0.0 133216 900 pts/7 S+ 23:53 0:00 grep --colour=auto size of annexed files in working tree
chymera 5886 6.4 0.0 1074610112 102528 pts/7 Dl+ 23:53 0:05 /usr/bin/git-annex info
chymera 5905 0.0 0.0 11304 1084 pts/8 S+ 23:55 0:00 ag annex
chymera@clusterhost /mnt/overflow $ ps aux | ag git
chymera 5884 0.0 0.0 139920 3388 pts/7 S+ 23:53 0:00 git annex info
chymera 5886 6.3 0.0 1074610112 102528 pts/7 Dl+ 23:53 0:05 /usr/bin/git-annex info
chymera 5893 0.0 0.0 258580 4492 pts/7 S+ 23:54 0:00 git --git-dir=.git --work-tree=. --literal-pathspecs -c core.bare=false cat-file --batch
chymera 5894 0.0 0.0 139920 3740 pts/7 S+ 23:54 0:00 git --git-dir=.git --work-tree=. --literal-pathspecs -c core.bare=false cat-file --batch-check=%(objectname) %(objecttype) %(objectsize)
chymera 5909 0.0 0.0 11304 1032 pts/8 S+ 23:55 0:00 ag git
chymera@clusterhost /mnt/overflow $ ps aux | ag grep
chymera 5885 0.0 0.0 133216 900 pts/7 S+ 23:53 0:00 grep --colour=auto size of annexed files in working tree
chymera 5913 0.0 0.0 11304 1072 pts/8 S+ 23:55 0:00 ag grep
</code></pre>
comment 7http://git-annex.branchable.com/forum/Move_history/comment_7_acd690861b31e55352497b32b7c0fd6e/andrew2019-01-21T15:42:51Z2019-01-03T23:48:03Z
<p>Aaah, sorry, yeah, <code>git-annex info</code> is very slow its checks many things locally and remotely… (i've seen it run for 30min+ on some of my repos). No, worries I don't think we'll learn too much more from that command than we learned from the <code>du</code> commands.</p>
<p>You indeed do have some un-accounted for space in <code>.git</code>, I usually expect most of the space to be in the git-annex or git objects folders but that only accounts for 1.6 of the 501 GB in your .git folder.</p>
<p>What are the outputs of <code>du -h -d 1 .git/</code> thats a level-1 listing of files in .git, and <code>du -h -d 1 .git/annex/</code> thats for files in the annex specific folder? That will help narrow down where the space is eaten up from. Perhaps <code>.git/annex/misctmp</code> or <code>.git/annex/tmp</code> are the culprits.</p>
comment 8http://git-annex.branchable.com/forum/Move_history/comment_8_0e4b1629d2bb9d360a1ba65494dd8ccd/Chymera2019-01-21T15:42:51Z2019-01-04T03:52:22Z
<p>Additionally, I notice that the git annex version my repo has (5) is 2 versions old. Given the git-annex availability on my distributions, I think I could bump this to 6 --- do you suggest I do this now or after I have this issue handled?</p>
<pre><code>chymera@clusterhost ~/ni_data $ du -h -d 1 .git/
12K .git/info
52K .git/hooks
218M .git/objects
501G .git/annex
124K .git/refs
172K .git/logs
501G .git/
chymera@clusterhost ~/ni_data $ du -h -d 1 .git/annex/
499G .git/annex/misctmp
4.0K .git/annex/ssh
8.3M .git/annex/journal
60K .git/annex/keys
30M .git/annex/transfer
1.6G .git/annex/objects
4.0K .git/annex/tmp
501G .git/annex/
</code></pre>
comment 9http://git-annex.branchable.com/forum/Move_history/comment_9_1153476304000f417292fdce1f9727bf/andrew2019-01-21T15:42:51Z2019-01-04T12:49:51Z
<p>Aaah. All the space is in <code>.git/annex/misctmp</code>. This is essentially a directory for git annex to stage things temporarily, but I don't know too much about what gets put in this directory and when it is safe to delete it (the only official documentation is in <a href="http://git-annex.branchable.com/internals/">internals</a>).</p>
<p>One person had their <code>.git/annex/misctmp</code> dir fill up after <a href="https://git-annex.branchable.com/forum/misctmp_filling_up/">interrupting the assistant during transfers</a>, another person had their misctmp fill up <a href="https://git-annex.branchable.com/bugs/direct_command_leaves_repository_inconsistent_if_interrupted/">after interrupting git annex while it was switching to direct mode</a>.</p>
<p>Maybe one of those situations applies to you? Perhaps take a look at some of the files in misctmp and try to evaluate if you feel they are safe to delete? They should have somewhat recognizable names. I don't know if running <code>git annex fsck</code> will cleanup any of these files (Joey?).</p>
<p>I would personally not rush into upgrading from v5. v6 has been deprecated so, with the latest git-annex, it will auto-upgrade v6 to v7 (so you can't have a v6 repo anymore). So your only options are staying on v5 or upgrading to v7. But, there are some significant differences (currently) that you need to be aware of. v7 no longer supports direct mode (it has features that are similar but not equivalent in all situations). v7 (and v6) take control of <code>git add</code> so files are actually added to the annex (not git) when you use this command unless you have configured largefiles (this makes it a bit more difficult to maintain repos that have a mix of git and git-annex files. And unlocked/locked files are treated differently.</p>
comment 10http://git-annex.branchable.com/forum/Move_history/comment_10_da64814a7dfc2322200130a17bb79923/Chymera2019-01-21T15:42:51Z2019-01-06T01:25:10Z
it did indeed look like data from an unintended and interrupted "git annex add *", deleted it and now the space issue has resolved. Thank you.
comment 11http://git-annex.branchable.com/forum/Move_history/comment_11_46ce6990e5921238f98749af830ac5ec/joey2019-01-21T15:42:51Z2019-01-17T20:01:51Z
<p>There were actually a few ways git-annex could be interrupted and leave
droppings in misctmp.</p>
<p>This is now dealt with, a subsequent run of git-annex will clean up
leftover files from a previous interrupted run.</p>