forum/Not sure how to get my s3 remote backgit-annexhttp://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/git-annexikiwiki2013-11-27T22:47:37Zcomment 1http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_1_fffb59ad5a197d2980dd0ec35cf4aafa/joey2013-11-27T22:47:37Z2013-04-24T23:56:54Z
<p>What you describe, <code>git annex get $file --from remote</code> silently not doing anything, is the expected behavior if the remote doesn't have the file. This allows you to eg, run <code>git annex get . --from remote</code> and get all files that the remote does have while skipping the rest.</p>
<p>Did you ever try looking at <code>git annex whereis $file</code> ?</p>
comment 2http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_2_0cfcc2075bff556b9fde5acc3dc1d599/Jason2013-11-27T22:47:37Z2013-04-25T00:20:59Z
When I run <code>git annex whereis $file</code> it tells me it's available at box, s3, my rsync remote, the original repo (from the machine I created the annex on), and my temporary usb stick repo (nothing wrong with redundancy...). So it seems that git annex thinks the file is available in the s3 remote, even though it's refusing to download it.
comment 3http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_3_6fe2ff1282fb14a4ce26ef8dc775d07e/joey2013-11-27T22:47:37Z2013-04-25T00:50:01Z
Does the uuid that whereis prints next to the name of the s3 remote match the annex..uuid setting for that remote in .git/config?
comment 4http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_4_64338d2d77dcbabd16b55eb145f40dc6/Jason2013-11-27T22:47:37Z2013-04-25T01:18:00Z
<p>Oh! That might be it! During the whole "I have two remotes with the name s3" situation, it seems that both of them in my .git/config ended up with the same uuid, even though the original one had a different uuid. If I change it back, I end up getting an access denied when I try to <code>git annex get ...</code>. Progress!</p>
<p>I thought that you were supposed to do a <code>git annex initremote s3</code> from a clone to enable a remote with the credentials stored in the repo. It seems that internally something still thinks that the "s3" remote has the new uuid. When I run that command it changes the uuid back to the new (invalid) one.</p>
<p>Is there a way I can totally remove the bad s3 (which I've partially renamed to s3thefirst) remote from my history/repo (I'm pretty sure it's been synced back up to origin at this point) or properly rename it so it doesn't keep getting confused? Hopefully that will address my problem.</p>
comment 5http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_5_dd66c9ea0c83388f6826751944330d10/joey2013-11-27T22:47:37Z2013-04-25T02:14:18Z
<p>Yes, when you run <code>git annex initremote $remotename</code> with no other parameters, it enables a remote from the stored configuration.
Which does not include <code>AWS_SECRET_ACCESS_KEY</code> and <code>AWS_ACCESS_KEY_ID</code>; you need to set those and then
you should not get access denied.</p>
<p>You seem to say your .git/config contains two remotes with the same name, but I don't think that's possible.</p>
<p>I don't know how you could end up with two remotes with the same name in <code>git show git-annex:remote.log</code>, unless the two were added in separate repositories which were then synced together. Since this is not a usual situation there's not any UI to deal with it. I've just committed a change that will make <code>initremote</code> prefer remotes that have not been marked dead when there's a naming comflict.</p>
<p>However, I'm more curious how this situation came about. I have not been able to reproduce the problem when enabling a S3 remote using the webapp.</p>
comment 6http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_6_dc0c5e395e4c443b7227afdb157194e5/joey2013-11-27T22:47:37Z2013-04-25T02:18:56Z
<p>What you could do to help track down how this occurred is to check out the <code>git-annex</code> branch, and use <code>git blame</code> to find out when the second remote with the same name was first added to the <code>remote.log</code> file.</p>
<p>Then you should be able to tell, either from the email address used for that commit, or at least the date of the commit, whether this occurred recently when you enabled the S3 remote in the webapp, or perhaps at some time in the past.</p>
comment 7http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_7_3c0ea4c76cdd889707f7308576e3efa0/Jason2013-11-27T22:47:37Z2013-04-26T19:10:15Z
<p>http://pastebin.com/CM2EfQ21</p>
<p>This is what the commit log looks like for the remote.log file. There is some interesting stuff in here. I'll try to highlight the changes without giving too much of the important bits away.</p>
<p>The I commit at 2013-04-22 11:57 is when I added the box remote:</p>
<p>0490d177-78e2-421b-a004-47d88ee7a2e3 chunksize=10mb cipher=... davcreds=... embedcreds=yes name=box.com type=webdav url=https://www.box.com/dav/annex timestamp=1366657062.972357s
1d0ab67c-6a43-11e2-9feb-df22c6d1e308 bucket=annex-1d0ab67c-6a43-11e2-9feb-df22c6d1e308 cipher=... datacenter=US host=s3.amazonaws.com name=annex port=80 storageclass=REDUCED_REDUNDANCY type=S3 timestamp=1359484726.520727s</p>
<p>The contents also includes my nas remote, but I will omit that for brevity's sake. I did notice that initially the s3 remote was named "annex". That was probably the web interface's doing, way back when I added it.</p>
<p>The next commit at 2013-04-24 10:55 seems to have added encryption=shared and highRandomQuality=false to the nas remote (I think this was when I re-enabled the nas remote through the webapp).</p>
<p>The commit at 2013-04-24 11:05 looks like it added similar stuff to the box remote (added highRandomQuality=false). Probably this was from enabling it then as well.</p>
<p>At 2013-04-24 11:12 the s3 remote had highRandomQuality=false added also.</p>
<p>At 2013-04-24 11:26, a new remote was added:</p>
<p>4d86972d-9b0a-4095-bc50-f9bec8144c30 bucket=s3-4d86972d-9b0a-4095-bc50-f9bec8144c30 cipher=... datacenter=US host=s3.amazonaws.com name=s3 port=80 storageclass=STANDARD type=S3 timestamp=1366828017.8792s</p>
<p>Very possibly this was me doing a <code>git annex initremote ...</code> thinking that the s3 remote was actually named s3 (somehow, I feel like I would have checked that, but I'm going to chalk that up to my own stupidity).</p>
<p>Then at 2013-04-24 11:35, the new s3 remote was changed... but it seems like only the timestamp was altered. I suspect this was from another command line change, but I don't remember exactly what I did at that point. Probably a reference in a different file was also modified, but I'm not looking at those.</p>
<p>At 2013-04-24 11:37, again the new s3 remote was changed, but again it was just the timestamp.</p>
<p>In the merge at 2013-04-24 15:15, a bunch of things happened. This may be where stuff went wrong. I do find it weird because it should have just been a fast forward, given what the history looks like. I suspect that this was caused by a <code>git annex sync</code>, but I'm not 100% sure.</p>
<p>In this commit the following happened:</p>
<ul>
<li>The box remote was duplicated (with different davcreds and one having highRandomQuality=false)</li>
<li>The annex remote was duplicated (with highRandomQuality=false in one)</li>
<li>The nas remote was duplicated (one with encryption=shared and highRandomQuality=false and the other without)</li>
</ul>
<p>In addition, within that commit, my uuid.log file also had duplication that seems to be where part of the confusion comes from:</p>
<ul>
<li>The 1d0ab67c-6a43-11e2-9feb-df22c6d1e308 remote shows up twice, once named "annex" and the other time named "s3".</li>
<li>The 4d86972d-9b0a-4095-bc50-f9bec8144c30 remote is only include once in there, but its name is also "s3".</li>
<li>Other remotes are duplicated, with different timestamps, but no overlapping uuids.</li>
</ul>
<p>Then at 2013-04-24 18:13, I think things try to fix themselves:</p>
<ul>
<li>The older box remote (I guess based on timestamp) is removed. Now there's only one.</li>
<li>The older 1d0ab67c-6a43-11e2-9feb-df22c6d1e308 remote (still named annex) is removed. Now there's only one there too.</li>
<li>The single 4d86972d-9b0a-4095-bc50-f9bec8144c30 remote is updated with a new timestamp.</li>
<li>The older nas remote is also removed.</li>
</ul>
<p>No duplicates exist in this file and no cross-references exist either.</p>
<p>The uuid.log file seems to be the place where the annex remote is renamed to s3. I have no idea what caused that, but it was probably me.</p>
<ul>
<li>In 2013-04-24 11:12, everything is fine in the uuid.log file. The annex timestamp is updated, but no problems.</li>
<li>In 2013-04-24 11:13 (which doesn't show up when I look at the remote.log changes, because it didn't change that), a file's location log is updated and the 1d0ab67c-6a43-11e2-9feb-df22c6d1e308 remote is renamed from annex to s3 in uuid.log, but not in remotes.log.</li>
<li>In 2013-04-24 11:26, 4d86972d-9b0a-4095-bc50-f9bec8144c30 is added to remotes.log with the name s3 and to uuid.log with the name s3 (which is now a duplicate of the renamed 1d0ab67c-6a43-11e2-9feb-df22c6d1e308, but only in uuid.log).</li>
</ul>
<p>All of this seems horribly confusing and I don't envy your trying to unwind it.</p>
comment 8http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_8_36519ee4499a19f0864e4fcd264e9933/joey2013-11-27T22:47:37Z2013-04-26T20:06:34Z
<p>Most of this is perfectly normal. The duplication of lines are normal; when two git-annex branches are union merged, it's as if it runs <code>cat branch1:file branch2:file | uniq > file</code>. When there are conflicting lines for the same uuid, the one with the newest timestamp is used.</p>
<p>The description of the remote in uuid.log is also not relevant to this bug.</p>
<p>This is the key part:</p>
<blockquote><p>The box remote was duplicated (with different davcreds and one having highRandomQuality=false)</p></blockquote>
<p>As you note, 2013-04-24 15:15 was a merge. So there must have been two branches before, which had different box remotes with different davcreds.</p>
<p>It would probably help if you can paste those lines as they looked after that merge (omitting most of the davcreds).</p>
<p>Also, I'd like to see the box line from the 11:05 commit.</p>
comment 9http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_9_85b23f375e53469fb09b24b945b3aba9/Jason2013-11-27T22:47:37Z2013-04-26T20:21:05Z
<p>Two box lines after 15:15 merge:</p>
<p>0490d177-78e2-421b-a004-47d88ee7a2e3 chunksize=10mb cipher=... davcreds=... embedcreds=yes highRandomQuality=false name=box.com type=webdav url=https://www.box.com/dav/annex timestamp=1366826729.945023s
0490d177-78e2-421b-a004-47d88ee7a2e3 chunksize=10mb cipher=... davcreds=... embedcreds=yes name=box.com type=webdav url=https://www.box.com/dav/annex timestamp=1366657062.972357s</p>
<p>After the 11:05 commit, the box line looked like this:</p>
<p>0490d177-78e2-421b-a004-47d88ee7a2e3 chunksize=10mb cipher=... davcreds=... embedcreds=yes highRandomQuality=false name=box.com type=webdav url=https://www.box.com/dav/annex timestamp=1366826729.945023s</p>
<p>I am curious why you want to know about box, when s3 is the one that I'm having trouble with...</p>
comment 10http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_10_ed35a6ec605e8f79ec107856af6d1a46/joey2013-11-27T22:47:37Z2013-04-26T20:33:03Z
<p>Oh.. I got confused by you talking about the box remote. Lines you pasted look ok anyway.</p>
<p>Ok, looking at the S3 remote then...</p>
<blockquote><p>I did notice that initially the s3 remote was named "annex". That was probably the web interface's doing, way back when I added it.</p></blockquote>
<p>So, you can never change the names used to refer to remotes in remote.log. These names can be different from the names used to refer to the same remotes in .git/config. (Which can vary from repository to repository anyway..) So, if you originally added a s3 remote and called it "annex", you still need to use that name when running initremote elsewhere to add that remote to your repository.</p>
<p>The remote with name "s3" added in the 11:26 is a separate s3 remote, and I think one you don't want. (And have marked dead?)</p>
<p>I think all you need to do is "git annex initremote annex" to add the s3 remote you want to your new repository.</p>
comment 11http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_11_e48b6efa42159dc83e1be11bfb54abcd/Jason2013-11-27T22:47:37Z2013-04-26T20:37:51Z
<p>Ah, I see. It looks like that did solve my problem.</p>
<p>Yes, I did mark the old s3 remote as dead.</p>
<p>At least now I know how to fix it if it ever happens again. I wonder if I'll ever be able to recreate it...</p>
<p>Thanks!</p>
comment 12http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_12_b58232d0e3fa4649565c0c7d4ce2e82e/joey2013-11-27T22:47:37Z2013-04-26T20:52:36Z
<p>It's easy to recreate. As I understand it, the entire process went something like this:</p>
<p>git annex initremote annex type=S3 encryption=blahblah # possibly this was done in the webapp?</p>
<p>git remote rename annex s3 # also possibly done in the webapp</p>
<h1>clone to different computer, and on the new clone:</h1>
<p>git annex initremote s3</p>
<p>git-annex: Specify the type of remote with type=</p>
<p>git annex initremote s3 type=S3 encryption=blahblah</p>
<p>The last line creates a <em>new</em> remote.</p>
<p>I'm inclined to think the main confusing thing here is that initremote is used to both create a new special remote, and to configure the repository to use an already existing special remote that was created elsewhere. If you had to use <code>enableremote</code> for the latter,
things could be less confusing:</p>
<h1>clone to different computer, and on the new clone:</h1>
<p>git annex enableremote s3</p>
<p>git-annex: No existing special remote named s3. Choose from one of these existing special remotes: annex</p>
comment 13http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_13_85368b60091dc3ce2efb58013ffe9f83/Jason2013-11-27T22:47:37Z2013-04-26T21:16:30Z
<p>I tend to agree with you. At first I liked the idea that initremote could be used to re-initialize a remote, but then I got confused about what the name of that remote was. I suppose git annex status could have told me. I kept wanting to have something like "git annex remote" (which would list them) and then "git annex remote init" to initialize them. That way the remote actions would follow the same sort of interface as "git remote", where you could list, init, create, edit, rename, enable, disable, kill (dead?), etc. The main drawback I see with that is having too many levels to type.</p>
<p>I really like the idea of having the ability to "git annex remote show s3" and it will tell me what the type, uuid, options, etc are for that remote.</p>
comment 14http://git-annex.branchable.com/forum/Not_sure_how_to_get_my_s3_remote_back/comment_14_e65281bef23e0076936c508728a87897/joey2013-11-27T22:47:37Z2013-04-26T22:25:04Z
<p>Have now split out an enableremote command.</p>
<pre>
joey@gnu:~/tmp/annex>git annex initremote foo
git-annex: There is already a special remote named "foo". (Use enableremote to enable an existing special remote.)
joey@gnu:~/tmp/annex>git annex enableremote
git-annex: Specify the name of the special remote to enable. Known special remotes: foo
</pre>
<p>Also, I wrote something wrong before. It <em>is</em> possible to change the name used by initremote (now enableremote).</p>
<p>With the current release of git-annex:</p>
<p><code>git annex initremote annex name=mys3</code></p>
<p>With the next release:</p>
<p><code>git annex enableremote annex name=mys3</code></p>