syncgit-annexhttp://git-annex.branchable.com/sync/git-annexikiwiki2023-03-21T18:08:04Zvery nicehttp://git-annex.branchable.com/sync/comment_1_59681be5568f568f5c54eb0445163dd2/Adam2013-11-27T22:47:37Z2012-02-25T15:02:18Z
Here's a way to get from a starting point of two or more peer directory trees <em>not</em> tracked by git or git-annex, to the point where they can be synced in the manner described above: <a href="http://git-annex.branchable.com/forum/syncing_non-git_trees_with_git-annex/">syncing non-git trees with git-annex</a>
Its just grandhttp://git-annex.branchable.com/sync/comment_2_9301ff5e81d37475f594e74fbe32f24e/Daniel2013-11-27T22:47:37Z2013-01-04T14:45:35Z
<p>I cam upon git-annex a few months ago. I saw immidiately how it could help with some frustrations I've been having. One in particlar is keeping my vimrc in sync accross multiple locations and platforms. I finally took the time to give it a try after I finally hit my boiling point this morning. I went through the <a href="http://git-annex.branchable.com/walkthrough/">walkthrough</a> and now I have an annax everywhere I need it. <code>git annex sync</code> and my vimrc is up-to-date, simply grand!</p>
<p>Thanks so much for making git-annex,
<a href="http://woz.io">Daniel Wozniak</a></p>
synchronising stored files with a bare repositoryhttp://git-annex.branchable.com/sync/comment_3_49560003da47490e4fabd4ab0089f2d7/Diggory2013-11-27T22:47:37Z2013-01-11T16:52:38Z
Good for syncing indexes, but if I want to synchronise all data files too (specifically pushing to a remote bare repository), how do I do that?
comment 4http://git-annex.branchable.com/sync/comment_4_cf29326408e62575085d1f980087c923/joeyh.name2013-11-27T22:47:37Z2013-01-11T18:18:07Z
Yes, sync only syncs the git branches, not git-annex data. To sync the date, you can run a command such as <code>git annex copy --to bareremote</code>. You could run that in cron. Or, the <a href="http://git-annex.branchable.com/assistant/">assistant</a> can be run as a daemon, and automatically syncs git-annex data.
How to sync content with git-annex, not assistanthttp://git-annex.branchable.com/sync/comment_5_18c396c59907147bb2bf713e55392b6b/chocolate.camera2013-11-27T22:47:37Z2013-10-11T09:58:12Z
Sure assistant can sync git-annex data across remotes. But how do I tell a repo to sync git-annex data, but not so manually as to having to know what exactly needs to be copied from/to where?
Syncing only a specific branchhttp://git-annex.branchable.com/sync/comment_6_012e9d4468d0b88ee3c5dad3911c3606/Dav2013-11-27T22:47:37Z2013-11-24T17:48:22Z
<p>By default, <code>git annex sync</code> will sync to all remotes, unless you specify a remote. So, I have to specify, e.g., <code>git annex sync origin</code>. I can simplify this with aliases, I suppose, but I do a lot of teaching non-programmer scientists... so it'd be nice to be able to configure this (so beginning users don't have to keep track of as many things).</p>
<p>Is there (or will there be) a way to do this?</p>
comment 7http://git-annex.branchable.com/sync/comment_7_6276e100d1341f1a0be368f54de0ae7b/joeyh.name2013-11-27T22:47:37Z2013-11-26T20:08:33Z
I feel that syncing with all remotes by default is the right thing for git annex sync to do.
Use case for not syncing to all remoteshttp://git-annex.branchable.com/sync/comment_8_b89161c82c05634d35f6b65bf8360a96/Dav2013-12-08T19:42:27Z2013-12-08T19:20:26Z
<p>Just in case you haven't considered such a scenario - maybe you have suggestions for how to collaborate more effectively with git annex (and avoid warning messages):</p>
<p>I'm trying to teach beginning scientist programmers (mostly graduate students), and a common scenario is to fork some scientific code. I'd like forking on github to be mundane, and not trigger warnings, and generally have as little for folks to explicitly keep track of as possible (this seems to be a common concern we share, which leads you to prefer syncing to all remotes without the option to configure the default behavior!).</p>
<p>However, I am currently working with students on forking and fixing up scientific code where the upstream maintainer doesn't want to allow pushes upstream, except via pull request. So, part of our approach is to set up some common shared datasets in git annex (and these just end up in our fork). If we have an "upstream" remote, git annex will try to sync with it, and report an error.</p>
<p>So - that's why I'd like to be able to configure the deactivation of syncing to a defined branch (e.g., "upstream"). However, if you have other suggestions to smooth the workflow, I would also like to hear those!</p>
comment 9http://git-annex.branchable.com/sync/comment_9_849883b7cc05bfcb01914d8737098010/joeyh.name2013-12-12T17:54:55Z2013-12-12T17:54:55Z
<p>@Dav what kind of url does the upstream remote have? Perhaps it would be sufficient to make sync skip trying to push to git:// and http[s]:// remotes. Both are unlikely to accept pushes and in the cases where they do accept pushes it would be fine to need a manual <code>git push</code>.</p>
<p>Anyway, you can already configure which remotes get synced with. From the man page:</p>
<pre>
remote.<name>.annex-sync
If set to false, prevents git-annex sync (and the git-annex
assistant) from syncing with this remote.
</pre>
<p>So <code>git config remote.upstream.annex-sync=false</code></p>
Sorry to just be getting back...http://git-annex.branchable.com/sync/comment_10_2cd8ab86f498d6f676f859b552f831eb/Dav2014-01-26T22:51:30Z2014-01-26T22:51:28Z
The URLs in question in this case were read-only github https URLs. In any case, my problems are solved by what you've already suggested. I think a less error-sounding response to read-only https repos sounds nice!
sync slow with content switchhttp://git-annex.branchable.com/sync/comment_11_7683879f6982c0eb0aa39b66ff5a5ea9/Matthias2014-04-22T20:37:05Z2014-04-22T20:37:05Z
<p>I noticed that in a test with 2 local repositories and around 2'000 files "git annex sync" is still very fast, but "git annex sync --content" takes multiple seconds. Is this avoidable?</p>
<p>I have a central repo and client repos. I want to copy all content to the central repo after a commit. Right now, I use "git annex group central backup", "git annex wanted central standard", and a hook that triggers "git annex sync --content" after each commit. Maybe there is a more efficient way to do this? Thanks for sharing thoughts.</p>
Sync specific branch or ignore a branch during synchttp://git-annex.branchable.com/sync/comment_12_2fea14fa314ddb7ab645a5cca5a95fd9/mshri [livejournal.com]2014-04-25T15:37:53Z2014-04-25T15:37:53Z
<p>I too feel that syncing all remotes by default is the right thing to do, but I think it should be limited to the 'master' and 'git-annex' branch. I often create branches that I want to keep local and do not want them to be synced. But I want 'master' and 'git-annex' branches to be synced with all remotes.</p>
<p>So it would be nice to able to set an option to sync all branches or just the 'master' and 'git-annex' or to able to ignore some branches during git annex sync</p>
<p>Shri</p>
comment 13http://git-annex.branchable.com/sync/comment_13_690f66be9cefe28844d8df653b7a0331/zardoz2014-05-15T08:28:09Z2014-05-15T08:28:09Z
<p>I agree with mshri. It’s confusing to have every local branch wind up on every remote (and it hinders «git annex unused»).</p>
<p>I tried working around this by just including relevant branches in the «fetch» refspec, but this will only work until another remote pushes the branches again.</p>
comment 14http://git-annex.branchable.com/sync/comment_14_db342785a4dade30b5b75cb95031bed1/zardoz2014-05-15T08:58:26Z2014-05-15T08:58:26Z
Added a wishlist item <a href="http://git-annex.branchable.com/todo/Allow_syncing_only_selected_branches/">http://git-annex.branchable.com/todo/Allow_syncing_only_selected_branches/</a>
comment 15http://git-annex.branchable.com/sync/comment_15_168e0ab10b4084e13df1a3058fa7e8a9/joeyh.name2014-05-15T19:53:16Z2014-05-15T19:53:16Z
We seem to have some rumor going around that <code>git annex sync</code> pushes all branches. It does not. It pushes only the git-annex branch and the currently checked out branch.
comment 16http://git-annex.branchable.com/sync/comment_16_96096f994fc55f921f2b24b274f998f7/joeyh.name2014-05-15T19:54:54Z2014-05-15T19:54:54Z
@Matthias, <code>git annex sync --content</code> has to check each file to see if any other repository wants it. This is necessarily going to get slow when there are a lot of files. The assistant does a similar syncing but uses some tricks to avoid scanning all the files too often, while still managing to keep them all in sync -- it can do this since it's a long-running daemon and is aware when files have changed.
comment 17http://git-annex.branchable.com/sync/comment_17_44a4ae4685c4bf2b4e7c61897eb3ff80/Matthias2015-01-22T22:04:09Z2015-01-22T22:04:09Z
<p>git sync … >> fetches from each remote</p>
<p>Well, I have two git annex-ed repositories where "git remote -v" properly lists the other repo, and "git annex sync foo" manages to pull from foo, but "git annex sync" without a remote name simply does a local sync. Also, neither command pushes anything anywhere.</p>
<p>So, where does "git annex" get its list of remotes from? What could prevent it from accessing them?</p>
comment 18http://git-annex.branchable.com/sync/comment_18_838fb249cd5be83962770d5cc394389a/joey2015-02-04T19:36:50Z2015-02-04T19:12:23Z
<p>If a remote has "remote..annex-sync" set to false in the git
config, <code>git-annex sync</code> will skip that remote unless you specify the name.
That's probably what's going on in your case.</p>
git-config for manual sync-like operationshttp://git-annex.branchable.com/sync/comment_19_d409ad20c5a6671f0d0b834232368030/clacke2016-04-14T08:21:03Z2016-04-14T08:21:03Z
<p>My way of working with git-annex doesn't seem to mesh well with the Assistant or even with <code>git annex sync</code>. I seem to have a bit of a control need when it comes to what gets committed when. But here's my workflow approximating what it does, with a twist. I have this in git config on <code>mylaptop</code>:</p>
<pre><code>remote.myserver.fetch=+refs/heads/*:refs/remotes/myserver/*
remote.myserver.push=refs/heads/*:refs/remotes/mylaptop/*
remote.myserver.push=refs/heads/master:refs/heads/master
remote.myserver.push=refs/heads/git-annex:refs/heads/git-annex
</code></pre>
<p>I don't need a <code>synced/git-annex</code>. If upstream is not up-to-date I fetch and merge. In this case upstream happens to be a bare git repo, so I don't need <code>synced/master</code> either. If upstream is non-bare, I use <code>synced/master</code> -- or sometimes I keep upstream usually checked out on an orphan branch and just switch into master to check things and then switch away to avoid conflict. If I can avoid it, I prefer not to have several branches where I don't know which one is the latest one.</p>
<p>But here's the twist, look at this row:</p>
<pre><code>remote.myserver.push=refs/heads/*:refs/remotes/mylaptop/*
</code></pre>
<p>If I just do <code>git push</code>, close the lid and run into the forest, it may or may not have a non-fastforward event on master and git-annex ... but it always succeeds in pushing to the <code>mylaptop</code> remote on my server.</p>
<p>If I have added a batch of files, I usually push first to all my remotes, to get that precious metadata up there. At that point I don't care if there's a conflict upstream. Then I <code>git annex copy</code> to wherever, fetch all remotes, <code>git annex merge</code>, maybe merge <code>master</code> if I have to (usually not), then push to all remotes again. It's less of a bother than it sounds like. I don't even have any handy aliases for this, I prefer to just get the for loop from my command-line history.</p>
Branch names containing slasheshttp://git-annex.branchable.com/sync/comment_20_537b27219871a565ae7bb7f357cd3793/kartynnik2016-08-29T17:30:44Z2016-08-29T17:30:44Z
<p>1) When I have a branch "some/branch/name" containing slashes in its name, git-annex sync strips everything up to the last slash and creates "synced/name", which may clash with "some/other/name". Is there a workaround?</p>
<p>2) Could the "don't use synced branches" behavior referred in the comments above somehow be configured on the repository side so that everyone cloning it doesn't need to configure it for himself?</p>
Re: Branch names containing slasheshttp://git-annex.branchable.com/sync/comment_21_9c0368eb796f1191c22c186cbb06c642/joey2016-09-21T20:15:38Z2016-09-21T18:56:22Z
<p>@kartynnik, that's a bug: <span class="createlink"><a href="http://git-annex.branchable.com/ikiwiki.cgi?do=create&from=sync%2Fcomment_21_9c0368eb796f1191c22c186cbb06c642&page=bugs%2Fsync_uses_conflicting_names_for_deep_branches" rel="nofollow">?</a>sync uses conflicting names for deep branches</span></p>
<p>Please file bugs there and not as comments here, it's too easy to lose
track of a comment deep in a thread.</p>
Compressed file transfershttp://git-annex.branchable.com/sync/comment_22_e83ed5c0034c48baed7943c596f708ae/mario2017-05-03T20:52:43Z2017-05-03T20:52:43Z
<p>Hi,</p>
<p>how does "git-annex sync --content" transfers its file to a (regular) ssh-remote? I think it uses rsync.. Is that correct?</p>
<p>I want to use compression for the file transfers. Therefore, I tried in .git/config to set:</p>
<pre><code>[remote "origin"]
annex-rsync-upload-options = "--compress"
</code></pre>
<p>However, it seems that this crashes the upload. The sync just seems to hang.. Is it possible to use compression for the transfer? How?</p>
comment 23http://git-annex.branchable.com/sync/comment_23_e5e7ec9fbafe5e0429161b977e483752/joey2017-05-09T18:03:26Z2017-05-09T17:52:11Z
<p>@mario, great question! (Not the best place for such a question, start a
thread on the forum next time..)</p>
<p>git-annex does use rsync when transferring files between ssh remotes.
Rsync normally goes over ssh, and it might be better to enable compression
at the ssh level. For example, I have "Compression yes" in <code>~/.ssh/config</code></p>
<p>I think that the reason your annex-rsync-upload-options setting broke
it is that rsync needs --compress to be passed on to the other
rsync process (in the remote repository), and that is run via
git-annex-shell, which has a whitelist of options it will pass to rsync.
Passing arbitrary options to rsync could allow unwanted behavior
when git-annex-shell is being used as a security barrier. And --compress is
one of the options that both the rsync sender and receiver have to agree
on for the rsync protocol to work.</p>
<p>I have added a note to the man page about this limitation of what
the rsync-options settings can be used to do.</p>
sync only git-annex branchhttp://git-annex.branchable.com/sync/comment_24_1d1eb1bddc835644c7f9d6e83e09b752/Dan2019-07-18T19:52:35Z2019-07-18T19:52:35Z
<p>I've finally taken the time to learn git-annex and am extraordinarily impressed by its usefulness and documentation.</p>
<p>I'm currently using git-annex as part of a scientific workflow, wherein I use git to track my analysis source code and LaTeX reports, and git-annex to handle large binary files (typically input data).
<code>git annex sync</code> is really handy for making sure my <code>git-annex</code> branch propagates between my remotes, and it's hard to beat the usefulness of <code>git annex sync --content</code> now that I've wrapped my head around <a href="http://git-annex.branchable.com/preferred_content/standard_groups/">standard groups</a>.
However, I'd prefer if there was a flag (or configurable option) to suppress <code>git annex sync</code> from pushing/pulling whatever branch currently happens to be checked out.
I'm a pretty thoughtful committer and want more control over where my code branches (e.g., <code>master</code>) get pushed around.
I saw the <code>--no-pull</code> and <code>no-push</code> options for <code>git annex sync</code>, but it seems that this suppresses <em>all</em> push/pull behavior, and thus <code>git annex sync --no-push --no-pull</code> will not sync up my special <code>git-annex</code> branch.
Is there an option or workflow that accomplishes what I'm looking for?</p>
<p>TLDR
I want a way to tell <code>git annex sync</code> to leave my <code>master</code> (or whatever currently checked out branch is) alone (no pushing/pulling), but otherwise behave normally (e.g., <code>git annex sync</code> will just push/pull my special <code>git-annex-branch</code> around, or <code>git annex sync --content</code> will push/pull the special <code>git-annex</code> branch, and also move content around as it makes sense).
Apologies if this is already possible, but I haven't been able to figure it out.</p>
Re: sync only git-annex branchhttp://git-annex.branchable.com/sync/comment_25_7e69a963102ceaa5b691ad9ed15c5a42/joey2019-07-19T18:49:48Z2019-07-19T16:55:33Z
<p>@Dan, there's an open todo about that,
<a href="http://git-annex.branchable.com/todo/sync_--branches__to_sync_only_specified_branches___40__e.g._git-annex__41__/">http://git-annex.branchable.com/todo/sync_--branches__to_sync_only_specified_branches___40__e.g._git-annex__41__/</a></p>
<p>Please followup there if the suggested new option would work for you.</p>
+1 for a command to sync only the git-annex branchhttp://git-annex.branchable.com/sync/comment_25_ea6f4e1f5ab31cd74d67fe95e83084cb/Ilya_Shlyakhter2019-07-19T18:16:08Z2019-07-19T18:16:08Z
I've also missed this functionality. One use is to sync the <a href="http://git-annex.branchable.com/git-annex-metadata/">metadata</a>.
Duplicate content creates frustrating cycleshttp://git-annex.branchable.com/sync/comment_27_8925805ff8902d7b2d1f47c1395aadc7/dscheffy2020-12-16T17:10:52Z2020-12-16T17:10:52Z
<p>I'm currently cleaning up 3 machines (with the goal of eventually upgrading my OS's) and 2 large external drives filled with 10 plus years of backups, so my current situation is somewhat temporary and may not apply to others.</p>
<p>I've started using preferred content to manage which repos hang onto which content. My main cleanup workflow involves moving files into a staging repository and then adding them to the annex -- then letting the preferred content settings figure out where to send the content. If I know exactly where I want the content to go, I'll move it directly into the appropriate folder, but if I haven't figured that out yet, sometimes I'll just put it in a <code>stage</code> folder. I've simplified my preferred content settings to assume that I only have one <code>big</code> external drive where everything except the contents of the <code>stage</code> directory should go, but in reality it's split up a bit across the two drives I already mentioned...</p>
<pre><code>$ git annex wanted big
include=* and exclude=stage/*
$ git annex wanted stage
include=stage/*
</code></pre>
<p>I noticed the other day that I had some missing content in <code>big/photo/raw</code>, so I went into that folder and ran <code>git annex get .</code> to rehydrate the missing files.</p>
<p>Today I staged some new files and ran the following from my staging annex:</p>
<pre><code>git annex add stage
git commit -m 'stage some new photos'
git annex sync --content
</code></pre>
<p>This when I noticed some weirdness:</p>
<pre><code>pull big
...
ok
(merging big into stage...)
(recording state in git...)
copy photos/raw/pict0001.jpg (to big...)
SHA256E-abc--xyz.jpg
(checksum...) ok
drop photos/raw/pict0001.jpg ok
...
get stage/cats.jpg (from big...)
SHA256E-abc--xyz.jpg
(checksum...) ok
drop big stage/cats.jpg ok
pull big
</code></pre>
<p>Basically, if two copies of the same content live in two different files that have an affinity to two or more mutually exclusive annexes, it seems like the rule that applies to the last file in the directory tree is arbitrarily going to be the one that wins out in the end. It also means if you have such a situation, you're going to see a strange dance like this everytime you run <code>git annex sync --content</code> as the content moves across annexes only to make it's way back to where it started.</p>
<p>I'm currently running v6.2, so maybe this has been fixed in the interim. Has anybody else seen this? Do standard groups address this problem? I started out tryint to use standard groups, but fell back on my own custom folder definitions when I couldn't figure out how to keep my standard groups from grabbing more content than I wanted them to.</p>
<p>Thanks!</p>
comment 28http://git-annex.branchable.com/sync/comment_28_e5c5da7fc0d0d034bfae33481f1ae067/joey2020-12-17T20:35:35Z2020-12-17T16:45:20Z
<p>@dscheffy,
<a href="https://git-annex.branchable.com/bugs/indeterminite_preferred_content_state_for_duplicated_file/">https://git-annex.branchable.com/bugs/indeterminite_preferred_content_state_for_duplicated_file/</a></p>
`git annex sync` not automatically syncing gcrypt remotes using relative pathshttp://git-annex.branchable.com/sync/comment_29_161c5d3f693de45070e037d27ee7e8aa/talmukoydu2023-03-19T19:20:44Z2023-03-19T19:20:44Z
<p>@joey Is this a bug or am I missing something?</p>
<p>Notes:</p>
<ul>
<li>I am using the latest git-remote-gcrypt, version 1.5</li>
</ul>
<p>Flow 1</p>
<ul>
<li><code>git remote add test gcrypt::rsync://user@user.rsync.net:relative/path/to/repo</code></li>
<li><code>git annex sync</code> -> DOES NOT SYNC to test remote</li>
<li>Nothing has been synced so I CANNOT successfully clone from the test remote with <code>git clone gcrypt::rsync://user@user.rsync.net:relative/path/to/repo</code></li>
<li><code>git push test git-annex master</code></li>
<li>I can successfully clone from the test remote with <code>git clone gcrypt::rsync://user@user.rsync.net:relative/path/to/repo</code></li>
</ul>
<p>Flow 2</p>
<ul>
<li><code>git remote add test gcrypt::rsync://user@user.rsync.net/full/path/to/repo</code></li>
<li><code>git annex sync</code> -> DOES SYNC to test remote</li>
<li>I can successfully clone from the test remote with <code>git clone gcrypt::rsync://user@user.rsync.net:relative/path/to/repo</code></li>
</ul>
RE: `git annex sync` not automatically syncing gcrypt remotes using relative pathshttp://git-annex.branchable.com/sync/comment_30_f75f5957dbd0f6fd7b2d7291f06e7489/talmukoydu2023-03-19T19:27:46Z2023-03-19T19:27:46Z
@joey definitely seems like a bug. I am able to easily verify by changing the remote url back and forth in the .git/config and then running git annex sync. If the relative url is used git annex sync does not sync to that remote.
comment 31http://git-annex.branchable.com/sync/comment_31_c85edac65571caff70e87dff2317a4e5/joey2023-03-21T18:08:04Z2023-03-21T17:47:45Z
<p>@talmukoydu you need to file a bug report and include things like the
version of git-annex you are using..
<a href="https://git-annex.branchable.com/bugs/">https://git-annex.branchable.com/bugs/</a></p>