I have two regular git clones and 4 special remotes (cloud storage mostly).
My numcopies is set to 2 which means I always need to have at
least two copies. One usually lives in one of my git clones
which is a backup and so has everything. The other is somewhere
in the cloud.
When I add a new file to my client clone and do git annex sync
--content is this respecting the prefered content of each remote
or is it only trying to satisfy the numcopy?
These are my settings for archive and smallarchive
groupwanted smallarchive = ((include=*/archive/* or include=archive/*) and not (copies=archive:2 or copies=smallarchive:2 or (copies=archive:1 and copies=smallarchive:1))) or approxlackingcopies=1
groupwanted archive = (not (copies=archive:2 or copies=smallarchive:2 or (copies=archive:1 and copies=smallarchive:1)) or approxlackingcopies=1
One of the remotes is set as archive and two as smallarchive yet sync
--content only ever copies to one so as to satisfy numcopies. Is
this expected? Shouldn't it always try to make two copies in archive
or smallarchive?
Interestingly, it copies data to my backup also (which is an
extarnal drive, so always present). So it seems the prefered content
is only respected for clones and not special remotes. Is that true?
Do I have to run git annex copy --to <archive-remote> --auto to satisfy the content preferences?

||| When I add a new file to my client clone and do git annex sync --content is this respecting the … content of each remote or is it only trying to satisfy the numcopy?
Both.
Running
git annex sync --contentwill copy content to any remote where that content is wanted (IE look at the preferred content settings), and will drop files that are not wanted and don't violate numcopies totals (at the moment of the drop?).||| One of the remotes is set as archive and two as smallarchive yet sync --content only ever copies to one so as to satisfy numcopies.
I don't believe that
git annex sync --contentis ever trying to “satisfy numcopies,” I would think of numcopies as more of a limit or restriction on when git-annex is allowed to drop content that is not wanted by a remote.||| Shouldn't it always try to make two copies in archive or smallarchive?
Hmmmm. I would guess there is some issue with your archive or smallarchive expressions, or they aren't actually set (being used) or you have discovered an issue… You have overridden the standard groups, and can see your overrides with
git annex groupwanted archiveandgit annex groupwanted smallarchive? And your remotes havegit annex group archiveandgit annex wanted groupwantedset?So
(include=*/archive/* or include=archive/*) andmeans you are only copying files in the archive directory, was that your intention?If so, the rest of your content expression seems like it should want 1 copy in an archive and 1 copy in a smallarchive, or 2 copies in 2 remotes marked archive or 2 copies in two remotes marked smallarchive.
||| Interestingly, it copies data to my backup
Right. If you are using a standard content group backup means “All content is wanted. Even content of old/deleted files.” This expression will want all content and never drop content.
Hi Andrew!
Actually, you are entirely correct, I just misinterpreted the
smallarchiveformula. What I was thinking it would do was it would upload the file in case there are not two copies in any combination ofarchiveorsmallarchiveand as soon as there are twoarchivecopies it would ideally drop the file if not in*/archive/*directory.But now I see that the formula does exactly what you say it would do, the whole long
and notcondition is additional to the first, which means a file will not get uploaded if not in those directories.I will play around with the expression, I'm sure it can be modified to do what I want.
Thanks! It was mostly me being silly, sometimes having other people re-state the obvious helps!
I just wanted to say thanks to andrew for a really excellent response here. I've answered plenty of such questions less well. I hope you answer many more questions about git-annex with such care and attention to detail.
And, thanks to MatusGoljer for following up to confirm that you were misunderstanding -- and I agree that you should be able to adjust the preferred content expression to do what you want it to do.
Thanks Joey! I am glad to hopefully take some work off your plate.
—Andrew