Recent comments posted to this site:
Hmm, if the default always had "or present" added to it, at least the surprise drop would not be a concern.
That is a very funny idea, I like it!
Hmm, if the default always had "or present" added to it, at least the surprise drop would not be a concern.
I am going to change the names to "initwanted" etc as you suggested, to avoid closing off the possiblity of adding a global default later.
It's probably somewhat common to want to get files from origin, but not let origin make config changes that drop all the files they have previously shared.
Fair enough.
So I guess one can encourage users to include git config --global annex.jobs 4 and git config annex.defaultwanted present in their setup. Thanks for implementing that.
Anyone with write access to a repo can already freely change any group, groupwanted or wanted for any involved clone - if it's present in the git-annex branch
A good point certianly.
So your concerns only apply to private repos that don't record their activity in the git-annex branch by using
annex.private=true.
Well also repos that lack permission to push or are simply not pushed to origin.
It's probably somewhat common to want to get files from origin, but not let origin make config changes that drop all the files they have previously shared.
you can set annex.defaultwanted to "standard", and annex.defaultgroups to some group, and then changing git-annex groupwanted will affect all repositories that copied that defaultwanted into their config
If annex.defaultwanted were able to be changed for all repositories with git-annex config, then here's a really ugly security problem [...]
Yes, but the same is already possible for anyone with write access to a repo. I can git annex wanted JOEYS-UUID nothing, wait for your assistant or manual sync to auto-drop all files (would also need to set {num,min}copies to 1 for that, and even then it might not auto-drop it depending on the remotes). Anyone with write access to a repo can already freely change any group, groupwanted or wanted for any involved clone - if it's present in the git-annex branch (i.e. not made with git config annex.private=true). So your concerns only apply to private repos that don't record their activity in the git-annex branch by using annex.private=true. Making a git-annex repo private is a conscious, active choice. One does not need to do it if one only consumes files and does not have push access anyway. So that'll be people who actively change repo content, probably consume it, but don't want their repo to show up in git annex info. Maybe for a publicly-pushable git-annex repo where everyone can add new files (who would host that anyway...). In this case, yes, users of that repo can't trust each other and there setting something like git annex config --set annex.defaultwanted nothing at some point can lead to people's git annex sync|assist|assistant to suddenly drop their files - and probably also on the central remote. But I'd argue that this kind of publicly writable setup has so many other obvious problems that annex.defaultwanted is one of the minor ones.
Other situations I can imagine consider groups of people (or just single users) who trust each other when using a git-annex repo. git-annex is not designed to solve such permission problems - neither is git itself.
In your publicly readable (not writable) git-annex-builds repo on the other hand, if you were to set git annex config --set annex.defaultwanted nothing, then people who just run git annex sync|assist|assistant in their clones would have their downloaded builds dropped, okay.
git-annex usage scenarios
- publicly writable git-annex repo
- (bad idea anyway for several reasons without any form of permission control on the remote side)
- malicious people could set
git annex config --set annex.defaultwanted nothingat some point and other's clones would have files dropped on sync.
- publicly readable git-annex repo to provide assets (e.g. your git-annex-builds repo)
- only the owner could do such shenanigans. Users can avoid it by using
git annex pullandgit annex getinstead ofsync|assist|assistant(which arguably makes more sense in this case anyway) or explicitly stating theirgit annex wanted here ....
- only the owner could do such shenanigans. Users can avoid it by using
- groups or individuals working on a repo in several clones - everyone has write access, in a team for example
- anyone can already happily destroy repo contents and control other's wanted expressions
git annex config annex.defaultwantedcan be set as an established "repo policy" for everyone's convenience, that anyone can overwrite locally withgit annex wanted here ....- if you run
git annex assist|sync|assistant|satisfy, you accept the repo's policy, as with yoursecurehashesonlyexample. If you're paranoid, don't use these sync commands, but do only exactly what you want such asgit annex pull -g,git annex get <thatfile>,git annex wanted ..., etc.
If annex.defaultwanted were able to be changed for all repositories with
git-annex config, then here's a really ugly security problem:
- First, I make sure to get a copy of every annexed file.
- Then I run
git-annex config annex.defaultwanted nothing - Then I wait for git-annex to drop every file from your repository.
- Finally, I demand $ to get your files back.
Now, the same can be done by convincing people to add their repository to some group and set preferred content to "standard", and later changing the groupwanted. But that only works on people you were able to social engineer to doing that, not everyone who cloned a repository with the default settings.
And beyond the ransom problem, there's the problem that once this is set, any change to it is going to affect most every other user of the repository. With groupwanted there's a communicated intent in the name of the group, and there can be different groups with different versions of the preferred content expression. This lacks that, it encourages flag day events.
Note that you can set annex.defaultwanted to "standard", and
annex.defaultgroups to some group, and then changing
git-annex groupwanted will affect all repositories that copied that
defaultwanted into their config.
So that's a way to be able to make changes that will affect other people's clones. But only ones that they have opted into.
I'm on the fence about whether the kind of security impact I discussed earlier is really something that should prevent a global setting, or not.
git-annex config of annex.securehashesonly is another example of
something where my hypothetical "auditing repos" would be vulnerable to a
behavior change that might be security significant. Since that gets copied
from the git-annex config to git config at init time, behavior in a
new clone might be different than behavior in an existing clone.
Does that mean it's ok for there to be more cases where there can be such a potential security impact? I don't know.
The annex-ignore config can be manually set by the user to prevent using an otherwise usable remote. The man page gives the example of a network connection that is too slow to use normally.
It may be that no users are actually using annex-ignore like this. Using annex-sync seems more likely. But, it's hard to rule out.
That presents a problem, since this would need to unset annex-ignore once the repository was created.
Checking before push if the repository exists, and only unsetting annex-ignore if it did not exist before sync, but does afterwards, would be one way around this problem. It does mean that, if 2 people are making a repository at the same location at the same time, the loser may be left with annex-ignore set due to the other person having created the repository.
Or, a new config could be added, that is like annex-ignore, but is only set by git-annex, and not by the user. Keeping annex-ignore's behavior, but making git-annex set and unset the new config as needed.