If an external special remote is implemented as a Docker container, it can be safely autoenabled and run in a sandboxed way. So the distributor of a repo that has annex files fetchable with a given special remote, could have the docker tag for the special remote configured on the git-annex branch, and users could then clone and use the repo without needing to install anything.
Couldn't the docker image come with its own copy of git-annex? Not super space efficient, but it ensures that the special remote has access to a version of git-annex with the features it needs.
I think this could be a good idea, although I would not want to be forced to use docker as the only way to install an external special remote either.
It seems that the minimum needed is a way to add a shell script to PATH with the name of the external special remote program, so git-annex can run it as usual. Or git-annex could invoke
docker run
itself, but I like having a shell script because it means git-annex doesn't need to know about docker and other containerization technologies.OTOH, I can see it would be nice if
git annex enableremote
could somehow get everything set up to use docker, andgit annex init
could fully set up autoenable=true special remotes.A balance could be for
git annex enableremote
to set up the shell script, perhaps in.git/annex/externals/
. Store a few values like which docker image to use in the remote config, and generate the shell script from that. Then when a user needs to pass extra parameters to docker, or if they want to use rkt etc, they can just edit the shell script.chroot
orbash -r
would suffice?"docker is too insecure to auto-install, enable and execute random special-remote programs" -- interesting, didn't realize that. Maybe prompt the user for permission, and/or tell them to set a git config setting to enable auto-install?
In practice people often end up running less-than-vetted code, e.g. when trying out python packages written by people they don't know. Running sandboxed code seems relatively safe next to that.
Related: dockerized external backends .
There is an expectation that checking out and looking at a git repository will not cause arbitrary code to be run, sandboxed or not.
I think this can easily be dealt with in layers above git-annex, which can have different expectations about what code is safe to run.
How about this as a compromise that avoids any unwanted code execution while making it easy to enable if you do want it:
When git-annex enables an external special remote (including autoenable), and the special remote program is not available in PATH, and a git config (call it annex.special-remote-installer) is set to a command, git-annex runs that command with the name of the special remote program it wanted to install. The command should install the special remote program into a particular subdirectory in .git/annex/, and git-annex will then use it.
It would then be up to users to decide if they want to set that git config, or if something is being built on top of git-annex and sets up the git repo for them, it could set the config to point to whatever command it provides to install special remote programs.
This also has the benefit of not tying git-annex to any particular technology like docker.
My main reason for wanting dockerized special remotes and external backends, is to be able to use custom remotes/backends without adding a burden on repo users (beyond the standard
git-annex-init
after checkout), similar to what autoenabling of remotes does. So needing users to know about and set some special git config kind of removes the point. Maybe, instead, can just prompt the user for permission to install an external remote/backend, like what emacs does for calling untrusted code?In practice you'd typically trust code from a specific repo or author, so not sure
annex.special-remote-installer
could automatically determine the trust.I understand "the benefit of not tying git-annex to any particular technology like docker"; OTOH it's already tied to some particular technologies, like with the built-in S3 special remote.
Prompting users about things rarely improves security. (Not saying it doesn't in the case of emacs org mode, which may be a special case.) A good way to make clear to a user that they are running code that comes from a git repository is to make them take the effort to run ./setup or something like that.
It would also be weird for git-annex to prompt for this, since it never prompts about anything else, and auto-enabling these special remotes could happen when running any git-annex command.