dockerized external special remotes

If an external special remote is implemented as a Docker container, it can be safely autoenabled and run in a sandboxed way. So the distributor of a repo that has annex files fetchable with a given special remote, could have the docker tag for the special remote configured on the git-annex branch, and users could then clone and use the repo without needing to install anything.

RSS Atom

comment 1

Some external special remotes need to run git-annex, for things not directly supported by the protocol; a dockerized remote implementation couldn't do that. But maybe, the protocol could be extended with a command by which the remote asks git-annex to run a given git-annex command, and return paths to files containing the output and the exit code?

Comment by Ilya_Shlyakhter — Tue Dec 4 19:56:36 2018

Remove comment

comment 2

Couldn't the docker image come with its own copy of git-annex? Not super space efficient, but it ensures that the special remote has access to a version of git-annex with the features it needs.

Comment by joey — Wed Dec 5 16:24:17 2018

Remove comment

comment 3

I think this could be a good idea, although I would not want to be forced to use docker as the only way to install an external special remote either.

It seems that the minimum needed is a way to add a shell script to PATH with the name of the external special remote program, so git-annex can run it as usual. Or git-annex could invoke docker run itself, but I like having a shell script because it means git-annex doesn't need to know about docker and other containerization technologies.

OTOH, I can see it would be nice if git annex enableremote could somehow get everything set up to use docker, and git annex init could fully set up autoenable=true special remotes.

A balance could be for git annex enableremote to set up the shell script, perhaps in .git/annex/externals/. Store a few values like which docker image to use in the remote config, and generate the shell script from that. Then when a user needs to pass extra parameters to docker, or if they want to use rkt etc, they can just edit the shell script.

Comment by joey — Wed Dec 5 16:27:47 2018

Remove comment

autoenabling external special remotes

"the minimum needed is a way to add a shell script to PATH with the name of the external special remote program" -- that would of course be simpler; could specify a relative path within the repo itself. But auto-running an untrusted script fetched along with the repo may be risky. A Docker/Singularity container could be sandboxed; or maybe chroot or bash -r would suffice?

Comment by Ilya_Shlyakhter — Tue Mar 30 15:17:02 2021

Remove comment

comment 5

IMHO, docker is too insecure to auto-install, enable and execute random special-remote programs. It has a extremely large attack surface (syscalls, ioctls, etc.) compared to, say, virtual machines. And people regularly are able to break out of the latter.

Comment by Lukey — Tue Mar 30 16:21:10 2021

Remove comment

dockerized special remotes: security

"docker is too insecure to auto-install, enable and execute random special-remote programs" -- interesting, didn't realize that. Maybe prompt the user for permission, and/or tell them to set a git config setting to enable auto-install?

In practice people often end up running less-than-vetted code, e.g. when trying out python packages written by people they don't know. Running sandboxed code seems relatively safe next to that.

Related: dockerized external backends .

Comment by Ilya_Shlyakhter — Thu Apr 1 15:20:01 2021

Remove comment

comment 7

There is an expectation that checking out and looking at a git repository will not cause arbitrary code to be run, sandboxed or not.

I think this can easily be dealt with in layers above git-annex, which can have different expectations about what code is safe to run.

Comment by joey — Thu Apr 1 16:22:32 2021

Remove comment

comment 8

How about this as a compromise that avoids any unwanted code execution while making it easy to enable if you do want it:

When git-annex enables an external special remote (including autoenable), and the special remote program is not available in PATH, and a git config (call it annex.special-remote-installer) is set to a command, git-annex runs that command with the name of the special remote program it wanted to install. The command should install the special remote program into a particular subdirectory in .git/annex/, and git-annex will then use it.

It would then be up to users to decide if they want to set that git config, or if something is being built on top of git-annex and sets up the git repo for them, it could set the config to point to whatever command it provides to install special remote programs.

This also has the benefit of not tying git-annex to any particular technology like docker.

Comment by joey — Thu Apr 1 19:41:10 2021

Remove comment

running untrusted code

My main reason for wanting dockerized special remotes and external backends, is to be able to use custom remotes/backends without adding a burden on repo users (beyond the standard git-annex-init after checkout), similar to what autoenabling of remotes does. So needing users to know about and set some special git config kind of removes the point. Maybe, instead, can just prompt the user for permission to install an external remote/backend, like what emacs does for calling untrusted code?

In practice you'd typically trust code from a specific repo or author, so not sure annex.special-remote-installer could automatically determine the trust.

I understand "the benefit of not tying git-annex to any particular technology like docker"; OTOH it's already tied to some particular technologies, like with the built-in S3 special remote.

Comment by Ilya_Shlyakhter — Wed Apr 7 16:52:41 2021

Remove comment

comment 10

Prompting users about things rarely improves security. (Not saying it doesn't in the case of emacs org mode, which may be a special case.) A good way to make clear to a user that they are running code that comes from a git repository is to make them take the effort to run ./setup or something like that.

It would also be weird for git-annex to prompt for this, since it never prompts about anything else, and auto-enabling these special remotes could happen when running any git-annex command.