I thought I had an issue on this but failed to find 
ATM git-annex does not even bother to suggest or do anything about a remote git/git-annex repository if there is no git-annex (git-annex-shell) available there:
yoh@typhon:/mnt/DATA/data/dbic/QA$ git annex list
Unable to parse git config from origin
Remote origin does not have git-annex installed; setting annex-ignore
This could be a problem with the git-annex installation on the remote. Please make sure that git-annex-shell is available in PATH when you ssh into the remote. Once you have fixed the git-annex installation, run: git annex enableremote origin
here
|datasets.datalad.org
||origin
|||web
||||bittorrent
|||||
_X___ .datalad/metadata/objects/06/cn-2c3eade47bd2d9052658c6a9d10a57.xz
...
a workaround, it seems as it was posted over a decade ago (and now even google ai suggests that) is to setup an additional rsync remote and use it to fetch. upon a quick try didn't work for me but could have been an operator error...
As files are available over regular ssh/scp and even rsync over ssh - I really do not see a technical problem for git-annex to establish interoperability with such a remote, at least for reading from, without having remote git-annex-shell. That should make it possible to access git-annex'es on servers which might be running some odd setups where installation of git-annex in user-space would be tricky if not impossible.
It's actually possible to use a rsync special remote to fetch objects right out of
.git/annex/objects/. For example:Since the default hash directory paths are different for rsync than for a git-annex repository, getting an object will first try the wrong hash path, which does lead to rsync complaining to stderr. But then it will fall back to a hash path that works.
Sending an object to the rsync special remote will store it in a hash path different from the one that git-annex usually uses. So later switching to using git-annex in that repository will result in some unusual behavior, since it won't see some files that were put there.
git-annex fsckwill actually recover from this too, eg:There are enough problems that I can't really recommend this, it just seemed worth pointing out that it can be done.
As for the idea that git-annex could access a remote without git-annex-shell, I think that any efforts in this area are bound to end up with some partial implementation of a quarter of git-annex-shell in shell script, which is bound to not work as well as the real thing.
Consider that this is a supported workflow:
In that example, the git-annex branch is not pushed to origin after annexed files are sent to it. So how does git-annex on otherhost know that origin has those files? Well, git-annex-shell, when receiving the files, updates the git-annex branch in origin.
So, to support this workflow, the git-annex-shell reimplementation in shell would need to update the git-annex branch. That's about 3000 lines of code in git-annex, with complecations including concurrency, making it fast, etc.
Other complications include supporting different repository versions, populating unlocked files, supporting configs like annex.secure-erase-command, etc. And while any of these could be left out an be documented as limitations of not having git-annex installed, I think the real kicker is that this is behavior what would occur even if git-annex is only temporarily not installed. So there's the risk that any user who is having a bad PATH day suddenly gets a weird behavior.
Making it read-only would somewhat limit the exposure to all these problems, but if it's read-only, how would any annex objects get into the remote repository in the first place?
Using a separate special remote seems much cleaner. Then it's only used if you choose to use it. And it works like any other special remote. The rsync special remote is close enough to work, but a more special-purpose one could support things a bit better.
IIRC user can just push
git-annexbranch directly aftergit-annexmerging remote version locally, right?my use-case at hands: I manipulate git-annex repo on a linux box on an NFS mount and the original one is freebsd box with bare minimal installation. I have about 50 datasets in a hierarchy. I wanted to backup to another location and it would be more performant to talk to the original freebsd server directly instead of going through NFS mount. I can't install git-annex on that freebsd box ATM.
FWIW, on a second thought, given that I do have a workaround with
rsync(verified that it works) and unless another more prominent usecase arrives, might be indeed not worth the hassle.Sure, but my point was that they would have to change their workflow due to a change on the server that might not be visible to them. Violating least surprise.