Please describe the problem.
The documentation for git config annex.web-options
says that I should be able to use it to set up HTTP credentials in a ~/.netrc file, but it doesn't work.
I have been given some repos that are password-protected, I want to be able to download them non-interactively in a CI system. I won't sit there typing in the password 500 times for 500 files, and ideally I don't want to even type it once.
git
reads ~/.netrc
if it exists, and does so consistently enough that http://droneci.com/ has built that in as the default way it passes CI credentials to workers. It would be really great if git-annex
did the same, and did it instead of spawning curl
. When using an ssh remote, git and git-annex already share the same ssh credentials; it would be awesome if the same could be transparently true for http remotes as well
What steps will reproduce the problem?
Set up an HTTP server following https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/, but password-protect it.
I set up my server on Arch, but I tested the client from both Arch and Ubuntu. Here's the server set up; it should adapt to Debian or Fedora easily enough:
sudo pacman -S --noconfirm apache
echo 'Include conf/extra/git-annex.conf' | sudo tee -a /etc/httpd/conf/httpd.conf
sudo mkdir -p /srv/http/annex && sudo chown -R http:http /srv/http/annex
cat <<EOF | sudo tee /etc/httpd/conf/extra/git-annex.conf DocumentRoot "/srv/http/annex" <Directory "/srv/http/annex"> AllowOverride All Options FollowSymlinks Indexes Require all granted </Directory>
Set up a repo:
- Switch to
http
:sudo -u http bash
;cd /srv/http/annex
git config user.name httpd; git config user.email httpd@httpd
git init; git annex init
git config core.sharedrepository world; git config receive.denyCurrentBranch updateInstead
mv .git/hooks/post-update.sample .git/hooks/post-update
echo Hello > README.md && git add README.md && git commit -m "README.md"
dd if=/dev/urandom of=large.bin bs=1M count=1 && git annex add large.bin && git add large.bin && git commit -m "large.bin"
- Switch to
(optional): verify the repo is functional:
git clone http://localhost/.git annex-test; cd annex-test
git config annex.security.allowed-ip-addresses all
sha256sum large.bin
should failgit annex get
sha256sum large.bin
should succeed, and match the value shown in the symlink inls -l large.bin
Password protect the repo
While still in
/srv/http/annex
:cat <<EOF | tee .htaccess AuthType Basic AuthName gitannex AuthUserFile /srv/http/annex/.htpasswd Require valid-user
htpasswd -bc .htpasswd user4 password
Download the password-protected repo
If the test server is on the same machine:
git config --global annex.security.allowed-ip-addresses all
Download the repo without any password helper: 🫤🫤🫤
git clone http://localhost/.git annex-test; cd annex-test
; this will prompt for the password set above, e.g.$ git clone http://localhost/.git annex-test Cloning into 'annex-test'... Username for 'http://localhost': user4 Password for 'http://user4@localhost':
git annex get
; this will prompt for the password twice: once for the implicitgit annex init
(that needs to read the remote.git/config
) and once for downloading large.bin.
Running
pstree
while the prompts are waiting, or usinggit config annex.debug true
, reveals that the prompts are coming fromgit credential fill
.Drop the annoying redundant password prompts using git-credential-store(1): ✔️✔️✔️
cd $(mktemp -d)
git config --global credential.helper store
git clone http://localhost/.git annex-test; cd annex-test
; this will prompt for the passwordgit annex get
; but this will not prompt for any passwords
This works. So that's awesome, I can use
credential.helper store
to make my passworded downloads non-interactive by filling in~/.git-credentials
, which, for the record, has one credential per line in this format:$ cat ~/.git-credentials http://user4:password@localhost
or if a non-standard port is involved:
$ cat ~/.git-credentials http://user4:password@localhost%3a8080
- (Undo:
git config --global --unset credential.helper
to avoid contaminating the next test)
Attempt to drop the redundant password prompts using
annex.web-options
: ❌❌❌cd $(mktemp -d)
git config --global annex.web-options --netrc
Set up a
~/.netrc
:cat <<EOF | tee -a ~/.netrc machine localhost login user4 password password localhost
(optional) verify it works as expected with directly through curl:
$ curl --no-progress-meter -f -o /dev/null http://localhost/.git; echo $? # fails curl: (22) The requested URL returned error: 401 22 $ curl --netrc --no-progress-meter -f -o /dev/null http://localhost/.git; echo $? # works 0
git clone http://localhost/.git annex-test; cd annex-test
; this will not prompt for a password becausegit
picks up~/.netrc
automatically.git annex get
; this will prompt for passwords, n+1 times in fact for n=the number of annexed files
I don't understand why this isn't working. The docs say
Setting this option makes git-annex use curl, but only when annex.security.allowed-ip-addresses is configured in a specific way.
and I set allowed-ip-addressess
in the specific way, so why is this no bueno?
I've searched the wiki and all I've found is:
- https://git-annex.branchable.com/news/security_fix_release/
- https://git-annex.branchable.com/devblog/day_494__url_download_changes/
- https://git-annex.branchable.com/forum/Use_addurl_with_a_file_on_an_HPC_cluster/
From these, I understand I need to git config --global annex.security.allowed-ip-addresses all
, which I did, but otherwise my best guess is that web-options
only works when using the web as as special remote with addurl
. But here I'm using the web as a regular remote, something which git-annex has support for. But seemingly this corner case isn't working.
I can work around it by rewriting the contents of ~/.netrc
into ~/.git-credentials
and setting git config --global credential.helper store
, but I don't want to duplicate the credentials every time I'm in this situation.
What version of git-annex are you using? On what operating system?
git-annex 10.20220504-g4e4c44ed8 on ArchLinux, and git-annex 8.20210223 on Ubuntu 22.04.
Please provide any additional information below.
[kousu@nigiri tmp.ztnHTYA3ZC]$ cd $(mktemp -d)
[kousu@nigiri tmp.H5EkrNMUPc]$ git config --global annex.security.allowed-ip-addresses all
[kousu@nigiri tmp.H5EkrNMUPc]$ git config --global annex.web-options --netrc
[kousu@nigiri tmp.H5EkrNMUPc]$ cat <<EOF | tee ~/.netrc
machine localhost
login user4
password password
EOF
machine localhost
login user4
password password
[kousu@nigiri tmp.H5EkrNMUPc]$
[kousu@nigiri tmp.H5EkrNMUPc]$ # demonstrate that curl respects --netrc and behaves as expected:
[kousu@nigiri tmp.H5EkrNMUPc]$ curl -v -o /dev/null -f --no-progress-meter http://localhost:80
* Trying 127.0.0.1:80...
* Connected to localhost (127.0.0.1) port 80 (#0)
> GET / HTTP/1.1
> Host: localhost
> User-Agent: curl/7.84.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 401 Unauthorized
< Date: Thu, 08 Sep 2022 00:49:44 GMT
< Server: Apache/2.4.54 (Unix)
< WWW-Authenticate: Basic realm="gitannex"
< Vary: accept-language,accept-charset
< Accept-Ranges: bytes
< Transfer-Encoding: chunked
< Content-Type: text/html; charset=utf-8
< Content-Language: en
* The requested URL returned error: 401
* Closing connection 0
curl: (22) The requested URL returned error: 401
[kousu@nigiri tmp.H5EkrNMUPc]$ curl --netrc -v -o /dev/null -f --no-progress-meter http://localhost:80
* Trying 127.0.0.1:80...
* Connected to localhost (127.0.0.1) port 80 (#0)
* Server auth using Basic with user 'user4'
> GET / HTTP/1.1
> Host: localhost
> Authorization: Basic dXNlcjQ6cGFzc3dvcmQ=
> User-Agent: curl/7.84.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Thu, 08 Sep 2022 00:49:41 GMT
< Server: Apache/2.4.54 (Unix)
< Content-Length: 693
< Content-Type: text/html;charset=ISO-8859-1
<
{ [693 bytes data]
* Connection #0 to host localhost left intact
[kousu@nigiri tmp.H5EkrNMUPc]$
[kousu@nigiri tmp.H5EkrNMUPc]$ # demonstrate git respects .netrc:
[kousu@nigiri tmp.H5EkrNMUPc]$ git clone http://localhost/.git annex-test
Cloning into 'annex-test'...
[kousu@nigiri tmp.H5EkrNMUPc]$ cd annex-test/
[kousu@nigiri tmp.H5EkrNMUPc]$
[kousu@nigiri tmp.H5EkrNMUPc]$ # demonstrate that git-annex *does not* respect .netrc
[kousu@nigiri annex-test]$ git annex get
Username for 'http://localhost': ^C
[kousu@nigiri tmp.H5EkrNMUPc]$
[kousu@nigiri tmp.H5EkrNMUPc]$
[kousu@nigiri tmp.H5EkrNMUPc]$
[kousu@nigiri annex-test]$ git annex version
git-annex version: 10.20220504-g4e4c44ed8
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite S3 WebDAV
dependency versions: aws-0.22 bloomfilter-2.0.1.0 cryptonite-0.30 DAV-1.3.4 feed-1.3.2.1 ghc-9.0.2 http-client-0.7.11 persistent-sqlite-2.13.0.3 torrent-10000.1.1 uuid-1.3.15 yesod-1.6.2
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg hook external
operating system: linux x86_64
supported repository versions: 8 9 10
upgrade supported from repository versions: 0 1 2 3 4 5 6 7 8 9 10
local repository version: 8
[kousu@nigiri annex-test]$ cat /etc/os-release
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
ANSI_COLOR="38;2;23;147;209"
HOME_URL="https://archlinux.org/"
DOCUMENTATION_URL="https://wiki.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://bugs.archlinux.org/"
LOGO=archlinux-logo
With the older git-annex, I set up a proxy so I could reuse the same server, which changed the port, but otherwise everything else is the same:
$ ssh -R 8080:localhost:80 joplin
p115628@joplin:~$ cd $(mktemp -d)
p115628@joplin:/tmp/tmp.glF9EdYhnR$ git config --global annex.security.allowed-ip-addresses all
p115628@joplin:/tmp/tmp.glF9EdYhnR$ git config --global annex.web-options "--netrc"
p115628@joplin:/tmp/tmp.glF9EdYhnR$ git clone http://localhost:8080/.git annex-test # verify it's password protected
Cloning into 'annex-test'...
Username for 'http://localhost:8080': ^C
p115628@joplin:/tmp/tmp.glF9EdYhnR$ cat <<EOF | tee ~/.netrc
machine localhost
login user4
password password
EOF
machine localhost
login user4
password password
p115628@joplin:/tmp/tmp.glF9EdYhnR$ git clone http://localhost:8080/.git annex-test # verify the .netrc file works with git
Cloning into 'annex-test'...
p115628@joplin:/tmp/tmp.glF9EdYhnR$ cd annex-test/
p115628@joplin:/tmp/tmp.glF9EdYhnR/annex-test$ git annex get # but does not work with git-annex
(merging origin/git-annex into git-annex...)
(recording state in git...)
(scanning for unlocked files...)
Username for 'http://localhost:8080': ^C
p115628@joplin:/tmp/tmp.glF9EdYhnR/annex-test$
p115628@joplin:/tmp/tmp.glF9EdYhnR/annex-test$
p115628@joplin:/tmp/tmp.glF9EdYhnR/annex-test$
p115628@joplin:/tmp/tmp.glF9EdYhnR/annex-test$ git annex version
git-annex version: 8.20210223
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite S3 WebDAV
dependency versions: aws-0.22 bloomfilter-2.0.1.0 cryptonite-0.26 DAV-1.3.4 feed-1.3.0.1 ghc-8.8.4 http-client-0.6.4.1 persistent-sqlite-2.10.6.2 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.1.0
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg hook external
operating system: linux x86_64
supported repository versions: 8
upgrade supported from repository versions: 0 1 2 3 4 5 6 7
local repository version: 8
p115628@joplin:/tmp/tmp.glF9EdYhnR/annex-test$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.1 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Sure! Lots! We use it to share a large open access dataset at https://github.com/spine-generic, and I'm working on helping other researchers share their datasets on their own infrastructure using git-annex + gitea.
I've found a different workaround.
git
ships with an unmaintained(?) alternate netrc parser written in perl. You just need to install it:This seems to sidestep the problem! It's weird that
git clone
will read netrc directly, butgit credential fill
won't? Oh well.There's a SO post about this script.
I still think it would be good for the docs and the code to be made consistent: if
web-options
is supposed to invoke curl then make it invoke curl, or else not the corner cases where it won't invoke curl.Hope this helps someone out down the road.
Confirmed this behavior.
It is due to withUrlOptionsPromptingCreds, which forces use of conduit rather than curl. The idea there was to use git credentials when basic auth is needed. Since those can be provided to conduit but not to curl (securely).
But I do think that, if the user has forced use of curl, it ought to use curl. Even if the user only set options to
-4
, and so curl is not going to use the netrc and will fail the download. I have changed it to do so.This bug report also suggests making git-annex read the netrc file itself. Note that git does not read the netrc file itself. What it does do is use libcurl. git-annex has good reasons to not use libcurl though.
I am not thrilled by the prospect of implementing a parser for netrc in git-annex. The file is not even documented on my debian system; curl's man page links to a
netrc(5)
but that does not exist.Aside from git-credential-netrc, there is not a single mention of the netrc file in git's documentation. This is arguably surprising behavior on the part of git.
I feel that git's support for netrc is vestigal and mostly supersceded by git credentials.
Thanks for your attention and for confirming there's an issue!
I did notice this. It is a surprising behaviour! I only discovered it because DroneCI discovered it.
If that's your opinion, I would rather make
git config credential.helper store
the canonical solution for noninteractive passworded HTTP, since it works consistently with bothgit
andgit-annex
. I think everyone would find it easier to wrap our heads around if the docs dropped the mention of netrc and instead explained that git-annex hooks into git-credential(1) and that everyone should use that.Sorry, this was a typo:
I did rework the annex.web-options documentation earlier, it now has:
That's awesome! Thanks very much joey.
I'll mark this done now