Please describe the problem.

The documentation for git config annex.web-options says that I should be able to use it to set up HTTP credentials in a ~/.netrc file, but it doesn't work.

I have been given some repos that are password-protected, I want to be able to download them non-interactively in a CI system. I won't sit there typing in the password 500 times for 500 files, and ideally I don't want to even type it once.

git reads ~/.netrc if it exists, and does so consistently enough that http://droneci.com/ has built that in as the default way it passes CI credentials to workers. It would be really great if git-annex did the same, and did it instead of spawning curl. When using an ssh remote, git and git-annex already share the same ssh credentials; it would be awesome if the same could be transparently true for http remotes as well :)

What steps will reproduce the problem?

  1. Set up an HTTP server following https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/, but password-protect it.

    I set up my server on Arch, but I tested the client from both Arch and Ubuntu. Here's the server set up; it should adapt to Debian or Fedora easily enough:

    1. sudo pacman -S --noconfirm apache
    2. echo 'Include conf/extra/git-annex.conf' | sudo tee -a /etc/httpd/conf/httpd.conf
    3. sudo mkdir -p /srv/http/annex && sudo chown -R http:http /srv/http/annex
    4.  cat <<EOF | sudo tee /etc/httpd/conf/extra/git-annex.conf
       DocumentRoot "/srv/http/annex"
       <Directory "/srv/http/annex">
               AllowOverride All
               Options FollowSymlinks Indexes
               Require all granted
       </Directory>
      
    5. Set up a repo:

      1. Switch to http: sudo -u http bash; cd /srv/http/annex
      2. git config user.name httpd; git config user.email httpd@httpd
      3. git init; git annex init
      4. git config core.sharedrepository world; git config receive.denyCurrentBranch updateInstead
      5. mv .git/hooks/post-update.sample .git/hooks/post-update
      6. echo Hello > README.md && git add README.md && git commit -m "README.md"
      7. dd if=/dev/urandom of=large.bin bs=1M count=1 && git annex add large.bin && git add large.bin && git commit -m "large.bin"
    6. (optional): verify the repo is functional:

      1. git clone http://localhost/.git annex-test; cd annex-test
      2. git config annex.security.allowed-ip-addresses all
      3. sha256sum large.bin should fail
      4. git annex get
      5. sha256sum large.bin should succeed, and match the value shown in the symlink in ls -l large.bin
    7. Password protect the repo

      While still in /srv/http/annex:

      1.  cat <<EOF | tee .htaccess
         AuthType Basic
         AuthName gitannex
         AuthUserFile /srv/http/annex/.htpasswd
         Require valid-user
        
      2. htpasswd -bc .htpasswd user4 password
  2. Download the password-protected repo

    1. If the test server is on the same machine: git config --global annex.security.allowed-ip-addresses all

    2. Download the repo without any password helper: 🫤🫤🫤

      1. git clone http://localhost/.git annex-test; cd annex-test; this will prompt for the password set above, e.g.

         $ git clone http://localhost/.git annex-test
         Cloning into 'annex-test'...
         Username for 'http://localhost': user4
         Password for 'http://user4@localhost':
        
      2. git annex get; this will prompt for the password twice: once for the implicit git annex init (that needs to read the remote .git/config) and once for downloading large.bin.

      Running pstree while the prompts are waiting, or using git config annex.debug true, reveals that the prompts are coming from git credential fill.

    3. Drop the annoying redundant password prompts using git-credential-store(1): ✔️✔️✔️

      1. cd $(mktemp -d)
      2. git config --global credential.helper store
      3. git clone http://localhost/.git annex-test; cd annex-test; this will prompt for the password
      4. git annex get; but this will not prompt for any passwords

      This works. So that's awesome, I can use credential.helper store to make my passworded downloads non-interactive by filling in ~/.git-credentials, which, for the record, has one credential per line in this format:

       $ cat ~/.git-credentials
       http://user4:password@localhost
      

      or if a non-standard port is involved:

       $ cat ~/.git-credentials
       http://user4:password@localhost%3a8080
      
      1. (Undo: git config --global --unset credential.helper to avoid contaminating the next test)
    4. Attempt to drop the redundant password prompts using annex.web-options: ❌❌❌

      1. cd $(mktemp -d)
      2. git config --global annex.web-options --netrc
      3. Set up a ~/.netrc:

         cat <<EOF | tee -a ~/.netrc
         machine localhost
         login user4
         password password localhost
        
      4. (optional) verify it works as expected with directly through curl:

         $ curl --no-progress-meter -f -o /dev/null http://localhost/.git; echo $?         # fails
         curl: (22) The requested URL returned error: 401
         22
         $ curl --netrc --no-progress-meter -f -o /dev/null http://localhost/.git; echo $? # works
         0
        
      5. git clone http://localhost/.git annex-test; cd annex-test; this will not prompt for a password because git picks up ~/.netrc automatically.
      6. git annex get; this will prompt for passwords, n+1 times in fact for n=the number of annexed files

I don't understand why this isn't working. The docs say

Setting this option makes git-annex use curl, but only when annex.security.allowed-ip-addresses is configured in a specific way.

and I set allowed-ip-addressess in the specific way, so why is this no bueno?

I've searched the wiki and all I've found is:

  • https://git-annex.branchable.com/news/security_fix_release/
  • https://git-annex.branchable.com/devblog/day_494__url_download_changes/
  • https://git-annex.branchable.com/forum/Use_addurl_with_a_file_on_an_HPC_cluster/

From these, I understand I need to git config --global annex.security.allowed-ip-addresses all, which I did, but otherwise my best guess is that web-options only works when using the web as as special remote with addurl. But here I'm using the web as a regular remote, something which git-annex has support for. But seemingly this corner case isn't working.

I can work around it by rewriting the contents of ~/.netrc into ~/.git-credentials and setting git config --global credential.helper store, but I don't want to duplicate the credentials every time I'm in this situation.

What version of git-annex are you using? On what operating system?

git-annex 10.20220504-g4e4c44ed8 on ArchLinux, and git-annex 8.20210223 on Ubuntu 22.04.

Please provide any additional information below.

[kousu@nigiri tmp.ztnHTYA3ZC]$ cd $(mktemp -d)
[kousu@nigiri tmp.H5EkrNMUPc]$ git config --global annex.security.allowed-ip-addresses all
[kousu@nigiri tmp.H5EkrNMUPc]$ git config --global annex.web-options --netrc
[kousu@nigiri tmp.H5EkrNMUPc]$ cat <<EOF | tee ~/.netrc
machine localhost
login user4
password password
EOF
machine localhost
login user4
password password
[kousu@nigiri tmp.H5EkrNMUPc]$
[kousu@nigiri tmp.H5EkrNMUPc]$ # demonstrate that curl respects --netrc and behaves as expected:
[kousu@nigiri tmp.H5EkrNMUPc]$ curl -v -o /dev/null -f --no-progress-meter http://localhost:80
*   Trying 127.0.0.1:80...
* Connected to localhost (127.0.0.1) port 80 (#0)
> GET / HTTP/1.1
> Host: localhost
> User-Agent: curl/7.84.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 401 Unauthorized
< Date: Thu, 08 Sep 2022 00:49:44 GMT
< Server: Apache/2.4.54 (Unix)
< WWW-Authenticate: Basic realm="gitannex"
< Vary: accept-language,accept-charset
< Accept-Ranges: bytes
< Transfer-Encoding: chunked
< Content-Type: text/html; charset=utf-8
< Content-Language: en
* The requested URL returned error: 401
* Closing connection 0
curl: (22) The requested URL returned error: 401
[kousu@nigiri tmp.H5EkrNMUPc]$ curl --netrc -v -o /dev/null -f --no-progress-meter http://localhost:80
*   Trying 127.0.0.1:80...
* Connected to localhost (127.0.0.1) port 80 (#0)
* Server auth using Basic with user 'user4'
> GET / HTTP/1.1
> Host: localhost
> Authorization: Basic dXNlcjQ6cGFzc3dvcmQ=
> User-Agent: curl/7.84.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Thu, 08 Sep 2022 00:49:41 GMT
< Server: Apache/2.4.54 (Unix)
< Content-Length: 693
< Content-Type: text/html;charset=ISO-8859-1
<
{ [693 bytes data]
* Connection #0 to host localhost left intact
[kousu@nigiri tmp.H5EkrNMUPc]$
[kousu@nigiri tmp.H5EkrNMUPc]$ # demonstrate git respects .netrc:
[kousu@nigiri tmp.H5EkrNMUPc]$ git clone http://localhost/.git annex-test
Cloning into 'annex-test'...
[kousu@nigiri tmp.H5EkrNMUPc]$ cd annex-test/
[kousu@nigiri tmp.H5EkrNMUPc]$
[kousu@nigiri tmp.H5EkrNMUPc]$ # demonstrate that git-annex *does not* respect .netrc
[kousu@nigiri annex-test]$ git annex get
Username for 'http://localhost': ^C
[kousu@nigiri tmp.H5EkrNMUPc]$
[kousu@nigiri tmp.H5EkrNMUPc]$
[kousu@nigiri tmp.H5EkrNMUPc]$
[kousu@nigiri annex-test]$ git annex version
git-annex version: 10.20220504-g4e4c44ed8
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite S3 WebDAV
dependency versions: aws-0.22 bloomfilter-2.0.1.0 cryptonite-0.30 DAV-1.3.4 feed-1.3.2.1 ghc-9.0.2 http-client-0.7.11 persistent-sqlite-2.13.0.3 torrent-10000.1.1 uuid-1.3.15 yesod-1.6.2
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg hook external
operating system: linux x86_64
supported repository versions: 8 9 10
upgrade supported from repository versions: 0 1 2 3 4 5 6 7 8 9 10
local repository version: 8
[kousu@nigiri annex-test]$ cat /etc/os-release
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
ANSI_COLOR="38;2;23;147;209"
HOME_URL="https://archlinux.org/"
DOCUMENTATION_URL="https://wiki.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://bugs.archlinux.org/"
LOGO=archlinux-logo

With the older git-annex, I set up a proxy so I could reuse the same server, which changed the port, but otherwise everything else is the same:

$ ssh -R 8080:localhost:80 joplin
p115628@joplin:~$ cd $(mktemp -d)
p115628@joplin:/tmp/tmp.glF9EdYhnR$ git config --global annex.security.allowed-ip-addresses all
p115628@joplin:/tmp/tmp.glF9EdYhnR$ git config --global annex.web-options "--netrc"
p115628@joplin:/tmp/tmp.glF9EdYhnR$ git clone http://localhost:8080/.git annex-test   # verify it's password protected
Cloning into 'annex-test'...
Username for 'http://localhost:8080': ^C
p115628@joplin:/tmp/tmp.glF9EdYhnR$ cat <<EOF | tee ~/.netrc
machine localhost
login user4
password password
EOF
machine localhost
login user4
password password
p115628@joplin:/tmp/tmp.glF9EdYhnR$ git clone http://localhost:8080/.git annex-test  # verify the .netrc file works with git
Cloning into 'annex-test'...
p115628@joplin:/tmp/tmp.glF9EdYhnR$ cd annex-test/
p115628@joplin:/tmp/tmp.glF9EdYhnR/annex-test$ git annex get                         # but does not work with git-annex
(merging origin/git-annex into git-annex...)
(recording state in git...)
(scanning for unlocked files...)
Username for 'http://localhost:8080': ^C
p115628@joplin:/tmp/tmp.glF9EdYhnR/annex-test$
p115628@joplin:/tmp/tmp.glF9EdYhnR/annex-test$
p115628@joplin:/tmp/tmp.glF9EdYhnR/annex-test$
p115628@joplin:/tmp/tmp.glF9EdYhnR/annex-test$ git annex version
git-annex version: 8.20210223
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite S3 WebDAV
dependency versions: aws-0.22 bloomfilter-2.0.1.0 cryptonite-0.26 DAV-1.3.4 feed-1.3.0.1 ghc-8.8.4 http-client-0.6.4.1 persistent-sqlite-2.10.6.2 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.1.0
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg hook external
operating system: linux x86_64
supported repository versions: 8
upgrade supported from repository versions: 0 1 2 3 4 5 6 7
local repository version: 8
p115628@joplin:/tmp/tmp.glF9EdYhnR/annex-test$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.1 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)

Sure! Lots! We use it to share a large open access dataset at https://github.com/spine-generic, and I'm working on helping other researchers share their datasets on their own infrastructure using git-annex + gitea.

done