Recent comments posted to this site:

How do I get GETGITREMOTENAME to work in INITREMOTE?

I am writing a external special remote using this protocol. This is little similar to the directory remote and there's a path on the local system where content is stored.

I don't want this location to be saved in the git-annex branch and I thought I'll be able to use GETGITREMOTENAME to persist it myself. However, I'm running into an issue where GETGITREMOTENAME fails during INITREMOTE (presumably since the remote has not yet been created). It does work during Prepare, but that feels a bit late to ask for a required piece of configuration.

What are my options? My ideal behavior would be if it behaves very similar to directory= field in directory remote, but I can hand-manage it too if that's the recommendation as long as I get some identifier for this remote (there can be multiple of these in the same repo)

Comment by Katie
comment 7

The www-authenticate header is also sent when the request for /config is a 401. So git-annex can use that to set the wwwauth field.

The capability fields are indicating capabilities of git. I checked and git-credential-oauth does not rely on those capabilities.

(Wildly, git-credential-oauth is looking for "GitLab", "GitHub", and "Gitea" in order to sniff what backend it's authenticating to, and that's all it uses the wwwauth for.)

Comment by joey
comment 6

Forgejo-aneksajo also creates the repository for requests to /config, and will git-annex-init it if the request comes from a git-annex user agent and the user has write permissions.

Hmm, then git-annex pull will create a repository. Which is going further than "push to create".

I do think my idea in comment #2 would be better than how you implemented that. But it's also not directly relevant to this bug report.

I did open support push to create.

Comment by joey
comment 5

git push seems to first make a GET request for something like /m.risse/test-push-oauth2.git/info/refs?service=git-receive-pack, which responds with a 401 and www-authenticate: Basic realm="Gitea" among the headers. Git then seems to pass this information on to the git-credential-helper.

git annex push likewise receives a 401 response from the /config endpoint with the same www-authenticate header, so it could pass it on to the credential helper too.

I am not sure where the capabilitys are coming from...

Comment by matrss
comment 4

The chicken-and-egg problem you are describing is actually something msz has already encountered and reported, but that issue is fixed: Forgejo-aneksajo also creates the repository for requests to /config, and will git-annex-init it if the request comes from a git-annex user agent and the user has write permissions. More about that here:

So that's not it... I've investigated a bit and I think I led you astray with the comment about a "non-existing repository". I am also seeing the issue with a pre-created repository, and even with a pre-created and git-annex-init'ialized repository.

The issue is actually that for ATRIS I rely on git-credential-oauth's "Gitea-like-Server" discovery here: https://github.com/hickford/git-credential-oauth/blob/f01271d94c70b9280c19f489f90c05e9aba0d757/main.go#L206

When doing a git push origin main the git-credential-oauth helper actually receives this request:

$ git push origin main
capability[]=authtype
capability[]=state
protocol=https
host=atris.fz-juelich.de
wwwauth[]=Basic realm="Gitea"

while with git annex push it is just this:

$ git annex push
protocol=https
host=atris.fz-juelich.de

Git-credential-oauth recognizes that it is talking to a Gitea/Forgejo server based on this wwwauth[]=Basic realm="Gitea" data. Without it and in the absence of a more specific configuration for the server it doesn't try to handle it and falls back to the standard http credential handling of git. I am not sure where these capability and wwwauth fields are coming from, but I think git-annex should somehow do the same as git here...


I've gotten at the data git sends to the credential helper with this trivial script:

$ cat ~/bin/git-credential-echo 
#!/usr/bin/env bash

exec cat >&2

and configuring it as my credential helper.

I have to say, I like this pattern of processes communicating over simple line-based protocols :)

Comment by matrss
comment 3

Looks like the 401 Unauthorized happens for all non-existent repos when accessing /config.

Eg:

joey@darkstar:~>curl https://atris.fz-juelich.de/m.risse/joeytestmadeup.git
Not found.
joey@darkstar:~>curl https://atris.fz-juelich.de/m.risse/joeytestmadeup.git/config
Unauthorized

A bug in Forgejo?

Comment by joey
comment 2

If the server sent back 404 for the /config hit, then the early UUID discovery would not prompt with git credential.

Then, to make "push to create" work smoothly, git-annex push, after pushing the git branches, could regenerate the remote list. So if the branch push created the git repo, any annex uuid that the new repo has would be discovered at that point.

The remote list regeneration would only need to be done when there are git remotes that don't have a UUID yet.

The assistant would also need to be made to do that.

Comment by joey
comment 11

Replicated this problem as follows:

  1. modified importKeys to fail at the end
  2. set up a directory special remote with importtree=yes
  3. git config annex.largefiles nothing
  4. run, git-annex import, which fails
  5. that left git-annex branch changes in the journal, for GIT keys
  6. git-annex sync back to origin
  7. return importKeys to usual behavior
  8. make new clone from origin
  9. run git-annex import in the new clone
  10. merge the imported branch into master

result:

error: unable to read sha1 file of 1 (d00491fd7e5bb6fa28c517a0bb32b8b506539d4d)
error: unable to read sha1 file of 2 (5716ca5987cbf97d6bb54920bea6adde242d87e6)
error: unable to read sha1 file of 3 (aab959616afa9408f5efc385eb98f63fdb990ba5)

Verified that 69e6c4d024dcff7c2f8ea1a2ed3b483a86b2cc7d does in fact avoid this problem. Running steps 9 and 10 with that commit results in a non-broken repository.

Yay, solved!

Comment by joey
comment 10

I think that a previous, failed import from the remote, run in a different clone of the repository than the import that later fails, could have caused the problem.

My thinking is, while import is downloading files, the content identifiers get recorded in the git-annex branch. Only once the import is complete does the imported tree get grafted into the git-annex branch. So, if the import fails (or is interrupted), this can leave content identifiers in the log. The git blobs for small files have already been stored in git, but no tree references them. If that git-annex branch gets pushed, then in a separate clone of the repository, running the import again would see those content identifiers. But the git blobs referenced by them would not have been pushed, and so would not be available.

We already know that the import was failing due to the S3 permissions, so the only other thing that would have been needed is for the git-annex branch to be pushed to origin, and then this same import tried later in a different clone.

@yarikoptic does this seem plausibly what could have happened?

Comment by joey