Recent comments posted to this site:

Reigniting interested in this topic and linking to related efforts (BABS etc)

I just saw support for git sparse-checkout merged in BABS and frankly I never knew/used it before! Inspired by an enthusiastic Meng who made a strategic mistake for her PhD progress pioneered use of use of git worktrees in DataLad having attended Distribits 2025, I thought to check if git-annex has support for the sparse-checkout.

In conjunction with sparse-checkout (existing already) support for worktrees in git-annex can make a perfect "couple" for an efficient ephemeral compute where we checkout only what is really needed, e.g. following the datalad run input/output specifications.

This is just a summary of the potential research/implementation since may be it even somehow magically all works already given that BABS merged the sparse-checkout support and they extensively use git annex already...?

Comment by yarikoptic
Reigniting interested in this topic and linking to related efforts (BABS etc)

I just saw support for git sparse-checkout merged in BABS and frankly I never knew/used it before! Inspired by an enthusiastic Meng who made a strategic mistake for her PhD progress pioneered use of use of git worktrees in DataLad having attended Distribits 2025, I thought to check if git-annex has support for the sparse-checkout.

In conjunction with sparse-checkout (existing already) support for worktrees in git-annex can make a perfect "couple" for an efficient ephemeral compute where we checkout only what is really needed, e.g. following the datalad run input/output specifications.

This is just a summary of the potential research/implementation since may be it even somehow magically all works already given that BABS merged the sparse-checkout support and they extensively use git annex already...?

Comment by yarikoptic
comment 8

Ok, I've tagged the todos about import support from rsync, and hopefully that will be able to get implemented.

As for this bug, it seems that at least documentation improvements are needed in order to close it. I have also fixed the adb special remote to avoid the behavior, which leaves webdav and any external special remotes that might have the behavior.

Comment by joey
comment 3

git-annex allows you to have any number of remotes pointing at the same git repository. It is able to tell it's the same git repository, so you don't need anything like sameas in this case.

All you need is a git remote with an url pointing at the network local host, and set the <remote>.annex-cost of that one lower than the other remote. And git-annex will try it first.

Of course a dynamic ssh config is a fine way to do it too..

Comment by joey
comment 1

I'm not sure how it could be a bug in git-annex that it uses any and all rsync options it might choose to use.

In any case, "rsync error: unexpected end of file" kind of looks like openrsync is having difficulty communicating with the remote rsync server, and not like an un-implemented option.

Searching the web for that error message finds plenty of other openrsync users, that are not using git-annex, and have similar problems with it.

I think my suggestion has to be to install the real rsync somewhere in PATH before openrsync, so git-annex will use it.

Comment by joey
comment 1

The git-annex branch is automatically merged by git-annex, it doesn't matter if it has unrelated histories or not. The merge will always succeed, without conflicts.

All you need to do is pull the git-annex branch from a remote, and run git-annex merge.

If for some reason you need to manually merge the git-annex branches, yes all it takes is a simple union merge where on conflict you concatenate both versions of the conflicted file together.

Comment by joey
comment 1

The LLM generated text incorrect. It is conflating two different commits which both modified the same code.

87e0b77a0435522dce7be8ebec77a1326f2ede20 was the push-to-create commit, and it only involves cases after the repoCheap case is handled.

In the meantime, we have f79f7d322bc0278e4edba13c3b57093753fded6c which explicitly involves annex-ignore and local git remotes and is intentionally making the config be read

I have to say that, once again, I find this kind of LLM-generated text counterproductive. My work on git-annex is necessarily detail-oriented, and needing to deal with something that subtly gets details wrong by construction is not helpful.

Comment by joey
comment 2

I may have actually come up with a solution. Instead of creating a second remote, I was able to make my ~/.ssh/config dynamic based on the results of a dig command: https://fmartingr.com/blog/2022/08/12/using-ssh-config-match-to-connect-to-a-host-using-multiple-ip-or-hostnames/

Thanks for going with me on this journey!

Comment by xentac
comment 1
Trying to set up and test this, I just realized these aren't special remotes. They're proper git (git-annex) remotes on my nas. I'm trying to figure out what setting I need to change to mark them as sameas and if cost will work in that case as well.
Comment by xentac
comment 3

After a bug fix, it's now possible to make a sameas remote that is private to the local repository.

git-annex initremote bar --sameas=foo --private type=...

While not ephemeral as such, if you git remote remove bar, the only trace left of it will probably be in .git/annex/journal-private/remote.log, and possibly any creds that got cached for it. It would be possible to have a command that removes the remote, and also clears that.

If that is close enough to ephemeral, then we could think about the second part, extending the external special remote protocol with REDIRECT-REMOTE.

That is similar to Special remote redirect to URL. And a few comments over there go in a similar direction. In particular, the discussion of CLAIMURL. If TRANSFER-RETRIEVE-URL and TRANSFER-CHECKPRESENT-URL supported CLAIMURL, then if the ephermeral special remote had some type of url, that it claimed, those could be used rather than REDIRECT-REMOTE.

That would not cover TRANSFER STORE and REMOVE though. And it probably doesn't make sense to extend those to urls generally. (There are too many ways to store to an url or remove an url, everything isn't WebDAV..)

I don't know if it is really elegant to drag urls into this anyway. The user may be left making up an url scheme for something that does not involve urls at all.

Comment by joey