Recent comments posted to this site:

comment 6

Actually I have gone ahead an implemented some git-annex-matching-options that will be useful in finding content to drop from the trashbin: --presentsince --lackingsince --changedsince

You might use, for example:

git-annex drop --force --from trashbin \
    --presentsince=trashbin:7d --and --not --changedsince=here:7d

That will match files that were moved to the trashbin 7 days ago, and that have not re-entered the current repository in the time since then.

Comment by joey
comment 5

FWIW, dynamically linked binary is no good either:

[yoh@dbic-mrinbox ~]$ wget https://downloads.kitenet.net/git-annex/linux/current/git-annex-standalone-amd64.tar.gz
[yoh@dbic-mrinbox ~]$ tar -xzvf git-annex-standalone-amd64.tar.gz 
[yoh@dbic-mrinbox ~]$ cd git-annex.linux/
[yoh@dbic-mrinbox ~/git-annex.linux]$ ls
LICENSE         exe         git-annex       git-core        git-remote-tor-annex    lib         logo_16x16.png      templates
README          extra           git-annex-shell     git-receive-pack    git-shell       lib64           magic           trustedkeys.gpg
bin         gconvdir        git-annex-webapp    git-remote-annex    git-upload-pack     libdirs         runshell        usr
buildid         git         git-annex.MANIFEST  git-remote-p2p-annex    i18n            logo.svg        shimmed
[yoh@dbic-mrinbox ~/git-annex.linux]$ ./git-annex
ELF binary type "3" not known.
exec: /usr/home/yoh/git-annex.linux/exe/git-annex: Exec format error

I will try to assemble build commands later...

Comment by yarikoptic
comment 5

annex.trashbin is implemented.

I am going to close this todo; if it turns out there is some preferred content improvement that would help with cleaning out the trash, let's talk about that on a new todo. But I'm guessing you'll make do with find.

I think I would deliberately want this to be invisible to the user, since I wouldn't want anyone to actively start relying on it.

With a private remote it's reasonably invisible. The very observant user might notice a drop time that scales with the size of the file being dropped and be able to guess this feature is being used. And, if there is some error when it tries to move the object to the remote, the drop will fail. The error message in that case cannot really obscure the fact that annex.trashbin is configured.

Comment by joey
comment 4

I don't know much about the static-annex builds, but you may have better luck with the Linux standalone builds due to their using a more conventional libc.

Building git-annex from source is not hard if you can get the stack tool installed. It looks like the only currently supported way to do that as a freebsd user is to install https://www.haskell.org/ghcup/ which includes stack. Then follow the fromsource section on "building from source with stack".

Comment by joey
comment 4

IIRC user can just push git-annex branch directly after git-annex merging remote version locally, right?

Sure, but my point was that they would have to change their workflow due to a change on the server that might not be visible to them. Violating least surprise.

Comment by joey
comment 3

In that example, the git-annex branch is not pushed to origin after annexed files are sent to it. So how does git-annex on otherhost know that origin has those files? Well, git-annex-shell, when receiving the files, updates the git-annex branch in origin.

IIRC user can just push git-annex branch directly after git-annex merging remote version locally, right?

Making it read-only would somewhat limit the exposure to all these problems, but if it's read-only, how would any annex objects get into the remote repository in the first place?

my use-case at hands: I manipulate git-annex repo on a linux box on an NFS mount and the original one is freebsd box with bare minimal installation. I have about 50 datasets in a hierarchy. I wanted to backup to another location and it would be more performant to talk to the original freebsd server directly instead of going through NFS mount. I can't install git-annex on that freebsd box ATM.

FWIW, on a second thought, given that I do have a workaround with rsync (verified that it works) and unless another more prominent usecase arrives, might be indeed not worth the hassle.

Comment by yarikoptic
comment 3

don't know much about freebsd but static builds from https://git.kyleam.com/static-annex do not work:

[yoh@dbic-mrinbox ~/git-annex-10.20250828]$ bin/git-annex
ELF binary type "0" not known.
bash: bin/git-annex: cannot execute binary file: Exec format error
[yoh@dbic-mrinbox ~/git-annex-10.20250828]$ file bin/git-annex
bin/git-annex: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=a6f7f36778ade374ef6572c787cacf6ffa2ec78d, with debug_info, not stripped
Comment by yarikoptic
comment 2

Doesn't FreeBSD support emulating linux syscalls? I suspect that the linux standalone tarball could be used to install git-annex on user-space on FreeBSD and work that way. Have not tried it maybe there is a better way, to install a FreeBSD port as a regular user.

Comment by joey
comment 2

As for the idea that git-annex could access a remote without git-annex-shell, I think that any efforts in this area are bound to end up with some partial implementation of a quarter of git-annex-shell in shell script, which is bound to not work as well as the real thing.

Consider that this is a supported workflow:

git push origin master
git-annex copy --to origin

ssh otherhost
cd repo
git pull origin
git-annex get

In that example, the git-annex branch is not pushed to origin after annexed files are sent to it. So how does git-annex on otherhost know that origin has those files? Well, git-annex-shell, when receiving the files, updates the git-annex branch in origin.

So, to support this workflow, the git-annex-shell reimplementation in shell would need to update the git-annex branch. That's about 3000 lines of code in git-annex, with complecations including concurrency, making it fast, etc.

Other complications include supporting different repository versions, populating unlocked files, supporting configs like annex.secure-erase-command, etc. And while any of these could be left out an be documented as limitations of not having git-annex installed, I think the real kicker is that this is behavior what would occur even if git-annex is only temporarily not installed. So there's the risk that any user who is having a bad PATH day suddenly gets a weird behavior.

Making it read-only would somewhat limit the exposure to all these problems, but if it's read-only, how would any annex objects get into the remote repository in the first place?

Using a separate special remote seems much cleaner. Then it's only used if you choose to use it. And it works like any other special remote. The rsync special remote is close enough to work, but a more special-purpose one could support things a bit better.

Comment by joey
comment 1

It's actually possible to use a rsync special remote to fetch objects right out of .git/annex/objects/. For example:

git-annex initremote foo-rsync type=rsync encryption=none rsyncurl=example.com:/path/to/repo/.git/annex/objects/ --sameas=foo

Since the default hash directory paths are different for rsync than for a git-annex repository, getting an object will first try the wrong hash path, which does lead to rsync complaining to stderr. But then it will fall back to a hash path that works.

Sending an object to the rsync special remote will store it in a hash path different from the one that git-annex usually uses. So later switching to using git-annex in that repository will result in some unusual behavior, since it won't see some files that were put there. git-annex fsck will actually recover from this too, eg:

fsck newfile (normalizing object location) (checksum...) ok

There are enough problems that I can't really recommend this, it just seemed worth pointing out that it can be done.

Comment by joey