Recent comments posted to this site:

comment 4

yes -- that one is embargoed (can be seen by going to https://dandiarchive.org/dandiset/000675)

And when you replicated the problem from the backup, were you using it in the configuration where it cannot access those?

if I got the question right and since I do not recall now -- judging from me using ( source .git/secrets.env; git-annex import master... I think I was with credentials allowing to access them (hence no errors while importing)

Do you have annex.largefiles configured in this repository, and are all of the affected files non-annexed files?

yes

(venv-annex) dandi@drogon:/mnt/backup/dandi/dandiset-manifests$ grep largefiles .gitattributes
**/.git* annex.largefiles=nothing
* annex.largefiles=((mimeencoding=binary)and(largerthan=0))

and it seems all go into git

(venv-annex) dandi@drogon:/mnt/backup/dandi/dandiset-manifests$ git annex list
here          
|s3-dandiarchive (untrusted)
||web         
|||bittorrent 
||||          

is empty

Comment by yarikoptic
comment 3

yes -- small files, go to git

no, it is a small number of files created/renamed. In this case it is a set of 4 files pre-created empty and closed, and then 3 out of 4 opened for writing by duct and at the end of the process closed, and that original 1 (_info.json) is reopened for writing to dump the record and closed. Then outside tool which ran it takes all of them and renames into the filename with end timestamp. git-annex manages to detect that original 0-sized _info.json one gets removed but does not pick up the new one which gets rapidly renamed into a longer name.

In git log looks like:

commit 65e9f13a882ef78d743fbe634c8e05f9dcb32c45
Author: ReproStim User <changeme@example.com>
Date:   Tue Dec 16 09:44:30 2025 -0500

    git-annex in reprostim@reproiner:/data/reprostim

 Videos/2025/12/2025.12.16-09.30.29.570--.mkv.duct_info.json                         | 0
 Videos/2025/12/2025.12.16-09.30.29.570--2025.12.16-09.44.28.225.mkv                 | 1 +
 Videos/2025/12/2025.12.16-09.30.29.570--2025.12.16-09.44.28.225.mkv.duct_usage.json | 1 +
 Videos/2025/12/2025.12.16-09.30.29.570--2025.12.16-09.44.28.225.mkv.log             | 1 +
 4 files changed, 3 insertions(+)

commit 3fe4710fc058e7d1433637c9af538b3bb9e5ebed
Author: ReproStim User <changeme@example.com>
Date:   Tue Dec 16 09:30:31 2025 -0500

    git-annex in reprostim@reproiner:/data/reprostim

 Videos/2025/12/2025.12.16-09.30.29.570--.mkv.duct_info.json | 0
 1 file changed, 0 insertions(+), 0 deletions(-)

commit f6bb6137c81ef36387ded229a4d8592964530bc8
Author: ReproStim User <changeme@example.com>
Date:   Tue Dec 16 09:30:23 2025 -0500

    git-annex in reprostim@reproiner:/data/reprostim

 Videos/2025/12/2025.12.16-09.29.32.681--.mkv.duct_info.json                         | 0
 Videos/2025/12/2025.12.16-09.29.32.681--2025.12.16-09.30.21.889.mkv                 | 1 +
 Videos/2025/12/2025.12.16-09.29.32.681--2025.12.16-09.30.21.889.mkv.duct_usage.json | 1 +
 Videos/2025/12/2025.12.16-09.29.32.681--2025.12.16-09.30.21.889.mkv.log             | 1 +
 4 files changed, 3 insertions(+)

commit 00444920167e17b429d10fa29df8f1947930152c
Author: ReproStim User <changeme@example.com>
Date:   Tue Dec 16 09:29:34 2025 -0500

    git-annex in reprostim@reproiner:/data/reprostim

 Videos/2025/12/2025.12.16-09.29.32.681--.mkv.duct_info.json | 0
 1 file changed, 0 insertions(+), 0 deletions(-)

Here is a copy of current process: https://www.oneukrainian.com/tmp/daemon-20251216.log

Comment by yarikoptic
passing additional flags to rclone

I'm trying to pass additional flags to rclone, like --bwlimit for example. Not sure how to do that, though. The --whatelse flag tells me they should just be passed by default:

> git annex initremote hetzner type=rclone rcloneremotename=hetzner rcloneprefix=someprefix  encryption=shared chunk=500MiB  --whatelse
embedcreds
    embed credentials into git repository
    (yes or no)
onlyencryptcreds
    only encrypt embedded credentials, not annexed files
    (yes or no)
mac
    how to encrypt filenames used on the remote
    (HMACSHA1 or HMACSHA224 or HMACSHA256 or HMACSHA384 or HMACSHA512)
keyid
    gpg key id
keyid+
    add additional gpg key
keyid-
    remove gpg key
*
    all other parameters are passed to rclone

I tried --bwlimit 3000 and bwlimit=3000, but that gives me invalid option plus help text or git-annex: Unexpected parameters: bwlimit respectively.

Comment by nadir
comment 1

This is not a bug. While it could be moved to todo, anyone can write an external special remote to use this or any other storage system.

So I am closing this bug report.

Comment by joey
comment 6

Actually I have gone ahead an implemented some git-annex-matching-options that will be useful in finding content to drop from the trashbin: --presentsince --lackingsince --changedsince

You might use, for example:

git-annex drop --force --from trashbin \
    --presentsince=trashbin:7d --and --not --changedsince=here:7d

That will match files that were moved to the trashbin 7 days ago, and that have not re-entered the current repository in the time since then.

Comment by joey
comment 5

FWIW, dynamically linked binary is no good either:

[yoh@dbic-mrinbox ~]$ wget https://downloads.kitenet.net/git-annex/linux/current/git-annex-standalone-amd64.tar.gz
[yoh@dbic-mrinbox ~]$ tar -xzvf git-annex-standalone-amd64.tar.gz 
[yoh@dbic-mrinbox ~]$ cd git-annex.linux/
[yoh@dbic-mrinbox ~/git-annex.linux]$ ls
LICENSE         exe         git-annex       git-core        git-remote-tor-annex    lib         logo_16x16.png      templates
README          extra           git-annex-shell     git-receive-pack    git-shell       lib64           magic           trustedkeys.gpg
bin         gconvdir        git-annex-webapp    git-remote-annex    git-upload-pack     libdirs         runshell        usr
buildid         git         git-annex.MANIFEST  git-remote-p2p-annex    i18n            logo.svg        shimmed
[yoh@dbic-mrinbox ~/git-annex.linux]$ ./git-annex
ELF binary type "3" not known.
exec: /usr/home/yoh/git-annex.linux/exe/git-annex: Exec format error

I will try to assemble build commands later...

Comment by yarikoptic
comment 5

annex.trashbin is implemented.

I am going to close this todo; if it turns out there is some preferred content improvement that would help with cleaning out the trash, let's talk about that on a new todo. But I'm guessing you'll make do with find.

I think I would deliberately want this to be invisible to the user, since I wouldn't want anyone to actively start relying on it.

With a private remote it's reasonably invisible. The very observant user might notice a drop time that scales with the size of the file being dropped and be able to guess this feature is being used. And, if there is some error when it tries to move the object to the remote, the drop will fail. The error message in that case cannot really obscure the fact that annex.trashbin is configured.

Comment by joey
comment 4

I don't know much about the static-annex builds, but you may have better luck with the Linux standalone builds due to their using a more conventional libc.

Building git-annex from source is not hard if you can get the stack tool installed. It looks like the only currently supported way to do that as a freebsd user is to install https://www.haskell.org/ghcup/ which includes stack. Then follow the fromsource section on "building from source with stack".

Comment by joey
comment 4

IIRC user can just push git-annex branch directly after git-annex merging remote version locally, right?

Sure, but my point was that they would have to change their workflow due to a change on the server that might not be visible to them. Violating least surprise.

Comment by joey
comment 3

In that example, the git-annex branch is not pushed to origin after annexed files are sent to it. So how does git-annex on otherhost know that origin has those files? Well, git-annex-shell, when receiving the files, updates the git-annex branch in origin.

IIRC user can just push git-annex branch directly after git-annex merging remote version locally, right?

Making it read-only would somewhat limit the exposure to all these problems, but if it's read-only, how would any annex objects get into the remote repository in the first place?

my use-case at hands: I manipulate git-annex repo on a linux box on an NFS mount and the original one is freebsd box with bare minimal installation. I have about 50 datasets in a hierarchy. I wanted to backup to another location and it would be more performant to talk to the original freebsd server directly instead of going through NFS mount. I can't install git-annex on that freebsd box ATM.

FWIW, on a second thought, given that I do have a workaround with rsync (verified that it works) and unless another more prominent usecase arrives, might be indeed not worth the hassle.

Comment by yarikoptic