Recent comments posted to this site:
yes -- that one is embargoed (can be seen by going to https://dandiarchive.org/dandiset/000675)
And when you replicated the problem from the backup, were you using it in the configuration where it cannot access those?
if I got the question right and since I do not recall now -- judging from me using ( source .git/secrets.env; git-annex import master... I think I was with credentials allowing to access them (hence no errors while importing)
Do you have annex.largefiles configured in this repository, and are all of the affected files non-annexed files?
yes
(venv-annex) dandi@drogon:/mnt/backup/dandi/dandiset-manifests$ grep largefiles .gitattributes
**/.git* annex.largefiles=nothing
* annex.largefiles=((mimeencoding=binary)and(largerthan=0))
and it seems all go into git
(venv-annex) dandi@drogon:/mnt/backup/dandi/dandiset-manifests$ git annex list
here
|s3-dandiarchive (untrusted)
||web
|||bittorrent
||||
is empty
yes -- small files, go to git
no, it is a small number of files created/renamed. In this case it is a set of 4 files pre-created empty and closed, and then 3 out of 4 opened for writing by duct and at the end of the process closed, and that original 1 (_info.json) is reopened for writing to dump the record and closed. Then outside tool which ran it takes all of them and renames into the filename with end timestamp. git-annex manages to detect that original 0-sized _info.json one gets removed but does not pick up the new one which gets rapidly renamed into a longer name.
In git log looks like:
commit 65e9f13a882ef78d743fbe634c8e05f9dcb32c45
Author: ReproStim User <changeme@example.com>
Date: Tue Dec 16 09:44:30 2025 -0500
git-annex in reprostim@reproiner:/data/reprostim
Videos/2025/12/2025.12.16-09.30.29.570--.mkv.duct_info.json | 0
Videos/2025/12/2025.12.16-09.30.29.570--2025.12.16-09.44.28.225.mkv | 1 +
Videos/2025/12/2025.12.16-09.30.29.570--2025.12.16-09.44.28.225.mkv.duct_usage.json | 1 +
Videos/2025/12/2025.12.16-09.30.29.570--2025.12.16-09.44.28.225.mkv.log | 1 +
4 files changed, 3 insertions(+)
commit 3fe4710fc058e7d1433637c9af538b3bb9e5ebed
Author: ReproStim User <changeme@example.com>
Date: Tue Dec 16 09:30:31 2025 -0500
git-annex in reprostim@reproiner:/data/reprostim
Videos/2025/12/2025.12.16-09.30.29.570--.mkv.duct_info.json | 0
1 file changed, 0 insertions(+), 0 deletions(-)
commit f6bb6137c81ef36387ded229a4d8592964530bc8
Author: ReproStim User <changeme@example.com>
Date: Tue Dec 16 09:30:23 2025 -0500
git-annex in reprostim@reproiner:/data/reprostim
Videos/2025/12/2025.12.16-09.29.32.681--.mkv.duct_info.json | 0
Videos/2025/12/2025.12.16-09.29.32.681--2025.12.16-09.30.21.889.mkv | 1 +
Videos/2025/12/2025.12.16-09.29.32.681--2025.12.16-09.30.21.889.mkv.duct_usage.json | 1 +
Videos/2025/12/2025.12.16-09.29.32.681--2025.12.16-09.30.21.889.mkv.log | 1 +
4 files changed, 3 insertions(+)
commit 00444920167e17b429d10fa29df8f1947930152c
Author: ReproStim User <changeme@example.com>
Date: Tue Dec 16 09:29:34 2025 -0500
git-annex in reprostim@reproiner:/data/reprostim
Videos/2025/12/2025.12.16-09.29.32.681--.mkv.duct_info.json | 0
1 file changed, 0 insertions(+), 0 deletions(-)
Here is a copy of current process: https://www.oneukrainian.com/tmp/daemon-20251216.log
I'm trying to pass additional flags to rclone, like --bwlimit for example. Not sure how to do that, though. The --whatelse flag tells me they should just be passed by default:
> git annex initremote hetzner type=rclone rcloneremotename=hetzner rcloneprefix=someprefix encryption=shared chunk=500MiB --whatelse
embedcreds
embed credentials into git repository
(yes or no)
onlyencryptcreds
only encrypt embedded credentials, not annexed files
(yes or no)
mac
how to encrypt filenames used on the remote
(HMACSHA1 or HMACSHA224 or HMACSHA256 or HMACSHA384 or HMACSHA512)
keyid
gpg key id
keyid+
add additional gpg key
keyid-
remove gpg key
*
all other parameters are passed to rclone
I tried --bwlimit 3000 and bwlimit=3000, but that gives me invalid option plus help text or git-annex: Unexpected parameters: bwlimit respectively.
This is not a bug. While it could be moved to todo, anyone can write an external special remote to use this or any other storage system.
So I am closing this bug report.
Actually I have gone ahead an implemented some
git-annex-matching-options that will be useful
in finding content to drop from the trashbin:
--presentsince --lackingsince --changedsince
You might use, for example:
git-annex drop --force --from trashbin \
--presentsince=trashbin:7d --and --not --changedsince=here:7d
That will match files that were moved to the trashbin 7 days ago, and that have not re-entered the current repository in the time since then.
FWIW, dynamically linked binary is no good either:
[yoh@dbic-mrinbox ~]$ wget https://downloads.kitenet.net/git-annex/linux/current/git-annex-standalone-amd64.tar.gz
[yoh@dbic-mrinbox ~]$ tar -xzvf git-annex-standalone-amd64.tar.gz
[yoh@dbic-mrinbox ~]$ cd git-annex.linux/
[yoh@dbic-mrinbox ~/git-annex.linux]$ ls
LICENSE exe git-annex git-core git-remote-tor-annex lib logo_16x16.png templates
README extra git-annex-shell git-receive-pack git-shell lib64 magic trustedkeys.gpg
bin gconvdir git-annex-webapp git-remote-annex git-upload-pack libdirs runshell usr
buildid git git-annex.MANIFEST git-remote-p2p-annex i18n logo.svg shimmed
[yoh@dbic-mrinbox ~/git-annex.linux]$ ./git-annex
ELF binary type "3" not known.
exec: /usr/home/yoh/git-annex.linux/exe/git-annex: Exec format error
I will try to assemble build commands later...
annex.trashbin is implemented.
I am going to close this todo; if it turns out there is some preferred
content improvement that would help with cleaning out the trash, let's talk
about that on a new todo. But I'm guessing you'll make do with find.
I think I would deliberately want this to be invisible to the user, since I wouldn't want anyone to actively start relying on it.
With a private remote it's reasonably invisible. The very observant user might notice a drop time that scales with the size of the file being dropped and be able to guess this feature is being used. And, if there is some error when it tries to move the object to the remote, the drop will fail. The error message in that case cannot really obscure the fact that annex.trashbin is configured.
I don't know much about the static-annex builds, but you may have better luck with the Linux standalone builds due to their using a more conventional libc.
Building git-annex from source is not hard if you can get the stack tool installed. It looks like the only currently supported way to do that as a freebsd user is to install https://www.haskell.org/ghcup/ which includes stack. Then follow the fromsource section on "building from source with stack".
IIRC user can just push
git-annexbranch directly aftergit-annexmerging remote version locally, right?
Sure, but my point was that they would have to change their workflow due to a change on the server that might not be visible to them. Violating least surprise.
In that example, the git-annex branch is not pushed to origin after annexed files are sent to it. So how does git-annex on otherhost know that origin has those files? Well, git-annex-shell, when receiving the files, updates the git-annex branch in origin.
IIRC user can just push git-annex branch directly after git-annex merging remote version locally, right?
Making it read-only would somewhat limit the exposure to all these problems, but if it's read-only, how would any annex objects get into the remote repository in the first place?
my use-case at hands: I manipulate git-annex repo on a linux box on an NFS mount and the original one is freebsd box with bare minimal installation. I have about 50 datasets in a hierarchy. I wanted to backup to another location and it would be more performant to talk to the original freebsd server directly instead of going through NFS mount. I can't install git-annex on that freebsd box ATM.
FWIW, on a second thought, given that I do have a workaround with rsync (verified that it works) and unless another more prominent usecase arrives, might be indeed not worth the hassle.