Please describe the problem.
In a large import, three files (all with non-ascii names) gave the following error: git-annex: .git/annex/othertmp/.0: createSymbolicLink: already exists (File exists)
I've tried to extract the relevant part of a strace -f
:
mkdir(".git/annex/othertmp", 0777) = -1 EEXIST (File exists)
newfstatat(AT_FDCWD, ".git/annex/othertmp", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
mkdir(".git/annex/othertmp", 0777) = -1 EEXIST (File exists)
newfstatat(AT_FDCWD, ".git/annex/othertmp", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
mkdir(".git/annex/othertmp/.0", 0777) = 0
unlink(".git/annex/othertmp/.0") = -1 EISDIR (Is a directory)
symlink("../../../../../.git/annex/objects/9w/wJ/SHA256E-s5426861--cdc0664822c9df3ffbf255d160870fc39a6fdd1168b02fc2c9b59cc65bc81c26.pdf/SHA256E-s5426861
--cdc0664822c9df3ffbf255d160870fc39a6fdd1168b02fc2c9b59cc65bc81c26.pdf", ".git/annex/othertmp/.0") = -1 EEXIST (File exists)
newfstatat(AT_FDCWD, ".git/annex/othertmp/.0", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
newfstatat(AT_FDCWD, ".git/annex/othertmp/.0", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
openat(AT_FDCWD, ".git/annex/othertmp/.0", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 31
fstat(31, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
getdents64(31, 0x7f2694001180 /* 2 entries */, 32768) = 48
getdents64(31, 0x7f2694001180 /* 0 entries */, 32768) = 0
close(31) = 0
rmdir(".git/annex/othertmp/.0") = 0
close(26) = 0
What steps will reproduce the problem?
touch "墨漬 Ink Stains.pdf"
git annex add "墨漬 Ink Stains.pdf"
(the file name base64 encoded is 5aKo5rysIEluayBTdGFpbnMucGRm
)
What version of git-annex are you using? On what operating system?
git-annex version: 10.20250605-gd2dc318a867f571cbc848b5d45e82e153e364e4e
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Benchmark Feeds Testsuite S3 WebDAV
dependency versions: aws-0.24 bloomfilter-2.0.1.2 crypton-1.0.0 DAV-1.3.4 feed-1.3.2.1 ghc-9.4.8 http-client-0.7.18 persistent-sqlite-2.13.1.0 torrent-10000.1.3 uuid-1.3.16 yesod-1.6.2.1
I'm running Arch Linux (kernel 6.15.1-arch1-2). The repo I'm running the commands in is on an ext4 filesystem.
Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
git-annex has been brilliant for managing my large media collection across several removable drives, and I'm confident it will continue to scale. This is the first issue I've run into with it.
I can confirm this behaviour. But it's not precisely "non-ascii characters" that cause this, emojis and greek letters for example are no problem.
Hm, Just the non-ascii characters can't be the problem. Only with the suff
.pdf
or.pdff
it fails. Without the suffix or just.pd
or longer extensions.pdfff
it works 🤪I'm experiencing this as well; mostly with filenames that have CJK characters in them, but also a couple using other non-ASCII symbols. I think nobodyinperson already confirmed this, but it seems like the contents of the file don't matter, just the filename. It also doesn't seem to matter whether or not it's in a subdirectory.
There's something to do with filename length, too? Changing the extension but keeping the character count the same doesn't fix the issue, but using a shorter extension does:
Here are all the problematic filenames I've found so far. They all seem to start with a non-ASCII character, not sure if that's relevant.
There was another bug filed about the same problem, ?strong>unlock fails for some names.
Cause is a filename that is 21 bytes long and begins with a utf-8 character. Which AFAICS all the filenames mentioned here are.
67f00027d1b326c979db8b81c973a61234c406d7 fixes this.