Am I right in concluding that using an array of limited storage devices that require FAT32 is not a valid use case for git-annex?
Dumb devices that require FAT32, that need all present files to be unlocked in order to not choke on them (i.e. unlike with git-annex’ text-files-as-symlinks on crippled filesystems), that also therefore need missing files to be hidden, and that are storage space-constrained, I would say are extremely common. Yet, if I understood correctly, v7 repos are required for being able to hiding missing files, but at the same time, on v7 and FAT32 filesystems, files that are present take up double their file size.
Am I right, or I am missing a way to use those?
I would say that FAT32 devices are perfectly valid, but only if used as a directory special remote with chunking enabled. This way you can transfer files of any size (given there is enough space of course) and keep all previous versions. The downside is that you can't just plug the drive into random computer and read files, git-annex is required, also it is important to remember that special remotes do not store metadata about files so you need your regular repository synced to a computer where you want to read special remote.
Next option is to use direct mode, this way files will be readable on any computer even without git-annex, but file size will be limited to 4Gb and only latest version of your files will be available. Note that you still can drop files from direct mode repository and these files will be replaced with placeholders, so you are not forced to copy big files to a repository with limited file size.
Actually it is possible (and I think it is a valid use case) to use both "directory special remote" and "direct mode repository" on the same drive.
If I was forced to use FAT32, I would use it this way: create directory special remote with chunking to store files, create bare repository just for metadata. This way the drive will store all versions of files, any size, even if other repositories disappear this will be a complete copy.
By device, I meant the computer-like device that needs to be able to read the files. Say, an MP3 player for example.
Well, that rules it out since, the device being dumb, it does need to be able to find the real files at their standard path and be able to read them as is.
Chunking is not a requirement, though, since the device's software, only supporting FAT32, would not support or expect files that cannot exist in FAT32.
Direct mode is deprecated.
Lack of support for "dumb FAT32" and "dumb NTFS" is indeed the reason why most of the people in my circle of acquaintances do not use git-annex.
V7 and import/export tree should solve most the problems related to the synchronization.
One last issue remains, however: FAT/NTFS and POSIX have different sets of allowed characters in file names. To address this, git-annex should "smudge" file names when dealing with a FAT/NTFS remote. For example
Moby-Dick: or, The Whale.epub
should be exported asMoby-Dick_ or_ The Whale.epub
.I see no reason to add such filename munging to git-annex, controlling the names of files in the working tree is well out of its scope. It could be in scope for git, but more likely, if this is a problem for you, you need to set up your own policy for what filenames are allowed in your repository, and use existing facilities to enforce it (git hooks, etc).
lsblk -no uuid,mountpoint | egrep -qx '<uuid>\s+<mountpoint>'
to make sure git-annex only tries to touch the remote when it's mounted. I just rename any files in the repo that have invalid names.It's not a repo version, it's a kind of special remote that just accesses a tree of files on the device as if it were a git repo but without all the baggage.