Please describe the problem.
The new --preserve-filename
option does not have its described effect on torrent files.
What steps will reproduce the problem?
$ git annex addurl --preserve-filename \
http://downloads.endor.at/chaos-math_multi-language_1080p_mkv.ea15601881aa1be1.torrent
(cancelling when at least a file has arrived)
$ tree
└── downloads.endor.at_chaos-math_multi-language_1080p_mkv.ea15601881aa1be1.torrent/
└── 01._Motion_and_determinism_-_Panta_Rhei__1080p_.mkv
$ btcheck -l <(curl 'https://downloads.endor.at/chaos-math_multi-language_1080p_mkv.ea15601881aa1be1.torrent')
Base directory : chaos-math_multi-language_1080p_mkv
01. Motion and determinism - Panta Rhei [1080p].mkv (409315188)
Based on the description of --preserve-filename
, given that nothing in the names is particularly malicious, I'd have expected the tree output to look like this:
$ tree
└── chaos-math_multi-language_1080p_mkv/ (as per base directory)
└── 01. Motion and determinism - Panta Rhei [1080p].mkv
What version of git-annex are you using? On what operating system?
8.20200522 (built using gbp buildpackage
from current git master, 87dc9cd0)
Please provide any additional information below.
This option, when working with torrents, would be a building block to bittorrent: support offline operation and verification, and resolve most of the suggestions from there.
Have you had any luck using git-annex before?
Yes: It saved me from potential data loss when my backup cron jobs stopped working and the mails got lost -- git annex drop
failing on the laptops was both the right thing to do given numcopies=2, and alerted me to the problem in due time.
It certianly should not be mangling the filenames inside the torrent directory, eg replacing space with underscore. I have fixed that now.
As to the name of the directory used by the torrent file, the interface with remotes does not currently let the remote provide it, and would need to be changed, which would also involve changing the external special remote protocol.
Hmm, I suppose if the remote returns a set of files all within a single subdirectory, it could use that subdirectory instead of the mangled url as the containing subdirectory. Then the bittorrent remote could just add the name as a prefix to the list of files in the torrent. (Or rather, stop removing the name prefix, which is what it actually does currently..)
Then any external remotes that support CHECKURL and return multiple files all inside the same single subdirectory would change behavior with and without --preserve-filename. Without it, the single directory would be removed, and the files in it put inside the mangled url subdirectory. With it, the single directory would be used without the containing subdirectory. The behavior change without --preserve-filename is the possibly concerning one. Unfortunately I don't think this approach is a safe one because something might rely on the current behavior.
I am not keen on complicating the remote interface, and especially the external special remote protocol with something that only supports this special case.
Thanks for the fast fix. I've run it with my test torrent and the files now all have their original names.
As for the the directory name, there's the component that git-annex picks, and the one in the torrent.
As git-annex has historically ignored the latter, so be it -- we could think up a configurable option, but I wouldn't bother. (And it wouldn't necessarily be a change in the remote protocol, the torrent remote could just prefix its paths with the
name
field if that gets configured.)The component that git-annex picks I would kind of have expected to be absent: If I'm in a directory and tell git-annex to --preserve-filename get a copy of whatever, I'd expect the server-provided name (or names) of that thing to pop up in that directory, the command was issued, not in a subdirectory.