git-annex-import by default deletes the original files. Keeping them by default would be better. "import" in many other tools (e.g. the bioinformatics tool Geneious) means a non-destructive import. The short description of git-annex-import
on its man page says it "adds" files to the repo, which does not suggest erasure. When I first used git-annex-import
, I was surprised by the default behavior, and others may be too. Also, the command has now been "overloaded" for importing from a special remote, and in that mode the originals are not erased; giving the import-from-dir mode the same default would be more consistent. In general, erasing data by default seems dangerous: what if it was being imported into a temporary or untrusted repo?
Changing the default would also let one ?repeatedly re-import a directory while keeping original files in place.
I realize this would be a breaking change for some workflows; warning of it like git does would mitigate the breakage.
My general feeling about git-annex import is that everything not involving importing from a special remote should be deprecated and eventually removed.
The --duplicate option probably does what you want, but if the interface is going to be changed, such as making that the default, I'd rather the interface change move toward the goal of deprecating the old mode.
The fundamental mistake that the legacy interface made is it conflated copying content into the repository, dropping content from the directory, and updating the working tree. The new interface decouples all 3, only doing the first, and updating a tracking branch, which the user is then free to merge as-is, or otherwise modify before merging. Dropping requires an export of a new tree, which is the main pain point in emulating the old interface, but you happen to not want to drop the content from the directory, so that pain point shouldn't affect you.
Makes sense. It's certainly better if import/export did complementary things. Maybe, move the old
git-annex-import
functionality to a new command calledgit-annex-ingest
?"everything not involving importing from a special remote should be deprecated" -- i.e. to ingest a directory you'd first create a directory special remote for it, and then
git-annex-import
from that?