It would be nice to have a way to drop files without leaving broken symlinks around, at least while in direct mode.
Here is my user case. I have a collection of music CDs ripped in FLAC formats. At the same time I convert all these files to MP3 files to that I can use them in various other devices and to save space.
The problem is that if I git annex drop
the FLAC files once they are converted and copied, they leave broken symlinks around; this result in various annoying error messages in almost all the music playback I tried. At the same time, if I rm
or git rm
these symlinks, the content of these files will be removed also from the remotes as they are signalled as no longer wanted.
Couldn't git-annex keep a separate index of files that have been removed but are meant to be kept?
A suggestion from #git-annex:
The original poster seems to have a misunderstanding of what git-annex does with the content of files. Deleting a file does not remove the content of the file. You can always use git to check out a previous version of a file, and the content from the annex will still be there (or you can
git annex get
it to get it from whereever git-annex stored the content).The only exception to this rule is is you, manually, chose to run
git annex unused
and thengit annex dropunused
. That can delete the contents of files, when no branch or tag refers to them. As long as some branch refers to the content of the files, even if it's not the currently checked out branch, the file content is retained.So a branch is the "index of files that have been removed but are wanted to be kept".
For example, you could do:
As long as you always switch to the keepflacs branch to add flac files, and never merge the master branch into keepflacs, but only merge keepflacs into master, keepflacs will have every flac file you have ripped. And so git-annex will retain those files even when you
git annex unused; git-annex dropunused
.Or, just make a promise to yourself that you'll never run
git-annex unused
, similarly to how you'd probably never runrm -rf .git/objects/$rand
, and you don't need the branches; like git, git-annex will retain all data that has ever been checked into it.(The branches are still a good idea. For one thing, they let you run
git annex fsck
and other repository maintenance commands with the keepflacs branch checked out.)I am going to move this thread to the forum, because it's not really a TODO item, but is something others may want to read.
Erm, I missed that you want to use direct mode. All this fun with git won't really work in direct mode, and indeed direct mode is not able to guarantee that old versions of modified files are retained.
Direct mode is nice for some applications involving syncing and less than ideal devices, but not for this.