Comments in the moderation queue:
Recent comments posted to this site:
We are considering introducing git-annex with gcrypt in hybrid mode as secure storage for common data in our company and I'd rather not delete and reinit the repo everytime when somebody new is granted access.
A little testing with current git-annex showed, that GCRYPT_FULL_REPACK with a forced git-push of all branches makes the git-repo accessible (I get the files) to the newcomer but not the annexed data (gpg error "No secret key" in git annex get, git annex info secretRepo just lists my first key).
Has anybody sucessfully tested adding keyids in hybrid-encryption later on? Which further steps where needed to make it work?
Thanks for any input!
@joey - thanks, that's prompt feature request fulfilment
Looking more closely at the duplicates, it turns out that not everything got duplicated, just the "older" episodes. It turns out the newer episodes do have guid values saved (as itemid in the metadata) and the older episodes do not. I think this is most likely because I was running a fairly old git-annex until about October 2016, on a fairly old OS install, but then upgraded to a more recent one (now about 6 months old) which does track them. My assumption (without checking every file) is the episodes downloaded before October 2016 are ones that got duplicated.
I've edited the main page and added a note that GUIDs are tracked in versions since 2015, since I didn't obviously find that listed anywhere before.
@rok it's a consequence of using smudge/clean filters; git add passes
the file through the filters.
You can't accomplish this with remote.<name>.annex-ssh-options,
since it is not exposed to the shell, and the parser just breaks it up into
A smarter parser would be needed. Or you could configure it in
~/.ssh/config, or perhaps make a ssh config file elsewhere and use
annex-ssh-options to pass -F to ssh to make it use this other config file.
Now that git-annex supports GIT_SSH_COMMAND, which is exposed to the
shell, you should be able to accomplish it that way. I don't know if that
would work in your use case, since the environment variable affects all ssh
remotes, not just one.
@ewen importfeed already tracks guids, since 2015. Relevant commit is
You may well have an
older version of git-annex that didn't do that. But there are probably also
feeds that lack a useful guid, or that even make a change that changes the
guid of an existing item.
With git annex metadata, you can see the itemid which is where the guid
git annex metadata
PS, please post in todo when you have a request..
While tracking podcast media URLs usually works to avoid duplicate downloads, when it fails it usually fails spectacularly. In particular if a podcast feed decides to update all the URLs (for old and new podcasts) to use a different URL scheme, then suddenly that looks like a huge volume of new URLs, and all of them get downloaded again -- even if the content has actually already been retrieved from a different URL (ie, older URL scheme). For instance the acast.com service has changed their URL scheme a couple of times in the last 1-2 years, rewriting all the historical URLs, so I have three copies of many of the episodes on podcasts on their service (Many downloaded; some skipped once I caught the bulk download and stopped it/reran with --fast or --relaxed to make placeholders instead. acast.com seem to have managed to cause even more confusion by rewriting many of the older mp3 files with new id3 tags, thus changing the file size/hashes -- it definitely made cleaning up more complicated.)
Some (all?) podcast feeds also have a guid field, which specifies what should be a unique per-episode and unchanging, that other podcatchers use to track "seen this" content. In theory that guid value should be stable even across media URL changes -- at least if it isn't, then a podcaster changing the guid and media URL will almost certainly induce re-downloads in most podcatchers, and thus hopefully realise early on (eg, during testing) rather than in production.
Can git-annex be extended to track the guid values as well as the filenames, so git annex importfeed can avoid downloading episodes where it has already processed that guid, and instead just add the newly listed url as an alternate web URL for that specific episode (which has been my manual work around). Perhaps the episode guid could be stored as additional metadata, along with some sort of feed unique ID (link?), and then an index built/consulted when importfeed runs (although that "feed unique ID" would probably also have to be updatable by the user, to cope with "the feed URL has now changed from http:// to https:// which also seems to be happening a bunch at present.)
git annex importfeed
PS: Apologies for duplicate partial comment; I think my browser decided some key combination meant "do default form action", which is post -- and I wasn't finished writing. I couldn't see a way to edit the comment, hence deleting/readding.
Your last comment brought me onto the right track. The Problem was not in the repository, but an old stale global .gitconfig in my homedir. I just checked $XDG_CONFIG_HOME/git/config were currently my global git-config is residing and totaly forgot about this old config. Stupid me!
git config --show-origin --get annex.largefiles
was my savior here as it clearly indicated that there is indeed a (unintended) config setting and where to find the file. So i can strongly recommend anybody experiencing strange behavior to try this one-liner. It might have saved me hours of time.
Thanks for your help!
Note that if annex.largefiles is set in git config (including global git
config), it overrides the .gitattributes setting. So a reasonable guess
would be that you set it in the git config.
@joern.mankiewicz, you need to file a bug report with enough information to
reproduce your problem.
annex.largefiles in .gitattributes works fine:
joey@darkstar:~/tmp> git init ttt
Initialized empty Git repository in /home/joey/tmp/ttt/.git/
joey@darkstar:~/tmp> cd ttt
joey@darkstar:~/tmp/ttt> git annex init
(recording state in git...)
joey@darkstar:~/tmp/ttt> echo '* annex.largefiles=nothing' > .gitattributes
joey@darkstar:~/tmp/ttt> touch foo
joey@darkstar:~/tmp/ttt> git annex add foo
add foo (non-large file; adding content to git repository) ok
(recording state in git...)