Make annex.addunlocked be able to be configured in .gitattributes, the same way as annex.largefiles can be.
This would be useful if certain filename extensions need to be unlocked to be used, and others are desired to be kept locked.
The annex.addunlocked git config is a boolean, but this gitattributes one would effectively build up a file match expression. So it might then follow that the git config should also be a file match expression, with "true" being the same as "anything" and "false" the same as "nothing" for back-compat. --Joey
git annex config
so could be set persistently (across clones) this way.Yes, annex.addunlocked with a matching expression was implemented in 2019 and accomplishes the same thing as this would have, without the problems of using git attributes.
just it is getting a bit "confusing" to use
.gitattributes
for some types of annotations (e.g.,largefiles
) for files patterns and thengit annex config
for others. Ideally there should either be a singular or equivalently expressive (not complimentary) multiple ways.could you please remind (or point to prior composed list) of those? I personally just keep forgetting which one (first or last match) applies
git-annex config does work for both largefiles and addunlocked.
gitattributes does not allow setting an attribute containing a space and so complex expressions, which either of these can have, become very annoying to shoehorn in. gitattributes also adds a round-trip overhead to query it for every file.
ATM we rely on .gitattributes to set default (to DataLad datasets) backend to be MD5E, so every dataset then is guaranteed to have
.gitattributes
. We also rely in many configurations on settinglargefiles
for different extensions within.gitattributes
. I think this is two primary target use cases. You say that having.gitattributes
slows downgit
(andgit-annex
) operations -- so would you recommend to switch to specifying those (backend, largefiles) viagit annex config
instead?In datalad/issues/5383 (Stop using .gitattributes for annex.largefiles config ) note was that default backend cannot be specified via
git-annex config
-- is that still the case?Well, moving your annex.largefiles settings from gitattributes to
git-annex config
won't speed up queries for it, because the gitattribute overrides thegit-annex config
setting. And so git-annex still has to do the work of querying for the gitattribute anyway, even when it's not set.In 4acbb40112aa73dcde63841d8d8c04c433f6a806 I benchmarked that as making
git-annex add
2% slower than it would be otherwise (excluding hashing). We will just have to live with that, unless the gitattribute can eventually somehow be deprecated. That is a good lesson about the risks of adding more gitattributes.annex.backend is not currently configurable by
git-annex config
. It would be listed in its man page if it were.I'd support adding that, but annex.backend is currently the name of a single backend, so this would not allow setting the backend differently for different filenames. Which is something that gitattributes can do. So it would need annex.backend to be expanded, so it can specify different backends for different filenames or other properties. I don't know how that syntax would look; the syntax git-annex currently uses for annex.largefiles etc is not suitable here. It would certianly be an added complication.
Also, it seems that the reasoning that made the annex.largefiles gitattributes override
git-annex config
would also make sense for annex.backend, and if so there would be no performance benefit to moving it. I'm not sure what that reasoning was. Possibly that there might be cases where the desired value depends on the branch that's checked out.I would support adding annex.backend to
git-annex config
using the current simple value though. There is benefit to doing it for consistencty, surely. If a more complicated syntax is needed someone can still use gitattributes or git-annex could be changed later.Please open a new todo if this would be useful to you..
I also think that the thinking in this comment is worth considering: https://github.com/datalad/datalad/issues/5383#issuecomment-770108778
Those are valid reasons to prefer to be able to set things like this in gitattributes rather than the global config. Those kinds of considerations are why the global config always has a local way to override it. Sometimes that is necessarily .git/config, not .gitattributes though.