If I try to specify custom scripts for freeze/thaw in .git/config
of a repository with relative paths (since absolute paths are not robust to renames etc and thus IMHO should be avoided):
(datalad) [f006rq8@discovery-01 subdir]$ pwd
/dartfs/rc/lab/D/DBIC/DBIC/CON/asmacdo/tmp/test-local-thaw/subdir
(datalad) [f006rq8@discovery-01 subdir]$ git config get annex.thawcontent-command
.git/annex/thaw-content %path
(datalad) [f006rq8@discovery-01 subdir]$ git config get annex.freezecontent-command
.git/annex/freeze-content %path
their invocation fails when ran from subdirectory
[2024-10-16 14:47:08.941720897] (Annex.Perms) freezing content ../.git/annex/objects/6k/VJ/MD5E-s115--9a295e3f5f148380d74c3ff3ebdaa173/MD5E-s115--9a295e3f5f148380d74c3ff3ebdaa173
[2024-10-16 14:47:08.948171243] (Utility.Process) process [2572997] call: sh ["-c",".git/annex/freeze-content '../.git/annex/objects/6k/VJ/MD5E-s115--9a295e3f5f148380d74c3ff3ebdaa173/MD5E-s1
15--9a295e3f5f148380d74c3ff3ebdaa173'"]
sh: .git/annex/freeze-content: No such file or directory
I wonder if there could be a way added to be able to specify them relative to the top of the repository.
.git/hooks
, e.g..git/hooks/annex-{freeze,thaw}-content
?One simple solution is the put the script somewhere in PATH and then set the config to the name of the script rather than its location.
In the case of the freeze and thaw hooks, these are run a lot and so git-annex shouldn't be checking if .git/hooks/annex-freeze-content exists every time.
Original line of my thought was expressed in this issue on github.
One of the recent cases which made git-annex "flip out" into adjusted branch mode (yet to try to reproduce and follow up on add_config_var_preventing_adjusted_branch_mode), which happened when user executed datalad with git-annex inside a singularity container. To facilitate reproducibility etc, we are aiming to minimize effects of outside elements on execution within container so bind mount only current dataset and transfer only some git / git-annex settings. We could also check on paths for those scripts and bind mount them too. Also if relying on PATH, we would need somehow to ensure that inside the container PATH would point to them too (might be overridden by container's startup script since after all outside PATH might have little to do with inside -- think about running docker container on OSX).
I think it would have been clean(er) if some initial invocation of current global git-annex freeze/thaw script which would potentially determine either it is needed or not at all (since some partitions might not need it, some need one kind, another - some other one), would instantiate in a given repository a copy of the specific freeze/thaw scripts tandem. But inability to specify relative path hinders that. May be similarly to
%path
, it could have some%dotgit
or alike variable to point to location of.git
folder, and our "freeze/thaw" installation script populating values likethawcontent-command = %dotgit/bin-annex/thaw-content %path
? I guess also could simply add treatment of leading./
to signal being relative to.git/
folder. Such susbstitution would need to be done once upon reading that config setting per repo, there is no need to sense if script is there or not. Since if not -- it better error out instead of proceeding forward with "default" behavior (which seems to be "switch to adjusted branch").I can see how it would be simpler for you to just be able to have those hooks in .git/hooks/ along with the rest of the repository.
I don't think that a special casing of "./" is a good idea, that would be pretty confusing and for all I know some user might really want git-annex to run a hook in the current directory of their git repository.
I am meh on "%dotgit", for one thing in a bare repository it's not
.git
.What if git-annex just added the git hooks directory to the end of PATH when running configured
annex.*-command
s? Then you could:git config annex.thawcontent-command annex-thawcontent
and put your script in .git/hooks/annex-thawcontentThis nicely avoids git-annex doing any extra work in general to check if .git/hooks/ exist.
The reason I think it would need to be at the end of PATH rather than the front is that there are some git hooks with names like "update" and "applypatch" that I can imagine might have the same names as a user's own programs in their PATH. For example, if
annex.commitmessage-command=foo
and the script foo runs "update", the user would be surprised if that ran the git hook rather than their~/bin/update
.On the other hand, when configuring a
annex.*-command
, it does not seem likely that the user would set it to "update" or any of the other names of git hooks. Especially if they didn't have an "update" in PATH. So using the git hook directory for this, rather than some other special directory under .git seems ok.It could be
%gitdir
or any other you like. It could be that any relative path - relative to that folder since it makes no sense relative to current working dir etcI really feel odd about changing PATH for this purpose, especially to point to .git/hooks which want intended to be added to PATH. I think it if it was
.git/annex/hooks
, with documentation about such act happening, it would be a but more kosher, but I still feel better about some more explicit path specificationAfter sleeping on it, I concur that PATH changes feel unwise.
Also, it turns out that git-annex already actually caches existing hooks, so adding new hook scripts to .git/hooks/ even for things that run frequently is not a performance problem.
So, my plan is to add .git/hooks that are run in preference to some
annex.*-command
git configs. Eg, .git/hooks/freezecontent-annex corresponds to annex.freezecontent-command.The "-annex" prefix matches the current pre-commit-annex and post-update-annex hooks. Also considering adding git configs corresponding to the existing hooks. I doubt that there would be much use case for configuring annex.pre-commit-command rather than the pre-commit-annex hook, since the hook is there only to let users who would usually install a pre-commit hook to install their hook script without getting in the way of the pre-commit hook that git-annex writes. But, it seems worth having the git config just for consistency.
There are some things like annex.youtube-dl-command and annex.shared-sop-command that are configuring commands for git-annex to run, and are not really hooks per se.
And it does not make sense to have hook scripts that a specific to a given remote corresponding to configs like
remote.name.annex-cost-command
. Instead there could be a single.git/hooks/remote-cost-annex
that is passed the name of the remote.Implemented hooks: freezecontent-annex, thawcontent-annex, secure-erase-annex, commitmessage-annex, http-headers-annex
That leaves only
remote.name.annex-cost-command
and similar git configs that don't have hooks. And a few like annex.youtube-dl-command that are not really equivilant to hooks.I think I will wait on adding hooks for remote git configs, I'd rather talk with someone who has a use case for that than make up something for completeness. Am not currently liking the idea of including a remote name in the hook for those, but perhaps someone would have a use case that argues otherwise.
Coolio! So my ACLs use-case should need/entail
annex.pre-init-hook
globally to point to the script which would check if ACL is used on a specific folder wheregit annex init
is ran. If it is ran, it would then either.git/hooks
either place or remove (if not required)freezecontent-annex
,thawcontent-annex
which would take a single path (file or a directory) which would need to be frozen or thawed.Is that correct/complete picture or am I missing anything I would need to do for my use case?
That seems like it would work. I suppose you could instead always install the freeze/thaw hooks, and just make them do nothing when ACL is not used. If probing for that is expensive or better to only do once for some other reason, having the pre-init set up the hooks would make sense.