specify freeze/thaw scripts relative to topdir

git-annex/ todo/ specify freeze/thaw scripts relative to topdir

Edit
RecentChanges
History
Preferences
Branchable
9 comments

install
assistant
walkthrough
tips
bugs
todo
forum
comments
contact
thanks

If I try to specify custom scripts for freeze/thaw in .git/config of a repository with relative paths (since absolute paths are not robust to renames etc and thus IMHO should be avoided):

(datalad) [f006rq8@discovery-01 subdir]$ pwd
/dartfs/rc/lab/D/DBIC/DBIC/CON/asmacdo/tmp/test-local-thaw/subdir

(datalad) [f006rq8@discovery-01 subdir]$ git config get annex.thawcontent-command
.git/annex/thaw-content %path

(datalad) [f006rq8@discovery-01 subdir]$ git config get annex.freezecontent-command
.git/annex/freeze-content %path

their invocation fails when ran from subdirectory

[2024-10-16 14:47:08.941720897] (Annex.Perms) freezing content ../.git/annex/objects/6k/VJ/MD5E-s115--9a295e3f5f148380d74c3ff3ebdaa173/MD5E-s115--9a295e3f5f148380d74c3ff3ebdaa173            
[2024-10-16 14:47:08.948171243] (Utility.Process) process [2572997] call: sh ["-c",".git/annex/freeze-content '../.git/annex/objects/6k/VJ/MD5E-s115--9a295e3f5f148380d74c3ff3ebdaa173/MD5E-s1
15--9a295e3f5f148380d74c3ff3ebdaa173'"]                                                                                                                                                       
sh: .git/annex/freeze-content: No such file or directory

I wonder if there could be a way added to be able to specify them relative to the top of the repository.

done --Joey

RSS Atom

comment 1

as an alternative/complimentary idea -- could git-annex support simply having those scripts under .git/hooks, e.g. .git/hooks/annex-{freeze,thaw}-content?

Comment by yarikoptic — Wed Oct 16 18:58:04 2024

Remove comment

comment 2

One simple solution is the put the script somewhere in PATH and then set the config to the name of the script rather than its location.

In the case of the freeze and thaw hooks, these are run a lot and so git-annex shouldn't be checking if .git/hooks/annex-freeze-content exists every time.

Comment by joey — Mon Oct 21 15:25:22 2024

Remove comment

may be %dotgit?

Original line of my thought was expressed in this issue on github.
One of the recent cases which made git-annex "flip out" into adjusted branch mode (yet to try to reproduce and follow up on add_config_var_preventing_adjusted_branch_mode), which happened when user executed datalad with git-annex inside a singularity container. To facilitate reproducibility etc, we are aiming to minimize effects of outside elements on execution within container so bind mount only current dataset and transfer only some git / git-annex settings. We could also check on paths for those scripts and bind mount them too. Also if relying on PATH, we would need somehow to ensure that inside the container PATH would point to them too (might be overridden by container's startup script since after all outside PATH might have little to do with inside -- think about running docker container on OSX).

I think it would have been clean(er) if some initial invocation of current global git-annex freeze/thaw script which would potentially determine either it is needed or not at all (since some partitions might not need it, some need one kind, another - some other one), would instantiate in a given repository a copy of the specific freeze/thaw scripts tandem. But inability to specify relative path hinders that. May be similarly to %path , it could have some %dotgit or alike variable to point to location of .git folder, and our "freeze/thaw" installation script populating values like thawcontent-command = %dotgit/bin-annex/thaw-content %path? I guess also could simply add treatment of leading ./ to signal being relative to .git/ folder. Such susbstitution would need to be done once upon reading that config setting per repo, there is no need to sense if script is there or not. Since if not -- it better error out instead of proceeding forward with "default" behavior (which seems to be "switch to adjusted branch").

Comment by yarikoptic — Mon Jan 6 23:38:39 2025

Remove comment

comment 4

I can see how it would be simpler for you to just be able to have those hooks in .git/hooks/ along with the rest of the repository.

I don't think that a special casing of "./" is a good idea, that would be pretty confusing and for all I know some user might really want git-annex to run a hook in the current directory of their git repository.

I am meh on "%dotgit", for one thing in a bare repository it's not .git.

What if git-annex just added the git hooks directory to the end of PATH when running configured annex.*-commands? Then you could: git config annex.thawcontent-command annex-thawcontent and put your script in .git/hooks/annex-thawcontent

This nicely avoids git-annex doing any extra work in general to check if .git/hooks/ exist.

The reason I think it would need to be at the end of PATH rather than the front is that there are some git hooks with names like "update" and "applypatch" that I can imagine might have the same names as a user's own programs in their PATH. For example, if annex.commitmessage-command=foo and the script foo runs "update", the user would be surprised if that ran the git hook rather than their ~/bin/update.

On the other hand, when configuring a annex.*-command, it does not seem likely that the user would set it to "update" or any of the other names of git hooks. Especially if they didn't have an "update" in PATH. So using the git hook directory for this, rather than some other special directory under .git seems ok.

Comment by joey — Thu Jan 9 17:57:05 2025

Remove comment

comment 5

It could be %gitdir or any other you like. It could be that any relative path - relative to that folder since it makes no sense relative to current working dir etc

I really feel odd about changing PATH for this purpose, especially to point to .git/hooks which want intended to be added to PATH. I think it if it was .git/annex/hooks , with documentation about such act happening, it would be a but more kosher, but I still feel better about some more explicit path specification

Comment by yarikoptic — Fri Jan 10 03:20:31 2025

Remove comment

comment 6

After sleeping on it, I concur that PATH changes feel unwise.

Also, it turns out that git-annex already actually caches existing hooks, so adding new hook scripts to .git/hooks/ even for things that run frequently is not a performance problem.

So, my plan is to add .git/hooks that are run in preference to some annex.*-command git configs. Eg, .git/hooks/freezecontent-annex corresponds to annex.freezecontent-command.

The "-annex" prefix matches the current pre-commit-annex and post-update-annex hooks. Also considering adding git configs corresponding to the existing hooks. I doubt that there would be much use case for configuring annex.pre-commit-command rather than the pre-commit-annex hook, since the hook is there only to let users who would usually install a pre-commit hook to install their hook script without getting in the way of the pre-commit hook that git-annex writes. But, it seems worth having the git config just for consistency.

There are some things like annex.youtube-dl-command and annex.shared-sop-command that are configuring commands for git-annex to run, and are not really hooks per se.

And it does not make sense to have hook scripts that a specific to a given remote corresponding to configs like remote.name.annex-cost-command. Instead there could be a single .git/hooks/remote-cost-annex that is passed the name of the remote.

Comment by joey — Fri Jan 10 16:13:44 2025

Remove comment

comment 7

Implemented hooks: freezecontent-annex, thawcontent-annex, secure-erase-annex, commitmessage-annex, http-headers-annex

That leaves only remote.name.annex-cost-command and similar git configs that don't have hooks. And a few like annex.youtube-dl-command that are not really equivilant to hooks.

I think I will wait on adding hooks for remote git configs, I'd rather talk with someone who has a use case for that than make up something for completeness. Am not currently liking the idea of including a remote name in the hook for those, but perhaps someone would have a use case that argues otherwise.

Comment by joey — Fri Jan 10 18:50:07 2025

Remove comment

comment 8

Coolio! So my ACLs use-case should need/entail

Setup annex.pre-init-hook globally to point to the script which would check if ACL is used on a specific folder where git annex init is ran. If it is ran, it would then either
- Under .git/hooks either place or remove (if not required) freezecontent-annex, thawcontent-annex which would take a single path (file or a directory) which would need to be frozen or thawed.

Is that correct/complete picture or am I missing anything I would need to do for my use case?

Comment by yarikoptic — Fri Jan 10 20:27:48 2025

Remove comment

Re: comment 8

That seems like it would work. I suppose you could instead always install the freeze/thaw hooks, and just make them do nothing when ACL is not used. If probing for that is expensive or better to only do once for some other reason, having the pre-init set up the hooks would make sense.