Please describe the problem.
We used a script calling out to reregisterurl
to move URLs from datalad to regular web remote: https://github.com/dandi/dandisets/pull/387/files.
Even after removing all urls, key is associated with the remote, and thus annex find
able:
dandi@drogon:/mnt/backup/dandi/dandisets/000897$ git annex whereis sub-amadeus/sub-amadeus_ses-08152019_behavior+ecephys.nwb
whereis sub-amadeus/sub-amadeus_ses-08152019_behavior+ecephys.nwb (2 copies)
00000000-0000-0000-0000-000000000001 -- web
cf13d535-b47c-5df6-8590-0793cb08a90a -- datalad
web: https://api.dandiarchive.org/api/assets/d3a96834-ee80-4afa-b985-82066817272c/download/
web: https://dandiarchive.s3.amazonaws.com/blobs/a6e/c32/a6ec3274-ceeb-4d21-b091-1e991a512c7b?versionId=Vt7RKy0cgO1L82S7tqIQRQgNHBBZVtVh
ok
I think that git-annex should have completely dissociated that remote from the key whenever the very last url was reregistered.
What version of git-annex are you using? On what operating system?
10.20240430-1~ndall+1
Removing the last url from the web special remote makes it treat content as no longer present in that remote. That is a documented special case.
For all other special remotes, git-anenx does not have any reason to expect that removing a record of an url will mean that the special remote will not be able to still retrieve content that was stored on it.
reregisterurl is no different than unregisterurl in this respect. Both document this special case. You've filed multiple bugs about rmurl having the same behavior in the past, IIRC. And IIRC those all got closed, so I guess I'll close this one too.
Well, it then seem to me that "web" special remote is not that special and it would be useful to extend its special case to be supported by other special remotes exhibiting similar behavior of being useless if no URLs are registered for a key.
BTW, isn't it not just
web
but alsobittorrent
(do not have any URL handy to check ATM) since I expect it also to be a special remote needing to know a URL?Anyways, sorry for the noise, but as it happens with special cases and aging brains, I keep forgetting some of them. Hopefully I will make a better mental note about this one
It would need some way for a special remote to indicate to git-annex that it is in this unusual class of remotes where not having an url is the same as content no longer being present in it.
Implementing that would just make some more remotes have a special case, which seems even harder to remember. I'd rather remove the special case, but of course that will break existing workflows.
And it is unusual. Consider a S3 remote. It can have an url recorded for a object stored in it, but forgetting the url doesn't mean that the S3 bucket no longer contains the file. If git-annex behaved this way for S3, it would be broken in a way that could be expensive to the user.
(The special case is not currently implemented for bittorrent special remote. But it also doesn't record urls in a user-visible way actually.)
yes: remote would need to indicate to git-annex that "feature". ATM git-annex already does that via
EXTENSIONS
to announce what it can do, and it seems thatso it could have announced
or alike.
yes -- it is "unusual" as not every special remote would be "URL-only" remote. But there is AFAIK a growing number of custom remotes which are like that at least in
datalad
land:datalad
,datalad-archives
,datalad-uncurl
and likely others. Typically they are "read-only" remotes, and URL is used as the identifier for custom "downloader" support.It is interesting that bittorrent remote doesn't make them visible... never used, but wondered how my life would have been if I wanted to manage collection of torrent urls per each key...