It is possible to deposit files in a remotes annex, for example via a p2phttp request. In this case, the deposited file was never known to the git-annex branch metadata. It is my understanding that in this case all the "unused" tooling is not applicable.
Does git-annex provide means to scan an annex for unexpected annex keys, and maybe for ingesting them such that the appear as unused?
Thx!
git-annex p2phttpdoes update the git-annex branch itself when recieving files. And generally, any time git-annex stores an object in a repository, it updates the git-annex branch accordingly.So, you can fetch from the remote and learn about those objects, and then
git-annex unused --from=$remotewill show you unused objects in the remote.When running
git-annex unusedon the local repository, it does list all objects in the local repository. So if an object somehow does get into the repository without a branch update, it will still show as unused.There is no way to list all objects present in a remote. Special remotes are not required to support emumeration at all. So, if an object got sent to a special remote, and the git-annex branch record of that was lost, there would be no way to find that unused object.
Thanks for detailing the behavior. I am observing something different, though. The context is a git-annex repo at a forgejo-aneksajo site.
I used a JS client to upload annex keys to a an annex with uuid
f1a8ef1c-.... This worked. I see them inannex/objectsat the remoteI also see this:
This made me (incorrectly) think whether this could mean that the repo thinks the upload came FROM f1a8ef1c-... ?
The p2phttp request is made to an endpoint that is composed like this:
where
Notice that
clientUuidis not a UUID (redacted original value that also was not a valid UUID).I have adjusted that to be an actual UUID, and did another upload. This achieved two things:
However, the new upload is now sitting in the journal, and has not been taken into account, and additional uploads do not trigger a git-annex branch update immediately.
This issue may be in the realm of forgejo-aneksajo, and how it runs the p2phttp server. The previous uploads were made mid-December (as seen from the timestamps in the journal). Nothing has triggered a journal commit, also not the fetch of the git-annex branch.
This seems like a bug in the p2phttp server, it should not be leaving the git-annex branch uncommitted for long periods of time. It's easy enough to show that it leaves changes in the journal for a long time.
Probably we don't usually notice the bug because usually, if the p2phttp server doesn't commit the journal, the client will record the same information in the git-annex branch on its side, and push it out in the normal course of events, eg during a sync. I assume your JS client doesn't do that.
I've filed a bug: p2phttp timely journal commit
(As to the p2phttp clientuuid parameter, it is actually only used in transfer logs, which don't get into the git-annex branch. Using a made-up non-UUID there, or for that matter, using a UUID that "belongs" to someone else won't cause any real problem. (
git-annex infowill use the non-UUID in the "transfers in progress" display). This does not seem related to your problem.)