Hello,
Has anyone encountered this issue:
I have a repository in version 7. It has various remotes and clones from my frantic attempts to recover my data without really knowing what I'm doing. Anyways, the files in the repository are text files with hashes in them:
cat demo_beheer.gpkg
/annex/objects/SHA256E-s204800--a518c074bc22f673f0c73191a01426fef0a7d8b262a17d2729a4a3ac51da40ce.gpkg
But in .git/annex/objects
there are two-letter directories. I can find this file in there, but its name is different than the above. All the .gpkg files under .git/annex/objects
are intact (I can open them), and also appear to be versions of the same file (the demo_beheer.gpkg
one I'm looking for).
find -name '*.gpkg' -type f
./.git/annex/objects/Gz/v4/SHA256E-s204800--01120000361af90c29ee27a51ef7a6157bc413dc768d8ba495b7df8360c6dbfe.gpkg/SHA256E-s204800--01120000361af90c29ee27a51ef7a6157bc413dc768d8ba495b7df8360c6dbfe.gpkg
./.git/annex/objects/mV/6j/SHA256E-s204800--222d8fe6975a07a6305b27a453c7db62df0518458d53252bee2f8bac16d1329c.gpkg/SHA256E-s204800--222d8fe6975a07a6305b27a453c7db62df0518458d53252bee2f8bac16d1329c.gpkg
./.git/annex/objects/Q3/30/SHA256E-s204800--71d2d90cb98ea98806b6f9ae479ffae7d2d7f6b1fb6ea970c108ef0b7b0a52ec.gpkg/SHA256E-s204800--71d2d90cb98ea98806b6f9ae479ffae7d2d7f6b1fb6ea970c108ef0b7b0a52ec.gpkg
./.git/annex/objects/X7/QJ/SHA256E-s204800--d3ce397eb2f1d5080641e15a8d28a5ebabf56ce03b756ed6ceb93fec0d390c72.gpkg/SHA256E-s204800--d3ce397eb2f1d5080641e15a8d28a5ebabf56ce03b756ed6ceb93fec0d390c72.gpkg
./.git/annex/transfer/failed/download/a3f8a46a-60fe-58e2-901b-2c093bcc22d3/SHA256E-s3989504--78a7d01d5b7331cb867464c5787264292b39905533be58822a63c9f6d9ea8b3d.gpkg
./.git/annex/transfer/failed/download/a3f8a46a-60fe-58e2-901b-2c093bcc22d3/SHA256E-s1929216--03c015bdd9ac6efadf5d855bed734fb20531938d93d70c075cfc47b5f3f3a64b.gpkg
./.git/annex/transfer/failed/download/a3f8a46a-60fe-58e2-901b-2c093bcc22d3/SHA256E-s946176--49fe39ceb46d2518bfff4dfe2e1e83c043ad76f13871c4143a5fff68540d943c.gpkg
./.git/annex/transfer/failed/download/a3f8a46a-60fe-58e2-901b-2c093bcc22d3/SHA256E-s204800--a518c074bc22f673f0c73191a01426fef0a7d8b262a17d2729a4a3ac51da40ce.gpkg
./.git/annex/transfer/failed/download/a3f8a46a-60fe-58e2-901b-2c093bcc22d3/SHA256E-s51445760--a44651f781e5fce11bc498ba7fc30a0b79e5f7e282226852ef949f45888fb6eb.gpkg
./.git/annex/transfer/failed/download/a3f8a46a-60fe-58e2-901b-2c093bcc22d3/SHA256E-s4091904--3eb3df304d9fab549dcc657198c88bfc300f8c11836ecc62ae68538a67e3d430.gpkg
./.git/annex/transfer/failed/download/a3f8a46a-60fe-58e2-901b-2c093bcc22d3/SHA256E-s63684608--e5ff8eb805b96c7e231b7450514d101397a98a11f9320416d78084b9cad58e93.gpkg
./demo_beheer.gpkg
Is this a recognized result of some slip-up I've made? I can't remember what exactly I did to reach here other than try to clone the repo various times, and I might have messed up some paths because of struggling with the gcrypt url formats. I can of course provide more details if that would help. Alternatively, does someone have an idea of how I could recover these files simply, to start over?
Help would be much appreciated.
I forgot to mention, all I've been able to find that seemed related was this post: https://git-annex.branchable.com/bugs/fix_git-annex_paths47objects_40repository_not_available41/
but I couldn't figure out if this was a similar problem to mine.
"/annex/objects/SHA256E-s204800--a518c074bc22f673f0c73191a01426fef0a7d8b262a17d2729a4a3ac51da40ce.gpkg" is used when an annexed object in a v7 repository is unlocked.
If you run
git annex lock
on it, it will be turned backed into a symlink to the .git/annex/objects file.Normally unlocked files have that pointer replaced with the file content when it's available, and only when the file content is not available would you see that pointer. I guess you've done something to get your repository into this state where the content is present but the unlocked file is not populated with it. It's likely that running
git annex fsck
on the file would fix that problem.Looking more closely at your list of files, your repository does not contain a copy of the current version of demo_beheer.gpkg, which is "SHA256E-s204800--a518c074bc22f673f0c73191a01426fef0a7d8b262a17d2729a4a3ac51da40ce.gpkg"
There's evidence you tried to download that key from somewhere, but the download failed.
So, it seems that your repository is not in any unsual state, you're just confused about how an unlocked file that is not present looks. Probably commands like these will be useful:
Thanks for your attention, Joey.
Hmm.
git annex fsck
reports 'ok' for all files. After that, running get/whereis I get:this is the remote in question, which is accessible (it's on the same drive as the current repo)
However, I can run
git annex sync smdata_remote_wd_elements_small
, which does complete successfully, so why is it not accessible withget
orcopy
?If I
lock
demo_beheer.gpkg, it turns into this symlink:of which the target indeed does not exist (the directory exists, not the file). However, git annex get still fails after locking (and a subsequent sync) and I am at a loss to know why.
This wouldn't be due to the nature of the remote, or my URL for it, or something? Decryption works fine when syncing.
git-annex sync
does not, by default, download the content of annexed files. Usegit annex get
I have been trying
get
andcopy
andwhereis
. They all report failure and suggest making the remotea3f8a46a-60fe-58e2-901b-2c093bcc22d -- smdata_encrypted_remote_wd_elements_small
available.The remote with this name is available -- at least, I can
sync
with it. However, in.git/config
theannex-uuid
is97d51497-158f-54ef-baef-77a720c9d758
.as far as I can tell -- I'm shooting in the dark here -- that isn't the issue, because if I change the uuid in
.git/config
it still gives the same error, now with the changed uuid.So, let me review my mental model of this situation, which will hopefully reveal the gaping holes:
normally, content tracking means each repo knows which other repos have copies of the file. In locked mode, as you said, the file is a symlink to the annexed object and if that target is missing the symlink is simply broken. In unlocked mode, the file is present at its correct location but if it is missing it is replaced with a text file with an annex object path as the contents.
So, my content has somehow gone missing.
Not knowing very much about git-annex's internals, my next question would be: how can I look for this content? I can't explore the files in the remote manually, since they are encrypted. This is what happens if I clone that bare repo again:
So: the
git-annex
branch is indicating that the content is available in this bare remote, but is that not true? Is there a way for me to determine (with or without git-annex) if the content is actually there or not?From the speed at which the
git annex get
command returns its error message I get the impression that it's not actually checking the remote, but determining from the localgit-annex
branch that the content is not in the remote. Is that correct? Why does it then suggest making that remote available? In that case, is there a way to figure out if and when thegit-annex
branch logs have diverged from reality, OR alternatively how to find out if there is content in that repository?What is also really bugging me is that I have the content of some files available in one of my clones, but I can't access them via git-annex. My files are of such a number and format that manually fishing them out of
.git/annex/objects
is not feasible (e.g. shapefiles which consist of about six different files).Finally, the most relevant git-annex command I've been able to find is
git annex unused
, which gives me some interesting information in the bare repository (1900+ unused objects), butaddunused
doesn't seem to bring things back.