Dear all, From some time now I am wondering about a way to index a set of files, lets say PDF documents. The idea is to have a unique identifier for each file and to cross-reference using this identifier. For instance, I use a project management (PM) software (web based) on a public server of my university. Then I have a set of tasks saying, review document X, or Y. And those documents are stored on an internal server of my lab. I see several options:
- Upload the required documents to the PM site and directly link
- Share online my internal server and use the URL of the docs in the PM
- Just use the unique identifier in the PM, and then look in git annex for that ID
- Use some sort of document management system (DMS)
Options 1 and 2 are impractical for several reasons. Option 4 usually requires that your files are inside the DMS. So my questions are:
- Do you think this is doable with git-annex?
- Is there an easy way to ask it: give me the document with this index?
- I think the best answer for this question is: git annex find --include '*' --format='${key} ${file}' | grep
- And conversely, how do I find the key of a certain document?
Thanks in advance. Best, Juan
To make git-annex output the key of a file, run:
git annex lookupkey $file
I don't know if the git-annex key is appropriate for your use-case. If the files never get changed, then it's a nice stable identifier. If ongoing changes are made to a file, and you want to link to the most recent version, the key would not be useful.
You might also look at git-annex's metadata; you could make up some metadata field and value and attach it to a file, and it would persist as the file was modified.
If I get this right, the key will change every time the file is changed. If that's the case, seems that it won't be useful for my case. Thanks for the time.