Many websites return an Etag in the http response header, indicating the version of the resource. Could the etag (or a checksum of it) be recorded in the URL- key, the way size is now? Then e.g. fsck --from web
could do a stronger check that the same file is still downloadable from the web, and the situation where different remotes have different versions of a file with the same URL- key could be better prevented.
Closing as this does not seem like a useful idea. done --Joey
Etags are intended to help http clients with caching. It would not be considered much of a problem if a web server only returned the same Etag for a little while and then generated a new one later, since clients only cache so long anyway. But then git-annex would treat the file as no longer present on the website.
Apache uses the inode, size, and mtime for its Etag generation. So just moving a website to a different drive would change the Etag.