Loading for instance all the entries in a D64 into a database, calculating the hash code of the entries works *very* well. For each entry in the D64 I get the bytes, concatinate them into one stream, sort it and hash. This again is stored in the database to make matching *fast*.
Using this method (and visually verifying they're the same) I found *many* mistakes in the collection and found a lot of additions from other sources.
Will post results soon.