Author Topic: Nintendo Super Famicom & Super Entertainment System - Games (TOSEC-v2020-10-26)  (Read 1228 times)

Offline Cassiel

  • Administrator
  • Hero Member
  • *****
  • Posts: 1574
    • Email
Duplicate

Code: [Select]
<game name="Bubsy II (1994)(Accolade)(US)">
<description>Bubsy II (1994)(Accolade)(US)</description>
<rom name="Bubsy II (1994)(Accolade)(US).bin" size="2097152" crc="d0d172fa" md5="cf5c2ca9af502a0f12cf9504fb0e748a" sha1="c501d8779891a501ca4674cc39fc9d82cc05eb8c"/>
</game>
<game name="Bubsy II (1994)(Accolade)(US)[a2]">
<description>Bubsy II (1994)(Accolade)(US)[a2]</description>
<rom name="Bubsy II (1994)(Accolade)(US)[a2].bin" size="2097152" crc="d0d172fa" md5="cf5c2ca9af502a0f12cf9504fb0e748a" sha1="c501d8779891a501ca4674cc39fc9d82cc05eb8c"/>
</game>



Offline Cassiel

  • Administrator
  • Hero Member
  • *****
  • Posts: 1574
    • Email
Duplicate

Code: [Select]
<game name="Chrono Trigger (1995)(Square)(US)">
<description>Chrono Trigger (1995)(Square)(US)</description>
<rom name="Chrono Trigger (1995)(Square)(US).bin" size="4194304" crc="2d206bf7" md5="a2bc447961e52fd2227baed164f729dc" sha1="de5822f4f2f7a55acb8926d4c0eaa63d5d989312"/>
</game>
<game name="Chrono Trigger (1995)(Square)(US)[a]">
<description>Chrono Trigger (1995)(Square)(US)[a]</description>
<rom name="Chrono Trigger (1995)(Square)(US)[a].bin" size="4194304" crc="2d206bf7" md5="a2bc447961e52fd2227baed164f729dc" sha1="de5822f4f2f7a55acb8926d4c0eaa63d5d989312"/>
</game>

Offline Cassiel

  • Administrator
  • Hero Member
  • *****
  • Posts: 1574
    • Email
Duplicate

Code: [Select]
<game name="Dragon Quest VI (1995)(Enix)(JP)">
<description>Dragon Quest VI (1995)(Enix)(JP)</description>
<rom name="Dragon Quest VI (1995)(Enix)(JP).bin" size="4194304" crc="33304519" md5="ac9955fa4c1aa8ebcc1d09511808c58b" sha1="3e699dc7e064d6ac84b1981aa150fdf1672b5456"/>
</game>
<game name="Dragon Quest VI (1995)(Enix)(JP)[a]">
<description>Dragon Quest VI (1995)(Enix)(JP)[a]</description>
<rom name="Dragon Quest VI (1995)(Enix)(JP)[a].bin" size="4194304" crc="33304519" md5="ac9955fa4c1aa8ebcc1d09511808c58b" sha1="3e699dc7e064d6ac84b1981aa150fdf1672b5456"/>
</game>

Offline Cassiel

  • Administrator
  • Hero Member
  • *****
  • Posts: 1574
    • Email
Duplicate

Code: [Select]
<game name="Final Fantasy III Rev 0 (1994)(Square)(US)">
<description>Final Fantasy III Rev 0 (1994)(Square)(US)</description>
<rom name="Final Fantasy III Rev 0 (1994)(Square)(US).bin" size="3145728" crc="a27f1c7a" md5="e986575b98300f721ce27c180264d890" sha1="4f37e4274ac3b2ea1bedb08aa149d8fc5bb676e7"/>
</game>
<game name="Final Fantasy III Rev 0 (1994)(Square)(US)[a2]">
<description>Final Fantasy III Rev 0 (1994)(Square)(US)[a2]</description>
<rom name="Final Fantasy III Rev 0 (1994)(Square)(US)[a2].bin" size="3145728" crc="a27f1c7a" md5="e986575b98300f721ce27c180264d890" sha1="4f37e4274ac3b2ea1bedb08aa149d8fc5bb676e7"/>
</game>

Offline Cassiel

  • Administrator
  • Hero Member
  • *****
  • Posts: 1574
    • Email
Duplicate
Code: [Select]
<game name="Frankenstein (1994)(Sony)(US)">
<description>Frankenstein (1994)(Sony)(US)</description>
<rom name="Frankenstein (1994)(Sony)(US).bin" size="2097152" crc="91415d1e" md5="623e2961454d44a07d7d5270ca9d6400" sha1="521249ed57c74dc85349aa5e2afca587443a7cef"/>
<game name="Frankenstein (1994)(Sony)(US)[a]">
<description>Frankenstein (1994)(Sony)(US)[a]</description>
<rom name="Frankenstein (1994)(Sony)(US)[a].bin" size="2097152" crc="91415d1e" md5="623e2961454d44a07d7d5270ca9d6400" sha1="521249ed57c74dc85349aa5e2afca587443a7cef"/>
</game>

Offline Cassiel

  • Administrator
  • Hero Member
  • *****
  • Posts: 1574
    • Email
Duplicate

Code: [Select]
<game name="Jetsons, The - Invasion of the Planet Pirates (1994)(Taito)(US)">
<description>Jetsons, The - Invasion of the Planet Pirates (1994)(Taito)(US)</description>
<rom name="Jetsons, The - Invasion of the Planet Pirates (1994)(Taito)(US).bin" size="1048576" crc="3e3073ce" md5="76d93ffebb00ddaf4cf23222ab28b962" sha1="b3a08e927fe4eac5f4e47e1361cb80cc392fb06f"/>
</game>
<game name="Jetsons, The - Invasion of the Planet Pirates (1994)(Taito)(US)[a]">
<description>Jetsons, The - Invasion of the Planet Pirates (1994)(Taito)(US)[a]</description>
<rom name="Jetsons, The - Invasion of the Planet Pirates (1994)(Taito)(US)[a].bin" size="1048576" crc="3e3073ce" md5="76d93ffebb00ddaf4cf23222ab28b962" sha1="b3a08e927fe4eac5f4e47e1361cb80cc392fb06f"/>
</game>

Offline Cassiel

  • Administrator
  • Hero Member
  • *****
  • Posts: 1574
    • Email
Duplicate

Code: [Select]
<game name="Mega Man's Soccer (1994)(Capcom)(US)">
<description>Mega Man's Soccer (1994)(Capcom)(US)</description>
<rom name="Mega Man's Soccer (1994)(Capcom)(US).bin" size="1310720" crc="fa9ee2ce" md5="9e886f686da276b00c4671967d396a79" sha1="7e59c9457829ed3f0fbd9cd0ceeb3432a4739e98"/>
</game>
<game name="Mega Man's Soccer (1994)(Capcom)(US)[a]">
<description>Mega Man's Soccer (1994)(Capcom)(US)[a]</description>
<rom name="Mega Man's Soccer (1994)(Capcom)(US)[a].bin" size="1310720" crc="fa9ee2ce" md5="9e886f686da276b00c4671967d396a79" sha1="7e59c9457829ed3f0fbd9cd0ceeb3432a4739e98"/>
</game>

Offline Cassiel

  • Administrator
  • Hero Member
  • *****
  • Posts: 1574
    • Email
Duplicate
Code: [Select]
<game name="Secret of Mana (1993)(Square)(US)">
<description>Secret of Mana (1993)(Square)(US)</description>
<rom name="Secret of Mana (1993)(Square)(US).bin" size="2097152" crc="d0176b24" md5="10a894199a9adc50ff88815fd9853e19" sha1="8133041a363e3cc68cedef40b49b6d20d03c505d"/>
</game>
<game name="Secret of Mana (1993)(Square)(US)[a]">
<description>Secret of Mana (1993)(Square)(US)[a]</description>
<rom name="Secret of Mana (1993)(Square)(US)[a].bin" size="2097152" crc="d0176b24" md5="10a894199a9adc50ff88815fd9853e19" sha1="8133041a363e3cc68cedef40b49b6d20d03c505d"/>
</game>

Offline Cassiel

  • Administrator
  • Hero Member
  • *****
  • Posts: 1574
    • Email
Duplicate

Code: [Select]
<game name="X-Men - Mutant Apocalypse (1994)(Capcom)(US)">
<description>X-Men - Mutant Apocalypse (1994)(Capcom)(US)</description>
<rom name="X-Men - Mutant Apocalypse (1994)(Capcom)(US).bin" size="2097152" crc="5e34822a" md5="0c8d8fd7e695499b86d39cc75e166671" sha1="81f66a620352f6ecad3bdfe0d47f42c55971080c"/>
</game>
<game name="X-Men - Mutant Apocalypse (1994)(Capcom)(US)[a]">
<description>X-Men - Mutant Apocalypse (1994)(Capcom)(US)[a]</description>
<rom name="X-Men - Mutant Apocalypse (1994)(Capcom)(US)[a].bin" size="2097152" crc="5e34822a" md5="0c8d8fd7e695499b86d39cc75e166671" sha1="81f66a620352f6ecad3bdfe0d47f42c55971080c"/>
</game>

Offline Cassiel

  • Administrator
  • Hero Member
  • *****
  • Posts: 1574
    • Email
CloneSpy is very good for avoiding these problems...   :)

http://www.clonespy.com/


Offline artifex

  • Newbie
  • *
  • Posts: 16
I created a little script to check duplicates based on dat files. These are duplicates based on SHA1+CRC (dat filename:game entry/rom entry):

Code: [Select]
dats/Nintendo Super Famicom & Super Entertainment System - Games (TOSEC-v2020-10-26_CM).dat:Super Shadow of the Beast (1992)(IGS)(US)(beta)[a]/Super Shadow of the Beast (1992)(IGS)(US)(beta)[a].bin
dats/Nintendo Super Famicom & Super Entertainment System - Games (TOSEC-v2020-10-26_CM).dat:Super Shadow of the Beast (1992)(IGS)(US)(beta)/Super Shadow of the Beast (1992)(IGS)(US)(beta).bin

dats/Nintendo Super Famicom & Super Entertainment System - Games (TOSEC-v2020-10-26_CM).dat:Final Fantasy III Rev 0 (1994)(Square)(US)[a2]/Final Fantasy III Rev 0 (1994)(Square)(US)[a2].bin
dats/Nintendo Super Famicom & Super Entertainment System - Games (TOSEC-v2020-10-26_CM).dat:Final Fantasy III Rev 0 (1994)(Square)(US)/Final Fantasy III Rev 0 (1994)(Square)(US).bin

Offline Cassiel

  • Administrator
  • Hero Member
  • *****
  • Posts: 1574
    • Email
Ooooh... very cool.

Is it per DAT based, or can you run across multiple DATs? i.e. to check duplicates ACROSS DATs, if that makes sense...

Offline artifex

  • Newbie
  • *
  • Posts: 16
Is it per DAT based, or can you run across multiple DATs? i.e. to check duplicates ACROSS DATs, if that makes sense...

You give the filenames to read and the rest is to the program. It reads all the given files and dump roms, check for duplicates based on SHA1+CRC and you can check the output. Not error prone and just a simple script with a very a quite cryptic output but it is works. :-) And yes, it works across multiple dats. It can parse MAME as well, so you can cross check between MAME and TOSEC. You do not want to I think, this is just an extra.

The shell script part, using the Nintendo Famicon dats:
Code: [Select]
#!/bin/bash -eu

./DatFile_Converter.py dats/Nintendo*Famicom*.dat \
 | tee z01 \
 | sort -i \
 | uniq -w 50 --all-repeated=separate \
 | tee z01a \
 | cut -f6 \
 > z01z

z01 - raw output of python script, includes all rom entry from all dat you specified, just for debug
z01a - duplicates output, six columns: SHA1, CRC, rom entry length, MD5, number of rom entry within a game entry, dat filename:game name/rom name
z01z - duplicate output, the last column of z01a

The python part, use to parse XML files and output the result, use the same filename as in shell script (like DatFile_Converter.py):
Code: [Select]
#!/usr/bin/env python3

import sys
from lxml import etree


def main():
        allroms = list()
        for f in sys.argv[1:]:
                print(f'--- reading file {f}', file=sys.stderr)
                tree = etree.parse(source=f)
                print(f'--   parsing file {f}', file=sys.stderr)
                sets = tree.xpath('//machine|game')
                print(f'--   processing file {f} with {len(sets)} sets', file=sys.stderr)
                for set in sets:
                        setofrom = set.get('name')
                        roms = [c for c in set if c.tag == 'rom']
                        rominsetcount = len(roms)
                        if rominsetcount > 1: ## filter by rom/set
                                continue
                        for rom in roms:
                                romname = rom.get('name')
                                romsha1 = rom.get('sha1')
                                rommd5  = rom.get('md5')
                                romcrc  = rom.get('crc')
                                romsize = int(rom.get('size'))
                                if romsize < 16384: ## filter by size
                                        continue
                                print(f"{romsha1}\t{romcrc}\t{romsize}\t{rommd5}\t{rominsetcount}\t{f}\t{setofrom}/{romname}")
                                allroms.append(rom)
        ## allroms.sort(key=lambda rom: rom.get('sha1') + rom.get('crc'), reverse=False)
        print(f"=== allroms size is {len(allroms)}",  file=sys.stderr)
        print(f"=== {allroms[0].get('sha1')}|{allroms[0].getparent().get('name')}|{allroms[0].get('name')}", file=sys.stderr)


if __name__ == '__main__':
        main()

If you see there is only one rom within a game and there is duplicates you think that is a real duplicates. If you see more than one rom entry within a game entry you need to check out the rest as this util is not able to compare game entry with game entry only rom entry with rom entry. So there is still room for improvement I know. Again this is just a quick hack to learn python and XML processing but with useful output. :-)

Update: modified the python part a little. Moved out the variables from print statement and therefore you can filter your included lines not only its size but number of rom entries within game entry (now every rom entry included which is bigger than 0 byte and there is no more rom than one within its game entry, this is a quite good starting point to find real duplicates). If you would like to change the filter parameters you have to change the source.

Update: simplified xpath. Now it can parse tosec, mame and yori XML files with one xpath. As yori files are huge the parsing is bloody slow.

Update: using a different approach to process the xml file and it gives a huuge improvement to yori dats processing. starting to move sorting and duplicate finding logic into python script so you do not need any external utility. very embryonic and not working yet. might never will. :-)
« Last Edit: November 15, 2020, 10:09:10 PM by artifex »

Offline Cassiel

  • Administrator
  • Hero Member
  • *****
  • Posts: 1574
    • Email
Excellent... will have to have a play with this.