Author Topic: Tosec2Json  (Read 3375 times)

Offline kyndig

  • Newbie
  • *
  • Posts: 6
Tosec2Json
« on: December 25, 2018, 07:54:47 AM »
Not sure if anyone will find this useful, I prefer working in JSON and for a project I am working on I converted the DAT to JSON, XML threw a lot of errors I had to work around happy to do a dump of these errors if useful, anyway here is the code I used it is a throwaway code to convert so it is not pretty, but if anyone want's to use it here it is, tool I am making is a DAT manager, to create and dump DATs, has a revision history for each dat and game in the dats and a pull-request type changes system to people can suggest changes and they can be merged into the live DAT, the idea was to make projects like TOSEC more accessible, but not sure if there is a need for such tool?

https://github.com/kyndigs/tosec2json

« Last Edit: December 25, 2018, 08:02:22 AM by kyndig »



Offline Maddog

  • Global Moderator
  • Full Member
  • *****
  • Posts: 199
Re: Tosec2Json
« Reply #1 on: December 26, 2018, 03:42:04 PM »
Thanks for the heads up!
I am not familiar with JSON, but hopefully some people will find your work useful.  :)

Offline kyndig

  • Newbie
  • *
  • Posts: 6
Re: Tosec2Json
« Reply #2 on: December 27, 2018, 05:05:17 AM »
Just a lot easier to work with for web based stuff, and parsing, a few errors in the DATs I worked around were just because of incorrect XML structure, example each game can have 1..m ROMs but because of how the DAT is structured if we try and de-serialize it, it will not pick up both instances, so you need to manually parse it all and check for those scenarios.

Code: [Select]

<game name="Alien Invasion (1994)(Archimedes World)">
<description>Alien Invasion (1994)(Archimedes World)</description>
<rom name="Alien Invasion (1994)(Archimedes World).adf" .../>
</game>

<game name="Alien Invasion (1994)(Archimedes World)">
<description>Alien Invasion (1994)(Archimedes World)</description>
<rom name="Alien Invasion (1994)(Archimedes World).adf" .../>
<rom name="Alien Invasion (1994)(Archimedes World).adf" .../>
</game>

Ideally we would want to do it like this.

Code: [Select]

<game name="Alien Invasion (1994)(Archimedes World)">
<description>Alien Invasion (1994)(Archimedes World)</description>
                <roms>
      <rom name="Alien Invasion (1994)(Archimedes World).adf" .../>
      <rom name="Alien Invasion (1994)(Archimedes World).adf" .../>
               </roms>
</game>

In JSON it looks like this.

Code: [Select]
{
  "name": "Acorn BBC - Applications - [BIN]",
  "description": "Acorn BBC - Applications - [BIN] (TOSEC-v2013-10-16)",
  "category": "TOSEC",
  "version": "2013-10-16",
  "author": "CRSV - Cassiel",
  "email": null,
  "homepage": "TOSEC",
  "url": "http://www.tosecdev.org/",
  "games": [
    {
      "name": "6502 2nd Processor BASIC Selector (1986)(Horsington, Gordon)",
      "description": "6502 2nd Processor BASIC Selector (1986)(Horsington, Gordon)",
      "roms": [
        {
          "name": "6502 2nd Processor BASIC Selector (1986)(Horsington, Gordon).bin",
          "size": "999",
          "crc": "aa16cca4",
          "md5": "8c600bedc91d878c72ab881945d54265",
          "sha1": "48cdc6079e685aadceca8347b7edc478a79eff2c"
        },
        {
          "name": "6502 2nd Processor BASIC Selector (1986)(Horsington, Gordon).bin",
          "size": "999",
          "crc": "aa16cca4",
          "md5": "8c600bedc91d878c72ab881945d54265",
          "sha1": "48cdc6079e685aadceca8347b7edc478a79eff2c"
        }
      ]
    }
}

Offline kyndig

  • Newbie
  • *
  • Posts: 6
Re: Tosec2Json
« Reply #3 on: December 27, 2018, 05:10:39 AM »
Here is a complete copy of the main branch dats in json if you want to take a look.

https://www26.zippyshare.com/v/mxV0234H/file.html

Offline PandMonium

  • Administrator
  • Hero Member
  • *****
  • Posts: 1332
Re: Tosec2Json
« Reply #4 on: December 27, 2018, 01:12:08 PM »
Hey @kyndig
Thanks! That might be useful to other users and we always welcome interesting tools or projects related to our project.
Regarding the XML format, I wouldn't call it an error but more the way it was designed (which can be criticized i'm sure :P). Our dats are mostly generated using clrmame pro tool (i guess, it kinda depends between renamers) and follow / are checked against the definition in http://www.logiqx.com/Dats/datafile.dtd

Keep us posted!

Offline kyndig

  • Newbie
  • *
  • Posts: 6
Re: Tosec2Json
« Reply #5 on: December 27, 2018, 01:42:21 PM »
Thanks, what in house tools are you guys using at the moment, would be happy to write some tools to help out in anyway I can to make the work you guys do easier!

Offline PandMonium

  • Administrator
  • Hero Member
  • *****
  • Posts: 1332
Re: Tosec2Json
« Reply #6 on: December 27, 2018, 02:46:45 PM »
Well it kinda depends on the task and the person.

Renamers have a couple of different ways of working but since they are mostly dealing with files and inspecting them they mostly prefer (AFAIK) to manually analyze it, using emulators, hexeditors, diff or other tools I guess. Afterwards they typically just rename the set/rom itself manually and regenerate the datfiles using clrmame pro dir2dat tool or edit the dat directly (e.g., @Maddog with large dats that take tons of time to regenerate). Over time some tools have been created to help them rename (at least generate the name according to the standards, I recall TIM tool having a module for that) but mostly end up not being used since they prefer to avoid an extra layer in the process. At the opposite end of the spectrum you have renamers such as @Duncan Twain and @Lady Eklipse who have personal and very system specific tools that provide tons of help, from exploring the image dump structure and analyzing the internal files for similarities, to fetch and merge online data from various sources.

Regarding the release process, it has been mostly made available using sets of zipped datfiles. There have been other approaches such as tugid and TIM db-based online updates but IMHO it kinda sucked (and nowadays we have gaps in our release history due to those less transparent methods, datpacks should have been also available at the time). Nowadays the release process is just a compilation of the most recent dats, scripts and cues, managed with a bunch of scripts I create as required to automate the process :P

Hope this makes sense, just wrote it without rechecking.

Offline kyndig

  • Newbie
  • *
  • Posts: 6
Re: Tosec2Json
« Reply #7 on: December 27, 2018, 03:17:57 PM »
Well it kinda depends on the task and the person.

Renamers have a couple of different ways of working but since they are mostly dealing with files and inspecting them they mostly prefer (AFAIK) to manually analyze it, using emulators, hexeditors, diff or other tools I guess. Afterwards they typically just rename the set/rom itself manually and regenerate the datfiles using clrmame pro dir2dat tool or edit the dat directly (e.g., @Maddog with large dats that take tons of time to regenerate). Over time some tools have been created to help them rename (at least generate the name according to the standards, I recall TIM tool having a module for that) but mostly end up not being used since they prefer to avoid an extra layer in the process. At the opposite end of the spectrum you have renamers such as @Duncan Twain and @Lady Eklipse who have personal and very system specific tools that provide tons of help, from exploring the image dump structure and analyzing the internal files for similarities, to fetch and merge online data from various sources.

Regarding the release process, it has been mostly made available using sets of zipped datfiles. There have been other approaches such as tugid and TIM db-based online updates but IMHO it kinda sucked (and nowadays we have gaps in our release history due to those less transparent methods, datpacks should have been also available at the time). Nowadays the release process is just a compilation of the most recent dats, scripts and cues, managed with a bunch of scripts I create as required to automate the process :P

Hope this makes sense, just wrote it without rechecking.

Ah okay, I had an idea about how the data is analyzed, would be interested to see common patterns where information is typically found for example when analyzing the hex information, with that you could then start to automate certain tasks and check for information, this information can then be manually verified if required.

For the release process, have you ever thought about managing as we would with a development project, using Git and Github? For example scene releases for this project get dumped daily. A similar process could be used for the project.

https://github.com/nZEDb/nZEDbPre_Dumps

Anyway project is still great! Just suggestions from me to help in any way I can! :)

Offline PandMonium

  • Administrator
  • Hero Member
  • *****
  • Posts: 1332
Re: Tosec2Json
« Reply #8 on: December 27, 2018, 11:46:02 PM »
Well, I can't say much about the diffs and system specific analysis. That is a complex subject which can be better discussed by @Crashdisk (did some Amiga specific tools), @Duncan Twain (C64) or @Lady Eklipse (Spectrum). I'm sure similar tools for other systems would be very useful but they usually require good knowledge about the system and their associated imaging formats.

As for the release, I don't know much about nZEDb or their predb so I can't compare the situation. Some members have discussed briefly the idea of using some versioning software such as git to manage dat changes but atm that would add an extra step/complexity especially for non techie members (what is git, committing, branching, merging, conflicts and pull requests) and IMHO we already have limited time.

I'm definitely not saying no to it since I kinda like the idea. Lets see what 2019 brings us!