Author Topic: Parsing DAT file Contents? What's the Magic Algorithm?  (Read 1400 times)

Offline A2S2000

  • Newbie
  • *
  • Posts: 1
Parsing DAT file Contents? What's the Magic Algorithm?
« on: April 29, 2020, 08:29:25 PM »
I'm working on personal project would like to use TOSEC dats as a reference source while I'm making images of various things, cataloging magazines and the like.

The TNC has all the Goodness. Since I'm talking about working -with- the published data, I'm posting in the Tools sub-board. 

Title first, The first set of parenthesis is likely the date unless it's preceded by a set of parenthesis that contains alpha characters, which would then put the next set of parenthesis as being the publisher. That covers the minimum for a TOSEC record. There shouldn't be any entries with less than those. (Save for ZZZ-UNK-xxxxxxxx, which can also be processed specifically)


I can see how the two sets of flags work as they're expected to be in alphabetical order, so the first occurrence of square [brackets] is the start of the flags. This can be parsed going down the line from [cr],[f],[h],[m],[p], and so on, and then any more square bracket sets are likely to be platform specific stuff. Like in the case of the Apple II, the required RAM size, [48K], required ROM, [Req. Integer ROMs]..

It's the entries after the publisher that I'm after, since there's no delimiter in the format.

(system)(video)(country)(language) ... (media label) are not always present for each record. So how do we know of the next sets of parenthesis, which place they hold?

...and what their 'value' would be in the case of [more info], as these are likely to be platform / listing specific.