As far as the point about the naming convention being enforced, you'll also notice that every other project out there, whether that be MAME, Redump or No-Intro also has a naming standard which is enforced by their DAT files. ...
That part alone now confirms to me something I have suspected but was not certain of. First, not one of the tools enforces any format in which I have to name my own files. They enforce file formats so that they can read and parse the data within the data file, but that is a format, not a file name. They _may_ enforce the file names of the data files (as in, it must be named "tosec.dat" and placed in your home directory) they need to find and read for operation, and the format of such a file, but they do not force me to name my other files in any way.
When I extracted the TOSEC dat file archive, I was able to extract it where I wanted it, specifically, under a "tosec" folder in my "emu" folder. Would you feel you can tell me I must extract it to a folder named "TOSEC DAT FILES", or anything else? That was the impression the TNC gave me--that I had to use a properly approved name.
Using ClrMAMEPro's example from their documentation, a data file entry looks like:
set (
name pacman
cloneof pacman
description "PuckMan (Japan set 1)"
rom ( name namcopac.6e size 4096 crc fee263b3 md5 3f84d78d59147b9b3c816da72110e55f)
sample shot.wav
sampleof galaxian
)
I am guessing that ClrMAMEPro is fully capable of renaming files to follow TNC, since the data above has most of the information needed. (I am on Linux, so I cannot use CMP under my system wine or mono configuration.) But no where in that example does it even show what the TNC name _would_ be. It clearly separates the name, description, and other data instead of pushing it all together into a single name, without dictating to me how the final composed name must be.
Because of the way the TOSEC dat files treat the _text_ inside of the dat file as if it were an actual existing file on my hard drive, that is what has frustraed me and made me believe that TNC is being forced and enforced. I may not even have that ROM image on my hard drive, so there is no file with that name and no file that can/should be renamed to that.
So I am not talking about an actual existing (or not) file on my drive. I am talking about a line of _text_ that I am trying to parse to identify the data about that ROM, if it were to exist, possibly under a different name. In TOSEC dat files, I have to write code to figure out which part of the name is which, by reversing the TNC rules. It would be simpler if the TOSEC dat files had name="title" date="date" publisher="publisher" extension="rom", which I could then do something far more simpler (JavaScript-like pseudo code):
var name = game.getAttribute("name");
var date = game.getAttribute("date");
var publisher = game.getAttribute("publisher");
var extension = game.getAttribute("extension");
var newName = name + " (" + date + ")(" + publisher + ")." + extension;
Five simple lines of code that even a non-technical person can probably figure out what it does.
But the TOSEC dat files encode the name in the dat file, even though it's just a line of text: <rom name="game (date)(publisher)extension" ...>
For that, instead of five lines of simple string concatenation, I first have to write several lines of code to figure out which part is the date. Is "(date)" the date? Or is "(publisher)"? As a human, we can immediately see it and know. A computer program, however, has to do many small steps to figure it out. TNC rules state that the "(date)" must be the first parenthesized part--except if there's a "(demo)" tag which proceeds it. So I have to write several more lines of code to check if there is a "(demo)" tag, and if there is, then it has o assume the second ()'d item is the date. I have to check if the date matches the date format: numerical text instead of alphabetical text--except in the case of a single "-" or an "x" to indicate an unknown number at that position. Some dates may use "JAN" instead of "01", so I have to write more code to handle that. Some may use YYYYMMDD, some use YYYY-MM-DD, some use YYYY.MM.DD (despite the TNC giving specific rules about how to format dates), adding even more code to check for those possibilities.
And some entries are missing the date tag/component entirely. More code to check for that.
Start adding in all the other possible tags/components, and how to handle them if they are present, missing, or malformed... And the code has grown to several thousand lines. And all that code takes time to execute. The most recent dat file set has nearly 875000 <game ...> entries. if the code takes one second to fully parse and process the name from the dat file, it would take a little over 1 week and 3 days to process the 875K entries in the dat files.
If the code takes 1/10th of a second per name, it would still take a little more than 24 hours to process the entire set. At 1/100th of a second per name, about 2 and a half hours. Every bit of additional code slows it down even more.
But now I see there is a misunderstanding. The names in the dat files are _not_ actual file names, they are text data encoded in an XML format and encoding. They are only what the names _would_ be _if_ I actually had the file and named it according to the TNC. I have no issue with that, as I pointed out--only five lines of code (a few more, actually depending on how many tags/flags/components there are, but still far lass than hundreds of lines needed to parse and decode the name), which would execute very quickly, and only on the files I actually have.
I am not even certain I am expressing this very well.
Here is a line from "./TOSEC/Atari 8bit - Games - [XEX] (TOSEC-v2014-10-30_CM).dat":
"Winter Wally (1987)(Alternative Software)(GB)[h Paul Foster]"
Are either of hose lines the actual file? Did I have to upload them in order to copy/paste them in to this post? No, hey are just text. Likewise, the data in the dat files is not the files, it's just text data about he files--metadata.
But the fact that the text data is encoded in the TNC format is what has continued to make me believe that the TNC _must_ be used, without exception, and that it exists primarily to enforce that. That is what I am opposed to, why it makes me feel as if the TNC dictates to me how I must name my files, _and_ _must_ format any metadata about the file. I am _not_ allowed to:
<file name="Winter Wally" date="1987" publisher="Alternative Software" region="GB" flag-h="Paul Foster">
Nor am I allowed to name the file something else on my hard drive. "The text in the dat file follows the TNC, so must everything else."
If our positions were reversed, and you were under the same understanding I was, would you not also oppose it? Would you not come in here asking why?
( And I am still not certain I am being clear... )