Hi guys,
nice discussion that started when i was away
I will try to cover most of the stuff discussed here but will surely miss some since i have other things i need to do soon.
PART of the idea behind TIM was good and useful, the problem is that it would take years to do, also we would need to fix most of the aspects around TNC and TOSEC sets which difficult these ideas.
I, like you have the same idea/dream and have been pursuing it from the last 2 years or so (since i dropped TIM). Creating a set of useful tools about TOSEC collection, in my view doing a single, monolithic tool is not the solution and will end just like TIM, different small tools would be much better.
Renaming all sets with a single db should be done by some small tool and just that, not a powerful tool since cmp is already great and doing something big takes much of our free time, my current idea if i had to say something would just be a small simple tool that would pick some dats and rebuild from source to destination (creating an intermediate db with hashes and filename or not, dats already filtered with only the wanted sets for example) bla bla.
About the parser, TNC is really nasty and huge, i've a parser to extract all that information, identify some errors and so on, it is NOT perfect but it is something, currently WIP and unfortunately for you and me i guess done in C# (i was trying to use something new and of fast development).
That parser is used in a couple tools i have to check dats and so on.
Next, the first months after getting tired of TIM DB and thinking about an alternative led me to plan a new DB, i got huge with a lot of tables (100+) and i only normalized part of them (the most used), after that i decided to test it and start something that could be fun and decided for something more web instead of an local app, created some phps and imported them to mysql, that's how we can know see what TKaos said, for example from "Bencor Brothers" we have
Bencor Brothers Stats
Cracked: 95 Pirated: 0
Fixed: 0 Trained: 25
Hacked: 5 Translated: 0
Modified: 0 Total: 125
Published Sets: 4
found in:
Dats: 8
Systems: 2
...some of the sets use "BB" as the group abbreviation and there are also 84 sets using "Bencor Bros" which are wrong and need to be fixed, this is just an example but after having all information parsed and well organized, it's just a matter of code logic or SQL to do whatever we like.
Anyway there are a lot of issues regarding this, first and about my simple prototype, i started it just as a test and so it is not WELL designed, should be completely redone and secure before thinking about go live.
Second a lot of problems arise with not so clear TNC issues that keep changing and create the need to change the DB or DATs or parser.
3rd, the updates, this is a really nasty problem, with dats being renamed and sets fixed/added/dropped it gets really complicated to discover that just by looking at dats and not giving a LOT of extra work to renamers.
The original setnames should be maintained but the possibility for renames with some user defined name shouldn't be a problem.
Making renamers start using that system to add new sets is impossible, for ISO and new things it is possible since things are being now dumped with nice rules and added, but renamers prefer to handle files and generate cmp unfortunately, parsing their new dats is possible but gives a lot of extra code and possible points of failure
I guess server requirements to bring it to life would be higher than i have currently
Current datfiles (noniso) are full or errors that need to be fixed (values inserted has titles, scene groups, publishers and so on), that is our current goal.
Anyway those are my ideas, hope you can understand something out of this pile of un organized ideas and feel free to ask, that is my current view of it, the main thing would be a web something to have a central point of information, a few tools could be done and based on it, a rebuilder using dats/xml/db/something, a local app to browse information with a local db (eg. sqlite) or dats (cmp, xml, something), and so on.
The main problems with it are definitely time we have (at least i don't have much currently), existent rules and definitions in TOSEC that you or i can change just because they are clearly not a good option and in a few aspects knowledge