Hey Vaxalon, to be a member you need to fill some papers, pay the subscription fee and sign the NDA!
More seriously, i've already told you most of it via IRC, you can just start looking and the current dats, renaming new sets and fixing existent ones. Do it slowly to get used to it again and avoid making mistakes until you are trained in it again. Any question you have just post around here and anyone is free to try to help.
In the meanwhile and if you get used to it, the tag will naturally appear. As for what needs to be done, we don't know many specific things you just have to go on and find it. First think you have to do is assume that there are a LOT of mistakes everywhere so you might need to inspect each set to confirm it. For example, TKaos and mai did/do exactly this. Starting with some other dats and inspecting each set at a time to rename it. The results are amazing, even for sets considered generally good (Amiga), mai found a lot of mistakes and incomplete information. On others the number of mistakes is huge, TKaos knows more about this than me.
There can be errors in the convention (TNC), in other words naming format errors. Other kind of errors that are hard/impossible for others to check is content errors and these are important too such as wrong information like title, publisher, date and so on. This may sound insane and a lot of work but it is really needed.
The time when new sets appear frequently is gone, nowadays for most of the oldest systems you can only add new dumps of rare material, hacks and other dumps considered garbage sometimes. It is time for us to get it done properly, instead of grow dats fast adding partly unrenamed sets, take a lot more and rename each one properly so they are indeed useful to the users. This IS a huge task, especially on bigger systems. Duncan Twain is doing something similar with C64 (which is really messed up) and may give you some tips too from what he has already learned.
About spectrum:
- there were a couple of duplicates between spectrum and other dats, i don't recall wish but maybe someone can help us here.
- ZX Spectrum has 65 dats until know, half or more have 2009 in the date but it was due to minor changes to fix TNC errors from before or country / language conversions.
- there are 4 dats with compilations - various that could be separated.
- the biggest dat is games - tzx here you have:
11.922 sets
3 are unrenamed (ZZZ-UNK)
232 have an uknown date (1.9%)
a total of 5122 different titles.
1164 language flags used and only 15 country flags used, in many cases these language flags should be country flags.
cracked, hacked, modified and such flags are used really few times but this may depend on the system as you know.
alt flags, here you have 3378 sets using [a] (28,3 %), still none uses a descriptor. I highly suspect these alts are just incomplete renaming and should have a description (to know why they are alts) or be properly renamed.
Hope you liked this bit of statistics and good luck with the work.