Datafile-Format discussion

Discuss the future of Warzone 2100 with us.
User avatar
kage
Regular
Regular
Posts: 751
Joined: 05 Dec 2006, 21:45

Datafile-Format discussion

Post by kage »

Continued from Realistic/WWII total conversion? -- DevUrandom

well, in the last couple months, we've seen several suggestions for changes/additions to the wz data file formats -- if that continues, every time that changes, we'd need to add extra columns which makes the database engine reformat/defragment the database storage for a given table, which is inefficient, but is probably, overall, a whole lot more efficient than creating a 3-4 column type-agnostic table which has one record per field.

still, that seems the best implementation.

before it's started, though, i think it might be beneficial to have a set of version-dependent metadata files that warzone actually uses to parse the data files, as opposed to having it hardcoded -- this would help mitigate backwards-compatibility problems and would also be something the import/export script could use directly, so that adding new fields to warzone's csv data-files don't require direct maintainance of the online stat database.
Last edited by DevUrandom on 18 Dec 2006, 22:45, edited 1 time in total.
User avatar
lav_coyote25
Professional
Professional
Posts: 3434
Joined: 08 Aug 2006, 23:18

Datafile-Format discussion

Post by lav_coyote25 »

ok fine - is there a set format for what your doing... hand it over and i will do it.  or tell me how to proceed - i already have a mysql database setup in conjunction with wamp ( for testing out web stuff.)  i am just not sure how to proceed.

???
‎"to prepare for disaster is to invite it, to not prepare for disaster is a fools choice" -me (kim-lav_coyote25-metcalfe) - it used to be attributed to unknown - but adding the last bit , it now makes sense.
User avatar
kage
Regular
Regular
Posts: 751
Joined: 05 Dec 2006, 21:45

Datafile-Format discussion

Post by kage »

well, my point is, going for a set format is something of a waste of time -- instead we should create some way to make it so that the warzone engine and this stats database system would use the same way to parse the data files -- the easiest way to do this (that i can think of) would be to use meta-data files that describe the information stored in the normal .wz data files.

if this were done in such a unified way, you'd have literally nothing to do except upload the .wz file.
User avatar
lav_coyote25
Professional
Professional
Posts: 3434
Joined: 08 Aug 2006, 23:18

Datafile-Format discussion

Post by lav_coyote25 »

kage wrote: well, my point is, going for a set format is something of a waste of time -- instead we should create some way to make it so that the warzone engine and this stats database system would use the same way to parse the data files -- the easiest way to do this (that i can think of) would be to use meta-data files that describe the information stored in the normal .wz data files.

if this were done in such a unified way, you'd have literally nothing to do except upload the .wz file.
ok - now - enlighten this one - what??  ???  i think i see, so explain it how it should be ... and that will be the way it will be.
‎"to prepare for disaster is to invite it, to not prepare for disaster is a fools choice" -me (kim-lav_coyote25-metcalfe) - it used to be attributed to unknown - but adding the last bit , it now makes sense.
User avatar
kage
Regular
Regular
Posts: 751
Joined: 05 Dec 2006, 21:45

Datafile-Format discussion

Post by kage »

well, we're best off running this before a database specialist, but, given the downsides of making it more flexible, and since those downsides wont occur that often (presumably), it'd pretty much be the dead-simple layout as ratarf mentioned: one table per datafile, and one column per field in that file -- then just do a direct dump of the files into their appropriate columns -- if set up right, you'd only have to quote a few things as they're parsed, and then each line in the datafiles might actually work (more or less) as is as a valid sql insert string. metadata files, if implemented, could easily be used to find out what needs to be quoted (strings).

really, though, either metadata files or a shared parser library would be needed or you're blowing effort you could've spent doing nothing (with either of those two implementations -- metadata files are easier to implement and more beneficial anyways -- you would be doing nothing while the data gets imported automatically).
karmazilla
Trained
Trained
Posts: 84
Joined: 26 Aug 2006, 21:05

Datafile-Format discussion

Post by karmazilla »

Warzone already has .y and .l files for generating parsers to a couple of formats, methinks,  but the parser they generate are most likely very tightly coupled with the rest of warzone.

Looks like the discussion about using XML for data files has won an argument and a use case in its favor.
User avatar
kage
Regular
Regular
Posts: 751
Joined: 05 Dec 2006, 21:45

Datafile-Format discussion

Post by kage »

i was at one point a person spearheading for the use of xml to store warzone data, so i'll not comment other than to say that there huge benefits and equally huge downsides when working with xml.
User avatar
Watermelon
Code contributor
Code contributor
Posts: 551
Joined: 08 Oct 2006, 09:37

Datafile-Format discussion

Post by Watermelon »

I am not sure what the .y and .l do exactly,seems their compiled form is used to compile/interpret the slo and vlo scripts.

I will try to experiment with xml a bit if noone else is doing that(actually gerard_ already did xml version of 'repair.txt'),at least it's vastly better than that scanf...

edit:

I looked at this:
http://expat.sourceforge.net
seems to be too 'powerful' and complex for wz stats parse,though it's the most compact lib I can find to parse xml.

Also found this interesting example in .l and .y on w3.org,but I dont know how to use bison and flex...probably using bison and flex is better,since wz uses them to parse data too:
http://www.w3.org/XML/9707/XML-in-C
Last edited by Watermelon on 15 Dec 2006, 21:33, edited 1 time in total.
tasks postponed until the trunk is relatively stable again.
User avatar
kage
Regular
Regular
Posts: 751
Joined: 05 Dec 2006, 21:45

Datafile-Format discussion

Post by kage »

Watermelon wrote: I am not sure what the .y and .l do exactly,seems their compiled form is used to compile/interpret the slo and vlo scripts.
i believe they are used by a parser generator to create a data file parser for warzone.
Watermelon wrote: I will try to experiment with xml a bit if noone else is doing that(actually gerard_ already did xml version of 'repair.txt'),at least it's vastly better than that scanf...
indeed, it's much more human readable, but likely takes over 50x longer to parse, and if you are parsing data files in xml format, i strongly suggest you use sax, as building a full xml tree (dom) in memory will be entirely worthless in this case, and will eat up anywhere from 100 - 1000 times more memory than the scanf method.

as said, very strong advantages, and very strong disadvantages as well -- even the original designers of xml take the position that xml is being used wrongly in many cases, and that it was never designed to be a "best option every time" format.

what i might suggest, is use xml for authoring, and provide free, multi-platform tools for converting either way between the xml that modders use, and some plaintext/binary format that warzone uses: that way modders are happy and can work with greater efficiency, and warzone can still load a map in 5 seconds (as opposed to 20). also, if modders forget to convert their xml files to the native format, warzone would convert, using this same external tool, and cache those files for future use, doing a simple timestamp check on the mod as a whole to determine cache freshness.
karmazilla
Trained
Trained
Posts: 84
Joined: 26 Aug 2006, 21:05

Datafile-Format discussion

Post by karmazilla »

Watermelon wrote:I will try to experiment with xml a bit if noone else is doing that(actually gerard_ already did xml version of 'repair.txt'),at least it's vastly better than that scanf...
Beware that I revised his repair.xml file: https://mail.gna.org/public/warzone-dev ... 00208.html
Also worth reading is his reply: https://mail.gna.org/public/warzone-dev ... 00210.html

I'm not 100% on the requirements for this particular xml-format, put I can do an XSD Schema that will validate on my version of repair.xml, once the "techlevel-ALL" issue is fixed.
User avatar
Watermelon
Code contributor
Code contributor
Posts: 551
Joined: 08 Oct 2006, 09:37

Datafile-Format discussion

Post by Watermelon »

kage wrote: i believe they are used by a parser generator to create a data file parser for warzone.

indeed, it's much more human readable, but likely takes over 50x longer to parse, and if you are parsing data files in xml format, i strongly suggest you use sax, as building a full xml tree (dom) in memory will be entirely worthless in this case, and will eat up anywhere from 100 - 1000 times more memory than the scanf method.

as said, very strong advantages, and very strong disadvantages as well -- even the original designers of xml take the position that xml is being used wrongly in many cases, and that it was never designed to be a "best option every time" format.

what i might suggest, is use xml for authoring, and provide free, multi-platform tools for converting either way between the xml that modders use, and some plaintext/binary format that warzone uses: that way modders are happy and can work with greater efficiency, and warzone can still load a map in 5 seconds (as opposed to 20). also, if modders forget to convert their xml files to the native format, warzone would convert, using this same external tool, and cache those files for future use, doing a simple timestamp check on the mod as a whole to determine cache freshness.
I think any format that is easy to read/edit should be fine,we can keep the bison/flex,then just replace the LoadSomeStats function's(which loads/breaks the parsed data from bison/flex into memory I think) scanf's with xml read/parse function.Maybe we should use a custom plain text format like other games use if xml is too slow,though parsing performance is not a major concern as long as they are only loaded into memory once during initialization.
tasks postponed until the trunk is relatively stable again.
Troman
Trained
Trained
Posts: 424
Joined: 12 Aug 2006, 15:40
Contact:

Datafile-Format discussion

Post by Troman »

I have to admit I have never tried parsing XML files with bison before, but from my experience with bison I think it should be piece of cake to write a bison/flex parser to parse xml files. I doubt it would take 50x longer to parse, bison/flex are real fast and the grammar rules are simple for XML.
I think we shouldn't introduce any new libraries for XML parsing, bison/flex or lua, which Devurandom is going to use at some point anyway (i'm not familar with it though, not sure if it can be used to parse xml files, maybe Devurandom can clarify this), should be more than enough.

That said I must add that I probably won't have time to do that, I don't know if anyone else is up to writing a Bison/Flex parser, if we are to use Bison/Flex for parsing of course.

EDIT: the way we are doing it now we'd probably have to introduce a new parser for each txt file format there is (templates.txt, weapons.txt etc), we would end up with 50+ more c/h files (plus y/l files). I think we could combine all parsers into 1-3 files, since the parsers should be real tiny for each txt file.
Last edited by Troman on 16 Dec 2006, 06:09, edited 1 time in total.
Sign Up for Beta-Testing:
?topic=1617.0
User avatar
lav_coyote25
Professional
Professional
Posts: 3434
Joined: 08 Aug 2006, 23:18

Datafile-Format discussion

Post by lav_coyote25 »

hmmmm... k - question!

these txt files - weapons etc - would/could they be incorporated into what is being attempted ( tech tree as part of the gui ) ??

as i said its just a question... needed to be asked. ;D
‎"to prepare for disaster is to invite it, to not prepare for disaster is a fools choice" -me (kim-lav_coyote25-metcalfe) - it used to be attributed to unknown - but adding the last bit , it now makes sense.
karmazilla
Trained
Trained
Posts: 84
Joined: 26 Aug 2006, 21:05

Datafile-Format discussion

Post by karmazilla »

Troman wrote:I think we shouldn't introduce any new libraries for XML parsing, bison/flex or lua, which Devurandom is going to use at some point anyway (i'm not familar with it though, not sure if it can be used to parse xml files, maybe Devurandom can clarify this), should be more than enough.

That said I must add that I probably won't have time to do that, I don't know if anyone else is up to writing a Bison/Flex parser, if we are to use Bison/Flex for parsing of course.
I cringe at the idea of a homegrown XML parser :( libxml is a very small dependancy - I'm looking at the ubuntu package right now, and libxml1 only depends on libc6 and zlib1g. Besides, because of the SGML legacy in XML, there are some very hairy syntax rules in XML, like DTDs, Entity resolving, xml:id and CDATA sections. Plus, if we're going to create XSD Schemas and validate against them, then libxml might (I'm not entierly sure) have some functionality solve that.
And, if performance in XML parsing is an issue, then wouldn't you expect the lads and lassies behind libxml to know a thing or two about it?

Point of the matter is; we don't need to reinvent the wheel and roll our own XML parser.
Kamaze
Regular
Regular
Posts: 1017
Joined: 30 Jul 2006, 15:23

Datafile-Format discussion

Post by Kamaze »

My magic word: Caching!

XML == Raw Data
We all have the same heaven, but not the same horizon.
Post Reply