Datafile-Format discussion
Datafile-Format discussion
Continued from Realistic/WWII total conversion? -- DevUrandom
well, in the last couple months, we've seen several suggestions for changes/additions to the wz data file formats -- if that continues, every time that changes, we'd need to add extra columns which makes the database engine reformat/defragment the database storage for a given table, which is inefficient, but is probably, overall, a whole lot more efficient than creating a 3-4 column type-agnostic table which has one record per field.
still, that seems the best implementation.
before it's started, though, i think it might be beneficial to have a set of version-dependent metadata files that warzone actually uses to parse the data files, as opposed to having it hardcoded -- this would help mitigate backwards-compatibility problems and would also be something the import/export script could use directly, so that adding new fields to warzone's csv data-files don't require direct maintainance of the online stat database.
well, in the last couple months, we've seen several suggestions for changes/additions to the wz data file formats -- if that continues, every time that changes, we'd need to add extra columns which makes the database engine reformat/defragment the database storage for a given table, which is inefficient, but is probably, overall, a whole lot more efficient than creating a 3-4 column type-agnostic table which has one record per field.
still, that seems the best implementation.
before it's started, though, i think it might be beneficial to have a set of version-dependent metadata files that warzone actually uses to parse the data files, as opposed to having it hardcoded -- this would help mitigate backwards-compatibility problems and would also be something the import/export script could use directly, so that adding new fields to warzone's csv data-files don't require direct maintainance of the online stat database.
Last edited by DevUrandom on 18 Dec 2006, 22:45, edited 1 time in total.
- lav_coyote25
- Professional

- Posts: 3434
- Joined: 08 Aug 2006, 23:18
Datafile-Format discussion
ok fine - is there a set format for what your doing... hand it over and i will do it. or tell me how to proceed - i already have a mysql database setup in conjunction with wamp ( for testing out web stuff.) i am just not sure how to proceed.
???
???
"to prepare for disaster is to invite it, to not prepare for disaster is a fools choice" -me (kim-lav_coyote25-metcalfe) - it used to be attributed to unknown - but adding the last bit , it now makes sense.
Datafile-Format discussion
well, my point is, going for a set format is something of a waste of time -- instead we should create some way to make it so that the warzone engine and this stats database system would use the same way to parse the data files -- the easiest way to do this (that i can think of) would be to use meta-data files that describe the information stored in the normal .wz data files.
if this were done in such a unified way, you'd have literally nothing to do except upload the .wz file.
if this were done in such a unified way, you'd have literally nothing to do except upload the .wz file.
- lav_coyote25
- Professional

- Posts: 3434
- Joined: 08 Aug 2006, 23:18
Datafile-Format discussion
ok - now - enlighten this one - what?? ??? i think i see, so explain it how it should be ... and that will be the way it will be.kage wrote: well, my point is, going for a set format is something of a waste of time -- instead we should create some way to make it so that the warzone engine and this stats database system would use the same way to parse the data files -- the easiest way to do this (that i can think of) would be to use meta-data files that describe the information stored in the normal .wz data files.
if this were done in such a unified way, you'd have literally nothing to do except upload the .wz file.
"to prepare for disaster is to invite it, to not prepare for disaster is a fools choice" -me (kim-lav_coyote25-metcalfe) - it used to be attributed to unknown - but adding the last bit , it now makes sense.
Datafile-Format discussion
well, we're best off running this before a database specialist, but, given the downsides of making it more flexible, and since those downsides wont occur that often (presumably), it'd pretty much be the dead-simple layout as ratarf mentioned: one table per datafile, and one column per field in that file -- then just do a direct dump of the files into their appropriate columns -- if set up right, you'd only have to quote a few things as they're parsed, and then each line in the datafiles might actually work (more or less) as is as a valid sql insert string. metadata files, if implemented, could easily be used to find out what needs to be quoted (strings).
really, though, either metadata files or a shared parser library would be needed or you're blowing effort you could've spent doing nothing (with either of those two implementations -- metadata files are easier to implement and more beneficial anyways -- you would be doing nothing while the data gets imported automatically).
really, though, either metadata files or a shared parser library would be needed or you're blowing effort you could've spent doing nothing (with either of those two implementations -- metadata files are easier to implement and more beneficial anyways -- you would be doing nothing while the data gets imported automatically).
-
karmazilla
- Trained

- Posts: 84
- Joined: 26 Aug 2006, 21:05
Datafile-Format discussion
Warzone already has .y and .l files for generating parsers to a couple of formats, methinks, but the parser they generate are most likely very tightly coupled with the rest of warzone.
Looks like the discussion about using XML for data files has won an argument and a use case in its favor.
Looks like the discussion about using XML for data files has won an argument and a use case in its favor.
Datafile-Format discussion
i was at one point a person spearheading for the use of xml to store warzone data, so i'll not comment other than to say that there huge benefits and equally huge downsides when working with xml.
- Watermelon
- Code contributor

- Posts: 551
- Joined: 08 Oct 2006, 09:37
Datafile-Format discussion
I am not sure what the .y and .l do exactly,seems their compiled form is used to compile/interpret the slo and vlo scripts.
I will try to experiment with xml a bit if noone else is doing that(actually gerard_ already did xml version of 'repair.txt'),at least it's vastly better than that scanf...
edit:
I looked at this:
http://expat.sourceforge.net
seems to be too 'powerful' and complex for wz stats parse,though it's the most compact lib I can find to parse xml.
Also found this interesting example in .l and .y on w3.org,but I dont know how to use bison and flex...probably using bison and flex is better,since wz uses them to parse data too:
http://www.w3.org/XML/9707/XML-in-C
I will try to experiment with xml a bit if noone else is doing that(actually gerard_ already did xml version of 'repair.txt'),at least it's vastly better than that scanf...
edit:
I looked at this:
http://expat.sourceforge.net
seems to be too 'powerful' and complex for wz stats parse,though it's the most compact lib I can find to parse xml.
Also found this interesting example in .l and .y on w3.org,but I dont know how to use bison and flex...probably using bison and flex is better,since wz uses them to parse data too:
http://www.w3.org/XML/9707/XML-in-C
Last edited by Watermelon on 15 Dec 2006, 21:33, edited 1 time in total.
tasks postponed until the trunk is relatively stable again.
Datafile-Format discussion
i believe they are used by a parser generator to create a data file parser for warzone.Watermelon wrote: I am not sure what the .y and .l do exactly,seems their compiled form is used to compile/interpret the slo and vlo scripts.
indeed, it's much more human readable, but likely takes over 50x longer to parse, and if you are parsing data files in xml format, i strongly suggest you use sax, as building a full xml tree (dom) in memory will be entirely worthless in this case, and will eat up anywhere from 100 - 1000 times more memory than the scanf method.Watermelon wrote: I will try to experiment with xml a bit if noone else is doing that(actually gerard_ already did xml version of 'repair.txt'),at least it's vastly better than that scanf...
as said, very strong advantages, and very strong disadvantages as well -- even the original designers of xml take the position that xml is being used wrongly in many cases, and that it was never designed to be a "best option every time" format.
what i might suggest, is use xml for authoring, and provide free, multi-platform tools for converting either way between the xml that modders use, and some plaintext/binary format that warzone uses: that way modders are happy and can work with greater efficiency, and warzone can still load a map in 5 seconds (as opposed to 20). also, if modders forget to convert their xml files to the native format, warzone would convert, using this same external tool, and cache those files for future use, doing a simple timestamp check on the mod as a whole to determine cache freshness.
-
karmazilla
- Trained

- Posts: 84
- Joined: 26 Aug 2006, 21:05
Datafile-Format discussion
Beware that I revised his repair.xml file: https://mail.gna.org/public/warzone-dev ... 00208.htmlWatermelon wrote:I will try to experiment with xml a bit if noone else is doing that(actually gerard_ already did xml version of 'repair.txt'),at least it's vastly better than that scanf...
Also worth reading is his reply: https://mail.gna.org/public/warzone-dev ... 00210.html
I'm not 100% on the requirements for this particular xml-format, put I can do an XSD Schema that will validate on my version of repair.xml, once the "techlevel-ALL" issue is fixed.
- Watermelon
- Code contributor

- Posts: 551
- Joined: 08 Oct 2006, 09:37
Datafile-Format discussion
I think any format that is easy to read/edit should be fine,we can keep the bison/flex,then just replace the LoadSomeStats function's(which loads/breaks the parsed data from bison/flex into memory I think) scanf's with xml read/parse function.Maybe we should use a custom plain text format like other games use if xml is too slow,though parsing performance is not a major concern as long as they are only loaded into memory once during initialization.kage wrote: i believe they are used by a parser generator to create a data file parser for warzone.
indeed, it's much more human readable, but likely takes over 50x longer to parse, and if you are parsing data files in xml format, i strongly suggest you use sax, as building a full xml tree (dom) in memory will be entirely worthless in this case, and will eat up anywhere from 100 - 1000 times more memory than the scanf method.
as said, very strong advantages, and very strong disadvantages as well -- even the original designers of xml take the position that xml is being used wrongly in many cases, and that it was never designed to be a "best option every time" format.
what i might suggest, is use xml for authoring, and provide free, multi-platform tools for converting either way between the xml that modders use, and some plaintext/binary format that warzone uses: that way modders are happy and can work with greater efficiency, and warzone can still load a map in 5 seconds (as opposed to 20). also, if modders forget to convert their xml files to the native format, warzone would convert, using this same external tool, and cache those files for future use, doing a simple timestamp check on the mod as a whole to determine cache freshness.
tasks postponed until the trunk is relatively stable again.
Datafile-Format discussion
I have to admit I have never tried parsing XML files with bison before, but from my experience with bison I think it should be piece of cake to write a bison/flex parser to parse xml files. I doubt it would take 50x longer to parse, bison/flex are real fast and the grammar rules are simple for XML.
I think we shouldn't introduce any new libraries for XML parsing, bison/flex or lua, which Devurandom is going to use at some point anyway (i'm not familar with it though, not sure if it can be used to parse xml files, maybe Devurandom can clarify this), should be more than enough.
That said I must add that I probably won't have time to do that, I don't know if anyone else is up to writing a Bison/Flex parser, if we are to use Bison/Flex for parsing of course.
EDIT: the way we are doing it now we'd probably have to introduce a new parser for each txt file format there is (templates.txt, weapons.txt etc), we would end up with 50+ more c/h files (plus y/l files). I think we could combine all parsers into 1-3 files, since the parsers should be real tiny for each txt file.
I think we shouldn't introduce any new libraries for XML parsing, bison/flex or lua, which Devurandom is going to use at some point anyway (i'm not familar with it though, not sure if it can be used to parse xml files, maybe Devurandom can clarify this), should be more than enough.
That said I must add that I probably won't have time to do that, I don't know if anyone else is up to writing a Bison/Flex parser, if we are to use Bison/Flex for parsing of course.
EDIT: the way we are doing it now we'd probably have to introduce a new parser for each txt file format there is (templates.txt, weapons.txt etc), we would end up with 50+ more c/h files (plus y/l files). I think we could combine all parsers into 1-3 files, since the parsers should be real tiny for each txt file.
Last edited by Troman on 16 Dec 2006, 06:09, edited 1 time in total.
Sign Up for Beta-Testing:
?topic=1617.0
?topic=1617.0
- lav_coyote25
- Professional

- Posts: 3434
- Joined: 08 Aug 2006, 23:18
Datafile-Format discussion
hmmmm... k - question!
these txt files - weapons etc - would/could they be incorporated into what is being attempted ( tech tree as part of the gui ) ??
as i said its just a question... needed to be asked. ;D
these txt files - weapons etc - would/could they be incorporated into what is being attempted ( tech tree as part of the gui ) ??
as i said its just a question... needed to be asked. ;D
"to prepare for disaster is to invite it, to not prepare for disaster is a fools choice" -me (kim-lav_coyote25-metcalfe) - it used to be attributed to unknown - but adding the last bit , it now makes sense.
-
karmazilla
- Trained

- Posts: 84
- Joined: 26 Aug 2006, 21:05
Datafile-Format discussion
I cringe at the idea of a homegrown XML parserTroman wrote:I think we shouldn't introduce any new libraries for XML parsing, bison/flex or lua, which Devurandom is going to use at some point anyway (i'm not familar with it though, not sure if it can be used to parse xml files, maybe Devurandom can clarify this), should be more than enough.
That said I must add that I probably won't have time to do that, I don't know if anyone else is up to writing a Bison/Flex parser, if we are to use Bison/Flex for parsing of course.
And, if performance in XML parsing is an issue, then wouldn't you expect the lads and lassies behind libxml to know a thing or two about it?
Point of the matter is; we don't need to reinvent the wheel and roll our own XML parser.
Datafile-Format discussion
My magic word: Caching!
XML == Raw Data
XML == Raw Data
We all have the same heaven, but not the same horizon.
