Page 1 of 3
Datafile-Format discussion
Posted: 15 Dec 2006, 00:43
by kage
Continued from Realistic/WWII total conversion? -- DevUrandom
well, in the last couple months, we've seen several suggestions for changes/additions to the wz data file formats -- if that continues, every time that changes, we'd need to add extra columns which makes the database engine reformat/defragment the database storage for a given table, which is inefficient, but is probably, overall, a whole lot more efficient than creating a 3-4 column type-agnostic table which has one record per field.
still, that seems the best implementation.
before it's started, though, i think it might be beneficial to have a set of version-dependent metadata files that warzone actually uses to parse the data files, as opposed to having it hardcoded -- this would help mitigate backwards-compatibility problems and would also be something the import/export script could use directly, so that adding new fields to warzone's csv data-files don't require direct maintainance of the online stat database.
Datafile-Format discussion
Posted: 15 Dec 2006, 01:40
by lav_coyote25
ok fine - is there a set format for what your doing... hand it over and i will do it. or tell me how to proceed - i already have a mysql database setup in conjunction with wamp ( for testing out web stuff.) i am just not sure how to proceed.
???
Datafile-Format discussion
Posted: 15 Dec 2006, 01:59
by kage
well, my point is, going for a set format is something of a waste of time -- instead we should create some way to make it so that the warzone engine and this stats database system would use the same way to parse the data files -- the easiest way to do this (that i can think of) would be to use meta-data files that describe the information stored in the normal .wz data files.
if this were done in such a unified way, you'd have literally nothing to do except upload the .wz file.
Datafile-Format discussion
Posted: 15 Dec 2006, 02:05
by lav_coyote25
kage wrote:
well, my point is, going for a set format is something of a waste of time -- instead we should create some way to make it so that the warzone engine and this stats database system would use the same way to parse the data files -- the easiest way to do this (that i can think of) would be to use meta-data files that describe the information stored in the normal .wz data files.
if this were done in such a unified way, you'd have literally nothing to do except upload the .wz file.
ok - now - enlighten this one - what?? ??? i think i see, so explain it how it should be ... and that will be the way it will be.
Datafile-Format discussion
Posted: 15 Dec 2006, 10:00
by kage
well, we're best off running this before a database specialist, but, given the downsides of making it more flexible, and since those downsides wont occur that often (presumably), it'd pretty much be the dead-simple layout as ratarf mentioned: one table per datafile, and one column per field in that file -- then just do a direct dump of the files into their appropriate columns -- if set up right, you'd only have to quote a few things as they're parsed, and then each line in the datafiles might actually work (more or less) as is as a valid sql insert string. metadata files, if implemented, could easily be used to find out what needs to be quoted (strings).
really, though, either metadata files or a shared parser library would be needed or you're blowing effort you could've spent doing nothing (with either of those two implementations -- metadata files are easier to implement and more beneficial anyways -- you would be doing nothing while the data gets imported automatically).
Datafile-Format discussion
Posted: 15 Dec 2006, 10:47
by karmazilla
Warzone already has .y and .l files for generating parsers to a couple of formats, methinks, but the parser they generate are most likely very tightly coupled with the rest of warzone.
Looks like the
discussion about using XML for data files has won an argument and a use case in its favor.
Datafile-Format discussion
Posted: 15 Dec 2006, 11:39
by kage
i was at one point a person spearheading for the use of xml to store warzone data, so i'll not comment other than to say that there huge benefits and equally huge downsides when working with xml.
Datafile-Format discussion
Posted: 15 Dec 2006, 14:51
by Watermelon
I am not sure what the .y and .l do exactly,seems their compiled form is used to compile/interpret the slo and vlo scripts.
I will try to experiment with xml a bit if noone else is doing that(actually gerard_ already did xml version of 'repair.txt'),at least it's vastly better than that scanf...
edit:
I looked at this:
http://expat.sourceforge.net
seems to be too 'powerful' and complex for wz stats parse,though it's the most compact lib I can find to parse xml.
Also found this interesting example in .l and .y on w3.org,but I dont know how to use bison and flex...probably using bison and flex is better,since wz uses them to parse data too:
http://www.w3.org/XML/9707/XML-in-C
Datafile-Format discussion
Posted: 15 Dec 2006, 23:59
by kage
Watermelon wrote:
I am not sure what the .y and .l do exactly,seems their compiled form is used to compile/interpret the slo and vlo scripts.
i believe they are used by a parser generator to create a data file parser for warzone.
Watermelon wrote:
I will try to experiment with xml a bit if noone else is doing that(actually gerard_ already did xml version of 'repair.txt'),at least it's vastly better than that scanf...
indeed, it's much more human readable, but likely takes over 50x longer to parse, and if you are parsing data files in xml format, i strongly suggest you use sax, as building a full xml tree (dom) in memory will be entirely worthless in this case, and will eat up anywhere from 100 - 1000 times more memory than the scanf method.
as said, very strong advantages, and very strong disadvantages as well -- even the original designers of xml take the position that xml is being used wrongly in many cases, and that it was never designed to be a "best option every time" format.
what i might suggest, is use xml for authoring, and provide free, multi-platform tools for converting either way between the xml that modders use, and some plaintext/binary format that warzone uses: that way modders are happy and can work with greater efficiency, and warzone can still load a map in 5 seconds (as opposed to 20). also, if modders forget to convert their xml files to the native format, warzone would convert, using this same external tool, and cache those files for future use, doing a simple timestamp check on the mod as a whole to determine cache freshness.
Datafile-Format discussion
Posted: 16 Dec 2006, 02:31
by karmazilla
Watermelon wrote:I will try to experiment with xml a bit if noone else is doing that(actually gerard_ already did xml version of 'repair.txt'),at least it's vastly better than that scanf...
Beware that I revised his repair.xml file:
https://mail.gna.org/public/warzone-dev ... 00208.html
Also worth reading is his reply:
https://mail.gna.org/public/warzone-dev ... 00210.html
I'm not 100% on the requirements for this particular xml-format, put I can do an XSD Schema that will validate on my version of repair.xml, once the "techlevel-ALL" issue is fixed.
Datafile-Format discussion
Posted: 16 Dec 2006, 05:05
by Watermelon
kage wrote:
i believe they are used by a parser generator to create a data file parser for warzone.
indeed, it's much more human readable, but likely takes over 50x longer to parse, and if you are parsing data files in xml format, i strongly suggest you use sax, as building a full xml tree (dom) in memory will be entirely worthless in this case, and will eat up anywhere from 100 - 1000 times more memory than the scanf method.
as said, very strong advantages, and very strong disadvantages as well -- even the original designers of xml take the position that xml is being used wrongly in many cases, and that it was never designed to be a "best option every time" format.
what i might suggest, is use xml for authoring, and provide free, multi-platform tools for converting either way between the xml that modders use, and some plaintext/binary format that warzone uses: that way modders are happy and can work with greater efficiency, and warzone can still load a map in 5 seconds (as opposed to 20). also, if modders forget to convert their xml files to the native format, warzone would convert, using this same external tool, and cache those files for future use, doing a simple timestamp check on the mod as a whole to determine cache freshness.
I think any format that is easy to read/edit should be fine,we can keep the bison/flex,then just replace the LoadSomeStats function's(which loads/breaks the parsed data from bison/flex into memory I think) scanf's with xml read/parse function.Maybe we should use a custom plain text format like other games use if xml is too slow,though parsing performance is not a major concern as long as they are only loaded into memory once during initialization.
Datafile-Format discussion
Posted: 16 Dec 2006, 05:53
by Troman
I have to admit I have never tried parsing XML files with bison before, but from my experience with bison I think it should be piece of cake to write a bison/flex parser to parse xml files. I doubt it would take 50x longer to parse, bison/flex are real fast and the grammar rules are simple for XML.
I think we shouldn't introduce any new libraries for XML parsing, bison/flex or lua, which Devurandom is going to use at some point anyway (i'm not familar with it though, not sure if it can be used to parse xml files, maybe Devurandom can clarify this), should be more than enough.
That said I must add that I probably won't have time to do that, I don't know if anyone else is up to writing a Bison/Flex parser, if we are to use Bison/Flex for parsing of course.
EDIT: the way we are doing it now we'd probably have to introduce a new parser for each txt file format there is (templates.txt, weapons.txt etc), we would end up with 50+ more c/h files (plus y/l files). I think we could combine all parsers into 1-3 files, since the parsers should be real tiny for each txt file.
Datafile-Format discussion
Posted: 16 Dec 2006, 06:38
by lav_coyote25
hmmmm... k - question!
these txt files - weapons etc - would/could they be incorporated into what is being attempted ( tech tree as part of the gui ) ??
as i said its just a question... needed to be asked. ;D
Datafile-Format discussion
Posted: 16 Dec 2006, 13:08
by karmazilla
Troman wrote:I think we shouldn't introduce any new libraries for XML parsing, bison/flex or lua, which Devurandom is going to use at some point anyway (i'm not familar with it though, not sure if it can be used to parse xml files, maybe Devurandom can clarify this), should be more than enough.
That said I must add that I probably won't have time to do that, I don't know if anyone else is up to writing a Bison/Flex parser, if we are to use Bison/Flex for parsing of course.
I cringe at the idea of a homegrown XML parser

libxml is a very small dependancy - I'm looking at the ubuntu package right now, and libxml1 only depends on libc6 and zlib1g. Besides, because of the SGML legacy in XML, there
are some very hairy syntax rules in XML, like DTDs, Entity resolving, xml:id and CDATA sections. Plus, if we're going to create XSD Schemas and validate against them, then libxml
might (I'm not entierly sure) have some functionality solve that.
And, if performance in XML parsing is an issue, then wouldn't you expect the lads and lassies behind libxml to know a thing or two about it?
Point of the matter is; we
don't need to reinvent the wheel and roll our own XML parser.
Datafile-Format discussion
Posted: 16 Dec 2006, 14:31
by Kamaze
My magic word: Caching!
XML == Raw Data