Stats data format I wish to have.

For code related discussions and questions
User avatar
aubergine
Professional
Professional
Posts: 3459
Joined: 10 Oct 2010, 00:58
Contact:

Re: Stats data format I wish to have.

Post by aubergine »

In terms of RAM consumption whilst C++ parses XML, SAX parsers are very efficient.

And as for forum attachment size limits, XML files compress very well with zip compression - this also applies to when they are in .wz files, the resulting file will be tiny.

And, just as we have editor for maps, an editor for xml files will make editing stats, etc., very clean and easy... and *reliable*.
"Dedicated to discovering Warzone artefacts, and sharing them freely for the benefit of the community."
-- https://warzone.atlassian.net/wiki/display/GO
iap
Trained
Trained
Posts: 244
Joined: 26 Sep 2009, 16:08

Re: Stats data format I wish to have.

Post by iap »

Xml-marker will take the data in the xml and present it as a table. Very easy, I work with it all the time.
With a good planning of the data structure, it can be used as a very friendly stat editor.

http://symbolclick.com/
(The old version is free http://symbolclick.com/xmlmarker_1_1_setup.exe )
User avatar
aubergine
Professional
Professional
Posts: 3459
Joined: 10 Oct 2010, 00:58
Contact:

Re: Stats data format I wish to have.

Post by aubergine »

Yup, there's a plethora of tools available for working with XML and it's also very easy to create HTML interfaces so I imagine we'd quickly gain a lot of innovative editors.
"Dedicated to discovering Warzone artefacts, and sharing them freely for the benefit of the community."
-- https://warzone.atlassian.net/wiki/display/GO
User avatar
Iluvalar
Regular
Regular
Posts: 1828
Joined: 02 Oct 2010, 18:44

Re: Stats data format I wish to have.

Post by Iluvalar »

Some players take 20 seconds, or even take so long that I need to restart the game already when hosting a mod. (300 - 400k map pack).

The human readability is already good enough. I hosted in the last year more that 100 mods, please... Do not increase the stats file for more human readability, it's useless.
Heretic 2.3 improver and proud of it.
User avatar
Duha
Trained
Trained
Posts: 287
Joined: 25 Mar 2012, 20:05
Location: SPb, Russia

Re: Stats data format I wish to have.

Post by Duha »

Iluvalar wrote:Some players take 20 seconds, or even take so long that I need to restart the game already when hosting a mod. (300 - 400k map pack).
Looks like it is game/server trouble. My be storing maps on outer server will help. (git hub for example :))
The human readability is already good enough. I hosted in the last year more that 100 mods, please... Do not increase the stats file for more human readability, it's useless.
If person want to do something nothing stops him. I have seen a lot of mod created with more bad readability data and with out any documentation. But it take more time.

IMHO:
Data not well structured there is some place to improve it.
Data not documented.
Parser don`t DRY
Error messages does not informative in certain cases.
http://addons.wz2100.net/ developer
User avatar
milo christiansen
Regular
Regular
Posts: 749
Joined: 02 Jun 2009, 21:23
Location: Perrinton Michigan

Re: Stats data format I wish to have.

Post by milo christiansen »

What part of "text compresses really well" do you not get? Just asking.
In general, if you see glowing, pulsating things in the game, you should click on them.
- Demigod Game Ganual
Reg312
Regular
Regular
Posts: 681
Joined: 25 Mar 2011, 18:36

Re: Stats data format I wish to have.

Post by Reg312 »

i think some problems with csv-format can be solved by "strong" column naming
if check name of columns and do not check order position of column
aubergine wrote:Yup, there's a plethora of tools available for working with XML and it's also very easy to create HTML interfaces so I imagine we'd quickly gain a lot of innovative editors.
what problem in working with csv, for me its simplier than xml :)
Per wrote: In any case, my reason for wanting to change format was merely because CSV is almost impossible to read changes in (since diffs touch so many values at once)
may be problem in kinda "unwise" diff"? stats is not codelines which should be checked for diffs
are you afraid that modder can say "i changed cannons price" but he can make much more changes and be uncaught :P
Last edited by Reg312 on 17 Apr 2012, 19:01, edited 1 time in total.
User avatar
aubergine
Professional
Professional
Posts: 3459
Joined: 10 Oct 2010, 00:58
Contact:

Re: Stats data format I wish to have.

Post by aubergine »

XML brings benefits like increased portability, XSLT (data translations), DTDs (data validation), etc.

In terms of editing, a CSV file is just a bunch of bits in a data file. You open it in a text editor and you get a bunch of text with lots of commas, you open it in a spreadsheet and you get a tabular view. The same sort of thing applies to XML - the editor defines what view you get of the data. So you might want to open in a basic text editor and just see the raw XML, or you might want to open in a XML editor and see a more table/forms based view, or you might want a html page with some javascript that gives a totally custom interactive view of the data. All these attempts to say that XML editing is somehow harder are invalid IMHO, it depends on the editor. And as I've mentioned above, once data exists in XML format I think we'll see lots of community contributed editors.
"Dedicated to discovering Warzone artefacts, and sharing them freely for the benefit of the community."
-- https://warzone.atlassian.net/wiki/display/GO
User avatar
vexed
Inactive
Inactive
Posts: 2538
Joined: 27 Jul 2010, 02:07

Re: Stats data format I wish to have.

Post by vexed »

Per wrote:I hope we are talking about readability, not mere size. In which case, XML is a horrible choice. JSON is somewhat better. ini, I think, is the most easily readable format. We also have a parser already. Hammer, meet nail.

In any case, my reason for wanting to change format was merely because CSV is almost impossible to read changes in (since diffs touch so many values at once), and almost impossible to change (adding / removing columns breaks compatibility, which is bad). And I really want to make some changes.
Sorry, you bent the nail.

We had people in the forums make a few tools to edit the stats (2 windows only, 1 cross-platform made recently), and going to ini without tool support was a hasty, premature decision.
It is now much more of a PITA to edit said stats than it was using CSV + tools (or if you were not on windows, using a spreadsheet or the cross-platform stats editor).
As for the diffs, I don't know on linux, but there are visual diff programs that show you exactly what was changed on said line, so I don't think this is a valid argument at all. Stop using the command line :wink:

That is only part of the problem.
The way Qt works with ini blows chunks, and it is needlessly much harder to find out the exact area you are in the ini file without decoding all the information before hand, without inserting more code to find where you are. Before, all you needed to do is set a breakpoint in the debugger, and you got the whole buffer and line where you were at. (Note, I am not just talking about stats here, I mean other things that got the ini treatment as well.)

Switching to XML or bson/json or whatever else, *must* have tool support and we need to check to see how the libraries are made, and how it will integrate into the codebase, and what the debugging options are.


Here is a example of trying to debug ini ... try to figure out where we are in the ini file...

Code: Select all

QStringList list = ini.childGroups();
...
-		list	{...}	QStringList
-		QList<QString>	{p={...} d=0x02ac2388 }	QList<QString>
-		p	{d=0x02ac2388 }	QListData
-		d	0x02ac2388 {ref={...} alloc=166 begin=0 ...}	QListData::Data *
-		ref	{_q_value=1 }	QBasicAtomicInt
		_q_value	1	volatile long
		alloc	166	int
		begin	0	int
		end	166	int
		sharable	1	unsigned int
-		array	0x02ac239c	void * [1]
		[0]	0x02abb670	void *
-		d	0x02ac2388 {ref={...} alloc=166 begin=0 ...}	QListData::Data *
-		ref	{_q_value=1 }	QBasicAtomicInt
		_q_value	1	volatile long
		alloc	166	int
		begin	0	int
		end	166	int
		sharable	1	unsigned int
-		array	0x02ac239c	void * [1]
		[0]	0x02abb670	void *
/facepalm ...Grinch stole Warzone🙈🙉🙊 contra principia negantem non est disputandum
Super busy, don't expect a timely reply back.
stiv
Warzone 2100 Team Member
Warzone 2100 Team Member
Posts: 876
Joined: 18 Jul 2008, 04:41
Location: 45N 86W

Re: Stats data format I wish to have.

Post by stiv »

a CSV file is just a bunch of bits in a data file.
That may be the silliest thing said here so far.

As I understand it, the stats were originally in an Access(tm) database (low-rent Microsoft relational database). What we see as the "stats files" are simply an export of the database tables in a format that was easy for the application to use. This is why you see all those duplicated fields in the files that make editing and maintenance such a pain.

Rather than repeatedly changing the file format in search of a silver bullet, it would make more sense to simply suck them all back into a database where they could be maintained properly and exported for the game as needed. This does not require any code changes in the game.

(historical note: an attempt was made to directly access the stats from a database, but it proved a bridge too far, given that only one person was working on the task)
User avatar
aubergine
Professional
Professional
Posts: 3459
Joined: 10 Oct 2010, 00:58
Contact:

Re: Stats data format I wish to have.

Post by aubergine »

@stiv - whatever format we choose it's just bits in a file, it's an obvious assertion but one I feel needs to be made in light of someo of the crazy reasoning that's happening in this topic so far. There needs to be some human friendly editor regardless of file/data format otherwise it will be a pita to work with. Even in so called "easy to use ini files" they can be a pita to edit - there's still virtually no docs on the param names or what values they can have (I wrote up some notes with help from others for body.ini) and there's still some confusion (especially for new modders) as to which files to edit and where to find certain settings.

XML format is easy to debug, has widely adopted parsers and editors, is extremely reliable and portable to other formats (eg. a DTD will define clearly what data types each setting can be, and allow automated testing/validation of data as well as things like XSLT that allow quick translations of data to other formats), fast (eg. with SAX parser), and would be significantly easier to write new editors for becuase it's such a widely adopted format.

There's a reason formats like XML and JSON have become popular on the internet - it's because they are much better formats for working with than old-school approaches like CSV and INI.

XML can be thought of as a text-based database format - with a DTD you get to defined records, fields and value specifications. Unlike a traditional database, you don't need to set up a database server/engine to store the data though. For modders, having to set up a db of a specific type is going to be more hassle, IMHO, than loading an XML file in to a custom editor.
"Dedicated to discovering Warzone artefacts, and sharing them freely for the benefit of the community."
-- https://warzone.atlassian.net/wiki/display/GO
cybersphinx
Inactive
Inactive
Posts: 1695
Joined: 01 Sep 2006, 19:17

Re: Stats data format I wish to have.

Post by cybersphinx »

aubergine wrote:Unlike a traditional database, you don't need to set up a database server/engine to store the data though. For modders, having to set up a db of a specific type is going to be more hassle, IMHO, than loading an XML file in to a custom editor.
A database could be done with sqlite, which would just be embedded into any editors. No need for a separate database server, though the data could be pushed into one as well e.g. to generate dynamic web pages.
We want information... information... information.
User avatar
aubergine
Professional
Professional
Posts: 3459
Joined: 10 Oct 2010, 00:58
Contact:

Re: Stats data format I wish to have.

Post by aubergine »

But by doing that you instantly limit the scope for editors to using tools that can support sqlite. Anything that doesn't support sqlite gets ruled out of the equation.
"Dedicated to discovering Warzone artefacts, and sharing them freely for the benefit of the community."
-- https://warzone.atlassian.net/wiki/display/GO
User avatar
Emdek
Regular
Regular
Posts: 1329
Joined: 24 Jan 2010, 13:14
Location: Poland
Contact:

Re: Stats data format I wish to have.

Post by Emdek »

For me it's mostly choosing between SQLite or INI (in this case, two level deep, JSON files would be very similar) files as those are probably easiest to edit by hand for most people.
The main issue with database solutions is that you can't quickly edit them without using additional tools...
But in case of SQLite it would be extremely easy to create basic editor in few lines of code using Qt. ;-)
Nadszedł już czas, najwyższy czas, nienawiść zniszczyć w sobie.
The time has come, the high time, to destroy hatred in oneself.


Beware! Mad Qt Evangelist.
User avatar
Duha
Trained
Trained
Posts: 287
Joined: 25 Mar 2012, 20:05
Location: SPb, Russia

Re: Stats data format I wish to have.

Post by Duha »

Emdek wrote:For me it's mostly choosing between SQLite or INI (in this case, two level deep, JSON files would be very similar) files as those are probably easiest to edit by hand for most people.
The main issue with database solutions is that you can't quickly edit them without using additional tools...
But in case of SQLite it would be extremely easy to create basic editor in few lines of code using Qt. ;-)
sqlite is bad for git. It can`t be diffed. (single change will replae file)
But you can have dumped data in your git.

Sqlite does not have data validation. You can`t tell that this integer should be from -90 to 90. Or weporn type is one of defined words? DTD can.
http://addons.wz2100.net/ developer
Post Reply