Re: Future plans for GUI
Posted: 12 Jan 2015, 04:43
Will it be easy or just not that hard to implement ready to use libs as Ogre3d engine does?
I think we first we need to restructure the graphics code so that works like modern graphics code should. Then it will be possible to use a third-party library. But I do not have much experience using these graphics engines, so I might be wrong.MaNGusT wrote:Will it be easy or just not that hard to implement ready to use libs as Ogre3d engine does?
CEGUI is in Debian and MXE (though both somewhat outdated), I think it's the only option if we don't want to make our 3rdparty directory larger. It has also been around for quite a while (so is unlikely to be abandoned soon), and its features and tool support look good (on paper/screen).Per wrote:Also I have little idea which of them would be a good fit for us. It would need to be easily multiplatform, and work well with SDL and Qt.
I agree with your post for the most part. I just want to add that it is a bit worse than this. The original graphics code would involve the CPU in the drawing of every vertex in every frame. For models, that is now fixed, storing meshes on the GPU and drawing them as a whole with a few calls to the GPU. In my gfxqueue branch, I am extending this approach to everything, including fonts, lines, etc. which means that whenever they are unchanged, they are reused from attributes stored on the GPU. I don't think that writing things into textures gains a lot over this. The big jump up in performance should be getting the CPU out of every vertex.wuz21m wrote:You see these rotating models "Angry Python with Guns" on the Manufacture Menu? The matrix model is transformed, the model is actually redrawn every frame. The font rendering code? It transforms the code into UCS4, picks the glyphs (and measures the length), transforms the view matrix and renders the damn thing every frame (thus I get ~16% CPU usage just staring at the title screen). A correct approach should just use a texture and then re-use it every frame.
google profiler tools
I think the CPU usage stats we get from Valgrind are on track
Code: Select all
$ pprof --text src/warzone2100 /tmp/mybin.prof
Using local file src/warzone2100.
Using local file /tmp/mybin.prof.
Removing __funlockfile from all stack traces.
Total: 2231 samples
893 40.0% 40.0% 893 40.0% __memset_sse2
155 6.9% 47.0% 155 6.9% _init@750
131 5.9% 52.8% 131 5.9% _init@38f98
117 5.2% 58.1% 117 5.2% __ioctl
101 4.5% 62.6% 101 4.5% __driDriverGetExtensions_i965
33 1.5% 64.1% 33 1.5% drm_intel_bufmgr_fake_init
31 1.4% 65.5% 31 1.4% inflateBackEnd
26 1.2% 66.7% 26 1.2% tx_compress_dxtn
21 0.9% 67.6% 21 0.9% png_set_read_user_transform_fn
18 0.8% 68.4% 19 0.9% atmosUpdateSystem
17 0.8% 69.2% 17 0.8% edgeLessThan
16 0.7% 69.9% 16 0.7% __driDriverGetExtensions_i915
16 0.7% 70.6% 16 0.7% __fsync_nocancel
16 0.7% 71.3% 27 1.2% drawTerrain
15 0.7% 72.0% 15 0.7% __GI___poll
15 0.7% 72.7% 15 0.7% __GI___pthread_mutex_lock
14 0.6% 73.3% 14 0.6% __memcpy_avx_unaligned
13 0.6% 73.9% 15 0.7% atmosDrawParticles
10 0.4% 74.3% 27 1.2% locateMouse
Kinda. Yes, we have moved to QT5, but, moving everything to Qt isn't yet possible because of performance issues.wuz21m wrote:I have read on the forums that plans were underway to port WZ2100 to QT 5. I know we are already using QT 5 to build Warzone2100.
There are no editors that I know of that uses Qt that works.I know that we are using QT 5 extensively and a map editor is already using QT 5.
Everything done in WZ is the brute force approach, and that does have some advantages.But what about the game frontend and in-game widgets? My searches didn't turn up any conclusive results. The current solution performs many re-draws and is probably a major performance bottleneck. Are there any plans in place to do something about this?
procsystime script captures and prints the system call time usage for a given process nameNoQ wrote:
- For graphics engine optimizations, you'd better look for another thread; a lot of such discussions were happening in ArtRevolution threads, and in fact a lot of new optimizations were already introduced in -master. Also, you may want to synchronize your performance analysis with [2].
- Ah, valgrind. It introduces a huge overhead, but then works around it; numbers it displays are not exactly time, but rather number of emulated processor instructions, though it's still pretty accurate. Things to be aware of:
- It doesn't take sleep times into account, which means that you really don't know the absolute CPU usage values for the functions you found. It may be "10% of almost-nothing", and after adding 10-20 units on the board it may reduce to a negligible 1%. In any case, it is much more relevant to collect profile performance-critical situations, like with 1500 units on board (the limit we say to support in a 10-player game), especially when it comes to tools that discard sleep time, like valgrind or perf (in its out-of-the-box variant).
- Time-based application logic may be scewed by the large overhead. Performance statistics for the game running @60fps and @3fps are completely different, even if the same things are happening. It might be that console redraws (or anything else) happen at game frame rather than at render frame.
- As far as i remember, there was a way to arbitrary enable and disable collecting statistics (not instrumentation, of course) in valgrind to avoid mixing in statistics from the start menu, not sure if you used it.
Code: Select all
# /usr/share/dtrace/toolkit/procsystime -n warzone2100
Tracing... Hit Ctrl-C to end...
dtrace: 12797 dynamic variable drops with non-empty dirty list
dtrace: 13867 dynamic variable drops with non-empty dirty list
^C
Elapsed Times for processes warzone2100,
SYSCALL TIME (ns)
writev 105663
write 1600391
select 3821487
getpid 4151995
sendto 4205746
read 4522322
recvmsg 5857182
sched_yield 15857375
ioctl 16124479
nanosleep 25845524999
_umtx_op 26613797190
poll 41294596003
Code: Select all
The game parameter requires one of the following keywords:CAM_1A, CAM_2A, CAM_3A, TUTORIAL3, or FASTPLAY.
root@System-Product-Name:/home/test/war-test# src/warzone2100 --game TUTORIAL3
Unrecognized option: TUTORIAL3
Code: Select all
perf record -F 99 -p 16323 -g -- sleep 60
perf script | ./stackcollapse-perf.pl > out.perf-folded
./flamegraph.pl out.perf-folded > perf.svg