Page 2 of 2

Re: My options for improving code speed.

Posted: 06 Apr 2019, 23:04
by moltengear
Cyp wrote: 06 Apr 2019, 20:02 Might be fairer to compare to sqrtf than to sqrt. I'd naïevely guess sqrtf to be faster and more accurate than Q_rsqrt.

I'm not sure square root calculations are the main bottleneck in the game.
Remade! There was a difference of 10 times. Now more than 13 times. So it works!

Re: My options for improving code speed.

Posted: 07 Apr 2019, 08:59
by Cyp
faster ≠ works

Changing slightly, so it could actually compile:

Code: Select all

// g++ -O3 -march=native -fno-strict-aliasing -DQ -s -o sqrtf sqrtf.cpp -Wall -Wextra -fno-math-errno

#include <cmath>
#include <iostream>


float Q_rsqrt(float number)
{
	int32_t i;
	float x2, y;
	const float threehalfs = 1.5F;

	x2 = number * 0.5F;
	y = number;
	i = *(int32_t *)&y;
	i = 0x5f3759df - (i >> 1);
	y = *(float *)&i;
	y = y * (threehalfs - (x2 * y * y));

	return y;
}

int main() {
	float a = 0, t = 0;
#ifdef Q
	for (int x = 0; x < 1000000000; x++)
	{
		a = Q_rsqrt(t);
		t = t + 0.01;
		float z = Q_rsqrt(a + t);
		a = z;
	}
#else
	for (int x = 0; x < 1000000000; x++)
	{
		a = 1.f/sqrtf(t);
		t = t + 0.01f;
		float z = 1.f/sqrtf(a + t);
		a = z;
	}
#endif
	std::cout << a << std::endl;
}
I do actually measure the Q_rsqrt version to be about 13% faster, but the final result doesn't exactly agree (gives 0.00194982 instead of 0.00195312). Even if it was infinitely faster, I don't think it would affect the game performance significantly.

——————————————

Note that without -fno-math-errno, the sqrtf version is about 133% slower. The game only uses errno in one place, so should probably be compiled with -fno-math-errno if that can be removed.

Re: My options for improving code speed.

Posted: 31 Aug 2019, 15:09
by moltengear
As far as I know, sound support is implemented in the same thread as the entire game code.
I wonder how much CPU time percentage spent on providing sound in the game?

Re: My options for improving code speed.

Posted: 31 Aug 2019, 15:43
by Per
moltengear wrote: 31 Aug 2019, 15:09 As far as I know, sound support is implemented in the same thread as the entire game code.
I wonder how much CPU time percentage spent on providing sound in the game?
Very little. Graphics, pathfinding and AI are the big consumers.

Re: My options for improving code speed.

Posted: 01 Sep 2019, 16:03
by moltengear
Yes, shadows are calculated on the processor.

Re: My options for improving code speed.

Posted: 31 Dec 2019, 00:53
by Jorzi
The fast square root approximation should only be used for graphics calculations where small shading errors don't make a difference. No one cares if the brightness of a pixel is 10% wrong, but calculating distance 10% wrong for pathfinding can mess up a lot of stuff.

The biggest graphical slowdown right now is shadows, since it loops through every visible transformed polygon (on the cpu) and uses it to generate a shadow volume mesh. This process is not easily multithreaded or transfered to the cpu and puts a tight limit on how many polygons can be on screen before the game brings even the most advanced CPU to its knees. The most obvious solution to this is to use something like a cascaded shadow map (or just a normal shadow buffer that somehow follows the camera). This is done on the gpu by simply adding another render pass, which is very cheap on modern gpus.

I think almost all of the slowdowns we have now have to do with unit count. Pathfinding with loads of units maneuvering around each other, AI control, targeting, decision making, projectiles. These all have to loop through all units in range and are tricky problems in any RTS. Some of these things could probably have a dedicated thread, splitting up calculations on multiple cores, but that is serious refactoring which requires serious skill. Another thing to optimise, which might already have been done, is making sure that you only loop through the most relevant units, by organising things in the correct data structures.
One possibility is, for example, to have the map tiles keep track of which units are currently on them, which allows you to quickly get a list of only the units that are in a specific relevant area.

Unfortunately I don't have the skills or time to try this myself, just getting some thoughts out there