Page 1 of 2

My options for improving code speed.

Posted: 28 Mar 2019, 15:42
by moltengear
it is desirable to correct this code.
src/lighting.cpp 231 line

Code: Select all

	dotProduct = glm::dot(normalise(finalVector), theSun_ForTileIllumination);

	val = abs(dotProduct) / 16;
	if (val == 0)
	{
		val = 1;
	}
	if (val > 254)  // unnecessary calculation   // 231 line
	{
		val = 254;
	}
	mapTile(tileX, tileY)->illumination = val;
}

static void colourTile(SDWORD xIndex, SDWORD yIndex, PIELIGHT light_colour, double fraction)
{
	PIELIGHT colour = getTileColour(xIndex, yIndex);

Code: Select all

	if (val == 0)
	{
		val = 1;
	}
	else if (val > 254)  // 231 line
	{
		val = 254;
	}

Re: My options for improving code speed.

Posted: 28 Mar 2019, 15:46
by andrvaut
I think the compiler will turn this code into identical instructions ...
(using keys -o2 or more aggressive)

Re: My options for improving code speed.

Posted: 28 Mar 2019, 17:30
by moltengear
src/ligthing.cpp 182 line

Code: Select all

static void calcTileIllum(UDWORD tileX, UDWORD tileY)
{
	/* The number or normals that we got is in numNormals*/
	Vector3f finalVector(0.0f, 0.0f, 0.0f);

	unsigned int numNormals = 0; // How many normals have we got?
	Vector3f normals[8]; // Maximum 8 possible normals

	/* Quadrants look like:-

				  *
				  *
			0	  *    1
				  *
				  *
		**********V**********
				  *
				  *
			3	  *	   2
				  *
				  *
	*/

	/* Do quadrant 0 - tile that's above and left*/
	normalsOnTile(tileX - 1, tileY - 1, 0, &numNormals, normals);

	/* Do quadrant 1 - tile that's above and right*/
	normalsOnTile(tileX, tileY - 1, 1, &numNormals, normals);

	/* Do quadrant 2 - tile that's down and right*/
	normalsOnTile(tileX, tileY, 2, &numNormals, normals);

	/* Do quadrant 3 - tile that's down and left*/
	normalsOnTile(tileX - 1, tileY, 3, &numNormals, normals);

	for (unsigned int i = 0; i < numNormals; i++)
	{
		finalVector = finalVector + normals[i];
	}
	/*   // same old code
	int dotProduct = 0;
	unsigned int val = 0;
	dotProduct = glm::dot(normalise(finalVector), theSun_ForTileIllumination);
	val = abs(dotProduct) / 16;
	*/
	unsigned int val = abs((int)glm::dot(normalise(finalVector), theSun_ForTileIllumination)) >> 4;

	if (val == 0)
	{
		val = 1;
	}
	else if (val > 254)
	{
		val = 254;
	}
	mapTile(tileX, tileY)->illumination = val;
}
I rewrote the function. Checked, played, like no errors. I use shift instead of division, Threw out local variables which are used once.

Re: My options for improving code speed.

Posted: 28 Mar 2019, 17:41
by andrvaut
Need benchmark and assembler listing.
Gcc and other compilers do a great job of optimizing.
Your improvements are so straightforward that they are probably already produced at the compilation stage.

Re: My options for improving code speed.

Posted: 28 Mar 2019, 17:51
by moltengear
andrvaut wrote: 28 Mar 2019, 17:41 Need benchmark and assembler listing.
Gcc and other compilers do a great job of optimizing.
Your improvements are so straightforward that they are probably already produced at the compilation stage.
Unfortunately, I'm not a pro in assembly language.
Therefore, just testing the functions for speed.

Re: My options for improving code speed.

Posted: 28 Mar 2019, 17:57
by moltengear
In calculations, it is better to avoid intermediate local variables that I have removed. Double copying is not desirable. It is also better to create variables in those places where they will be immediately used. "The more local" the variable, the better. And also to help the compiler, mathematical expressions are best placed on a single line. This is how I remember from the books.

Re: My options for improving code speed.

Posted: 28 Mar 2019, 18:27
by andrvaut
Current compilers optimize the executable code very, very much. Now they do it better than people.
So much better that people have to make an effort to get the code to run as needed (undefined behavior, strict aliasing and other non trivial things.)

Re: My options for improving code speed.

Posted: 28 Mar 2019, 18:33
by pastdue
@moltengear: How are you testing the functions for speed? I'd really encourage profiling an optimized/release build of the game to see where performance improvements are needed.

Re: My options for improving code speed.

Posted: 28 Mar 2019, 19:20
by Cyp
The books sound out of date. It's usually better to declare variables as local as possible to help humans reading the code, but it shouldn't make a difference for modern compilers.

Unless it's complex C++ variables which may implicitly allocate/free memory, in which case it can sometimes be slightly better to declare variables in a less local scope than needed, in order to avoid memory allocations.

Re: My options for improving code speed.

Posted: 06 Apr 2019, 17:23
by moltengear
Thanks!
I saw that root calculation is being applied.
sqrt().
Can we apply another quick method for this project? Quake III code where John Carmack applied it.

Code: Select all

float Q_rsqrt( float number )
{
  long i;
  float x2, y;
  const float threehalfs = 1.5F;

  x2 = number * 0.5F;
  y  = number;
  i  = * ( long * ) &y;
  i  = 0x5f3759df - ( i >> 1 );
  y  = * ( float * ) &i;
  y  = y * ( threehalfs - ( x2 * y * y ) );

  #ifndef Q3_VM
  #ifdef __linux__
    assert( !isnan(y) );
  #endif
  #endif
  return y;
}

Re: My options for improving code speed.

Posted: 06 Apr 2019, 17:29
by pastdue
moltengear wrote: 06 Apr 2019, 17:23 Thanks!
I saw that root calculation is being applied.
scrt().
Can we apply another quick method for this project? Quake III code where John Carmack applied it.
Have you profiled the current performance? (By which I mean: Profiling the entire game running, to determine the current performance bottlenecks - not micro-benchmarks of individual functions.)

Re: My options for improving code speed.

Posted: 06 Apr 2019, 17:36
by moltengear
pastdue wrote: 06 Apr 2019, 17:29
moltengear wrote: 06 Apr 2019, 17:23 Thanks!
I saw that root calculation is being applied.
sqrt().
Can we apply another quick method for this project? Quake III code where John Carmack applied it.
Have you profiled the current performance? (By which I mean: Profiling the entire game running, to determine the current performance bottlenecks - not micro-benchmarks of individual functions.)
No. Is there a guide to profiling this game? However, it would be desirable to increase the speed. It is unpleasant to endure the brakes in multiplayer.

Re: My options for improving code speed.

Posted: 06 Apr 2019, 17:42
by pastdue
moltengear wrote: 06 Apr 2019, 17:36 No. Is there a guide to profiling this game?
If using Visual Studio, here's some information:
https://docs.microsoft.com/en-us/visual ... ew=vs-2017

Re: My options for improving code speed.

Posted: 06 Apr 2019, 18:25
by moltengear
If someone is suddenly interesting to compare. Q_rsqrt vs sqrt

Code: Select all

#include <iostream>
#include <Windows.h>
using namespace std;

float Q_rsqrt(float number)
{
	long i;
	float x2, y;
	const float threehalfs = 1.5F;

	x2 = number * 0.5F;
	y = number;
	i = *(long *)&y;
	i = 0x5f3759df - (i >> 1);
	y = *(float *)&i;
	y = y * (threehalfs - (x2 * y * y));

#ifndef Q3_VM
#ifdef __linux__
	assert(!isnan(y));
#endif
#endif
	return y;
}

int main()
{
	float a;
	float t = 10.0f;

	LARGE_INTEGER tsc;
	__asm
	{
		cpuid
		rdtsc
		mov tsc.LowPart, eax
		mov tsc.HighPart, edx
	}

	for (int x = 0; x < 1000000000; x++)
	{
		a = Q_rsqrt(t);
		t = t + 0.01;
		float z = Q_rsqrt(a + t);
		a = z;
	}

	LARGE_INTEGER tsc2;
	__asm
	{
		cpuid
		rdtsc
		mov tsc2.LowPart, eax
		mov tsc2.HighPart, edx
	}

	cout << (tsc2.QuadPart - tsc.QuadPart) << "   " << a << endl;


	t = 10.0f;

	//LARGE_INTEGER tsc;
	__asm
	{
		cpuid
		rdtsc
		mov tsc.LowPart, eax
		mov tsc.HighPart, edx
	}

	for (int x = 0; x < 1000000000; x++)
	{
		a = float(1.0 / sqrt(t));
		t = t + 0.01;
		float z = float(1.0 / sqrt(a + t));
		a = z;
	}

	//LARGE_INTEGER tsc2;
	__asm
	{
		cpuid
		rdtsc
		mov tsc2.LowPart, eax
		mov tsc2.HighPart, edx
	}

	cout << (tsc2.QuadPart - tsc.QuadPart) << "  "<< a << endl;


	char pause;
	cin >> pause;
}

Re: My options for improving code speed.

Posted: 06 Apr 2019, 20:02
by Cyp
Might be fairer to compare to sqrtf than to sqrt. I'd naïevely guess sqrtf to be faster and more accurate than Q_rsqrt.

I'm not sure square root calculations are the main bottleneck in the game.