Page 1 of 1

Kernel bug or ...? (Random segfaults)

Posted: 01 Mar 2010, 05:52
by Safety0ff
I know this is likely an external bug, but I thought that I would get your opinion about it:

I keep getting random segfaults from the following loop ( in atmos.c, function "atmosInitSystem" line ~86)

Code: Select all

	for(i=0; i<MAX_ATMOS_PARTICLES; i++)
	{
		/* None are being used initially */
		asAtmosParts[i].status = APS_INACTIVE;
	}
Which obviously shouldn't cause a segfault ( the faults appear with random i values, which are less than the size of the array.)

After I changed asAtmosParts from being an array to being a pointer that I malloc; I haven't gotten any of those segfaults since.

I first thought that this was a ALSA/pulseaudio problem, but it always happens in that part of the code.

Thoughts?

I'm using Ubuntu 9.10 32bit with the latest kernel ( but it has been happening previous versions as well.)

Re: Kernel bug or ...? (Random segfaults)

Posted: 01 Mar 2010, 17:35
by cybersphinx
Hmmm... a short calculation shows that this array needs over 2 MB (34 bytes * 65536). Maybe Pulseaudio (or whatever it was) also needs a lot of stack space, and then asAtmosParts is pushed beyond the limit, and at some point in the loop it segfaults? So without Pulseaudio we have enough stack space, and it runs fine. Also, with a 2 MB array, it doesn't matter much where exactly it begins, so even if Warzone or other libraries need several hundred kilobytes more of the stack, it's still this array that'll be across the border.
  1. Check your stack limit with "ulimit -s" (gives the stack size in KB, 8 MB here), and increase that. If that helps...
  2. report a Pulseaudio bug regarding its excessive stack usage, and have Pulseaudio or the default stack limit changed. Maybe this is the general source of problems that get solved by uninstalling Pulseaudio...
And I guess we shouldn't use over 25% of the stack for one array, but allocate it on the heap instead.

Re: Kernel bug or ...? (Random segfaults)

Posted: 01 Mar 2010, 17:39
by Safety0ff
I'm pretty sure that it's in the heap whether it is declared as array ( BSS) or malloc'd.

I'm not convinced it's pulseaudio any more either.

Stack is 8 Mb here as well.

Re: Kernel bug or ...? (Random segfaults)

Posted: 01 Mar 2010, 18:18
by cybersphinx
Ok, seems my theory was wrong, since testing with larger stack still crashes.

Re: Kernel bug or ...? (Random segfaults)

Posted: 01 Mar 2010, 23:12
by Zarel
Regardless, a 2 MB array should probably be on the heap... Someone should patch that.

Re: Kernel bug or ...? (Random segfaults)

Posted: 02 Mar 2010, 18:17
by stiv
Zarel wrote:Regardless, a 2 MB array should probably be on the heap... Someone should patch that.
You might want to look at the code.

static ATPART asAtmosParts[MAX_ATMOS_PARTICLES];

Re: Kernel bug or ...? (Random segfaults)

Posted: 02 Mar 2010, 18:47
by cybersphinx
Yeah, ignore the stack stuff, Valgrind says we only use a few KB of it.

So we're back to random memory corruption...

Re: Kernel bug or ...? (Random segfaults)

Posted: 04 Mar 2010, 00:17
by Safety0ff
stiv wrote:You might want to look at the code.

static ATPART asAtmosParts[MAX_ATMOS_PARTICLES];
Precisely, it is in the data segment either way, and since the same binary works properly most of the time, it makes wonder what the cause is.

Re: Kernel bug or ...? (Random segfaults)

Posted: 04 Mar 2010, 12:34
by Per
Does this happen also in campaign and/or the tutorials?

Re: Kernel bug or ...? (Random segfaults)

Posted: 06 Mar 2010, 22:27
by Safety0ff
Per wrote:Does this happen also in campaign and/or the tutorials?
Alright, I've just confirmed that it happens with campaign as well.

It seems to crash in other places as well, but the malloc'ing the array stops those as well. :S

Re: Kernel bug or ...? (Random segfaults)

Posted: 07 Mar 2010, 02:06
by Zarel
Let's just malloc the array, then?