Kernel bug or ...? (Random segfaults)

For code related discussions and questions
Post Reply
Safety0ff
Trained
Trained
Posts: 397
Joined: 18 Jul 2009, 23:23

Kernel bug or ...? (Random segfaults)

Post by Safety0ff »

I know this is likely an external bug, but I thought that I would get your opinion about it:

I keep getting random segfaults from the following loop ( in atmos.c, function "atmosInitSystem" line ~86)

Code: Select all

	for(i=0; i<MAX_ATMOS_PARTICLES; i++)
	{
		/* None are being used initially */
		asAtmosParts[i].status = APS_INACTIVE;
	}
Which obviously shouldn't cause a segfault ( the faults appear with random i values, which are less than the size of the array.)

After I changed asAtmosParts from being an array to being a pointer that I malloc; I haven't gotten any of those segfaults since.

I first thought that this was a ALSA/pulseaudio problem, but it always happens in that part of the code.

Thoughts?

I'm using Ubuntu 9.10 32bit with the latest kernel ( but it has been happening previous versions as well.)
cybersphinx
Inactive
Inactive
Posts: 1695
Joined: 01 Sep 2006, 19:17

Re: Kernel bug or ...? (Random segfaults)

Post by cybersphinx »

Hmmm... a short calculation shows that this array needs over 2 MB (34 bytes * 65536). Maybe Pulseaudio (or whatever it was) also needs a lot of stack space, and then asAtmosParts is pushed beyond the limit, and at some point in the loop it segfaults? So without Pulseaudio we have enough stack space, and it runs fine. Also, with a 2 MB array, it doesn't matter much where exactly it begins, so even if Warzone or other libraries need several hundred kilobytes more of the stack, it's still this array that'll be across the border.
  1. Check your stack limit with "ulimit -s" (gives the stack size in KB, 8 MB here), and increase that. If that helps...
  2. report a Pulseaudio bug regarding its excessive stack usage, and have Pulseaudio or the default stack limit changed. Maybe this is the general source of problems that get solved by uninstalling Pulseaudio...
And I guess we shouldn't use over 25% of the stack for one array, but allocate it on the heap instead.
Safety0ff
Trained
Trained
Posts: 397
Joined: 18 Jul 2009, 23:23

Re: Kernel bug or ...? (Random segfaults)

Post by Safety0ff »

I'm pretty sure that it's in the heap whether it is declared as array ( BSS) or malloc'd.

I'm not convinced it's pulseaudio any more either.

Stack is 8 Mb here as well.
cybersphinx
Inactive
Inactive
Posts: 1695
Joined: 01 Sep 2006, 19:17

Re: Kernel bug or ...? (Random segfaults)

Post by cybersphinx »

Ok, seems my theory was wrong, since testing with larger stack still crashes.
User avatar
Zarel
Elite
Elite
Posts: 5770
Joined: 03 Jan 2008, 23:35
Location: Minnesota, USA
Contact:

Re: Kernel bug or ...? (Random segfaults)

Post by Zarel »

Regardless, a 2 MB array should probably be on the heap... Someone should patch that.
stiv
Warzone 2100 Team Member
Warzone 2100 Team Member
Posts: 876
Joined: 18 Jul 2008, 04:41
Location: 45N 86W

Re: Kernel bug or ...? (Random segfaults)

Post by stiv »

Zarel wrote:Regardless, a 2 MB array should probably be on the heap... Someone should patch that.
You might want to look at the code.

static ATPART asAtmosParts[MAX_ATMOS_PARTICLES];
cybersphinx
Inactive
Inactive
Posts: 1695
Joined: 01 Sep 2006, 19:17

Re: Kernel bug or ...? (Random segfaults)

Post by cybersphinx »

Yeah, ignore the stack stuff, Valgrind says we only use a few KB of it.

So we're back to random memory corruption...
Safety0ff
Trained
Trained
Posts: 397
Joined: 18 Jul 2009, 23:23

Re: Kernel bug or ...? (Random segfaults)

Post by Safety0ff »

stiv wrote:You might want to look at the code.

static ATPART asAtmosParts[MAX_ATMOS_PARTICLES];
Precisely, it is in the data segment either way, and since the same binary works properly most of the time, it makes wonder what the cause is.
Per
Warzone 2100 Team Member
Warzone 2100 Team Member
Posts: 3780
Joined: 03 Aug 2006, 19:39

Re: Kernel bug or ...? (Random segfaults)

Post by Per »

Does this happen also in campaign and/or the tutorials?
Safety0ff
Trained
Trained
Posts: 397
Joined: 18 Jul 2009, 23:23

Re: Kernel bug or ...? (Random segfaults)

Post by Safety0ff »

Per wrote:Does this happen also in campaign and/or the tutorials?
Alright, I've just confirmed that it happens with campaign as well.

It seems to crash in other places as well, but the malloc'ing the array stops those as well. :S
User avatar
Zarel
Elite
Elite
Posts: 5770
Joined: 03 Jan 2008, 23:35
Location: Minnesota, USA
Contact:

Re: Kernel bug or ...? (Random segfaults)

Post by Zarel »

Let's just malloc the array, then?
Post Reply