Jump to content

Troubleshooting random crashes


PaulWeston

Recommended Posts

Hi again,


Continuing our work on the original Starship Enterprise NCC-1701, now on our third version of the game which is much improved from our previous offerings...


However, as the game grows and we are pushing the functionality, I keep running into various random-seeming crashes, and it's getting very frustrating.


Basically, because of the size of the ship, and the fact that zones and particularly portals seem to be flaky, we have had to implement a triggered add/remove system for objects in the ship... We load all deck floors, turbolifts, doors and lights on startup, but only when a player goes to a specific deck does that trigger the adding of the rest of the deck objects (room details, hallway arches, etc). Before we load in that deck's objects, we remove the previous deck's stuff to free up memory and keep the FPS up.


This actually works very well, and you can run through the whole ship from top to bottom and back and everything will load and unload as it should.


That is, until we started adding in more effects such as AI characters, Awesomium web-enabled objects and more SFXEmitters around the ship... This seemed to push us up over some limit or something, because since then I will crash randomly at odd times, not linked to any specific action. Sometimes it will crash shortly after loading up a new deck, other times it may just crash if you turn too quickly after standing still for a while.


In the cases where I can create the Awesomium objects client-side, there is an improvement in stability, perhaps the server has trouble ghosting them or something. But it's still not the smoking gun, I will still crash randomly once in a while. Likewise with the AI, the more groups I put in there, the more likely I will crash when spawning/despawning them. We use the UAISK which has always been solid before.


Our engine is pretty clean, I have spent a lot of time making sure everything was merged properly and there are no loose ends. Our current engine is basically a T3D 3.0 base, but with most of 3.6 merged in including Faust's latest open-source AFX branch, along with some old resources we use like the Advanced Character Kit, Pathshape resource, fxGuiSnooper, and Guis on Objects.


I have also succeeded in creating a new version using the latest 3.7 RC, for which I had to sort out converting old Con:AddCommands to DefineEngineMethods, but that version is actually less reliable sadly. I was hoping for an improvement after all the work it took to build it, but when that version crashes the whole window just suddenly disappears with the sound still going and I need to end the task in task manager :(


So... I am trying to find the best ways to troubleshoot this. Obviously a debug build is a start, and I have tried that, but it did not give me much insight. The break point when it crashes is not the same each time and is always in some generic function that is not easily identified as to what it's doing.


Are there any other/better ways to troubleshoot random crashes like this? What are the main causes for such crashes in your experience? I have eliminated some obvious causes such as mismatched pack/unpack lists, all that seems fine. I am also not out of masks or anything.


console.log is pretty useless in these cases, it tells me roughly where it crapped out but it is not in a consistent place. It basically just seems to choke randomly, sometimes on an AI spawn, sometimes when adding an Awesomium shape, sometimes when deleting old objects from the previous deck, and sometimes seemingly for no reason when clicking while turning too fast.


Anybody have any ideas for me as to what I should be looking for / tightening up? Also, ideas why 3.7 crashes so hard without even a windows popup crash box?


Thanks for listening to yet another long rant from me lol :)


Cheers

P

Link to comment
Share on other sites

No idea, but it's very worrying. I guess it could be platform-related changes to support Linux and OpenGL? Or maybe you're seeing things creeping in from the console function refactor, or even from the replacement of all those console methods with engine methods. I assume you're not just pushing past 3GB memory usage. If the crashes are random... heap corruption? What actual crashes are happening - invalid memory access?

Link to comment
Share on other sites

Yes it usually seems to be some sort of memory issue, either from too many things going on at once or too much being rendered at once maybe? It does get up into the high 2 gigs of usage, but i have not seen a specific point at which a crash happens.


I guess I'm wondering if anyone has ever calculated any rough limits of the engine in terms of amount of objects/materials/effects/AI that a game could handle at once? I'm just trying to work within the limits of the engine, and pack as much as I can into the ship, the crashes I'm hoping will go away once I find the right balance.


And of course there is still the possibility that something in my build is wonky, even though it compiles right. Maybe a missing mask bit or something in an old resource that I have not ported correctly. Am going to be trying another rebuild soon using a different merging process, maybe the result will be more stable.

Link to comment
Share on other sites

Here is a typical error that comes up during a random crash:


Microsoft Visual Studio

Unhandled exception at 0x7561c42d in NCC1701_3D_DEBUG.exe: Microsoft C++ exception: std::__non_rtti_object at memory location 0x0018b2a0..


Does this mean anything to anyone?


I even did a complete fresh rebuild tonight, using the Afx-T3D-3.6.3 as a base. Just added in the stuff for advanced characters, awesomium, pathshape, and fxguisnooper / guis on objects. It all compiles great and loads fine, but still the random crashes even with this super clean build.


It seems to happen more often when Awesomium shapes are loaded that are playing Flash. But even if I remove all of them I can still crash so it's not all them.


Also made sure my core and shaders files were all up to date as well just in case. Can't think of anything else. Could it be something with some of the actual models in the ship maybe?


Odd, frustrating.

Link to comment
Share on other sites

32 bit build, can't go to 64 because we use awesomium and that is only 32.


I am actually leaning towards guitexture canvas being the problem, finally got a reasonable breakpoint where I got a meaningful error that points to that. I see in this very site that this resource has been updated, maybe that update will help.


More to come lol

Link to comment
Share on other sites

If you're getting a consistent crash, then yeah, trace it down.


If you're getting random crashes like the one you were talking about above, open up the task manager and look at the amount of memory usage you're using. If it's up around 2 GB, you're near the limit and problems are going to start happening.


I used to start getting crashes from random spots when I was sitting around 1.85 GB of RAM usage.

Link to comment
Share on other sites

You mean 2 gigs of usage just for the T3D executable correct?


Because I am frequently over 2 gigs in total memory usage just in daily use when I'm working lol. When I run T3D, my total usage can go as high as 3 gigs or even a bit more if I have other big things open like Visual Studio, Torsion, Firefox, Photoshop, whatever. My machine idles at about 900 mb - 1.4 gigs with nothing running at all.


I wish Awesomium came in 64 bit, I have been dying to port the game up to 64 bit for a looong time, for many reasons.

Link to comment
Share on other sites

You can use as much total system memory as you have available (and then some with virtual memory) but for the actual T3D application itself, it can't exceed 2 GB of memory usage when built in 32-bit. When it starts getting close to the limit, memory allocations will start to fail and you'll get a random crash from the next place that trys to use some memory that failed to allocate.


It should be simple to check, just launch the game, run it for awhile, then open your task manager and check how much memory your game is using. If its anywhere near 2 GB, that's your problem. If it's no where near 2 GB, ignore everything I just said :D

Link to comment
Share on other sites

Lol OK, will do :)


Let's assume for a moment that this is in fact an issue - that my game is just creeping up to 2 gigs and then crapping when it crosses that line...


Are there some best practices we should be following in terms of keeping the memory usage down? As far as I know all our models are very optimized, our art guy uses MilkShape and doesn't export anything that is close to the polygon limits for T3D. And I am trying to minimize the amount of objects on-screen and in the scene at any given time, to keep FPS up, etc.


Are there things I am not thinking of perhaps, in terms of managing memory?


Thanks

P

Link to comment
Share on other sites

OK, getting closer...


My debug build is working better now, it is taking me to more specific places which is good.


Again, seems related to the guitexturecanvas, or something in the gui (I have added Awesomium and the guitexturecanvas, both of which affect the gui system).


There is a routine on the bridge, where the main view screen is on a timer which switches security camera views. That uses fxguisnoooper and guitexturecanvas. The randomness of the crashes could be due to this timer which is set to rotate the screen every 15 seconds. Sometimes everything is fine, other times it seems when I am moving around a lot, or if AI is spawning at that exact time interval, then I will crash.


The switching routine deletes a specific guicontrol client-side which displays the camera, then it re-creates the gui using the new chosen camera. For some reason fxguisnooper won't let you change the gui parameters on-the-fly, so I am just destroying and re-creating the gui each time the timer cycles. The security cameras themselves are also client-side only, so everything works properly.


So, timer goes off every 15 seconds, destroys the client security gui, re-creates it using the new camera for this 15-second cycle, then sticks it to the main viewer object using the guitexturecanvas.


Something in that process I'm betting is failing. As mentioned above, maybe something that is being deleted is throwing things off. Can I maybe just remove the assertfatal message so it stops crashing? Or would that make it worse lol


Latest error is:


engine\source\gui\core\guicontrol.cpp(1818) : Fatal - GuiControl::addObject() - cannot add non-GuiControl as child of GuiControl



Which is in this routine:

void GuiControl::addObject(SimObject *object)
{
  GuiControl *ctrl = dynamic_cast<GuiControl *>(object);
  if(object->getGroup() == this)
     return;

  AssertFatal( ctrl, "GuiControl::addObject() - cannot add non-GuiControl as child of GuiControl" );

Parent::addObject(object);

  AssertFatal(!ctrl->isAwake(), "GuiControl::addObject: object is already awake before add");
  if( mAwake )
     ctrl->awaken();

 // If we are a child, notify our parent that we've been removed
 GuiControl *parent = ctrl->getParent();
 if( parent )
    parent->onChildAdded( ctrl );
}

Link to comment
Share on other sites

And another update...


I moved some things around in my script, where I was deleting the attached camera object before I deleted the guicontrol that referenced it, and that seems to have stopped one crash that was always firing within guicontrol::onadd.


Now that I am past that, I decided to just let the game idle while I watched the task manager in debug mode. In a release build, it is getting pretty solid, but the debug build craps out on its own while idling when I have not even touched the mouse. The memory at the time of the crash is around 1.7 gigs for the game.


Short of going to 64-bit which is not an option right now (for the main reason that Awesomium is not available in 64-bit and won't compile, plus moving to 3.7 from 3.6.3 is a pretty big merge to do with all the resources and AFX that I have in my code), is there an easy way to keep the memory usage down?


I have a lot going on in the scene yes, but really it's not like I'm running World of Warcraft here or anything, it's just a starship with a bunch of stuff in it and some AI guys spawning in. I'd like to think the engine can handle what I'm throwing at it, I'm sure others must have built games more complex than what we have. Surely I can't be the only one pushing the limits like this?


Any ideas?


Thanks :)


Latest error when it crapped out at idle in debug mode:


First-chance exception at 0x7572c44d in NCC1701_3D_DEBUG.exe: Microsoft C++ exception: std::bad_alloc at memory location 0x0018a23c..

Unhandled exception at 0x7572c44d in NCC1701_3D_DEBUG.exe: Microsoft C++ exception: std::bad_alloc at memory location 0x0018a23c..

Link to comment
Share on other sites

Yeah, 1.7-1.8 GB of usage is about where things start to break down. Rising memory with no input is generally a memory leak. Memory is being allocated but not freed.


My approach was just to remove things from my level until I discovered where large amounts of usage were coming from. To be honest though, its 2015, 32-bit is on its deathbed, I think you'd be better off spending the time trying to get Awesominum working in 64-bit.

Link to comment
Share on other sites

Yes, that would be the best thing for sure.


But that exercise is beyond me... From what I see, many people have been hammering them for the past two years or more on their site, asking for 64-bit Awesomium. Best reply they got was about a year ago saying it was "in our plans". Nothing on the site about any movement towards that sadly :(


The integration kit for T3D is written by Stefan Lundmark, I can ask him what his thoughts are on it, but I think the problem is more at the actual Awesomium SDK level, and that it needs 64 bit libraries or whatever in order to roll it into a 64 bit build of T3D.


And then of course there is the small matter of porting my engine codebase up from 3.6.3 to 3.7, which I tried once already and screwed up lol. Lots of room for mistakes there, so much has changed.


Maybe I will try your idea of ripping things out one at a time from the level, to see if the memory stops climbing. Although it is not constantly going up without input, so maybe there is not a leak. Memory goes up because I add stuff to the scene incrementally as I move about the ship. I also delete things, so I thought that would reduce memory as I go, but maybe it is not.


Anything I can do help free up unused memory while the game is in progress, as I am adding/deleting things?


Also, anyone want to tackle porting up Awesomium? lol I can dream


Thanks again for the advice

Link to comment
Share on other sites

If that's the issue, sounds like we need to assert on memory allocation, instead of letting it silently fail. However, there's dynamic memory allocation everywhere, so this sounds like a significant task.


EDIT: Also, what are the most complicated places to port from 3.6 to 3.7? Is it the engine method stuff?

Link to comment
Share on other sites

The move to 3.7 just poses a huge merge when I do a compare. It seems like a lot of core things have changed, probably to accommodate 64-bit and the opengl stuff.


I did manage to throw one together, but it kept crashing right out at random times, no windows warning or anything lol. I'm assuming I missed something in the afx merging or the player stuff for advanced characters. I will be giving it another go soon, it was just a whole lot of diffs so it takes forever to click through and check everything since so much seems to have changed.


However, I am excited at the prospect of 64-bit, which would hopefully solve all these memory issues. So that motivates me to try again.

Link to comment
Share on other sites

I solved a similar problem with my game by using the /LARGEADDRESSAWARE option in Visual Studio 13, which tells the linker that the application can handle addresses larger than 2 gigabytes. https://msdn.microsoft.com/en-us/library/wz223b1z.aspx So far, I have not had any bad side effects, running on a 64-bit windows 8 machine, so maybe you should give that a shot. My memory usage occasionally spikes up to ~1.7 gigs, and used to crash, but this fixed the problem. Also, check through the github pull requests, there were some memory leak fixes which I also applied, can't remember exactly where they are or who posted them.

Link to comment
Share on other sites

Oh wow.


I got it.


Completely built in 64-bit, including AFX and all my other resources. Everything working except Awesomium, but I will work around that using Theora video on textures instead for now (was using Awesomium to play flash video clips of Star Trek movies in the deck 6 theater).


And, wait for it.... No. More. Crashes.


Can't believe it, I can run everywhere in the ship and it will keep loading areas without issue. The game's memory usage goes up past 2 gigs and keeps climbing. I hit almost 3 gigs in heavy testing last night when I loaded every area at once. And it doesn't complain :)


This is going to change everything. So so happy right now.


I bow to the Steering Committee gods and all those who worked hard to get 64 bit support in there. It's a long time coming and now things can really scale.


Cheers all!

P

Link to comment
Share on other sites

Does the memory use climb indefinitely? If so then we have a leak to look for :p. But if not then it may just be that it loads more assets into memory as you explore. Either way, it'd be lovely if we could unload assets when they're no longer needed, to keep that memory usage down. That'd be better than saying 'just use more RAM', especially after you've gone to such lengths to unload objects that aren't near.

Link to comment
Share on other sites

I don't think there's any memory leaks, because if I just stop in an area then the memory does not climb, it stays where it is. Only when I move to a new deck and trigger the next object load do things go up.


Now, one thing as you mention, is that I am in fact deleting out objects when you leave a deck. Not just hiding them, but actually deleting from the scene and then re-adding later if I go back to that deck. But yet the memory does not seem to go down when I do that, which seems odd to me, unless the assets are always loaded in memory once they've been called once, even if you delete the object itself from the scene. As you say, it would be nice if it actually REALLY unloaded the object completely and freed up some memory.


Because even with things working as well as they are now, theoretically if we add enough to the ship then a person could still hit a total memory ceiling eventually if they only have 4 gigs of ram.


But for now, I'll take this as a win lol :)

Link to comment
Share on other sites

I guess the test is - once you've explored the entire ship, and keep going back to areas you've already been, does the memory hold steady? IIRC, yes, assets remain loaded in memory once they have been loaded once. As you say this becomes a critical issue if you want to pack more than 4GB of assets into your game, effectively.

Link to comment
Share on other sites

OK, a bit of trouble in paradise, another mystery to solve lol...


So, I enjoyed a couple of days of bliss playing with my new 64-bit build, until all of a sudden it started crashing again randomly. And I had not changed the engine code at all, same .exe. Had been making some script changes at the time, so I reverted back to the previous ones that worked fine, but yet still I am crashing.


Then, it got worse, to the point where right after the splash screen it will crash, does not even get to the title screen.


If I run the 32-bit build of the same engine code, I get no crashes on startup, just the usual problems when the game hits close to 2 gb of memory usage.


Pulled my hair out all weekend, tried everything, but no matter what I do it is still crashing.


Just to test, put it on my son's PC which is even faster (quad core instead of my dual core), and it did the same thing.


Disheartened, I decided to try it on my day job PC this morning, and would you believe it works fantastic without issue? Seriously, I can go anywhere and watch the memory climb past 2 gb without any crashes, things are smooth, I can run in full high resolution graphics and it does not complain.


The only difference I see between my work machine and the two machines at home where it fails, is that here I have 8 gb of ram and at home we only have 4. However, at no time does my total usage climb close to 4 gb, so I am unsure why the game is now crashing on those 4 gb systems.


Really bummed about this, since it worked and I was using it, and then all of a sudden it wasn't happy. I know my scripts have been heaviliy altered and no longer conform to the latest templates, but should this really be such a problem? The main changes I see in Faust's new AFX are that he has gone back to more stock implementation, less of his own custom scripts, so to convert my setup would be complicated - we have changed so much that sliding it back into the new template would be a royal pain.


And like I said, this all worked fine for over a day until it decided to crap out.


Anybody have any ideas what I should look for? Why would a 64-bit program run on some systems but not others? Are there maybe some .DLL files I need to put in a certain place, so Windows is able to find what it needs for 64-bit? Am totally stumped, and a bit depressed lol. At least at work I can see that I am not crazy and that I did in fact succeed with the 64-bit AFX build with all my extra resources in there. It's just at home it no longer wants to run. Which sucks lol.


Is this something anybody has heard of before?


Thanks

P

Link to comment
Share on other sites

Is it still failling in the same way? With a crash related to memory allocation?


I feel I have to ask this question: are the PC's that crash with the 64-bit version running 64-bit windows? If they're running 32 it's not going to work.


And finally I figured I should comment on this:

Because even with things working as well as they are now, theoretically if we add enough to the ship then a person could still hit a total memory ceiling eventually if they only have 4 gigs of ram.

 

The 32/64 bit limitation is in memory addresses. In 32-bit you can't address more than 2 GB worth of spots and then you run out of addresses. Once you move to 64-bit you've got plenty of addresses to work with. Running out of RAM is a different issue though. When you start to run out of RAM, your computer will swap lesser used data in RAM onto the harddrive to make space. This is called Virtual Memory. It prevents applications from crashing due to physical memory restrictions. So, someone running with 4 GB of RAM will not see a crash when you load more than 4 GB of assets into your game. It will make things slower, etc, but the application should not crash. The same would be true about running a 32-bit copy of your game on a machine with only 1 GB of RAM. It will overflow into virtual memory, but you'd still get your crash if you hit ~1.8 GB of usage, since you run out of memory addresses.

Link to comment
Share on other sites

Yes, all my machines are 64-bit Windows :)


I thought maybe it was related to low disk space, I was down to below 10 gigs on my dev PC, since that may affect how Windows swaps the memory to disk, however I freed up 30 gb on that machine and it made no difference. Likewise, my son's machine is sitting at over 100 gb free of disk space so I guess it can't be that.


I just ran it here at work again to convince myself I am not crazy. It runs fine. But when I get it home, exact same code, it will crash right after splash screen. Nothing telling in console, except that it seems to crap out after loading /tools.


This is really maddening, this build is proving to be a real c**k-tease lol, working for a day and then deciding not to any more, before I could really even begin to make use of the extra memory 64-bit was offering me :(


Surely, there has to be something simple I am missing here... Some switch I can flick during the build process, I'm not sure. I am using Windows SDK 7.0, maybe I need 7.1 for it to be more compatible with other systems? Maybe it's a different version of the C++ runtime or redistributable or something that it is looking for?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...