Page 1 of 2

### Optimisation/Performance of T3D with many objects

Posted: Thu Feb 04, 2016 4:30 am
I am not sure if this is the right forum for this, but I am working on a project which will, eventually, allow users to create complex objects such as buildings, vehicles, or machinery, from many smaller objects. I am hesitant to compare my work to other projects for fear of making it look like I am making a knock-off, but if you imagine "Kerbal Space Program", then that is the general type of concept that I am going for.

However, I just conducted some test in T3D regarding this, and, as I suspected, there is a problem.

As I discussed on my blog, I noticed that even with a similar polycount, the FPS is vastly reduced as object count increases.

For my tests, I did this:

I created some spheres in a modelling tool, with a known number of polys. One high res sphere, with 10082 polys, one medium resolution at 1058, and one low res at 105.

I then wrote three drawing functions, which I executed separately:

Test 1: Draw 512 of the high res spheres (512*10082 = 5,161,984 polys),

Test 2: Draw 4,913 of the medium res spheres (4913*1058 = 5,197,954)

Test 3: Draw 46,656 of the low res spheres (46.656 * 105 = 4,898,880)

Since the polycount is quite similar in all cases, the FPS should be similar too, but it isnt. With nothing at all rendering, the polycount is about 200, after Test 1 (512 polys) the FPS is about 50, with all polys in view. However, Test 2, in the same conditions, produces a poly count of 9.3, and Test 3 wasnt really conclusive due to issues drawing all of those objects, but it was giving me about 5-9 fps, depending on the angle I looked at the objects at.

To go from 50 FPS down to 9.3 with a similiar polycount indicates a serious problem with my concept.

Can anyone shed any light on how I would conceptually solve this? I am not planning to work on it in any major way at the moment, but I will need to in the future.

I am just creating TSStatic's in script at the moment, is there a way to combine the distinct object meshes into one logical object?
So, I might have 10 meshes drawing, but they would be logically grouped into one object?
I think I got some advice before on this, and I was told to look into the vehicle code (since it allows mounting of objects into a single, driveable, object). Can anyone see any major problems, or solutions, to connecting many objects together like this?

Something tells me this is going to be very hard to do, performance wise.

Thanks for any advice!

### Re: Optimisation/Performance of T3D with many objects

Posted: Thu Feb 04, 2016 6:21 am

### Re: Optimisation/Performance of T3D with many objects

Posted: Sun Feb 07, 2016 5:02 pm
This is interesting... I ran into this problem with the last game I worked on, which was a really cool concept but failed when I scaled it up.

Basically I was making a minecraft-like game, where the player can place blocks to build things like castles or houses, etc. They are all TSStatic objects to start, so should be really low overhead, and they are all simple simple cubes with only one low-res material on them...

What happened was the FPS would drop significantly when I added more than a few hundred blocks. This is a problem, as any decent size castle will have 1000+ blocks in it.

I have never heard of this instancing variable either - is there any more information on how to use it effectively? Because I really loved that game, I had all kinds of neat concept in there that I was working on, but had to put it on hold because it just didn't scale to the point I needed it to.

It would seem to me that the engine should be able to handle thousands of small cubes without issue, I was quite surprised and disappointed when it started to crawl to a halt before I even really got a lot in the scene.

Any suggestions?

Thanks!
P

### Re: Optimisation/Performance of T3D with many objects

Posted: Mon Feb 08, 2016 12:48 am
With $pref::TS::maxInstancingVerts that will affect all static geometry that uses TSMesh class internally. So all TS based static geometry(not skinned meshes). So what it does is each mesh that has$pref::TS::maxInstancingVerts or less will be rendered using instancing. This in itself can be costly if it's picking up meshes that are not rendered multiple times. So be aware of this part.

Where instancing is a big win is say you have 5000 cubes to draw, without instancing you have to make that draw call 5000 times to the graphics card, with instancing you could make that draw call once. There is more going on than this, with instancing you have to update the instance buffer for each mesh to send the gpu so it's not as simple as i make out above but that is the general idea. So as you can see by this, all that geometry still has to be rendered/processed by the gpu it's just cpu side you are not making 5000 draw calls to do it. So if the actual draw calls themselves are not your bottleneck than instancing may not help you at all. With T3D there is still heaps of other stuff going on with that mesh, like all the collision container stuff, scene graph sorting, <insert heaps more>. It's possible the bottleneck could be elsewhere but it's certainly worth experimenting with.

Personally i think it would work better if you had to manually mark a mesh for instancing rather than just blindly picking up anything with \$pref::TS::maxInstancingVerts or less, i'm sure art departments would most likely hate me for that one because it wouldn't be automatic but sometimes fine tuning by hand is simply the best method. Anyways have a play around, there is no magic number to use and it is very specific to your level you are rendering.

*Edit:

When a i say static, it doesn't have to be static in that it can never move, just static as in not a skinned mesh. For example a physics object can be rendered using instancing.

### Re: Optimisation/Performance of T3D with many objects

Posted: Mon Feb 08, 2016 10:45 am
If I remember correctly, it's a NVIDIA rule that if you want to have instancing be effective the object needs to have more then a certain amount of polys (thought it was 300 but could be wrong); else you'd be stuck up with overhead. Though I can't find the document anymore where they described it all into detail.