Over the past few weeks, I have been giving the TorqueScript virtual machine a makeover with the guidance and support of JeffR, GlennS, Mango, and a few others. The goal is to improve performance and make the vm more maintainable. A few highlights of what's happening in detail.
Note: I am not going to give out performance metrics until I am finished with optimizations. Just note that so far, some benchmarks have shown a 2.5X increase in performance, while others were on par with the current interpreter.
Code location: CLICK HERE
The first thing I did was refactored the vm's code dispatch implementation. Torque3D up until now used a switch statement of OP codes to dispatch the byte-code instructions at run-time. We now use a version of Direct Threading to handle OP code dispatch. Besides the potential performance increase of not having to worry about potential branch prediction problems, the interpreter is also more maintainable, as each OP code is now within its own function call. This will also allow for an easy expansion of new OP codes, since each OP code is now isolated (See below for some new stuff)!
New Language Feature - "Function Pointers"
You can now call functions indirectly by storing functions in a variable, without specifying the function name at function-call time. Function pointers can only be global, non-namespace function calls. They also have slightly less overhead than a normal function call, due to the interpreter knowing that the function sits in the global namespace. Because function can be stored to variables, you can now pass functions as parameters and call them in other functions. This can allow for interesting callbacks to be invoked!
// do some work
// Using it...
%fn = ACallbackFunction;
// Using it...
One of my goals now has become optimizing the byte-code generated. I am trying to make the byte-code translator smarter at generating more efficient byte-code. Here are a few of these optimizations that I have done so far:
Optimizing Method Calls
I optimized the way methods are called slightly by generating a more efficient OP code for passing the current object that we are performing the method call on. This brings the OP code down from 3 OP codes for passing the object parameter, to 1 OP code.
Optimizing Arrays with constant indices
Arrays with constant numeric or constant string values are now optimized, and generate the same level of performance as a normal variable.
%var = %var;
%var["a"] = %var["b"];
Has the same performance at run-time as
%var0 = %var1;
%vara = %varb;
Increment and Decrement
Simple increment and decremented of variables generated around 6 OP codes, and performed adding/subtracting as a normal add/subtract. I now have special byte-code to handle simple increments and decrements. I know I wasn't gonna say benchmarks, but in a micro-benchmark increments got about 20% faster.
That's all for now, I plan on doing a couple of additional optimizations to squeak even more performance out of TorqueScript. Then, off to other adventures...
Until next time,