Optimizing Actionscript and your approach to optimizing

Tagged:  

This article has simple techniques as well as advanced. Read or skim what best suits you.

The first thing I want to write about is how to decide when and where to optimize something. I've observed that some people care absolutely nothing about optimization as long as the program works, and some people waste altogether way too much time optimizing things that make little to no impact.

If it's not slow, does it really need optimizing? Consider your use cases. If you have a pretty solid understanding of the types of computers on which your application is run, and the way your application is going to be used, then if your tests run at an adequate speed, then don't bother optimizing. This may seem obvious, but just be self-aware of your own programming habits. You can literally spend an infinite amount of time optimizing an application. Computers keep getting faster, if it runs pretty good at present, it will likely run better in the future.

However, for applications that have a broad use case, or code that will be used in many places of unforeseen magnitude, you better invest some time into optimization.

What to optimize?

  • Animation
  • Graphics
  • Server requests
  • Code

Animation / Graphics

The priority I gave these are based on my observations of what most commonly degrades the performance of an application. The most processor intensive part of an application is the animation and graphics, hands down. Fortunately, the graphics are one of the easiest places to improve.

  • Use runtime filters sparingly. If you can turn it into a raster graphic with those filters, it will likely be a smoother animation. This will be a delicate balance between file size and performance.
  • Use motion tweens instead of shape tweens whenever possible.
  • Don't have transparent things on top of transparent things. When transparencies overlap, the processor usage multiplies.
  • Keep your frames per second at or below 30. I've seen many applications trying to run at 60+ fps. I usually set my applications at around 24.
  • Use easing only where you need it.
  • Learn about cacheAsBitmap. This can speed things up or make them slower, it depends on the context. If you have an animation, keep cacheAsBitmap as false, the cached bitmap will have to be regenerated every time the animation changes, so this isn't worth it. However, on a vector graphic that doesn't change within itself, cacheAsBitmap might be a good choice.

Server requests

Whether loading external assets or querying a web service, server requests take a lot of time. You might have spent ten hours shaving off 40 milliseconds from your code, but that's like peeing in the ocean compared to your server requests.

  • Memory is cheap, but bandwidth is not. Store the data you load if it has a chance of needing to be loaded again.
  • If you are loading a dictionary, it's several times faster to load the dictionary as a whole book than it is to load each word individually. However, that would make it take ages before the user can see the first word. Find the right balance of the amount of data to be loaded per request. Some ways of consolidating your files can be merging css files together, or packing your graphics and sounds into swfs.
  • For smaller files in particular, it's faster to load several at once rather than to queue them up one at a time. But in a situation where there is a specific order to the files you're loading (like a slideshow), it might be better to use a queue.

Code

This is what the nerds have been waiting for. The secrets to optimizing code in Flash.

In ActionScript 3, we were given two new primitive types: uint and int. There has been a lot of confusion about these primitive types, especially when it comes to speed. I have heard many people say to always use ints and uints whenever possible, and I've heard many people say to avoid them because Flash doesn't actually use them. Here's why it's so confusing. It depends on the version of the flash player.

  • int and uint operations are always slower when combined, compared, or any other way mixed with another type. This causes them to be cast and performance is decreased. For example: The code var x:Number = 10.0; var y:int = 9; if (y > x) {} will cause x to be cast to a number before comparison, making it much faster to have set y as a Number instead.
  • Incrementing, decrementing, adding, and subtracting as ints and uints are faster than Numbers regardless of player version.
  • Dividing ints and uints are always slower than dividing Numbers.
  • Multiplying ints and uints are faster than Numbers in 9.0.115 and higher.
  • uint division is always slower than int division.

Primitive type speed test chart

Here's a graph of Number, int, uint increment, addition, multiplication, and division. One thing I'm very pleased to see from these tests is that Adobe is going in the right direction. It's pretty appalling that their int and uint arithmetic is so terrible in earlier versions, but at least they're fixing it.

And now for something much more important. Class, function, and variable look-up times.

Everywhere I look, in 3d engines, in physics engines, everywhere where optimization is important, I see a critical mistake. The optimized math code is pulled out into a utility class. One of the slowest things flash does is package and class look-ups. Consider that you have a class com.example.FastMath. (I've seen this a lot). And you put a static constant PI2 on the class so instead of having the expensive operation Math.PI * 2, you have the "faster" use of FastMath.PI2.

Now consider what is actually going on with that static constant look-up. You can essentially think of it like this: hashmap(com).hashmap(example).hashmap(FastMath).PI2 If those packages have a lot of classes, that can be a ridiculously slow operation. What makes it worse, is that flash, even after it's found that package, that class, and that constant, it has to re-look it up on every single iteration unless you fix the machine code with a disassembler.

public class OptimizationTest extends MovieClip {
	private const PI2:Number = Math.PI * 2;
 
	public function OptimizationTest() {
		var i:uint;		
 
		start = getTimer();
		// Case 1: Normal case
		for (i = 0; i < 10000000; i++) {
			result = Math.PI * 2;
 
		}
		trace("Time elapsed: " + (getTimer() - start)); // 628
 
		// Case 2: Frequently optimized case
		start = getTimer();
		for (i = 0; i < 10000000; i++) {
			result = FastMath.PI2;
		}
		trace("Time elapsed: " + (getTimer() - start)); // 890
 
		// Case 3: Correctly optimized case  (Using a local constant)
		start = getTimer();
		for (i = 0; i < 10000000; i++) {
			result = PI2;
		}
		trace("Time elapsed: " + (getTimer() - start)); // 121
	}
}

Case 1 is what people would do without considering any optimization. Congratulations, you're 30% faster than most people that try to optimize their code. In case 2, even though you're getting rid of an expensive math calculation by having that calculation pre-determined, you're more than losing the benefit by having this expensive class look-up. Note: the Math class itself is an intrinsic class, and has a faster look-up time than any class you might make, but it still is very slow.
With Case 3, the constant is stored locally, therefore removing the need for the look-up. Oh, and another thing I should mention. The actual math operation is so trivial compared to the variable look-ups, there's absolutely no reason to to ever store a 2 * Math.PI variable, in a lot of cases, the compiler will consolidate this math anyway.

Now that I made some people cry, how can we fix this? We don't want to have to copy our code everywhere to avoid a look-up, that would be ugly. One way is to create your utility methods, and use include to include them as local methods. I don't like that way one bit, but it gets the job done if speed is your priority. The other way is to make your utility methods not static. So instead of FastMath.fastSin(x), do var fm:FastMath = new FastMath(); fm.fastSin(x);. This will be just as slow for the first time you do it, but if you have a loop that uses fastSin repeatedly, it will only have to do the look-up once.

In my machine architecture class, I wrote an optimized log2 method that will find the log base 2 of x much faster than using the alternative: (Math.log(x) / Math.LN). We're going to see how it compares when used as a static utility method, an inline method, and as an instance method. But before we do that, let's take a look at the optimized log2 itself.

public function log2(x:int):int {
	var num:int = x >> 16;
	var sign:int = int(!num);
	var ans:int = (sign << 4) ^ 24;
 
	num = x >> (ans);
	sign = int(!num);
	ans = (sign << 3) ^ (ans + 4);		
 
	num = x >> (ans);
	sign = int(!num);
	ans = (sign << 2) ^ (ans + 2);
 
	num = x >> (ans);
	sign = int(!num);
	ans = (sign << 1) ^ (ans + 1);		
 
	num = x >> (ans);
	sign = int(!num);
	ans = sign ^ ans;
 
	return ans;
}

Unlike Math.log, this method will only work with integers. It uses binary manipulation to calculate the result almost ten times faster than it's floating point counterpart. If someone wants an in depth explanation of how this algorithm works, feel free to contact me, but for now, it's beyond the scope of this article.

Now running this method through a 10 million iteration loop, we get some interesting results.

The first result is using Number, the second result is using int.

  • As a static method (e.g. MathUtils.log2(x)): 2754 / 2055
  • As an instance (e.g. var math:MathUtils = new MathUtils(); math.log2(x)): 1971 / 1339
  • Inline code, no method reference: 811 / 296

Wow! So avoiding a method reference altogether and keeping the math inside your loop is approximately 7 times faster than a static look-up. Same exact code, all that matters is where you put it. Instantiating your utility class first will be much faster over a huge loop, but no difference for smaller methods.

If anybody knows of a way to make method look-ups happen only once within a loop, let me know! If you make a reference to the method it makes things even slower. Package level methods aren't any faster either. (Like the flash.utils.* methods)

Here's another fantastic optimization article I learned a lot from: http://osflash.org/as3_speed_optimizations. A lot of the tips are good, there are a couple things that aren't very important or otherwise outdated in the latest version of flash. To comment on a few of the bullet points:

  • Do not use Objects, if you know which properties will be finally involved. Right on! If you don't strongly type things, flash will treat all your objects as HashMaps. This is very slow, so when you iterate over an array, make sure you cast the array element before using it, this will improve performance and readability greatly.**
  • Bitoperators are lightning fast. Yes, but don't obfuscate your code in places it's not vitally important. If something is hard to read, it will soon cease to be improved. It's like if you wrote perfect code in ASM. Sure it'll be awesome for a while, but inevitably changes will need to be made, and your unreadable code will be obsolete. Meanwhile, the inefficient, but easy-to-read code your competitor wrote is now just as fast as yours because computers have become so much faster.
  • Wishlist: Typed, length-fixed Arrays. Woohoo for Vectors in Flash 10!

** the common way to iterate over an array:

function calculateSum(arr:Array):Number {
	var sum:Number = 0;
	for (var i:Number = 0; i < arr.length; i++) {
		sum += arr[i];
	}
	return sum;
}

There are several things wrong with that code, 1. Arrays are mutable, so when you say i < my_arr.length as your maintenance condition, the length of the array needs to be re-calculated on every iteration. Yuck! Instead, store a local variable with the length before the iteration (This is even more important when it comes to Strings). 2. The array element isn't being properly cast, so it's being treated like an Object and what's actually happening is this: sum += Number(Object(my_arr[i]).toString()); That's no good, jack.

Here's the best way to do it:

function calculateSum2(arr:Array):Number {
	var sum:Number = 0;
	for each(var val:Number in arr) {
		sum += val;
	}
	return sum;
}

Testing the performance of it, even I was amazed by the difference. Given a 10 million iteration loop, calculateSum1 completed in 4213ms, calculateSum2 completed in 397ms! Over ten times faster.

Thanks for reading, if this was helpful to you, link to it! I'll try to keep it coming.

look at this post for additional and even faster methods to compute log2 in AS3.
http://guihaire.com/code/?p=414

Some parts of this article are a little bit out of date, the Flash player now (finally) caches method look-ups, so an iteration that calls the same method repeatedly will not have the same overhead as it did at the time of this article.

Cool and useful article! Did[n] not[n] knew[n] that[n] checking[n] array[n] values[n] too[n] much[n] was[n] the[n] thing[n] slowing[n] down[n] my[n] project[n]!! Now the slowdown is gone and my knowledge grown. Thnk you a lot!

Does this apply to FP 10.1?

I've seen optimizers in the wild that are supposed to give you the same advantages that opcodes that write and read as C libraries receive when compiled with Alchemy

Subscribe to comments?

Yes, there are good optimizers out there. I haven't done any real testing on any of them though. Die Apparat is a popular one.

Really useful article. But I also wonder how about the GARBAGE COLLECTION in flash?

Re:

Wow, this is one of the best posts for AS3 optimizations that I have found, such great attention to version. I got a couple more tips and tests to add of my own http://www.stephencalenderblog.com/?p=7

Best of Luck,

Stephen

:)

Thanks.
Hehe, on your site:
"I had a professor that claimed ++x was faster than x++, and he appears to be correct, but not enough to significantly matter."
Ha! You should have said to him, "Oh noes! My nanoseconds!"

Yeah, with optimization for web applications, with the exception of intense things like 3d or physics engines, math optimization doesn't matter in the slightest. Focus on network, garbage collection, and then display optimization.

In a 'Config' class, I have only "public const" constants.
I instantiate this class once at the begining of my main class and pass the instance as a parameter in the other classes that use these constants.
Is this ok for speed optimization? or should I create an "include" file that contains only the declarations of the constants without any class definition and then use include "myfile.as" in each class that I need those constants?

I suppose the use of static constants is excluded.

Another question is: it's ok to pass by reference(not by value) more than 4-5 parameters to a class constructor?

Please replay.
Thanks,
Marius.

re:

Hi Marius,

What you said is good for speed optimization. You mentioned 3 options; 1. A config class with static constants. 2. A config class with non-static constants you would instantiate in every class you need it. 3. A config file to be included in each class that needs those constants.

1. Slowest, but cleanest. Do this in most situations, but not for things like 3d engines, physics engines, and the like.
2. You have one major lookup when instantiating the class, but any loops you have will be very fast because the class lookup has already happened.
3. This is the fastest, but also might be the most difficult to manage, I suppose it depends on your IDE, if it understands the include or not.

For your other question, whether it's ok to pass 4-5 references to your constructor.
The rule of thumb I use for constructors is I will have every mandatory variable for that class in the constructor. If it's an optional config parameter, I keep it as a getter/setter, but if the class wouldn't work without it, I have it in the constructor, no matter how many it ends up being. Following that rule, I've never made a class with more than five constructor arguments.

-Nick

I told you this in email, but this article rocks and I'm glad it exists. :)

I wrote an article (actually, created a blog and then wrote an article) advocating the use of more bitwise operators, just to spite you. ;)

http://petosky.net/2008/08/21/making-friends-with-binary-representation

I'd be glad to have your input.

Hey.

nbilyk, nice to meet you, soldier. :-|

Thanks for the article.

Now, I will go back through some of my code that has the most looping in it, and start replacing some of my "public static const" (from external classes) with local static consts.

BTW, I did not cry. :-)

__________
-Zir0 out.

thank u so much it realy very helpful article

great article for optimization! I did not know that Number are faster when it came to division.