Here’s some cool stuff I learned in school while finishing up my computer science degree. I took a class called CS-201, at Portland Community College.

The only bad part was that the class was only offered at the “Rock Creek Campus” which is far from where I lived. Even better was that the class was only offered at only one time on one day (6pm on Mondays). 6pm in Portland (and just about any other populated metropolitan area) happens to be rush hour. And did I mention that last term my car stereo was stolen? Having only my thoughts and the high pitched squeal from the broken drivers side window, I had plenty of time to think of this blog entry and the examples I wanted to show, so check it out.

I’m posting these examples because hopefully, I can save some poor soul the hours it took me to learn this valuable lesson.

The first rule of code optimization is kinda like fight club.

There is no code optimization

Not so much that code optimizing is a secret club, but rather that it should be something you think of while you’re coding. Not something you go back and revisit.

Optimization can consume a huge amount of time. In most cases you only increase your efficiency by one clock cycle (or even one second). Trust me, your time could be better used on other projects like writing a new program, creating a blog post, or feeding the hungry. Either way, whichever you choose to do, your time is much more valuable than a clock cycle or two.

Let me show you an assignment we had to do.

The assignment seemed really easy and straight forward, get the code supplied in class optimized to run in less than 6.7 seconds and you get 100% for the assignment.

Get the code to run at 4.6 or less, and you get 200% for the assignment.

Here’s the code:

 

#include #include

// the code you submit must have these two values #define N_TIMES 200000 #define ARRAY_SIZE 10000

int main (void) {

double *array = calloc(ARRAY_SIZE, sizeof(double)); double sum = 0; int i;

printf(“CS201 – A02 – Ben Hermann”);

for (i = 0; i < N_TIMES; i++) {

// do not change anything above this comment, // except for your name

register unsigned int j;

for (j=0; j<array_SIZE; j++) {

sum = sum + array[j];

}

// do not change anything below this comment

}

return 0; }

After about four hours I couldn’t get my code less running than 5.5 seconds. I had tried just about everything.

Here’s my exact time: real 0m5.542s user 0m5.334s sys 0m0.006s

Here’s the code I that got me from 7 seconds to 5.5 seconds:

#include #include

// the code you submit must have these two values #define N_TIMES 200000 #define ARRAY_SIZE 10000

int main (void) {

double *array = calloc(ARRAY_SIZE, sizeof(double)); double sum = 0; int i;

printf(“CS201 – A02 – Ben Hermann”);

for (i = 0; i < N_TIMES; i++) {

// do not change anything above this comment, // except for your name

register unsigned int j;

for (j=0; j<array_SIZE; j+=10) {

sum += array[j] + array[j+1] + array[j+2]; sum += array[j+3] + array[j+4] + array[j+5]; sum += array[j+6] + array[j+7] + array[j+8] + array[j+9]; }

// do not change anything below this comment

}

return 0; }

But what was I missing??

Parentheses. You see, parentheses tell the processor to run various commands in a certain order. Putting parentheses around your mathematical code allows the processor to do the math in a more efficient fashion. This is called pipelining

Here’s a real life example: If you want to wash three loads of laundry: “Whites”, “Darks”, and “Mixed older clothes”, you usually don’t wait for the first load to be done drying before you put the second load in the washer (at least sane people don’t). Regardless if the dryer is done or now, as soon as the washer is free, you dump in the next load.

Same with a processor and parentheses. by putting parentheses around the equation, you tell the processor that even if it it’s not done finishing the last process, if there’s an open process, it does not have to wait until the last process is completely finished before starting the next one.

The time for this code is: real 0m4.469s user 0m4.289s sys 0m0.007s

A huge performance difference.

Here’s the code:

#include #include

// the code you submit must have these two values #define N_TIMES 200000 #define ARRAY_SIZE 10000

int main (void) {

double *array = calloc(ARRAY_SIZE, sizeof(double)); double sum = 0; int i;

printf(“CS201 – A02 – Ben Hermann”);

for (i = 0; i < N_TIMES; i++) {

// do not change anything above this comment, // except for your name

register unsigned int j;

for (j=0; j<array_SIZE; j+=10) {

sum += array[j] + (array[j+1] + array[j+2]); sum += array[j+3] + (array[j+4] + array[j+5]); sum += array[j+6] + (array[j+7] + (array[j+8] + array[j+9]));

}

// do not change anything below this comment

}

return 0;

 

To sum up the problem I was having… I found a code improvement that brought my code to around 5.5 seconds. To get there, I used a technique called loop unrolling. By unrolling the for loop to execute several additions in the one loop program speed saw a great increase. However no matter how much I unrolled that loop, I never saw the same improvement. The issue behind loop unrolling, is that it’s performance benefits are asymptotical. If you don’t know what that means, it simply refers to the idea that if you’re approaching point B from point A, and you move forward by 50% each time you move towards point B, no matter how much brute force you use, you’ll never actually get to point B, you’ll only get very close.

Most code optimizations are asymptotical. This means that if you have a goal of 4.6 seconds, and your current optimization has improved your code from 6 to 5 seconds, then even if you recreate the same optimization 100 times, you’ll never get to 4.6, you only get close.

Well I hope this was helpful and that and I helped you learn something. For more info, here’s a site which has other tips you can use to optimize your code with. They list almost everything except for the darn parentheses. http://www.eventhelix.com/RealtimeMantra/Basics/OptimizingCAndCPPCode.htm