Monthly Archives: December 2002

Computer Science 2 the Big O

CS1 was easy.  CS2 was easy until the midterm.  Then I had the realization that I was in trouble.  I bombed the midterm.  I was unprepared for the test.  Dr. Hasz pushed the attitude adjustment button for the whole class.  I was really set back getting 67/110.  I blew 2 of the 5 questions entirely.  My primary failure was caused by not keeping up with the book.  So when the first question asked:

Consider the language that the following grammar defines:
<S>=<L> | <D><S><S>   <L>=A|B   <D>=1|2

Write all three-character strings that are in this language
Write one string in this language that contains more than 3 characters

I had absolutely no idea what to do with this.  I should have read the section on grammars in the book since Dr. Hasz didn’t talk about it during lecture.

The other mistake I made was not paying attention as I rushed to complete the test.  The question:

Show the intermediate stack contents from infix to postfix for:
(a*b-c)/d+(e-f)

I failed to place the operands on the stack.

Dr. Hasz loved to give really hard tests.  No book, no Internet, just my brain, a pencil, and a pile of paper.  The same midterm also had us psuedocode a Binary Search algorithm and implementations of Stack and Queue ADT’s and a pile of other things.

I had to really knuckle down and by the second midterm I pulled out a 98/110 and finished the course with a B grade.

The BIG-O

Mathematical induction, recursion, and efficiency.  Does that algorithm operate on the order of n, log(n), n log(n), n2,or 2n.  How efficient is it?  What we learned is that really elegant recursive solutions to a problem often are really inefficient and that an iterative solution would be better, faster, less costly in terms of computational resources.  CS2 was only an introduction to  mathematical induction.  Dr. Hasz talked just enough about it to get us started looking at how efficient an algorithm is.

An algorithm requires time proportional to f(n) to run, this is its growth rate function.  Big O notation is the order of magnitude f(n) or O(f(n)).  If a problem of size n requires time directly proportional to n to solve then it is O(n).  If the problem requires time proportional to n2 then it is O(n2).

Using sorting algorithms we compared the time each algorithm took to sort primitive data types and comparable objects up to a million random datum.Sort Algorithm Times At the time I was running this on 233 MHz Pentium II computer using Java 1.4.  I had to guess the sorting time of a million objects as it would never complete on my slow machines.  I decided to rerun the program on a 2.5GHz Core i7 computer.  It handled the million object arrays nicely.Updated Sort Algorithm Times

The big takeaway that I got from this was that just because the algorithm is supposed to run a specific way it really does make a difference how the algorithm is written and what the data type is.  Comparable objects always ran slower than primitive data types.  Most of the sort algorithms have O(n2).  Radix is the one exception.  It doesn’t do comparisons and because of that the algorithm is O(n).  My graphs are on a logarithmic scale so the exponential growth isn’t obvious.  Here is the O(f(n)) for the sorts graphed on the same scale for comparison.BIG-O Times

The analysis of an algorithm’s efficiency really has to do with understanding where the costs are and that is language and machine specific.  Are you sorting based on references to an object or are you actually moving the object around in memory.  Java uses a reference to the object in memory.  As you re-order an array Java only moves the reference which is far less costly than moving the actual object in memory.  Setting that difference aside the first step in calculating big O is to count the move, exchange, and compare operations.

The selection sort will loop through an array of comparables n-1 times.  Each time it loops it makes a comparison.  It looks for the largest item and puts it in the last position of the array then loops again looking for the largest item and places it in the n-1 position.  The loop continues until n = 1 and that item is in its proper position.

Mathematically this is expressed as (n-1) + (n-2) + … + 1 = n * (n-1) / 2 comparisons.

At the end of the loop an exchange occurs that requires three assignments.  The last element in the array is moved to a temporary spot (move 1) .  The largest item is moved to the last (move 2).  The temporary item is moved to where the largest was (move 3).

3 * (n-1) data moves

Putting these together the selection sort takes n * (n-1)/2 + 3 * (n-1) = n2/2 + 5 * n/2 – 3 operations.

The properties of growth rate functions tell us the low order terms can be ignored leaving n2/2.  The multiplier (1/2) also has no real impact as n reaches really large numbers.  Half of infinity is still infinity so remove the multiplier.  That leaves the order of magnitude for the selection sort at O(n2).

The sorts program was Assignment 7.

I repeated this over and over and over.  Big O was one of Dr. Hasz’s favorite things to do to us.  We evaluated every algorithm that we looked at during the class.  When I look back we actually covered an amazing amount of concepts in this short class.  We looked at linked lists, stacks, queues, trees, graphs, and a hash table.  All of this was done with Java.  We learned more about data abstraction, class relationships, external methods, recursion, and always came back to algorithm efficiency.