# Big O Notation, Part 2

## Learning Goals

• Review Big O and the differences between time and space complexity
• Review some common Big O Notation equations while adding a couple more
• Practice a few technical challenges at a high level and calculating the Big O Notation

## Vocabulary

• `Algorithm` A set of rules and processes for solving a problem
• `Big-O Notation` A description of the worst-case performance of an algorithm

## Introduction

We’re going to be reviewing some of the Big O Notation lesson from Mod 3, and examining some additional terminology and working through some additional interview challenges to practice high-level design and Big O Analysis.

Big O Notation is one of those topics where you’ll stand out from other candidates applying for a job, especially if you can VOLUNTEER Big O Notation terminology during an interview without being prompted.

### Reviewing The Above

What is Big O and what does it describe?

• Big O notation is used in Computer Science to describe the performance or complexity of an algorithm.
• Big O specifically describes the worst-case scenario, and can be used to describe the execution time required or the space used (e.g. in memory or on disk) by an algorithm.

What is Time Complexity?

• The time complexity of an algorithm quantifies the amount of time taken by an algorithm to run as a function of the length of the input (ie n).

What is Space Complexity?

• The space complexity of an algorithm quantifies the amount of space or memory taken by an algorithm to run as a function of the length of the input.
• This is NOT impacted by variables or “transient” data structures like array slices, but rather when you intentionally build a new data structure like an object/hash or an array.
• Generally with use a “modifier” notation like `1x` or `2x` where `1x` represents that we did not use any more memory than the data structure provided to us. `2X` would indicate that we’ve added a second data structure of some kind. Which data structure, and what it holds, is less relevant.

Which optimization should be prioritized?

• There is rarely a perfect solution as there may often between trade offs between time and space complexity. Most of the time you will want to optimize for time complexity, but it is something to confirm especially during an interview.

### Reviewing Big O Notation Equations Key Differences:

Big O Notation Equation Name Explanation Use Cases
O(1) Constant Time No matter how much data there is, it only takes one operation. - Simple math operations
- Invoking a function
- Accessing a value in an array by index position
- Accessing a property in an object/hash by its attribute/key
O(log n) Logarithmic Time Every step/action you take, you cut out half of the work. - A common example of this is a binary tree
O(n) Linear Time Complexity grows in direct proportion to the size of the input data. - Iteration over an array or the keys of an object/hash with a `for` loop, `while` loop, or methods like `map` or `forEach`
O(n log n) Linearithmic Time Often is a nested iteration where within each operation done in linear time, there are actions being done in logarithmic time over the same size of data - Efficient sorting algorithms such as `merge sort`
- Some searching algorithms
O(n^2) Quadratic Time Nested loops over the same or similarly sized dataset - Some inefficient sorting algorithms such as `bubble sort`
- Anytime you have nested loops or iteration methods

### What are the factors in determining complexity?

• Number of comparisons or operations
• Amount of space used in memory

The number of elements in a data structure are counted, and generally use “n” as an algebraic replacement for the actual number. So if we have 100 elements in an array or 1,000,000, we substitute “n” for the count of those elements when determining how much work our algorithm is doing.

Why does this matter? Take a look at the differences in the number of operations taken when dealing with the same amount of data in the chart below:

``````      | O( log n )    | O( n ) |    O( n^2 )   | O( 2^n )
------------------------------------------------------------
100   |     2	      |    100 |        10,000 | 1.27E+30
------------------------------------------------------------
1000  |     3	      |  1,000 |     1,000,000 | 1.07E+301
------------------------------------------------------------
10000 |     4	      | 10,000 |      1.00E+08 | 1.07E+3010
------------------------------------------------------------
``````

### Note

There are a couple of solutions to the problem above that have various pros and cons. Don’t worry about trying to get the perfect solution. Instead, practice breaking down the problem and calculating what the pros and cons of that solution might be.

### Solution A

``````// iterate through the array of numbers   O(n)
// for each number, iterate through the remainder of the array  O(n)
// calculate the sum to see if it is equal to the target O(1)
// if true, return array with num[i] and num[j] O(1)
// return [] O (1)

// Space Complexity: 1x
``````

One potential solution is to use nested iterations on the array and calculate the sum of the numbers to see if they equal the target. Take note that we are not trying to write out code, we just want enough to hone in on the logic. Based on that, we can calculate that the time complexity is O(n^2) because of nested iterations. However, our space complexity is 1x due to use not having to create a new data structure.

### Solution B

``````// iterate through numbers and create object/hash with each num as a key O(n)
// iterate through keys and calculate the difference (target - num) O(n)
// check if the difference is a key on the payload and that the current number is different from the difference O(1)
// if true, return [currentNum, diff]
// return []

// Time Complexity: O(n) Linear
// Space Complexity: 2x
``````

Although there are multiple iterations happening here, none of them are nested keeping the time complexity as linear, O(n). However, because we want to create an object/hash this will increase our space complexity to 2x.

# New Terminology

## O(n * m) - This doesn’t have a good name (yet)

When we talk about quantities of numbers in a data set, we use “n” to describe the count. But what if we need to do an operation on one data set which is extremely large, against another data set which is extremely small? Should we use “n” in both cases?

Let’s use “n” for our largest set, and “m” for our smallest set, and compare using the following prompt:

``````A list of teachers is responsible for grading a number of students. Generate a list of students for each teacher.

set teacher_list = ['Alex', 'Eric', 'Megan', 'Casey']
set student_list = ['Able', 'Becca', 'Carlos', 'Demarcus', 'Edwin', 'Francesca', 'Gary', ...]

set teacher_index to 0
set last_teacher_index to -1

for each "student" in student_list:
if teacher_index is not equal to last_teacher_index:
print "---\n list of students for " + teacher_list[teacher_index]
set last_teacher_index = teacher_index
set teacher_index = teacher_index + 1
print student
``````

If the number of teachers is 10, and the number of students is 100, those sizes aren’t too different. In this case, we could say the algorithm is `O(n^2)`.

However, let’s say there are only 10 teachers for 10,000 students. Now the numbers are significantly different, by several “orders of magnitude”. In this case, we would use “m” to represent our smaller data set, the teachers, and classify this algorithm as O(n * m) – which is more than Linear Time (`O(n)`), but less than Quadradic Time (`O(n^2)`)

## O(2^n) - Exponential Time

For each new item, “n”, the growth of work doubles.

The most common algorithm for this is the Fibonacci Sequence, where the next number in the sequence is the sum of the preceding two numbers. This list always starts with [0, 1] and grows to look like [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, …]

This algorithm is typically built using recursion where we want to find the “N-th” number in the sequence. For example, what is the 30th number in the Fibonacci sequence? (the answer is 832,040)

``````function fibonacci(nth_number):
if n is less than 0:
print "ruh-roh, invalid input"
else if n === 0:
return 0
else if n === 1:
return 1
else:
return fibonacci(nth_number - 1) + fibonacci(nth_number - 2)
``````

In this sequence, each new “N-th” number we want to find needs TWO function calls to the `fibonacci()` method to calculate the answer.

# More Exercises

### Note

Although this could by a 10x10 array or even a 1000x1000 array, always focus on the easiest/smallest test case scenario first and then expand to larger and more complex datasets.

### Potential Solution

``````// while there are values in the matrix O(n) linear
// steal the first row
// ^^^^^^^^
// spread the result, and shift out the top array (spread) O(n) linear

// steal the right items
// ^^^^^^^^
// iterate through length of matrix O(n)
// Pop the last item off each array & push into result

// steal the bottom row
// ^^^^^^^^
// Pop the bottom array

// steal the left items
// ^^^^^^^^
// Iterate through length of matrix - 1 O(n)
// Shift the first item off each array and push into result

// return result 