Big O Notation

Learning Goals

  • Discuss the challenge of comparing multiple solutions to a problem
  • Understand what Big O is and why it’s valuable for both interviews as well as apps dealing with large amounts of data
  • Discuss the differences between time and space complexity
  • Discuss some common Big O Notation equations
  • Practice calculating the Big O of a few high level solutions to a technical challenge

Vocab

  • Algorithm A set of rules and processes for solving a problem
  • Big-O Notation A description of the worst-case performance of an algorithm

WarmUp Technical Challenge

Let’s practice solving a problem at a high level. Just focus on writing out pseudo code, NOT actual code.

Problem:

  • You have an array which contains all numbers from 1 to 1 million.
  • The numbers are randomly ordered/shuffled in the array.
  • One number is in the array twice, also at some random location.
  • How might you find the duplicate value?

Note

There are a couple of solutions to the problem above that have various pros and cons. Don’t worry about trying to get the perfect solution. Instead, practice breaking down the problem and thinking about the approach you could take.

Potential Solution A

Sort/Compare

// sort numbers in ascending order
// iterate over numbers
  // look at neighbor.  If they are the same, return the value

One potential solution is to sort the array of numbers and then iterate over the numbers again, comparing each value to it’s neighbor to see if they are the same.

Potential Solution B

Nested Lookup

// iterate over array of numbers
  // iterate over the array again and compare it to the rest of the array
    // If a match is found, return the value

Another potential solution is to iterate over the array of numbers, and for each iteration, compare it to the rest of the array to see if a duplicate is found.

Potential Solution C

Hash/Obj Tracker

// iterate over the array and add each value to a hash
  // As soon as the value of a key is 2, you've found the duplicate

A third solution might include iterating over an array and adding it as a key to an object/hash, counting the number of instances in the value. As soon as a key already exists and is increased to 2, you’ve found your duplicate.

Ummm…Question?

How do we know which is the “best” answer? This is why we we can use something like Big O in order to help us understand the performance implications of each of these problems and compare each of these different solutions.

Although it is very possible that you may never get asked about Big O Notation in an interview, if you attempt to speak to it, even a little, it can make you stand out from other candidates. Also if you apply to bigger companies like Google, Apple, Twitter, etc. it is much more likely that you’ll run into more computer science topics due to the amount of data those companies need to track.

Okay, so what is Big O?

Big O notation is used in Computer Science to describe the performance or complexity of an algorithm. Big O specifically describes the worst-case scenario, and can be used to describe the execution time required or the space used (e.g. in memory or on disk) by an algorithm.

Another way of saying that is:

We’re describing the time complexity and space complexity of an algorithm.

  • The time complexity of an algorithm quantifies the amount of time taken by an algorithm to run as a function of the length of the input (ie n).
  • The space complexity of an algorithm quantifies the amount of space or memory taken by an algorithm to run as a function of the length of the input.

Which optimization should be prioritized?

  • There is rarely a perfect solution as there may often between trade offs between time and space complexity. Most of the time you will want to optimize for time complexity, but it is something to confirm especially during an interview.

Nice to Know (NOT Need to Know)

There is also Big Ω (Omega) to describe the best performance and Big Θ (Theta) the describe the average performance. While these can be valuable pieces of information, in most real world scenarios, we are often focused on the worst-case scenarios because we often don’t have control over the data we’re receiving. (Be it data from the FE to the BE or vice versa)

An Analogy for Big-O

A good analogy is “horsepower”, a term coined by James Watt in Scotland in the 1800’s to teach a comparison of power between draft horses and steam engines (and later other types of engines and machines). In today’s society, while we may not understand the intricacies of how an automobile works, we generally know that a 200-horsepower engine is better than a 100-horsepower engine.

Likewise, with Big O Notation, we can get a sense of EFFICIENCY of an algorithm, instead of the POWER of an engine as in the previous example.

Notation for Time Complexity

The notation for time complexity is written as a math formula. "n" is typically used to signify the amount of data. If something happens more than once (such as multiple iterations), we typically count it only the first time. Some examples you might see include:

  • O(1)
  • O(log n)
  • O(n)
  • O(n log n)
  • O(n^2) (aka, n-squared)

We’ll discuss each of these in order from least complex (best performance) to most complex (worst performance).

Constant Time - O(1)

Description:

  • No matter how much data there is, it only takes one operation.

Examples:

  • Simple math operations and assignment operators
  • Invoking a function
  • Accessing an array by index position
  • Accessing a property in an object/hash by its attribute/key

Logarithmic Time - O(log n)

Description:

  • Every step/action you take, you cut out half of the work.

Examples:

  • “Binary search” using a binary tree. An example of doing this might include looking up a word in a dictionary.

Linear Time - O(n)

Description:

  • Complexity grows in direct proportion to the size of the input data.

Examples:

  • Iteration over an array or the keys of an object/hash with a for loop or while loop
  • Using array prototype methods/enumerables such as map or reduce
  • Anything that looks at, or processes, every element in a data structure.

Linearithmic Time - O(n log n)

Description:

  • Often is a nested iteration where within each operation done in linear time, there are actions being done in logarithmic time over the same size of data

Examples:

  • Efficient sorting algorithms such as merge sort
  • Some searching algorithms

Quadratic Time - O(n^2)

Description:

  • Nested loops over the same or similarly sized dataset

Examples:

  • Some inefficient sorting algorithms such as bubble sort
  • Anytime you have nested loops or iteration methods

Put em all together and what do you get?

Big-O Complexity

Key Differences:

Big O Notation Equation Name Explanation Use Cases
O(1) Constant Time No matter how much data there is, it only takes one operation. - Simple math operations
- Invoking a function
- Accessing a value in an array by index position
- Accessing a property in an object/hash by its attribute/key
O(log n) Logarithmic Time Every step/action you take, you cut out half of the work. - A common example of this is a binary tree
O(n) Linear Time Complexity grows in direct proportion to the size of the input data. - Iteration over an array or the keys of an object/hash with a for loop, while loop, or methods like map or forEach
O(n log n) Linearithmic Time Often is a nested iteration where within each operation done in linear time, there are actions being done in logarithmic time over the same size of data - Efficient sorting algorithms such as merge sort
- Some searching algorithms
O(n^2) Quadratic Time Nested loops over the same or similarly sized dataset - Some inefficient sorting algorithms such as bubble sort
- Anytime you have nested loops or iteration methods

Notation for Space Complexity

Generally you will use a “modifier” notation like 1x or 2x where 1x represents that we did not use any more memory than the data structure provided to us. 2X would indicate that we’ve added a second data structure of some kind. Which data structure, and what it holds, is less relevant.

Note: This is NOT impacted by variables or “transient” data structures like array slices, but rather when you intentionally build a new data structure like an object/hash or an array. An example of this might include duplicating an array to track a result. This would result in 2x memory usage.

Taking a look at our previous problem

Taking a look at our previous solutions, as a group let’s calculate the Big O (time and space complexity) for the first algorithm

Sort/Compare

// sort numbers in ascending order
// iterate over numbers
  // look at neighbor.  If they are the same, return the value
  • In breakout groups, try calculating the Big O for the Nested Lookup and Hash/Obj Tracker solutions.

No Peeking Until Afterwards!

Sort/Compare

// sort numbers in ascending order O(n log n)
// iterate over numbers O(n)
  // look at neighbor.  If they are the same, return the value O(1)

time: O(n log n)
space: 1x

Nested Lookup

// iterate over array of numbers O(n)
  // iterate over the array again and compare it to the rest of the array O(n)
    // If a match is found, return the value O(1)

time: O(n^2)
space: 1x

Hash/Obj Tracker

// iterate over the array and add each value to a hash
  // As soon as the value of a key is 2, you've found the duplicate

time: O(n)
space: 2x

Checks For Understanding

  • What is Big O? What is it describing?
  • Explain the differences between “time complexity” and “space complexity”.
  • Which optimization should be priotized?
  • What should you try to avoid in order to optimize for better “time complexity”? “Space complexity”?

Resources

Lesson Search Results

Showing top 10 results