Arrays, Stacks, Queues
Learning Goals
- Students will have practiced the basic principles of Arrays, Stacks, and Queues, as common data structures
- Students will have tried an example problem where knowledge of a specific data structure makes the work much easier
Topics
Arrays
Background
Arrays are one of the simpler data structures to introduce in programming, as they’re built into most of the most popular programming languages. Some languages call them arrays, some call them lists. Some lower-level programming languages dictate than an array can only hold data of the same type. Most high-level programming languages like JavaScript, Ruby, Python, etc, allow lists/arrays to hold any type of data.
Memory Usage
Within our computer’s RAM (random-access memory), arrays are built using consecutive blocks of memory, so a list of 100 items, in memory, are literally stored next to each other in RAM.
Implementation and Usage
Thankfully, in JavaScript and Ruby, there’s nothing we need to build, and our high-level languages give us a lot of methods to iterate or enumerate over our data to access everything one by one.
Most high level languages build us a “buffer” since we don’t typically tell our code to allocate “only 100 things”, so it might allocate memory for 50 items or 100 items even if we only have 5 things in our array. Our languages will reallocate more buffer memory as needed, and we don’t have to worry about it. Java, for example, when your array starts to fill up and run out of that buffer space, will double the full size of the array each time.
Likewise, if we have a large array and we pop
a lot of that data out of the array, our language processor may take some time to free up that memory for other uses.
Unfortunately, this buffer only happens at the end of the array, so if we unshift
something to the front of an array, the operation has to go through the following steps: (thankfully our computers are quick enough that we don’t generally notice this)
- allocate a new block of memory for the array
- add the item you’re adding with
unshift
- copy the rest of the array into the new array
Pros/Cons
Pros:
- iterating over an array is very fast since everything is stored sequentially in memory
- if you know the index position of your data, you can access it in a single operation
- Ruby and JavaScript have lots of built-in methods to iterate/enumerate
- adding or removing items to the front or back of an array is very easy to do
Cons:
- if we have to search for something, we have to iterate over the entire array; the more data we have, the longer this takes
- we cannot add things to the middle of an array
- adding/removing items to an array over and over could cause our language to reallocate more/less memory over and over, which slows things down a little
- adding an item to the front of the array will trigger a memory copy of the old array
- we can’t really control or set a hard limit on our array size
Stacks
Background
My best description for a stack is a Pringles can – the first chip they put in the can is the last one you can take out. The last chip added is the first thing you remove.
This Last-In-First-Out (LIFO) mechanism is sometimes also called First-In-Last-Out (FILO), but LIFO is more common.
Memory Usage
Stacks can be implemented as an array, but you could also utilize a Linked List where we prepend new data to the front of our Linked List, since we generally only track the “head” node of a Linked List.
A deeper implementation of a “stack” is a more complex structure of data, which is what we use in our “call stack” when our code is calling functions. Each time we call a function, a number of things are kept track of: where we left off in the code, parameters passed, whether we expect returned data, etc, and this structure of data is something we call a “stack frame”.
Implementation and Usage
A quick-and-dirty implementation of a stack can use an array where we constantly append (push
) new items to the end of the array and remove (pop
) items from the end of the array as needed.
You generally won’t ever implement a “stack frame” structure of a stack, though any object you push onto a stack could store any amount of relevant data that our application needs. This more complex structure (with the call stack) is generally given a fixed amount of reserved system memory, and if you try to put too many things into the stack, and it runs out of room, we get an error called a “stack overflow”.
While it’s valid to “peek” into a stack to see what’s stored within the stack, it’s against convention to remove data from anywhere else in the stack than the very last thing that was added.
When building a Stack, you’d typically build methods like isEmpty
/empty?
, count
, and peek
Pros/Cons
Pros:
- easy to implement with arrays, but also have flexibility of using a linked list
- only need to use
push
/pop
with arrays, or the ‘head’ node of a linked list - we’re allowed to “look” deeper into the stack, but this is uncommon
Cons:
- adding too many things to a stack at one time, or removing a large quantity of items at one time, could trigger a memory reallocation/copy
Queues
Background
I like to use a “tunnel” analogy to describe queues. The first thing in the tunnel is the first thing out: (FIFO)
Queued items could also have “priority” as part of their data payload, so I like to use an airport check-in analogy. There’s a REALLY long line of “economy class” passengers waiting to check in, but a much shorter line of “first class” or “business class” passengers. When someone shows up in the “priority” line, the airline attendant will take care of them before the next person waiting in the “economy” line.
Memory Usage
Queues can be implemented using arrays or linked lists, but arrays are more common.
Implementation and Usage
Queues are typically implemented in a “first thing in is the first thing out” mechanism. This could be done using arrays and something as simple as an unshift
to add something to the front, and using pop
to pull the first thing off the end of the array.
With a single array, it’s almost impossible to build “priority” into our queue data without breaking convention and using push
, but then you might still have competing priority levels.
A more common approach would be having multiple arrays (or linked lists), one per priority level, with a mechanism to look for higher priority items first.
Pros/Cons
Pros:
- easy to implement with arrays, but also have flexibility of using a linked list
Cons:
- constant use of
unshift
to put things at the front end of a queue causes a memory copy operation to allocate a new array - hard to implement priority levels of queue data with a single data structure