Rootish Array Stack
A Rootish Array Stack is an ordered array based structure that minimizes wasted space (based on Gauss’s summation technique). A Rootish Array Stack consists of an array holding many fixed size arrays in ascending size.
A resizable array holds references to blocks (arrays of fixed size). A block’s capacity is the same as it’s index in the resizable array. Blocks don’t grow/shrink like regular Swift arrays. Instead, when their capacity is reached, a new slightly larger block is created. When a block is emptied the last block is freed. This is a great improvement on what Swift arrays do in terms of wasted space.
Here you can see how insert/remove operations would behave (very similar to how a Swift array handles such operations).
Gauss’ Summation Trick
One of the most well known legends about famous mathematician Carl Friedrich Gauss goes back to when he was in primary school. One day, Gauss’ teacher asked his class to add up all the numbers from 1 to 100, hoping that the task would take long enough for him to step out for a smoke break. The teacher was shocked when young Gauss had his hand up with the answer 5050
. So soon? The teacher suspected a cheat, but no. Gauss had found a formula to sidestep the problem of manually adding up all the numbers 1 by 1. His formula:
sum from 1...n = n * (n + 1) / 2
To understand this imagine n
blocks where x
represents 1
unit. In this example let n
be 5
:
blocks: [x] [x x] [x x x] [x x x x] [x x x x x]
# of x's: 1 2 3 4 5
Block 1
has 1 x
, block 2
as 2 x
s, block 3
has 3 x
s, etc…
If you wanted to take the sum of all the blocks from 1
to n
, you could go through and count them one by one. This is okay, but for a large sequence of blocks that could take a long time! Instead, you could arrange the blocks to look like a half pyramid:
# | blocks
--|-------------
1 | x
2 | x x
3 | x x x
4 | x x x x
5 | x x x x x
Then we mirror the half pyramid and rearrange the image so that it fits with the original half pyramid in a rectangular shape:
x o x o o o o o
x x o o x x o o o o
x x x o o o => x x x o o o
x x x x o o o o x x x x o o
x x x x x o o o o o x x x x x o
Here we have n
rows and n + 1
columns. 5 rows and 6 columns.
We can calculate the sum just as we would an area! Let’s also express the width and height in terms of n
:
area of a rectangle = height * width = n * (n + 1)
We only want to calculate the amount of x
s, not the amount of o
s. Since there’s a 1:1 ratio between x
s and o
s we can just divide our area by 2!
area of only x = n * (n + 1) / 2
Voila! A super fast way to take a sum of all the blocks! This equation is useful for deriving fast block
and inner block index
equations.
Get/Set with Speed
Next, we want to find an efficient and accurate way to access an element at a random index. For example, which block does rootishArrayStack[12]
point to? To answer this we will need more math!
Determining the inner block index
turns out to be easy. If index
is in some block
then:
inner block index = index - block * (block + 1) / 2
Determining which block
an index points to is more difficult. The number of elements up to and including the element requested is: index + 1
elements. The number of elements in blocks 0...block
is (block + 1) * (block + 2) / 2
(equation derived above). The relationship between the block
and the index
is as follows:
(block + 1) * (block + 2) / 2 >= index + 1
This can be rewritten as:
(block)^2 + (3 * block) - (2 * index) >= 0
Using the quadratic formula we get:
block = (-3 ± √(9 + 8 * index)) / 2
A negative block doesn’t make sense, so we take the positive root instead. In general, this solution is not an integer. However, going back to our inequality, we want the smallest block such that block => (-3 + √(9 + 8 * index)) / 2
. Next, we take the ceiling of the result:
block = ⌈(-3 + √(9 + 8 * index)) / 2⌉
Now we can figure out what rootishArrayStack[12]
points to! First, let’s see which block the 12
points to:
block = ⌈(-3 + √(9 + 8 * (12))) / 2⌉
block = ⌈(-3 + √105) / 2⌉
block = ⌈(-3 + (10.246950766)) / 2⌉
block = ⌈(7.246950766) / 2⌉
block = ⌈3.623475383⌉
block = 4
Next lets see which innerBlockIndex
12
points to:
inner block index = (12) - (4) * ((4) + 1) / 2
inner block index = (12) - (4) * (5) / 2
inner block index = (12) - 10
inner block index = 2
Therefore, rootishArrayStack[12]
points to the block at index 4
and at inner block index 2
.
Interesting Discovery
Using the block
equation, we can see that the number of blocks
is proportional to the square root of the number of elements: O(blocks) = O(√n).
Implementation Details
Let’s start with instance variables and struct declaration:
import Darwin
public struct RootishArrayStack<T> {
fileprivate var blocks = [Array<T?>]()
fileprivate var internalCount = 0
public init() { }
var count: Int {
return internalCount
}
...
}
The elements are of generic type T
, so data of any kind can be stored in the list. blocks
will be a resizable array to hold fixed sized arrays that take type T?
.
The reason for the fixed size arrays taking type
T?
is so that references to elements aren’t retained after they’ve been removed. Eg: if you remove the last element, the last index must be set tonil
to prevent the last element being held in memory at an inaccessible index.
internalCount
is an internal mutable counter that keeps track of the number of elements. count
is a read only variable that returns the internalCount
value. Darwin
is imported here to provide simple math functions such as ceil()
and sqrt()
.
The capacity
of the structure is simply the Gaussian summation trick:
var capacity: Int {
return blocks.count * (blocks.count + 1) / 2
}
Next, let’s look at how we would get
and set
elements:
fileprivate func block(fromIndex: Int) -> Int {
let block = Int(ceil((-3.0 + sqrt(9.0 + 8.0 * Double(index))) / 2))
return block
}
fileprivate func innerBlockIndex(fromIndex index: Int, fromBlock block: Int) -> Int {
return index - block * (block + 1) / 2
}
public subscript(index: Int) -> T {
get {
let block = self.block(fromIndex: index)
let innerBlockIndex = self.innerBlockIndex(fromIndex: index, fromBlock: block)
return blocks[block][innerBlockIndex]!
}
set(newValue) {
let block = self.block(fromIndex: index)
let innerBlockIndex = self.innerBlockIndex(fromIndex: index, fromBlock: block)
blocks[block][innerBlockIndex] = newValue
}
}
block(fromIndex:)
and innerBlockIndex(fromIndex:, fromBlock:)
are wrapping the block
and inner block index
equations we derived earlier. superscript
lets us have get
and set
access to the structure with the familiar [index:]
syntax. For both get
and set
in superscript
we use the same logic:
- determine the block that the index points to
- determine the inner block index
get
/set
the value
Next, let’s look at how we would growIfNeeded()
and shrinkIfNeeded()
.
fileprivate mutating func growIfNeeded() {
if capacity - blocks.count < count + 1 {
let newArray = [T?](repeating: nil, count: blocks.count + 1)
blocks.append(newArray)
}
}
fileprivate mutating func shrinkIfNeeded() {
if capacity + blocks.count >= count {
while blocks.count > 0 && (blocks.count - 2) * (blocks.count - 1) / 2 > count {
blocks.remove(at: blocks.count - 1)
}
}
}
If our data set grows or shrinks in size, we want our data structure to accommodate the change.
Just like a Swift array, when a capacity threshold is met we will grow
or shrink
the size of our structure. For the Rootish Array Stack we want to grow
if the second last block is full on an insert
operation, and shrink
if the two last blocks are empty.
Now to the more familiar Swift array behaviour.
public mutating func insert(element: T, atIndex index: Int) {
growIfNeeded()
internalCount += 1
var i = count - 1
while i > index {
self[i] = self[i - 1]
i -= 1
}
self[index] = element
}
public mutating func append(element: T) {
insert(element: element, atIndex: count)
}
public mutating func remove(atIndex index: Int) -> T {
let element = self[index]
for i in index..<count - 1 {
self[i] = self[i + 1]
}
internalCount -= 1
makeNil(atIndex: count)
shrinkIfNeeded()
return element
}
fileprivate mutating func makeNil(atIndex index: Int) {
let block = self.block(fromIndex: index)
let innerBlockIndex = self.innerBlockIndex(fromIndex: index, fromBlock: block)
blocks[block][innerBlockIndex] = nil
}
To insert(element:, atIndex:)
we move all elements after the index
to the right by 1. After space has been made for the element, we set the value using the subscript
convenience method.
append(element:)
is just a convenience method to insert
to the end.
To remove(atIndex:)
we move all the elements after the index
to the left by 1. After the removed value is covered by it’s proceeding value, we set the last value in the structure to nil
.
makeNil(atIndex:)
uses the same logic as our subscript
method but is used to set the root optional at a particular index to nil
(because setting it’s wrapped value to nil
is something only the user of the data structure should do).
Setting a optionals value to
nil
is different than setting it’s wrapped value tonil
. An optionals wrapped value is an embedded type within the optional reference. This means that anil
wrapped value is actually.some(.none)
whereas setting the root reference tonil
is.none
. To better understand Swift optionals I recommend checking out @SebastianBoldt’s article Swift! Optionals?.
Performance
-
An internal counter keeps track of the number of elements in the structure.
count
is executed in O(1) time. -
capacity
can be calculated using Gauss’ summation trick in an equation which takes O(1) time to execute. -
Since
subcript[index:]
uses theblock
andinner block index
equations, which can be executed in O(1) time, all get and set operations take O(1). -
Ignoring the time cost to
grow
andshrink
,insert(atIndex:)
andremove(atIndex:)
operations shift all elements right of the specified index resulting in O(n) time.
Analysis of Growing and Shrinking
The performance analysis doesn’t account for the cost to grow
and shrink
. Unlike a regular Swift array, grow
and shrink
operations don’t copy all the elements into a backing array. They only allocate or free an array proportional to the number of blocks
. The number of blocks
is proportional to the square root of the number of elements. Growing and shrinking only costs O(√n).
Wasted Space
Wasted space is how much memory with respect to the number of elements n
is unused. The Rootish Array Stack never has more than 2 empty blocks and it never has less than 1 empty block. The last two blocks are proportional to the number of blocks, which is proportional to the square root of the number of elements. The number of references needed to point to each block is the same as the number of blocks. Therefore, the amount of wasted space with respect to the number of elements is O(√n).
Written for Swift Algorithm Club by @BenEmdon
With help from OpenDataStructures.org