Welcome to the Vitamin Page.
The page is dedicated to the Vitamin project, which is a C# project with a singular purpose, understanding data structures. This is an ongoing process, time permitting and perhaps when I learn new or interesting points regarding data structures and their usage and functionality, I will update the repo and this page. As of this moment, the following topics are covered. So get comfortable with C#, it's a marvelous language.
Sorting
Before diving into searching algorithms, you must first understand sorting algorithms. These algorithms are used to place or arrange elements in a container (array, tree, etc.) in a numerical or lexical order. The simply fact is that in order to have the most efficient search, the data must be sorted. Now this is not true of every case, as in the case of associative arrays, but well get to that later. The following list encompasses the current offerings and is broken into asymptotic complexity by algorithm, currently implemented, ranging from best, worst to average. Further reading on sorting can be found here.
-
O(n log n)
- Quick Sort (implemented)
- Merge Sort (future)
- Shell Sort (future)
- Heap Sort (future)
-
O(n2)
- Bubble Sort (implemented)
- Insertion Sort (implemented)
- Selection Sort (implemented)
-
O(n)
- Bucket Sort (future)
Searching
Being a master of sorting data with only get you so far, we need to understand where an how to get to relevant data within a collection. Luckily there are two well defined searching methods, Binary and Sequential search algorithms, for sorted and unsorted data respectively.
-
O(n log n)
- Binary search(implemented): A "Divide and conquer" approach used to recursively breakdown a problem into two or more like sub-problems. (great for parallelization)
-
O(n)
- Sequential search(implemented): Known as a "Linear" search, which is an algorithm used parses each element, until the data is found. Ideally used on short lists of unsorted data.
- Sequential search(implemented): Known as a "Linear" search, which is an algorithm used parses each element, until the data is found. Ideally used on short lists of unsorted data.
Data Structures
Data structures! Final we have arrived!
-
Linked Lists: A linked list is a linear structure where each element (called nodes) are separate objects containing two items, some data and a reference to the next object (which means increased space complexity over arrays). The upside is that unlike arrays, Linked Lists are dynamic structures and can shrink or grow based on demand. Now the downside, nodes in the Linked List cannot be accessed directly. In order to find a particular value, one must start at the head node of the list and traverse each node until the value is found. There are variations to the Linked Lists: A Singly linked list, where each node contains a single reference to the next node and the last node stores null for its next reference. A Circular Linked List, where the last node does not have a null reference, instead it points to the first (head) node. A Doubly linked list, where each node contains two references, one to the next and previous node. The head nodes previous reference and the last nodes next reference are both null. The following table shows the space time complexity for both the singly and doubly linked lists:
Function Performance Insert O(1) Delete O(1) Search O(n) Space O(n)
-
Associative Arrays: Commonly referred to as a Hash-table and Dictionary, these structures are based on the concept of associative arrays, which is a associative collection of pairs of keys and values or "key/value pair". Where the key is used to access the value, much like a word in a dictionary refers to a specific definition. The idea behind Associative arrays to speed up searching. Consider an unordered array, to search we would use a sequential search to traverse each elements until we find a given value, which yields O(n). If we were to search an ordered array, we might use a binary search, which yield better results but still O(log n). The search could be sped up if we know the index if the value beforehand, called hashing and is achieved via a hash function. Hashing is the transformation of some data to a short fixed-length value (key), which gives the "index" into the array where the value is stored, hence the term key/value pair. It's important to not that performance may be impacted by the quality of the hash function, which are called collisions.
Function Performance Insert O(1) Remove O(1) Search O(1) Space O(n)
-
Tree structures: This structure is an extension of the Linked List structure, where each node maintains two references to a left and right node, and a data element. If a node has a valid reference to a left, and/or right not, it is referred to as the parent node, if no, it's called a leaf node. The left and right nodes of a parent are refereed to children and are to each other siblings. There are many variations of the tree structure. Here I'll just cover the Binary Search tree, but in many cases, the average complexity of the functionality is described below. The properties of a Binary search trees are that for any node, the left sub-tree is always equal to or less than is parent, and conversely the right sub-tree is always greater than or equal to it's parent.
Function Performance Insert O(log n) Remove O(log n) Search O(log n) Space O(n)
-
Stack: A stack is a list consisting of like data types typically implemented as an array or linked list. The stack is defined by how the data is access, which is "LIFO" the last element in is the first element out.
Function Performance Push O(1) Pop O(1) Top O(1) Search O(n)
-
Queue: A Queue is a list containing like data types, and typically implemented as an array or list
Function Performance Insert O(1) Remove O(1) Search O(1) Space O(1)
Support or Contact
Questions, Comments or concerns or just want to chat? Please feel free to reach out to me, Bryan Mackenzie (@Bamacken)!