Binary Trees: Difference between revisions
From charlesreid1
| Line 2: | Line 2: | ||
==Goodrich Chapter 8 Trees== | ==Goodrich Chapter 8 Trees== | ||
===Definitions=== | |||
Trees - most important nonlinear data structures in computing | Trees - most important nonlinear data structures in computing | ||
| Line 12: | Line 14: | ||
* If T is non-empty, it has a special node, called the root of T, that has no parent. | * If T is non-empty, it has a special node, called the root of T, that has no parent. | ||
* Each node v of T different from the root has a unique parent node w; every node with parent w is a child of w. | * Each node v of T different from the root has a unique parent node w; every node with parent w is a child of w. | ||
===Relationships=== | |||
Node relationships: | Node relationships: | ||
| Line 19: | Line 23: | ||
* Node u is an ancestor of node v is u = v or if u is an ancestor of the parent of v. | * Node u is an ancestor of node v is u = v or if u is an ancestor of the parent of v. | ||
* The stubtree of T that is rooted at node v is the tree consisting of all descendants of v in T - including v itself. | * The stubtree of T that is rooted at node v is the tree consisting of all descendants of v in T - including v itself. | ||
===Edges Paths=== | |||
Edges and paths: | Edges and paths: | ||
| Line 27: | Line 33: | ||
Example: | Example: | ||
* Documents like books are hierarchically organized as a tree whose internal nodes are parts, chapters, sections, and leaves are paragraphs, tables, figures | * Documents like books are hierarchically organized as a tree whose internal nodes are parts, chapters, sections, and leaves are paragraphs, tables, figures | ||
===Ordered/Unordered=== | |||
Ordered vs unordered | Ordered vs unordered | ||
| Line 32: | Line 40: | ||
* Unordered tree - there is no precedence with respect to one sibling or the other | * Unordered tree - there is no precedence with respect to one sibling or the other | ||
* Examples of ordered trees occur when there are ordered operations (e.g., siblings ordered according to order of birth) | * Examples of ordered trees occur when there are ordered operations (e.g., siblings ordered according to order of birth) | ||
* Examples of | * Examples of unordered trees occur with enumerations or functionality (org charts, divisions in a company) | ||
===Implementation==== | |||
Tree abstract data types: | Tree abstract data types: | ||
| Line 43: | Line 53: | ||
* In Python, errors in arguments pointing to invalid positions should raise a ValueError | * In Python, errors in arguments pointing to invalid positions should raise a ValueError | ||
* In Java, we define the functions to be abstract and require that inheriting objects define them. [[Java/Abstract Classes]] | * In Java, we define the functions to be abstract and require that inheriting objects define them. [[Java/Abstract Classes]] | ||
* | |||
The actual implementation will differ depending on the type of data we are storing and the various requirements we might have for it (e.g., a faster algorithm may be rejected due to high memory usage and memory constraints). | |||
There are some concrete methods that can be defined, if they are independent of implementation. Some examples include: | |||
* is_root (parent is null) | |||
* is_leaf (number of children is 0) | |||
* is_empty (T if no nodes in tree) | |||
Computing depth and height: | |||
* Depth of p is the number of ancestors of p, excluding p itself | |||
* Depth is the distance (in other nodes) to the root node | |||
* The height of a node is the '''maximum''' height of each of the node's children | |||
* The height of the entire tree is the '''maximum''' depth of its leaf positions. | |||
===Algorithmic analysis=== | |||
Analyze the following two methods to obtain the height: | |||
<pre> | |||
def _height(self): | |||
return max(self.depth(p) for p in self.positions() if self.is_leaf(p)) | |||
</pre> | |||
Theis will call depth function on each tree, and the worst-case depth is n. | |||
That means we're doing | |||
<math> | |||
\sum_{i=1}^{n} j = \dfrac{n(n+1)}{2} \sim n^2 | |||
</math> | |||
Height runs in O(n^2) | |||
==Skiena Chapter 3 Data Structures== | ==Skiena Chapter 3 Data Structures== | ||
Revision as of 00:39, 7 June 2017
Notes
Goodrich Chapter 8 Trees
Definitions
Trees - most important nonlinear data structures in computing
Relationships in trees are hierarchical
Each element in a tree has a parent element and zero or more child elements
Formally, define tree T as a set of nodes storing elements such that the nodes have a parent-child relationship satisfying the following properties:
- If T is non-empty, it has a special node, called the root of T, that has no parent.
- Each node v of T different from the root has a unique parent node w; every node with parent w is a child of w.
Relationships
Node relationships:
- Sibling nodes are nodes that share a parent
- External nodes are nodes with no children - total number is n_E
- Internal nodes are nodes with 1 or more children - total number is n_I
- Node u is an ancestor of node v is u = v or if u is an ancestor of the parent of v.
- The stubtree of T that is rooted at node v is the tree consisting of all descendants of v in T - including v itself.
Edges Paths
Edges and paths:
- An edge of three T is a pair of nodes (u,v) such that u is the parent of v or vice-versa
- A path is a set of consecutive edges - or, a set of consecutive nodes connected by edges.
- A tree is ordered if there is a meaningful linear order among the children of each node.
Example:
- Documents like books are hierarchically organized as a tree whose internal nodes are parts, chapters, sections, and leaves are paragraphs, tables, figures
Ordered/Unordered
Ordered vs unordered
- Ordered tree - the order of the sibling nodes is important
- Unordered tree - there is no precedence with respect to one sibling or the other
- Examples of ordered trees occur when there are ordered operations (e.g., siblings ordered according to order of birth)
- Examples of unordered trees occur with enumerations or functionality (org charts, divisions in a company)
Implementation=
Tree abstract data types:
- Things get a bit confusing from here. Goodrich separates the concept of a "Position" from the concept of a "Node."
- The Position is a location in the tree, the Node is a four-pointer container. k
Abstract class:
- Tree is an example of an abstract class implementation
- In Python, each virtual function will raise a NotImplementedError("Must be implemented by subclass")
- In Python, errors in arguments pointing to invalid positions should raise a ValueError
- In Java, we define the functions to be abstract and require that inheriting objects define them. Java/Abstract Classes
The actual implementation will differ depending on the type of data we are storing and the various requirements we might have for it (e.g., a faster algorithm may be rejected due to high memory usage and memory constraints).
There are some concrete methods that can be defined, if they are independent of implementation. Some examples include:
- is_root (parent is null)
- is_leaf (number of children is 0)
- is_empty (T if no nodes in tree)
Computing depth and height:
- Depth of p is the number of ancestors of p, excluding p itself
- Depth is the distance (in other nodes) to the root node
- The height of a node is the maximum height of each of the node's children
- The height of the entire tree is the maximum depth of its leaf positions.
Algorithmic analysis
Analyze the following two methods to obtain the height:
def _height(self): return max(self.depth(p) for p in self.positions() if self.is_leaf(p))
Theis will call depth function on each tree, and the worst-case depth is n.
That means we're doing
$ \sum_{i=1}^{n} j = \dfrac{n(n+1)}{2} \sim n^2 $
Height runs in O(n^2)
Skiena Chapter 3 Data Structures
Question 3-5
Question 3-5) Find the overhead fraction (data space/total space) for each binary tree implementation on n nodes given the following conditions:
- All nodes store data, 2 child pointers, and 1 parent pointer. Data fields are 4 bytes, pointers are 4 bytes.
- Only leaf nodes store data; internal nodes store 2 child pointers. Data field requires 4 bytes, 2 bytes per pointer.
First case:
- Binary tree with n nodes -> n-1 edges
- Child/parent ppointers means 2x edges
- 2(n-1) edges, 2(n-1) pointers
- Alternatively, here's the analysis:
n nodes x (4 bytes of data/node) = 4n bytes data
n nodes x (12 bytes of pointers/node) = 12 n bytes
Total space is 16 n bytes, so overhead fraction is 1/4, i.e., the data space to total space ratio is 1/4
Second case:
- If we have n nodes, we have ~n/2 leaves
- n nodes total x (1 leaf node / 2 nodes) ~ n/2 lleaf nodes
n/2 empty nodes x (2 pointers/1 empty node) x (2 bytes/pointer) = 2n bytes for empty nodes with pointers
n/2 data nodes x (4 bytes/1 empty node) = 2n bytes data
The overhead fraction is thus 1/2: half the bytes are for data, half the bytes are for pointers.
Question 3-6
Question 3-6) Describe how to modify any balanced tree structure such that search, insert, delete, min, max each take the expected amount of time, O(log n), but sucessor and predecessor methods now take O(1) time each. What modifications are required, and to which methods?
The methods:
- Predecessor(D,k) - returns the predecessor key (in order) to the given key k
- Successor(D,k) - returns the successor key to the given key k
Question is, how to we implement neighbor lookup in O(1) time in a balanced tree?
Two options:
- Option one - add a previous and next node pointers. This increases the complexity of the bookkeeping, and of the add/insert/remove methods, which now have to traverse the tree to fix their links.
- Option two - make a data tree that is twice as big, and only holds data in the bottom-most leaf nodes. This obviates the need for a next and previous pointer, because the tree is easy to navigate - up and over... at most log(n) operations to traverse tree.
Option one:
We would need to modify the add() and insert() method, the delete method, no need to modify search or min or max methods.
We modify them to keep track of the previous node. Finding the previous node would be somewhat tricky logic. When we add a new node, find its next and previous nodes, and link them correctly.
Option two:
It would be algorithmically easier (but more expensive and more complicated implementation wise) to use a tree structure with only data in the leaf nodes. Finding/keeping track of next or previous node then doesn't require the extra two pointers and the extra logic of traversing a binary tree, it just requires twice the number of nodes (tree of size N has N/2 leaf nodes).
This also requires implementation of the entire data structure again, and it is not so obvious how you keep a tree structure like that balanced and dynamically resized.
Flags
| Data Structures Part of Computer Science Notes
This is the staging ground for computer science notes as part of the 2017 CS Study Plan.
Classes of data structures: Abstract Data Types Array-based and Link-based memory management: ArrayLists and Linked Lists Algorithmic Analysis of Data Structures: Algorithmic Analysis of Data Structures Advanced data structures: Advanced Data Structures
|
| Arrays Part of Computer Science Notes
Series on Data Structures Python: Arrays/Python · Arrays/Python/Sizeof · Arrays/Python/AppendCost · Arrays/Python/CaesarCipher · Arrays/Python/CompactArrays · Arrays/Python/DynamicArray Java: Arrays/Java · Arrays/Java/CaesarCipher · Arrays/Java/FisherYates · Arrays/Java/PythonList · Arrays/Java/Repeatedly_Remove Categories: Category:Python Arrays
|
| Stacks and Queues Part of Computer Science Notes
Series on Data Structures
Stacks and Queues: Python StacksQueues/Python · StacksQueues/Python/ArrayStack · StacksQueues/Python/ArrayQueue · StacksQueues/Python/ArrayDeque StacksQueues/Python/LinkedStack
Stacks and Queues: Java StacksQueues/Java · StacksQueues/Java/ArrayStack · StacksQueues/Java/ArrayQueue · StacksQueues/Java/ArrayQueueFS · StacksQueues/Java/ArrayDeque StacksQueues/Java/LinkedStack · StacksQueues/Java/LinkedQueue · StacksQueues/Java/LinkedDeque
Applications Postfix_Expressions#Stacks · StacksQueues/Subsets · StacksQueues/Subsets/Java
|
| Priority Queues and Heaps Part of Computer Science Notes
Series on Data Structures
Java: Priority Queues/Java · Priority Queues/ADT · Priority Queues/Sorted · Priority Queues/Unsorted Performance: Priority Queues/Timing and Performance Applications: Maximum Oriented Priority Queue · Priority Queues/Stack
Priority Queues/Heap · Priority Queues/Java · Priority Queues/Comparators
|
| Linked List Part of Computer Science Notes
Series on Data Structures Java: Linked Lists/Java · Linked Lists/Java/Single · Linked Lists/Java/Double · Linked Lists/Java/Circular Performance: Linked Lists/Java/Timing · Linked Lists/Java/Reverse Python: Linked Lists/Python · Linked Lists/Python/Single
|
| Trees Part of Computer Science Notes
Series on Data Structures Abstract data type: Trees/ADT Concrete implementations: Trees/LinkedTree · Trees/ArrayTree · SimpleTree
Tree Traversal Preorder traversal: Trees/Preorder Postorder traversal: Trees/Postorder In-Order traversal: Binary Trees/Inorder Breadth-First Search: BFS Breadth-First Traversal: BFT Depth-First Search: DFS Depth-First Traversal: DFT OOP Principles for Traversal: Tree Traversal/OOP · Tree Traversal/Traversal Method Template Tree operations: Trees/Operations Performance · Trees/Removal
Tree Applications Finding Minimum in Log N Time: Tree/LogN Min Search
Abstract data type: Binary Trees/ADT Concrete implementations: Binary Trees/LinkedBinTree · Binary Trees/ArrayBinTree Binary Trees/Cheat Sheet · Binary Trees/OOP · Binary Trees/Implementation Notes
|
| Search Trees Part of Computer Science Notes
Series on Data Structures
Binary Search Trees · Balanced Search Trees Trees/OOP · Search Trees/OOP · Tree Traversal/OOP · Binary Trees/Inorder
(Note that heaps are also value-sorting trees with minimums at the top. See Template:PriorityQueuesFlag and Priority Queues.)
|
| Maps and Dictionaries Part of Computer Science Notes
Series on Data Structures
Maps/Dictionaries Maps · Maps/ADT · Maps in Java · Maps/OOP · Maps/Operations and Performance Map implementations: Maps/AbstractMap · Maps/UnsortedArrayMap · Maps/SortedArrayMap Dictionary implementations: Dictionaries/LinkedDict · Dictionaries/ArrayDict
Hashes Hash Maps/OOP · Hash Maps/Operations and Performance Hash Maps/Dynamic Resizing · Hash Maps/Collision Handling with Chaining Hash functions: Hash Functions · Hash Functions/Cyclic Permutation Hash map implementations: Hash Maps/AbstractHashMap · Hash Maps/ChainedHashMap
Skip Lists · Java/ConcurrentSkipList · Java implementations: SkipList
Sets Sets · Sets/ADT · Sets in Java · Sets/OOP · Multisets
|