This file briefly describes each type of heap, and how it has been implemented in the supplied C source files. For detailed information refer to the source files, or the references given at the end of this file. ------------------------------------------------------------------------------- The Binary Heap Files: bheap.c, bheap.h The binary heap is a heap ordered binary tree. A binary tree allows each node in the tree to have two children. Each node has a value associated with it, called its key. The term `heap ordered' means that no child in the tree has a key greater than the key of its parent. By maintaining heap order in the tree, the root node has the smallest key. Because the heap has simple access to the minimum node, the find_min() operation to takes O(1) time. Heaps support the operations insert(), decrease_key()/increase_key(), delete(), and delete_min(). When these operations occur, the heap needs to be rearranged, shifting nodes to maintain heap order or to replace deleted nodes. When inserting a node, it is placed at the bottom of the tree. If the inserted node has a key greater than that of its parent, the node is swapped with its parent. This process, called sift-down continues until heap order is satisfied. When deleting a node, we replace the deleted node with the node at the bottom of the tree. If it is smaller than its parent, sift-down occurs. Otherwise, if it is greater than the minimum of its two children, the opposite process called sift-up occurs. The sift-up process swaps the node with its smallest child, and continues until heap order is satisfied. After a decrease_key() operation sift-down is performed if necessary. After an increase_key() operation sift-up is is performed if necessary. Because of the binary heaps simplicity, it is possible to maintain it using a one dimensional arrays. This method has been used in the source file bheap.c. The root node is located at position 1 in the array. The first child of the root is located at position 2 and the second child at position 3. In general, the children of the node at position i are located at 2*i and 2*i + 1. So the children of the node at position 3 in the array are located at positions 6 and 7. Similarly, the parent of the node at position i is located at i div 2. Note that in file bheap.c array entry 0 is unused. The alternative way to implement binary heaps is to define a structure type for a node, and use pointer fields to parent and child nodes in the heap. ------------------------------------------------------------------------------- The Fibonacci Heap Files: fheap.c, fheap.h In this description, when comparing nodes we are actually referring to its key. The Fibonacci heap invented by Fredman and Tarjan consists of a collection of several trees of different ranks. The rank of a node in a Fibonacci heap is defined as the number of children it has, and the rank of a tree is defined as the rank of the root node. The trees are created through a linking process. A rank 0 tree consists of a single node. A rank 1 tree is created by comparing the root nodes of two rank 0 trees and linking them so that the tree with the smaller root node becomes the parent of the tree with the larger root node. This is done by adding the smaller root node to the children of the other trees root node, thus increasing the rank of the other tree. In general a rank i tree is created through the linking of two rank i-1 trees. By comparing the keys of root nodes, trees created trough linking are heap ordered. Because linking joins whole trees, Fibonacci heaps are implemented by defining a structure type for nodes and using pointers between nodes. In the file fheap.h, the Fibonacci heap node type has the pointers parent, child, left, and right. A nodes children are maintained in a circular doubly linked list using the left and right sibling pointers. The child pointer of a node points to one child in its list of children. The parent pointer points to a nodes parent. For root nodes, the parent pointer is NULL. For the Fibonacci heap implementation in fheap.c, we allow at most one tree of each rank. This is done by maintaining an array of pointers to Fibonacci heap trees, where entry i points to the root node of a tree of rank i. When inserting a node, we attempt insertion at entry 0. If there is already a rank 0 tree there we link the new node with it and attempt to insert the resulting rank 1 tree at entry 1. Linking continues until a tree is inserted at an empty entry. For the delete_min() operation we must scan the root nodes of all trees in the heap to determine which one is the minimum node. After deleting the minimum node, its children, which make up a collection of trees, are merged back into the heap, similar to inserting a node into the heap, except we merge at the appropriate rank array entry for each tree. Each tree in the collection of child trees after a delete_min is easily accessed using the left and right sibling pointers. After a decrease_key() operation on node x, we break the link between x and its parent, p, then merge the resulting tree, with root node x, back into the array of trees. There is one further detail with decrease_key(): We allow a node to loose at most one of its children to decrease_key() operations. When a node looses a child, we mark it. Each node has a boolean field, `marked', which is TRUE if it has lost a child, and FALSE otherwise. Upon loosing a second child, we must also break the link between p and its parent and merge the tree rooted at p back into the heap. This process is called a cascading cut, since it may propagate further up the tree if p's parent has already lost a child, and so forth. For the Fibonacci heap implementation in file fheap.c, the marked field of a root node is true, and is set to false after the root node becomes non-root through linking. In fheap.c, breaking a link between node x and its parent node, p, also requires removing x from the linked list of p's children. The delete() operation is not implemented in fheap.c but is similar to a combination delete_min() and decrease_key() processes. ------------------------------------------------------------------------------- The 2-3 Heap Files: ttheap.c, ttheap.h The 2-3 heap is similar to the Fibonacci heap, but unlike the Fibonacci heap, the 2-3 heap has explicit constraints on the structure of trees in the heap. Trees in the 2-3 heap are made up of trunks, formed by linking nodes in a chain. Trunks can be either 2 or 3 nodes in length. In this description, length is defined as the number of nodes in the trunk. Consider the recursive definition: T(0) = a single node. T(i) = T_1(i-1) . ... . T_m(i-1) (m is either 2 or 3) We link trees T_1 to T_m in a chain by their root nodes. We call T(i) a tree of dimension i and the chain of nodes formed a trunk of dimension i. Main trunks are formed by linking 0, 1 or 2 trees of the same dimension. The 2-3 heap allows one main trunk for each tree dimension. This collection is maintained using an array of pointers to the head node of each main trunk. Entry i points to a main trunk of dimension i trees. If there is no main trunk for a particular dimension (i.e. no trees), then the pointer will be NULL. It is necessary to maintain the dimension of a node. The dimension of a node, as implemented in the file ttheap.c, is defined as the dimension of the highest dimension non-main trunk the node lies on. This means that array entry 0 points to a main trunk containing dimension 0 nodes. For implementing 2-3 trees we can use a similar node structure type to that of the Fibonacci heap. A nodes child pointer points to the highest dimension child. A circular doubly linked list of children is formed by the left and right sibling pointers of nodes. The left pointer points to the next lowest dimension sibling, and the right pointer points to the next highest dimension sibling. Because it is circular, the right pointer of the highest dimension child points to the dimension 0 child, and the left pointer of the dimension 0 child points to the highest dimension child. Every node has a parent pointer, except for root nodes, which have the parent pointer set to NULL. Fibonacci heap nodes had a rank field. In comparison 2-3 heap nodes have a dimension field. An alternative structure type pairs nodes by using an extra pointer field, partner. We can describe trunks in this representation as consisting of the nodes (parent, 1st child, 2nd child). 1st child and and child nodes are doubly linked to each other using their partner pointers. The left and right pointers of a 1st child node only point to other 1st child nodes. The parent pointer of a 1st child node points to its parent. 2nd child nodes do not use the parent, left, and right pointers. However, both 1st child and 2nd child nodes maintain their own child pointers. Since trunks can be either length 2 or 3, 2nd child nodes don't necessarily have to exist, in which case, the 1st child node's partner pointer would be NULL. Main trunks consisting of 2 nodes are treated as (1st child, 2nd child). For convenience we also have the boolean field, extra, which identifies whether a node is 1st child or 2nd child. However it may be possible to use the parent pointer to identify 2nd child nodes by always setting the parent pointer of 2nd child nodes to NULL, which would then require another check to distinguish 2nd child nodes from root nodes. An insert operation involves merging the new node into the dimension 0 main trunk, maintaining heap order. If the resulting trunk is length 3, then we have a dimension 1 tree which is merged into the dimension 1 main trunk. The insertion process can be viewed as adding 1 to the ternary number represented by the collection of main trunks, with the possibility of a chain of carries occurring. For example the ternary addition 222 + 1 = 1000 causes 3 carries. The delete min_operation for the 2-3 heap is similar to that for the Fibonacci heap. After removing the minimum node, we take child trunks of the minimum node and merge them back into the main trunk level of the heap. The child trunks can be either 1 or 2 nodes long. The list of child trunks is ordered, with the lowest dimension child trunk consisting of dimension 0 trees, the next consisting of dimension 1 tress, and so on. The main trunk that the minimum node lies on decreases in length by 1. This merging process corresponds to the addition of two ternary numbers, with each digit of the added number being either 1 or 2. The decrease_key operation involves removing the node whose key was decreased and merging it back into the heap at main trunk level. Removing a node, also removes the entire sub-tree it is the root of. When a node is removed from a non-main trunk we allow the trunk to shrink from 3 nodes to 2 nodes. If decrease_key occurs on a root node there is no need to remove it since heap order will still be maintained. Removing the second node from a 2-node main trunk shrinks the trunk to length 1, then the removed tree is merged back into the trunk. Recall the nodes on a 3 node trunk are (parent, 1st child, 2nd child). We treat the 2 nodes on a main trunk as (1st child, 2nd child) with no parent. This is why the second node on a main trunk identifies as a node that can be removed. When removing a node from a length 2 non-main trunk, we must replace it with some node from a nearby trunk since we do not allow length 1 non-main trunks. This is done by rearranging a nodes workspace. The workspace of a dimension i node consists of the dimension i trunk it is part of, the dimension i+1 trunk it lies off, and any other dimension i trunks that also lies off the dimension i+1 trunk. For a detailed definition, refer to Takaoka [1]. If the dimension i+1 trunk is a non-main trunk, the workspace has between 4 and 9 nodes. When the dimension i+1 trunk is a main trunk, the work space could have just 2 nodes. For details on how the workspace is rearranged, examine the remove_node() function in the 2-3 heap source code file ttheap.c. ------------------------------------------------------------------------------- The Trinomial Heap Files: triheap.c, triheap.h triheap_ext.c, triheap_ext.h The trinomial heaps structure is very similar to that of the 2-3 heap. However, with the trinomial heap, all non-main trunks are kept at length 3. Length here is defined as the number of nodes in the trunk. Only main trunks are allowed to vary in length, allowing lengths 0, 1, or 2. We refer to nodes on a trunk as (parent, 1st child, 2nd child). A 1st child or 2nd child node is called inconsistent if its key is less than that of the parent node. The trinomial heap always enforces key(1st child) <= key(2nd child), and can swap 1st and 2nd child nodes to maintain the correct order. An active node is a possibly inconsistent node. The trinomial heap allows a limited number of active nodes. We use the node-pair structure type for implementing the trinomial heap. There are two implementations of the trinomial heap. The simpler implementation, given in file triheap.c, maintains at most 1 active node for each dimension. An extended implementation, given in file triheap_ext.c, allows O(log n) active nodes of any dimension. The later supports decrease_key in O(1) worst case time, rather than O(1) amortised time. Insertion is the same as for the 2-3 heap, and nodes are inserted as inactive. We will describe the simpler version of the trinomial heap first. This is just a brief description, refer to the source code file, triheap.c for details. Consider what happens when a decrease_key operation occurs, decreasing the key of a node, x, of dimension d. Some pseudo code is given below. Here we assume that the node lies on a non-main trunk. If x is a 1st child, If there is not already an active node for dimension d, make node x active. Else, unless x is already the active node, do rearrangement with the current active node for dimension d. Else, If key(x) < key(1st child), swap x with the 1st child. If the old 1st child is active, If the old 1st child is inconsistent, do promotion. Else, x replaces it as the active node for dimension d. Else if there is some other active node for dimension d, do rearrangement with it. Else, make x the active node for dimension d. Else, If the 1st child is active, If x is inconsistent, do promotion. Rearrangement: This process involves two trunks. (pv, v, v') and (pw, w, w'). Here v is a node that tried to become active, but an active node, w, of the same dimension already existed. In the simple trinomial heap implementation 2nd child nodes are never left inconsistent, so we know that both v' and w' are consistent. Nodes v and w are possibly inconsistent. If key(v') < key(w'), make the trunk (pv, v', w') containing no inconsistent nodes. If key(v) < key(w), make the trunk (pw, v, w). If key(w) < key(pw), do promotion. Else v replaces w as the active node. Else make the trunk (pw, w, v). If key(v) < key(pw), do promotion. Else w remains the active node. Else make the trunk (pw, w', v') containing no inconsistent nodes. If key(v) < key(w), make the trunk (pv, v, w). If key(w) < key(pv), do promotion. Else v replaces w as the active node. Else make the trunk (pv, w, v). If key(v) < key(pv), do promotion. Else w remains the active node. Promotion: This process involves one trunk, where both nodes on the trunk are inconsistent. Consider the trunk (p, v, v') with both v and v' inconsistent. We promote v and v' up 1 position and place p at the end of the trunk to give the resulting trunk (v, v', p). Now both v' and p are consistent, but node v may now be inconsistent at a higher dimension. It appears as though a decrease_key has occurred at the position where p used to be an v now is, so the process continues at a higher dimension. Note that node p may go from active to inactive. The delete_min operation is more complicated than that of the 2-3 heap. We must also check active nodes as well as the root nodes when locating the minimum. The child trunks of the deleted node are still merged back into the heap. However if the node that was deleted is not a root node, we must perform the break-up operation which breaks up the higher dimension parts of the tree, producing trunks to be merged back into the heap at root level with the list of child trunks. For the trinomial heaps delete_min operation, all the trunks to be merged back into the heap will be length 2. A list of trunks to be merged back into the heap is maintained during break-up so that the merging can be done in one sweep. The node-pair structure is important for the break-up operation. Recall that the (1st child, 2nd child) nodes are paired by setting their partner pointers to point to each other. For the description of the break-up operation we refer to the broken nodes parent and partner, so we have one of the trunk cases (parent, break_node, partner) or (parent, partner, break_node). Initially, break_node is the node that was deleted. Let the dimension of break_node be d. Break-up pairs parent with partner to produce a trunk structured like a main trunk of 2 dimension d trees. This is added to the list of trunks to be merged back into the heap at main trunk level. If break_node's trunk has higher dimension sibling trunks, these are cut from parent to produce length 2 trunks and added to the list of trunks to be merged. Such a case will occur if dim(parent) > dim(break_node) + 1. Break-up proceeds to a higher dimensions by setting break_node to parent, giving new parent and partner pointers, with which to continue the process. The break-up ends once the main trunk has been reached. This still takes O(log n) time since there are only O(log n) active nodes, and the maximum dimension is O(log n). Note that during break-up, we must deactivate any active nodes on trunks to be merged back into the heap. ---------- Next, the extended trinomial heap is described, which supports decrease_key in O(1) worst case time. This differs in that it allows active nodes to be of any dimension, instead of allowing just one active node for each dimension. To achieve O(1) worst case time for decrease_key it is also necessary to allow 2nd child nodes to be active. However, note that a 2nd child nodes can not be active unless the 1st child is also active. This trinomial heap has a tolerance bound, t, which allows t = [log_3 (n+1)] nodes to be active. If the number of active nodes exceeds t then rearrangement or promotion must be performed on two active nodes of the same dimension. To keep track of active node of the same dimension the extended trinomial heap maintains an active node queue for each dimension. In the source files triheap_ext.c and triheap_ext.h each queue is implemented using a circularly doubly linked list of active_ptr_t items. The heap maintains an array, active_queues, which contains pointers to the head queue item for each dimension. Each active_ptr_t item points to the active node that it corresponds to. Nodes in the extended trinomial heap have an additional pointer, active_entry, which points to the corresponding active_ptr_t item when a node is active. If a node is not active its active_entry pointer will be NULL. A separate queue maintains candidates for rearrangement. Referring to triheap_ext.h, this is implemented using a doubly linked list of candidate_ptr_t items. Each candidate_ptr_t item has a dimension field that specifies which active node queue it corresponds to. If one of the active node queues has two or more items in it, then there will be a candidate_ptr_t item that corresponds to it in the candidate queue. For each dimension, there is at most one candidates_ptr_t item in the queue at any given time. Note that when there are more active nodes than the tolerance level, at least one of the active node queues has two or more items in it, since there are more items than queues. There is guaranteed to be at least one candidate queue item. Suppose that the tolerance level has been reached. The first item in the candidate queue specifies which active node queue has two items in it. From the two items, pointers to two active nodes of the same dimension can be obtained. Before continuing with rearrangement using the two active nodes, we must check that neither active node has another active node as its partner: * If we do get a (1st child, 2nd child) pair with both nodes active, * If both nodes in the (1st child, 2nd child) pair are inconsistent, do promotion. * Else, deactivate the 2nd child and finish. * Else, do rearrangement. Promotion and rearrangement have the effect of decreasing the number of active nodes by 1 or 2. Note that in the extended trinomial heap, promotion does not propagate to higher dimensions. To make checking active node keys easier during delete_min, an array, `active', is maintained which contains pointers to each active node. This uses entries 0 to n_active-1, where n_active is the number of active nodes in the heap. Each active_ptr_t queue item has a position field which specifies the of the corresponding active nodes pointer in this array. The code in triheap_ext.c is similar to that of triheap.c, except that now when a node is made active, an activate() procedure is called to update the queues. Similarly, when a node goes from active to inactive a deactivate() procedure is called to update the queues. Note that queue items from any queue are only added or removed when activate() or deactivate() is called. After a node is made active using activate(), a check is made to make sure that the number of active nodes has not exceeded the tolerance. There are also other minor differences in parts of the code since the extended trinomial heap allow 2nd child nodes to be active. Note that before a 2nd child node can become active, the first child has to already be active. ------------------------------------------------------------------------------- References [1] T. Takaoka. Theory of 2-3 heaps. In Proc. COCOON '99, volume 1627 of Lecture Notes in Computer Science, pages 41-50. July 1999.