The Modified 2-3 Tree (supports O(1) worst case time for delete-min()) This description extends upon the description of the standard 2-3 tree (see the file tree23.txt). The standard 2-3 tree has the following restrictions on its structure: - nodes can have either two or three children (except for an empty tree or a tree containing just 1 node). - the tree structure is maintained in such a way that the paths from the root to leaves are all equal in length. After the minimum node is deleted in the standard 2-3 tree, rearrangement may be necessary to ensure the tree structure satisfies the above requirements; refer to the description in the file tree23.txt. In the worst case, the rearrangement can propagate from the deleted leaf item, all the way up to the root node of the tree. Since the tree height is O(log n), the worst case time complexity for delete-min is O(log n). To support O(1) worst case delete-min, the rearrangements that can occur have to be restricted. In order to restrict the rearrangements that can occur, the restriction on the tree structure needs to be relaxed. This is done by allowing the lengths of paths down the left side of the tree to be shorter than other paths in the tree. To describe this idea, a description of the new delete_min() operation will be given. The insert() and delete() operations will still be the same as for the normal 2-3 tree. In order to keep the insert() and delete() operations the same as for the normal 2-3 tree, there are some restrictions on the tree structure: There are two types of nodes: o = leaf ( ) = internal node 1. A nodes children must not be a mixture of internal nodes and leaves. In other words, the descendants at the first level below a node must all be of the same type. For example, these structures are allowed: ( ) ( ) /|\ /|\ / | \ / | \ / | \ / | \ o o o ( ) ( ) ( ) but this structure is not allowed: ( ) /|\ / | \ / | \ o ( ) ( ) 2. The descendants at the second level below a node must all be of the same type. For example, these structures are allowed ( ) /|\ / | \ / | \ / | \ / | \ / | \ ( ) ( ) ( ) /|\ /|\ /|\ / | \ / | \ / | \ / | \ / | \ / | \ o o o o o o o o o ( ) /|\ / | \ / | \ / | \ / | \ / | \ ( ) ( ) ( ) /|\ /|\ /|\ / | \ / | \ / | \ / | \ / | \ / | \ ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) but this structure is not allowed: ( ) /|\ / | \ / | \ / | \ / | \ / | \ ( ) ( ) ( ) /|\ /|\ /|\ / | \ / | \ / | \ / | \ / | \ / | \ o o o ( ) ( ) ( ) ( ) ( ) ( ) In the normal 2-3 tree the children of a node are all of the same type, and the node's link_kind field is used to identify whether the nodes children are internal nodes or leaves. Without restriction #1 the link_kind filed would need to be extended since the children of a node could be of different types. The code for insert() and delete() would need to be modified to handle the additional cases, where an inserted or deleted nodes siblings were of a different type. Without restriction #2, the code for delete() would need to be modified to handle the cases where nodes of different types are merged after a delete operation. The delete_min() operation is designed so that rearrangements that occur obey restrictions 1 and 2. Consider deleting the minimum node in the tree. The modified delete_min() operation occurs the same as for delete_min() in the standard 2-3 tree, when no merge or just one merge operation is necessary after the minimum node has been removed. However, if a second merge_operation is required after the minimum node is deleted, the merging process is different form that in the normal 2-3 tree. We have three general cases when a second merge is required. The illustrations below show how merging is completed in each of these three cases. ( ) represents ( ) or ( ) / \ /|\ /| o...o o o o o o CASE 1: Node r has three children. ----------------------------------- / / m ( ) / | \ / | . / | ? / | . / | / | / | p ( ) q ( ) | /|\ | . . . | . . ? * x ( ) . . . * tree(x) resulted from the /|\ . first merge after the minimum o o o . data item was deleted. . / / r ( ) /|\ / | \ / | \ / | \ / | \ a( ) b( ) c( ) / \ / \ / \ o...o o...o o...o ............................... bottom of tree(q) and tree(m) / / ------> m ( ) / | \ / | . / | ? / | . / | / | / | p ( ) q ( ) /| /|\ / | . . . / | . . ? / | . . . / | . x( ) a( ) . /|\ / \ . o o o o...o . . / / r ( ) /| / | / | / | / | b( ) c( ) / \ / \ o...o o...o ............................... bottom of tree(q) and tree(m) CASE 2: Node r has two children. Node m has three children. ----------------------------------- / / m ( ) / | \ / | . / | . / | . / | / | / | p ( ) q ( ) | /|\ | . . . | . . ? * x ( ) . . . * tree(x) resulted from the /|\ . first merge after the minimum o o o . data item was deleted. . / / r ( ) /| / | / | / | / | a( ) b( ) / \ / \ o...o o...o ............................... bottom of tree(q) and tree(m) . . ......................... bottom of tree(s) / / m ( ) / | / . / . / . / delete p / / ------> q ( ) /|\ . . . . . ? . . . . . . / / r ( ) /|\ / | \ / | \ / | \ / | \ x( ) a( ) b( ) /|\ / \ / \ o o o o...o o...o ............................... bottom of tree(q) and tree(m) CASE 3: Node r has two children. Node m has two children. ----------------------------------- / / s ( ) / | \ / . . / . ? / . . / / / m ( ) / | / | / | / | / | / | / | p ( ) q ( ) | /|\ | . . . | . . ? * x ( ) . . . * tree(x) resulted from the /|\ . first merge after the minimum o o o . data item was deleted. . / / r ( ) /| / | / | / | / | a( ) b( ) / \ / \ o...o o...o ............................... bottom of tree(q) and tree(m) . . ......................... bottom of tree(s) / / s ( ) / | \ / . . / . ? / . . / delete nodes p and m / / ------> q ( ) Node q moved up to replace m /|\ . . . . . ? . . . . . . / / r ( ) /|\ / | \ / | \ / | \ / | \ x( ) a( ) b( ) /|\ / \ / \ o o o o...o o...o ............................... bottom of tree(q) . . . . . ......................... bottom of tree(s) NOTE: In all three cases above, it is possible that r is the same node as q. ---- For all three cases shown above, delete_min() can be completed in O(1) time. Each rearrangement above satisfies the restrictions on the tree structure. A description of the nodes in the above rearrangement is given below: x - The root of the tree from the first merge after a delete-min operation. Note that tree(x) has height 1. p - The parent of node x. Node x is the only child of node p, which is why the final rearrangement is required. q - The sibling immediately to the right of node p. r - A tree of height 2 whose children are to be merged with x. tree(r) is the left-most tree of height 2 inside tree(q). m - The parent of p and q. s - The parent of p. The modified 2-3 tree maintains a pointer, min_node, which points to the parent node of the minimum data item in the tree, so upon a delete_min() operation a pointer to the minimum data item is obtained in O(1) time as min_node->left.item. To obtain pointers to the other nodes in O(1) time it is necessary to maintain parent pointers for each node. Since data items (leaves) are different from nodes, they have no parent pointer, which is why the tree maintains a pointer to the parent of the minimum data item in the tree instead of pointing directly to the minimum data item. From this structure pointers p, q, m, and, s can be obtained in O(1) time. To obtain a pointer to r in O(1) time it is necessary to reach the left-most node in tree(q) in O(1) time and then use parent pointer to reach r. In the normal 2-3 tree, a node has two key pointers key_item1 and key_item2 which point to the appropriate data items in the tree which correspond to the keys value. In the modified 2-3 tree, a node has key pointers key_node1 and key_node2, which instead point to the parent of the data item. Now to reach the left-most node in tree(q) in O(1) time we use m->key_node1. The data item which used to be key_item1, is now accessed as key_node1->left.item, and the data item which used to be key_item2 is now key_node2->left.item. Node that if a nodes children are data items (leaves), its key_node1 and key_node2 pointers are NULL. For such nodes, the key items previously given by key_item1 and key_item2 are also aces-sable as the middle and right children respectively. Since key_node1 is NULL when a nodes children are data items, there is node need for a link_kind field. The maintenance of key_node pointer for the modified 2-3 tree is almost identical to maintaining the key_item pointers in the standard 2-3 tree. Find operations in the modified 2-3 tree will be slightly slower since key items must now be accessed using, for example, key_node1->left.item. This means that the insert() and delete() operations will be slightly slower in the modified 2-3 tree.