Michael A. Bender, Roozbeh Ebrahimi, Haodong Hu, Bradley C. Kuszmaul
Most B-tree articles assume that all N keys have the same size K, that f = B/K keys fit in a disk block, and therefore that the search cost is O(logf + 1N) block transfers. When keys have variable size, B-tree operations have no nontrivial performance guarantees, however.
This article provides B-tree-like performance guarantees on dictionaries that contain keys of different sizes in a model in which keys must be stored and compared as opaque objects. The resulting atomic-key dictionaries exhibit performance bounds in terms of the average key size and match the bounds when all keys are the same size. Atomic-key dictionaries can be built with minimal modification to the B-tree structure, simply by choosing the pivot keys properly.
This article describes both static and dynamic atomic-key dictionaries. In the static case, if there are N keys with average size K, the search cost is O(⌈K/B⌉log1 + ⌈B/K⌉N) expected transfers. It is not possible to transform these expected bounds into worst-case bounds. The cost to build the tree is O(NK) operations and O(NK/B) transfers if all keys are presented in sorted order. If not, the cost is the sorting cost.
For the dynamic dictionaries, the amortized cost to insert a key κ of arbitrary length at an arbitrary rank is dominated by the cost to search for κ. Specifically, the amortized cost to insert a key κ of arbitrary length and random rank is O(⌈K/B⌉log1 + ⌈B/K⌉N + |κ|/B) transfers. A dynamic-programming algorithm is shown for constructing a search tree with minimal expected cost.
This article also gives a cache-oblivious static atomic-key B-tree, which achieves the same asymptotic performance as the static B-tree dictionary, mentioned previously. A cache-oblivious data structure or algorithm is not parameterized by the block size B or memory size M in the memory hierarchy; rather, it is universal, working simultaneously for all possible values of B or M. On a machine with block size B, if there are N keys with average size K, search operations costs O(⌈K/B⌉log1 + ⌈B/K⌉N) block transfers in expectation. This cache-oblivious layout can be built in O(N log(NK)) processor operations.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados