A modified van Emde Boas data structure that achieves time complexity for successor/predecessor/insert/search/delete in O(lg w)
(with high probability), where w
is the bits of the word used to store the integer. Space complexity is O(n)
, where n
is how many numbers are stored. Thus, the constrain is that lg n <= w
, since any number must fit in a word.
If w
is not too high, then this is optimal. In case w
is too high, fusion trees are the best. However, dynamic versions of fusion tree only achieve expected O(lg n/lg w)
.
When lg w >= sqrt(lg(n))
is better to use fusion trees
. However, for dinamyc fusion trees
only expected O(lg n/lg w)
has been achieved.
For example, in a static case, with a 32-bit word, w = 32
(as this current implementation that uses scala's Int which is 32-bit signed integer), static van Emde Boas should be used if n >~ 2^25 ~ 33 millon
. Otherwise, static fusion trees
should be preferred.
In the same scenario, static, for 64-bit word, w = 64
, this structure is better than fusion trees if n >~ 68 billon
. You're likely to have memory problems with such big numbers, and you may want to switch to a distributed system. Due to its recursive nature, I imagine this structure is easy to implement in a distributed way.
The current implementation, handles scala's Int which is a signed 32-bit number. However, th
- search: O(lg w)
- successor: O(lg w)
- predecessor: O(lg w)
- delete: O(lg w)
- insert: O(lg w)
where n = 2^w is the maximum number that can be stored.
- O(n)
- Create performance tests
- Measure memory consumption againt data size (n) and bit (w)
- Create a complete immutable version
- Improve usage of generators for performance tests
- Do performance tests for Immutable Version
- Compare performance between mutable and immutable
- Write predecessor
- Add delete
- Allow any ordering
- Augment the data structure
- Compare to fusion tree
- Compare to
O(lg n)
successor/predecessor of trees - Present performance results
- immutable or mutable
- dynamic
- fully tested
- with benchmarks code