RegisterSign In
By ingvar on Oct 14, 2012 12:08 PM.
mckoi b+tree?
Hello,

I want to implement hypergraphDB primitive storage layer with a distributed key-value store. Till now i tried redis and hazelcast, both having their disadvantages. MckoiDDB seems to fit well, since it provides two features not found yet in combination:
a) it allows MVCC transactions
b) i read it is a distributed b+tree, which in theory would allow efficient range queries and iteration.

My questions:
- I need 160 bit lenght keys, and if possible directly, ideally also arbitrary length keys as byte arrays. Is this possible in some way? AbstractKey already seems to prescribe a pretty static key structure with fixed size of primary and secondary key respectively.
- I need something like a sorted Multimap<byte[],byte[]>, so I understand I could use Key with OrderedSetData, but i dislike that it encodes data as strings, and it says to be unsuitable for large data and searches. Is there an alternative ?
- is mckoiddb a distributed b-tree, on some places in the web it says it's B+tree?

thanks

regards
Ingvar
By Tobias Downer (toby) on Oct 15, 2012 6:18 PM.
Hello,

Find my answers below;
- I need 160 bit lenght keys, and if possible directly, ideally also arbitrary length keys as byte arrays. Is this possible in some way? AbstractKey already seems to prescribe a pretty static key structure with fixed size of primary and secondary key respectively.
The underlying Key class used in MckoiDDB is indeed hard coded to 14 bytes (112 bits) as you found in AbstractKey. The binary format of MckoiDDB would need to be changed to increase the size of this key. However, there are several ways to get around this restriction;

1) Hash your 160 bit keys/variable length keys and address the data as a hash table and deal with any clashes that might occur.
2) Implement an index that maps your larger keys to AbstractKey. This will incur an additional index lookup but it's still quite efficient (the ObjectDatabase data model uses generated key indexes for lookup extensively).
3) The 112 bit AbstractKey address space limitation is only applicable per MckoiDDB path (partition). You could split your data over multiple paths. An AbstractKey in path A can contain different data to the same AbstractKey in path B. When your dataset is so large that you believe an 112 bit address space is too small, I would recommend using multiple partitions anyway.
4) Instead of using AbstractKey as your address space, use the content of DataFile objects as your address space. In MckoiDDB, the 'value' part of the key/value store (ie. the DataFile objects) are very versatile data structures and each one has a potential 64-bit address space.
- I need something like a sorted Multimap<byte[],byte[]>, so I understand I could use Key with OrderedSetData, but i dislike that it encodes data as strings, and it says to be unsuitable for large data and searches. Is there an alternative ?
The string part is a documentation error. OrderedSetData actually stores encode data in a binary format. The encoding is very simple and is usually 1 byte in to 1 byte out. The exception is the 0x0FF, 0x0F8 sequence which is used as an entry partition so has to be encoded differently.

It's probably not appropriate to use OrderedSetData if your data and keys are very large (say, over 200k) because the search algorithm has to iterate over an entire entry to find the start and end of the entry during its binary search. I have actually got a patch ready to help with this a bit, however, if you are storing large value entries it should be a key to another DataFile that stores the data. Concurrency issues aside, OrderedSetData is appropriate for handling very large numbers of small entries.
- is mckoiddb a distributed b-tree, on some places in the web it says it's B+tree?
Technically it's not a B+tree because the leaf nodes are not linked to each other, but in all other cases B+tree describes it pretty well (data is only stored in leaf nodes, etc). The reason we don't link leaf nodes is because it's a multi-versioned system and we wouldn't be able to share leaf nodes between different tree versions.

Regards,
Toby.
By ingvar on Oct 17, 2012 3:59 PM.
thank you for your detailed and clear answer.

regards
ingvar
Please sign in or register to post in this topic.
The text on this page is licensed under the Creative Commons Attribution 3.0 License. Java is a registered trademark of Oracle and/or its affiliates.
Mckoi is Copyright © 2000 - 2017 Diehl and Associates, Inc. All rights reserved.