MckoiDDB is a distributed database system that runs on a network of computers and provides a platform for building interactive applications utilizing the resources of the network. The focus of the design of MckoiDDB has been to create an API for developing and maintaining highly scalable data storage systems in an extensible framework.
MckoiDDB is developed by Tobias Downer. To contact the author, you can email firstname.lastname@example.org. If you are emailing for the first time, please be sure to include 'MckoiDDB' somewhere in the message to ensure it doesn't end up being marked as spam.
Most database systems are targeted at single server environments where the system is designed and optimized around utilizing the resources local to one machine. These days, many of these systems that are in popular use have added distributed capabilities, however they still prove difficult to scale without a lot of pain when the need arises. The reason for the difficulty is because these database systems manage the consistency of the data structures in a few highly contended sub-systems (the memory and disks of the database server) which is a bottleneck that is difficult to overcome.
MckoiDDB has been designed from the bottom up to be a distributed system that runs over multiple servers. MckoiDDB supports a shared nothing data storage model that removes all contentions when reading and writing data, except for a single function when committing a transaction. Further, this one contention can be overcome by MckoiDDB's ability to logically partition (shard) the database allowing transaction commits to be distributed over multiple servers with little pain. In MckoiDDB, creating and populating partitions and splitting an existing database into a new partition is a very simple process.
MckoiDDB shares some of the same ideas for managing blocks of data over a cluster of machines as GoogleFS and Hadoop, but the problems the two systems are solving are very different. GoogleFS/Hadoop are data storage systems primarily focused on Batch Processing (performing highly parallel jobs on large sequences of data). MckoiDDB is a data storage system focused on Online Transaction Processing (providing low latency random access reads and writes on large datasets).
You need at least one machine with Java 1.6 or above installed. Preferably you should have at least a network of three machines.
Follow this guide.
The Mckoi Machine Node is a Java application you run on each machine you wish to be part of your MckoiDDB network cluster. Once installed and running, the Machine Node is set to perform up to three roles of operation in the network. These roles are;
A small setup will have at least three servers running the Block Server role. The number of Root Servers and Manager Servers in the system depends on the size of the system. The smallest network must have at least one Root and one Manager server.
Whenever data needs to be stored on a MckoiDDB network, the data is written to a block file on three different servers (or less if three servers aren't currently available). When a block file becomes full, a new block file is assigned for the next data item written. The servers where a block file is stored is randomly picked from the pool of available servers.
You can see all the block files stored on a node by looking in the /block/ sub-directory of the node_directory as defined in the node.conf file.
The reasons for storing data on three servers is for fault tolerance and load balancing. If a machine containing the data you want goes down there are still two other copies available. Storing data on multiple machines spreads read access workloads over multiple servers. Also, storing three copies of all data allows you to determine if one of the copies is corrupt.
If a block server node goes down, any clients that are reading data from the machine will retry the query on another machine that stores the same block. If a client is writing data to the failed machine, a group of three new machines are chosen to write the data to. In a MckoiDDB cluster with three or more machines, a client will not notice an interruption when a block server fails.
If a root or manager server fails there will be an interruption. This can only be fixed by bringing those services back online.
The manager is a centralized point but it's not a problem because the knowledge it provides to clients about the network is relatively small compared to the total amount of all data stored, and this knowledge changes infrequently and is never overwritten. These characteristics make it possible to replicate the manager server knowledge to multiple machines, or to cache it entirely in memory over all the clients.
With the release of 1.1 we now support an SQL data model for MckoiDDB. See the documentation for further details.
The easiest way to get started using MckoiDDB is to use the Simple Database API which supports a file and table data structure. See this guide to get started.
You don't need to explicitly rollback a transaction in MckoiDDB. To retrieve a fresh view of the database state you should create a new transaction and let the garbage collector reclaim transactions that you don't need anymore. A transaction is light weight enough that you can create as many as needed with very little heap consumption or side effects.
You may cache a transaction object for as long as you like, however it's not recommended that you try and optimize your application this way because of mistakes possible with exposing a transaction to multiple threads. MckoiDDB transactions are very light weight and there is little benefit to be gained by reusing them. Don't be afraid to create a new transaction whenever you need. If you need to revisit old snapshot states of a database, you may use the MckoiDDBClient.createTransaction(DataAddress) method (or SDBSession.createTransaction(SDBRootAddress) in the Simple Database API).
If used correctly, yes. The rules are fairly simple - creating new transactions from a com.mckoi.network.MckoiDDBClient is thread safe. The transaction objects themselves or any object created by the transaction are NOT thread safe. Therefore, to make a safe multi-threaded client application you should ensure that each thread has its own transaction.
For more technical documents on developing your own data models, consult the documentation.