RegisterSign In

MckoiDDB Frequently Asked Questions

General Questions

  1. What is MckoiDDB?
  2. Who developed MckoiDDB and how do I contact the author?
  3. Why would I use MckoiDDB instead of another database system?
  4. Is MckoiDDB like Hadoop?
  5. Can MckoiDDB be used in the cloud?
  6. What do I need to use MckoiDDB?

Questions about how MckoiDDB works

  1. How do I install MckoiDDB on a network?
  2. What is a Mckoi Machine Node and what does it do?
  3. How is data stored on the cluster of machines?
  4. Why are copies of the data stored on multiple machines?
  5. What happens when a machine fails?
  6. It sounds like the manager server is a bottleneck!

Questions about the client side API

  1. Does MckoiDDB support SQL?
  2. How do I store and query data?
  3. How do I rollback a transaction?
  4. Can I cache a transaction?
  5. Is the client API thread safe?
  6. I want to write my own data model, where do I start?

General Questions

What is MckoiDDB?

MckoiDDB is a distributed database system that runs on a network of computers and provides a platform for building interactive applications utilizing the resources of the network. The focus of the design of MckoiDDB has been to create an API for developing and maintaining highly scalable data storage systems in an extensible framework.

Who developed MckoiDDB and how do I contact the author?

MckoiDDB is developed by Tobias Downer. To contact the author, you can email toby@mckoi.com. If you are emailing for the first time, please be sure to include 'MckoiDDB' somewhere in the message to ensure it doesn't end up being marked as spam.

Why would I use MckoiDDB instead of another database system?

Most database systems are targeted at single server environments where the system is designed and optimized around utilizing the resources local to one machine. These days, many of these systems that are in popular use have added distributed capabilities, however they still prove difficult to scale without a lot of pain when the need arises. The reason for the difficulty is because these database systems manage the consistency of the data structures in a few highly contended sub-systems (the memory and disks of the database server) which is a bottleneck that is difficult to overcome.

MckoiDDB has been designed from the bottom up to be a distributed system that runs over multiple servers. MckoiDDB supports a shared nothing data storage model that removes all contentions when reading and writing data, except for a single function when committing a transaction. Further, this one contention can be overcome by MckoiDDB's ability to logically partition (shard) the database allowing transaction commits to be distributed over multiple servers with little pain. In MckoiDDB, creating and populating partitions and splitting an existing database into a new partition is a very simple process.

Is MckoiDDB like Hadoop?

MckoiDDB shares some of the same ideas for managing blocks of data over a cluster of machines as GoogleFS and Hadoop, but the problems the two systems are solving are very different. GoogleFS/Hadoop are data storage systems primarily focused on Batch Processing (performing highly parallel jobs on large sequences of data). MckoiDDB is a data storage system focused on Online Transaction Processing (providing low latency random access reads and writes on large datasets).

Can MckoiDDB be used in the cloud?

Absolutely. MckoiDDB can be installed on services such as Amazon EC2, Linode, and Rackspace Cloud.

What do I need to use MckoiDDB?

You need at least one machine with Java 1.6 or above installed. Preferably you should have at least a network of three machines.

Questions about how MckoiDDB works

How do I install MckoiDDB on a network?

Follow this guide.

What is a Mckoi Machine Node and what does it do?

The Mckoi Machine Node is a Java application you run on each machine you wish to be part of your MckoiDDB network cluster. Once installed and running, the Machine Node is set to perform up to three roles of operation in the network. These roles are;

A small setup will have at least three servers running the Block Server role. The number of Root Servers and Manager Servers in the system depends on the size of the system. The smallest network must have at least one Root and one Manager server.

How is data stored on the cluster of machines?

Whenever data needs to be stored on a MckoiDDB network, the data is written to a block file on three different servers (or less if three servers aren't currently available). When a block file becomes full, a new block file is assigned for the next data item written. The servers where a block file is stored is randomly picked from the pool of available servers.

You can see all the block files stored on a node by looking in the /block/ sub-directory of the node_directory as defined in the node.conf file.

Why are copies of the data stored on multiple machines?

The reasons for storing data on three servers is for fault tolerance and load balancing. If a machine containing the data you want goes down there are still two other copies available. Storing data on multiple machines spreads read access workloads over multiple servers. Also, storing three copies of all data allows you to determine if one of the copies is corrupt.

What happens when a machine fails?

If a block server node goes down, any clients that are reading data from the machine will retry the query on another machine that stores the same block. If a client is writing data to the failed machine, a group of three new machines are chosen to write the data to. In a MckoiDDB cluster with three or more machines, a client will not notice an interruption when a block server fails.

If a root or manager server fails there will be an interruption. This can only be fixed by bringing those services back online.

It sounds like the manager server is a bottleneck?

The manager is a centralized point but it's not a problem because the knowledge it provides to clients about the network is relatively small compared to the total amount of all data stored, and this knowledge changes infrequently and is never overwritten. These characteristics make it possible to replicate the manager server knowledge to multiple machines, or to cache it entirely in memory over all the clients.

Questions about the client side API

Does MckoiDDB support SQL?

With the release of 1.1 we now support an SQL data model for MckoiDDB. See the documentation for further details.

How do I store and query data?

The easiest way to get started using MckoiDDB is to use the Simple Database API which supports a file and table data structure. See this guide to get started.

How do I rollback a transaction?

You don't need to explicitly rollback a transaction in MckoiDDB. To retrieve a fresh view of the database state you should create a new transaction and let the garbage collector reclaim transactions that you don't need anymore. A transaction is light weight enough that you can create as many as needed with very little heap consumption or side effects.

Can I cache a transaction?

You may cache a transaction object for as long as you like, however it's not recommended that you try and optimize your application this way because of mistakes possible with exposing a transaction to multiple threads. MckoiDDB transactions are very light weight and there is little benefit to be gained by reusing them. Don't be afraid to create a new transaction whenever you need. If you need to revisit old snapshot states of a database, you may use the MckoiDDBClient.createTransaction(DataAddress) method (or SDBSession.createTransaction(SDBRootAddress) in the Simple Database API).

Is the client API thread safe?

If used correctly, yes. The rules are fairly simple - creating new transactions from a com.mckoi.network.MckoiDDBClient is thread safe. The transaction objects themselves or any object created by the transaction are NOT thread safe. Therefore, to make a safe multi-threaded client application you should ensure that each thread has its own transaction.

I want to write my own data model, where do I start?

For more technical documents on developing your own data models, consult the documentation.

The text on this page is licensed under the Creative Commons Attribution 3.0 License. Java is a registered trademark of Oracle and/or its affiliates.
Mckoi is Copyright © 2000 - 2017 Diehl and Associates, Inc. All rights reserved.