[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Mckoi 2.0 work in progress download



Hi all,

I have just uploaded a work in progress release of Mckoi 2.  You can 
download it here;

http://www.mckoi.com/database/ver2/mckoi20wip01.zip

I will be continuing to add releases to this directory as development 
continues.

Please be aware this release is a work in progress and it's missing some 
important features.  First I'll quickly describe what is complete and 
I'm most interested in getting feedback on at the moment.

The Mckoi 2.0 data store;  I expect the new snapshot data store engine 
and API will be the main focus of the 2 release.  The data store is the 
part of a database engine that deals with modeling structures for 
storing and indexing data in a database and mapping this data to a 
storage medium (such as in a persistent local file system or a transient 
heap).  Version 2.0 provides developers with a fairly straight forward 
API for modeling and manipulating primitive data structures freely in a 
strict snapshot transactional environment.  The problems of caching, 
memory allocation/deallocation, transaction management and structural 
representation are dealt with by the data store engine.

One way to think of the data store engine in 2.0 is a file system that 
is designed for database software and so supports strict isolation 
enforcement, discrete and controllable commit states, and supports files 
of various sizes and quantities with efficient creation and deletion. 
Most importantly, the data store supports very efficient copying of data 
and very efficient shifting of data inside a file.  These primitive 
operations are the base on which any type of more sophisticated data 
structure can be built.  For example, to implement an insert function on 
a list of 64 bit values you could implement a simple binary search and 
insert sort algorithm over a file.  You find the position in the file 
where the value is to be inserted, shift the data in the file by 64 bits 
and write the value at the sorted position.  The data store is designed 
to be able to handle these types of operations very efficiently on very 
large data sets (larger than the available heap memory).

The API will provide reference implementations for common data 
structures using the data store model.

Another useful feature of the new data store engine is the ability to 
copy data without actually replicating the data - the only thing that is 
replicated in a copy operation is the meta structure that organizes the 
data which is many times smaller than the actual data.  This means you 
can make a copy of a file that logically looks and works as if you have 
multiple copies but only one actual copy may be stored in the physical 
representation.  This leads to many elegant optimizations especially in 
a versioning RDBMS that needs to manage multiple versions of mostly 
identical data.

The new data store is feature complete.  I appreciate all feedback, 
comments, testing, questions on this part of the database.  Any feedback 
on this you give me will help me greatly with documentation and making 
sure I'm heading in the right direction with the API.  The relevant 
package is 'com.mckoi.treestore'.  Check out 
src/com/mckoi/tests/Main.java for examples on how to create a 
KeyObjectDatabase for either a heap or file system data store to get you 
started.  If you don't understand something about it then feel free to 
email me.

The SQL engine;  The Mckoi 2.0 SQL engine has been almost completely 
rewritten to make use of the features provided by the new data store 
engine.  We now have a proper cost based query optimizer and planner in 
2.0 (an EXPLAIN command is now available for analysis of query plans). 
The JDBC driver supports updatable result sets and this feature is 
implemented as a first class operation by the driver (updating 
information in a result set will not cause any SQL operations to be 
parsed or interpreted).  JDBC BLOB/CLOB support is also complete.

SQL features that are currently finished - you can CREATE and DROP 
tables, indexes and sequences, define integrity constraints but they are 
not enforced correctly yet.  The standard SQL functions are mostly 
finished, but there is no hook yet for defining user defined functions. 
  Most SELECT features are working but there's currently no flatting of 
nested queries so nested queries can not make full use of all indexes 
available and are not costed correctly by the planner.

Basically, if you are going to try out the SQL stuff you'll probably 
find something that doesn't work properly yet.  I'm still interested in 
getting feedback on the SQL code though, but you can expect lots of new 
SQL features added and the code tidied up this month.

Thanks, hope you like the direction we are taking,
Toby.