Seva Software

 

What is Aruna DB?

Last Updated: 9/25/2001

A_Table

Purpose

License

Description

Dependencies

Limitations

Performance Considerations

To Do

Usage

Class Methods

Instance Methods

Testing

A_Debug Usage

 

Purpose:

This Ruby class provides table support for ArunaDB. ArunaDB is a database server written in Ruby.  Tables allow you to define columns, column constraints, and triggers on data that inserted, updated, and deleted.

  

Description:

Each table uses two btrees, one btree holds the data (called the pkey btree) and the other btree is for transaction support (called the lock btree). When transactions are enabled (this is the default) all Inserts, updates, and deletes are written to the lock btree. When you call commit the inserts, updates, and deletes are taken out of the lock btree and are applied to the pkey btree. The pkey btree contains the actual data in the table. The A_Table_Shared class caches information about all tables in memory. This allows you to connect to the same table many times and have the information about the tables cached in memory only once. This was done for both performance reasons (to make connecting to a table faster) and to make is easier to change the table.

 

Row, record, key/value pair, tuple all represents pretty much the same thing, they are a collection of keys and data that make up one record or element in a table.

 

Dependencies:

 

Limitations:

 

Performance Considerations:

 

To Do:

 

 Usage:

require "a_table"

A_Catalog.use('test') # create a catalog

# create two filestores, one for Tables and another for Transactions/Locks

A_FileStore.create('Tables', 1024, 'test.fs')

A_FileStore.create('Locks', 1024, 'test_l.fs')

# create an array of A_Columns

cols = []

cols.push(A_Column.new('member_id', 'I', true, nil, '%d > 0', nil, '%3d'))

cols.push(A_Column.new('last_name', 'v32', true, nil, "'%s' != ''", "'%s'.capitalize", '%-12s'))

cols.push(A_Column.new('first_name', 'v32', nil, '', nil, "'%s'.capitalize", '%-12s'))

# create the table

m = A_Table.new('Membership', cols, 'member_id')

# general use after the table is created

A_Catalog.use('test') # use the catalog

t1 = A_Transaction.new # start a new transaction

m.insert(t1, %w(member_id last_name first_name), [102, 'Davis', 'josh'])

m.insert(t1, %w(member_id last_name first_name), [103, 'davis', 'michelle'])

t1.commit

# show data in the table

m.each(){|data| print " #{data)}\n"}

m.close

 

Class Methods:

A_Table.connect(table_name)

Connect to an existing table.

A_Table.create_trigger(table_name, trigger_name, trigger_type, trigger)

See create_trigger() in the A_Triggers module for an explanation of this method.

A_Table.drop(table_name)

Drop this table

A_Table.drop_trigger(table_name, trigger_name, trigger_type)

See drop_trigger() in the A_Triggers module for an explanation of this method.

A_Table.exists?(table_name)

Returns true if this table_name already exists. Otherwise returns false.

A_Table.new(table_name, columns=nil, pkey_names=nil, filestore_name='Tables', lock_filestore='Locks', node_page_size=1, leaf_page_size=2)

If the table already exists, it is opened and the parameters are ignored. If the table does not exist, it is created.

A_Table.open(table_name)

Alias for A_Table.connect().

 

Instance Methods:

begin_update(transaction)

This prepares the A_Table_Data object that has been yielded by an iterator for updating. You must call this before you change any values in the A_Table_Data object and before you call update_row().

check_constraints()

This method iterators over (reads) every row and performs column constraint checks on every column. It Raises a ColumnConstraintError error if any column fails a column constraint check.

clear

Alias for truncate.

close()

Close this table and free up any used resources. Much of the table's information is still cached in the A_Table_Shared class.

column_count()

Returns a count of the number of columns in this index.

column_info(col_name)

Returns the A_Column object associated with col_name.

commit(transaction)

Don’t call this method directly. See the commit method in the A_Transaction class. This is used internally by the A_Transaction class. When a table is updated in a transaction, the table is associated with the transaction. When the transaction is committed, this method is called to commit any pending changes in this table.

create_trigger(trigger_name, trigger_type, trigger_method_name)

See create_trigger() in the A_Triggers module for an explanation of this method.

delete(transaction, pkey)

Delete the row associated with pkey from the table.

delete_row(transaction)

Delete the current row from the table. This method is used in while you are iterating over rows in this tables. You must be iterating using any of the each() methods for this method to work. If you need to delete a lot of rows from the table, it is faster to iterate and call delete_row() than to call delete() for each primary key you want to delete.

disconnect()

Alias for close().

drop()

Drop this table. This will also delete all information that is cached about this table.

drop_trigger(trigger_name, trigger_type)

See drop_trigger() in the A_Triggers module for an explanation of this method.

each(transaction=nil, min_key=nil, max_key=nil)

Iterate over all rows in the table. This yields an A_Table_Data object for this table. The A_Table_Data object include both the keys and the data for the table. The min_key and max_key parameters are a bit unusual. The btree uses these values to narrow down how many pages have to be read to retrieve the values you are looking for. If both values are nil, then all rows in the table are iterated over. If you are looking to iterate over a range of primary keys, then set the min_key to smallest primary key value and the max_key the largest. Using min_key and max_key values could really improve your performance.

each_col()

Yields each A_Column object in the table.

each_duplicate(min_key=nil, max_key=nil)

Iterate over all duplicated rows in the table. A duplicated row is two rows that have the same primary key. This is extremely useful for eliminating duplicate rows from a table. This yields an A_Table_Data object and a counter for this table. The A_Table_Data object include both the keys and the data for the table. Currently, there is no way to include any rows updated in a transaction when iterating over duplicate rows. You can call udpate_row() as a part of a transaction while you iterating over duplicate rows. The btree uses these values to narrow down how many pages have to be read to retrieve the values you are looking for. If both values are nil, then all rows in the table are iterated over. If you are looking to iterate over a range of primary keys, then set the min_key to smallest primary key value and the max_key the largest. Using min_key and max_key values could really improve your performance.

Here is an example of how to delete all duplicate rows in a table:

m = A_Table.connect('Memberhip')

t = A_Transaction.new()

m.each_duplicate() {|data_array, counter|

if (counter > 0) # make sure we don't delete the first duplicate

m.delete_row(t)

end

t.commit

m.close

each_key

Alias for each().

each_pkey

Alias for each().

each_row

Alias for each().

each_sorted(transaction=nil, min_key=nil, max_key=nil, sort_block=nil)

Iterate over all rows in the table sorted by values other than the primary key. This uses the A_FileSort class to sort the rows. This yields an A_Table_Data object for this table. The A_Table_Data object include both the keys and the data for the table. The min_key and max_key parameters are a bit unusual. The btree uses these values to narrow down how many pages have to be read to retrieve the values you are looking for. If both values are nil, then all rows in the table are iterated over. If you are looking to iterate over a range of primary keys, then set the min_key to smallest primary key value and the max_key the largest. Using min_key and max_key values could really improve your performance.

each_value

Alias for each().

exists?(transaction, pkey)

Returns true if the pkey exists in the table. Returns false if this pkey does not exists.

export(fp)

Writes this table to the file. Also see import(). This can be used to back up your tables.

find(transaction, pkey)

Returns the A_Table_Data object of the first row associated with the pkey. Returns nil if this pkey does not exists.

find_last(transaction, pkey)

Returns the A_Table_Data object of the last row associated with the pkey. Returns nil if this pkey does not exists. Find() and find_last() show always return the same row.

get_columns(col_names)

Returns an array of A_Column objects for this table.

get_key_pack_string(key_names=nil)

Returns the pack string used by this table. This is the pack used by the A_BTree when storing data. This is not very useful. See apack for details. An '*' means that one or more columns does not have a valid type and Marshal.dump is used to store the data.

import(fp)

Reads this table from the file. The table is truncated before importing start. Also see export(). This can be used to back up your tables.

insert(transaction, column_names, values)

Insert one record into the table based on the provided column list and list of values.

insert_row(transaction)

If you have prepared am A_Table_Data object (see the prepare() method), you can use this method to insert the record into the table. This does the same thing as insert(), is more difficult to use than insert(), but could be faster than insert() if you set it up properly.

is_key?(column_name)

Returns true if this column is a part of the primary key, otherwise returns false.

key_count()

Returns the number of columns used in the primary key.

key_names()

Returns an array of column names that make up the primary key.

load_finish()

This method is a work in progress. Load_start(), load_row(), and load_finish() allow you to load unsorted directly into the table. These methods are slower than inserting directly into the table because the btree can insert faster that you can sort and load data. As a result, these methods are not very useful at this time. Load_start() creates an A_FileSort object, load_row() pipes unsorted data to the A_FileSort object, and load_finish() reads from the A_FileSort object and loads the data directly into the btree. When the A_FileSort method is re-written for better performance, these methods may become for useful.

load_row()

This method is a work in progress. Load_start(), load_row(), and load_finish() allow you to load unsorted directly into the table. These methods are slower than inserting directly into the table because the btree can insert faster that you can sort and load data. As a result, these methods are not very useful at this time. Load_start() creates an A_FileSort object, load_row() pipes unsorted data to the A_FileSort object, and load_finish() reads from the A_FileSort object and loads the data directly into the btree. When the A_FileSort method is re-written for better performance, these methods may become for useful.

load_start(tmp_path=nil)

This method is a work in progress. Load_start(), load_row(), and load_finish() allow you to load unsorted directly into the table. These methods are slower than inserting directly into the table because the btree can insert faster that you can sort and load data. As a result, these methods are not very useful at this time. Load_start() creates an A_FileSort object, load_row() pipes unsorted data to the A_FileSort object, and load_finish() reads from the A_FileSort object and loads the data directly into the btree. When the A_FileSort method is re-written for better performance, these methods may become for useful.

lock()

Grab the lock associated with this table. This lock is shared by all objects that connect to this table.

name()

Returns the name of this table.

nitems()

Returns the number of items (rows) stored in this table.

prepare(transaction)

Returns the A_Table_Data object prepared for inserting. This set the value in the A_Table_Data object to the default values as specified in the A_Column objects. You only need to use this method if you want to use the insert_row method to insert data. See the insert_row() method for more details. Here is an example:

m = A_Table.connect('membership')

t = A_Transaction.new()

d1 = m.prepare(t)

d1.member_id = 101

d1.last_name = 'davis'

d1.first_name = 'Michael'

m.insert_row(t)

t.commit

 

reset()

Alias for prepare.

rollback(transaction)

Don’t call this method directly. See the commit method in the A_Transaction class. This is used internally by the A_Transaction class. When a table is updated in a transaction, the table is associated with the transaction. When the transaction is committed, this method is called to commit any pending changes in this table.

rows_accessed()

Returns the number of rows accessed while iterating. Example:

m = A_Table.connect('membership')

m.each(){|data| print " #{data}\n"}

print " #{m.rows_accessed} rows\n"

rows_affected()

Returns the number of rows affected by update(), insert(), and delete(). You can also use this after iterating for a count of row affected by delete_row() and update_row().

rows_deleted()

Alias for rows_affected().

rows_inserted()

Alias for rows_affected().

rows_updated()

Alias for rows_affected().

set_defaults()

Alias for prepare().

show(format='A')

Prints useful information about this table and the btree used by this table.

show_triggers(prefix='')

See show_triggers() in the A_Triggers module for an explanation of this method.

synchronize()

Mutex.synchronize the lock associated with this table. This lock is shared by all objects that connect to this table.

table_name()

Returns that name of this table.

to_s

Returns a string containing useful information about this table.

triggers_exist

See triggers_exist() in the A_Triggers module for an explanation of this method.

truncate()

Quickly delete all rows in this table. This is not a part of a transaction and can't be undone or rolled back.

unlock()

Release the lock associated with this table. This lock is shared by all objects that connect to this table.

update(transaction, pkey, column_names, values)

This updates the row in the table associated with pkey.

update_row(transaction)

Update the current row in the table you are iterating over. This method can only be used in conjunction with one of the methods that iterator such as each, each_key, each_value, etc. You must call begin_update before you start changing value in the the A_Table_Data object that is yielded by the iterator.

Here is an example:

m1 = A_Table.connect('Membership')

t1 = A_Transaction.new

# this iterates only over rows where the pkey == 102

m1.each(t1, 102, 102) {|data|

data = m1.begin_update(t1)

data.first_name = 'atest'

m1.update_row(t1)

}

t1.commit

m1.close

use()

Alias for prepare().

 

Testing: 

tst_a_table.rb - this script performs basic testing for the A_Table class. To run this tests type:

ruby -I.. tst_a_table.rb

 

A_Debug Usage:

10 - prints information about creating, dropping, and closing A_Table_Data, A_Table_Shared, and A_Table classes. Also prints inforation about creating and dropping triggers.

14 - prints information about committing and rolling back transactions.

15 - prints information about set and extracting info out of the A_Table_Data cless such as when populating indexes. Also prints information about iterating over tables. Also prints information about updates, inserts, and deletes. This will create a lot of debug.

18 - prints information about creating, dropping, and updating indexes

19 - prints information checking column constraints, applying column actions, column assignments, etc. (column level stuff)