SolomonDB Weekly Update (#3): Beleth

SolomonDB Weekly Update (#3): Beleth

Date: November 2022 (8-13/11/2022)

·

6 min read

Support my project by visiting SolomonDB Github and give me a star ⭐️

Weekly Update #3: Beleth

Date: November 2022 (8-13/11/2022)

Beleth is a mighty and terrible king of Hell, who has eighty-five legions of demons under his command. He rides a warhorse, and all kind of music is heard before him, according to most authors on demonology and the most known grimoires. According to Pseudomonarchia Daemonum, Ham, son of Noah, was the first in invoking him after the flood, and wrote a book on mathematics with his help. - Wikipedia

Commits on Nov 8, 2022

Description

  • Adding macro to avoid duplicate of controller implementation
  • Update static column family names
  • Implement basic methods for LabelController and VertexController. These methods allow us to mutate and query data from vertex database.

Detail explanation

Add more column family name to the global variable

pub const CF_NAMES: [&str; 4] = ["test_suite:v1", "vertices:v1", "labels:v1", "edges:v1"];

How LabelController and VertexController implemented using the macro

impl_controller!(get RocksDBAdapter; from rocks_db for LabelController);
impl_controller!(get RocksDBAdapter; from rocks_db for VertexController);

Contributors


Commits on Nov 10, 2022

Description

  • Adding more methods to controllers
  • Refractor impl_controller macro to make it cleaner and more versatile
  • Add a property controller PropertyController to manage properties
  • Add feature for CassandraDB: kv-cassandradb
  • New method multi_get for database transaction
  • Implement a GlobalConfig

Detail explanation

A new approach to implement a controller. Now we can identify the controller identifier and its column family.

impl_controller!(LabelController("labels:v1"));
impl_controller!(VertexController("vertices:v1"));
impl_controller!(PropertyController("properties:v1"));

New method added for database transaction to retrieve multiple data at once

async fn multi_get<K>(&self, cf: CF, keys: Vec<K>) -> Result<Vec<Option<Val>>, Error>
    where
        K: Into<Key> + Send + AsRef<[u8]>,

The point of GlobalConfig is that we don't want controller use different databases for each of them. To make it in sync, there will be a struct GlobalConfig storing a datastore where other controllers can reference from.

macro_rules! impl_global_config {
    ($datastore: ident) => {
        pub struct GlobalConfig {
            pub ds: $datastore,
        }

        impl GlobalConfig {
            pub fn new() -> Result<Self, Error> {
                Ok(GlobalConfig {
                    ds: $datastore::default(),
                })
            }
        }
    };
}

Contributors


Commits on Nov 11, 2022

Description

Apply some techniques to serialize Rust struct data into bytes data. Derived from my experience with Solana development, I learnt the approach Solana codebase uses to for byte serialization and deserialization. This can be briefly described with a work Discriminator.

Detail explanation

The concept of Account Discriminator is common in Solana development using Anchor framework (Link to reference).

We will take these lines of code below as an example. This code is to serialize Label struct into Vec<u8> and store it in our data storage.

// Handle byte concatenate for label components
let _label_discriminator = serialize_discriminator(AccountDiscriminator::Label).unwrap();
let _labels = build_bytes(&label_components).unwrap();
let _label_offset = &build_offset(ll, uuid_len);
let (l_dis, l_offset, l) =
(_label_discriminator.as_slice(), _label_offset.as_slice(), _labels.as_slice());
let labels_concat = [l_dis, l_offset, l].concat();

Understanding the code, there will be three components of the bytes vector Label: (account discriminator / offset / object byte data)

  • Account Discriminator: Identifier of a struct object. Below code indicates the mapped index for each object. In Label case, the mapped index is 2
#[derive(PartialEq, Serialize, Deserialize, Eq, Debug, Clone)]
pub enum AccountDiscriminator {
    None = 0,
    Vertex = 1,
    Label = 2,
    Property = 3,
    Relationship = 4,
}
  • Offset: Or it can be called meta, it describes the number of byte indices object data will occupy. For example, a string hello = [68 65 6c 6c 6f] will have an offset as 5 as there are 5 byte elements in the vector.

(NOTE: The reason why we must store metadata of the object byte data is because there might be a chance the stored data is a vector of data)

  • Object byte data: This is obvious, the byte data of the object data. Taking the above string example, the byte data for String object hello is hello = [68 65 6c 6c 6f].

Combining all the byte component together, we get something like

/// The return data of this example might not be exact
/// but it is the idea
hello = [ 2 5 68 65 6c 6c 6f ]

Contributors


Commits on Nov 12, 2022

Description

Add a workflow script to automatically build and deploy a doc page to Github Pages.

Detail explanation

Rust ecosystem has a very interesting crate called mdbook that helps developer to construct a documentation page in one click. Command to install a cargo crate

cargo install mdbook

Documentation book of SolomonDB is stored in dpc/src. The page is hosted on SolomonDB Documentation Page

Contributors


Commits on Nov 13, 2022

Description

Add two badges to markdown. Just a visual stuff.

Detail explanation

Here is the two badges added to the README.md file

Contributors


Commits on Nov 15, 2022

Description

There are several keynote changes in today update.

  • Remove GlobalConfig. I found a better way to make Datastore be referenced in multiple places. Controllers need Datastore to mutate and retrieve data from a storage.
  • Add global Transaction struct. This will point to a right datastore method based on a given entry.
  • Add new transaction method to iterate a storage

Detail explanation

Database Reference (DBRef)

This is derived from IndraDB DBRef struct design

// Source: ./../managers.rs/*DBRef
#[derive(Copy, Clone)]
pub(crate) struct DBRef<'a> {
    pub db: &'a DB,
    pub indexed_properties: &'a HashSet<models::Identifier>,
}

// Referenced by SolomonDB
#[derive(Copy, Clone)]
pub struct DBRef<'a> {
    pub db: &'a Datastore,
}

The purpose of this new struct is like its name Database Reference (DBRef). Struct Datastore alone can't be referenced multiple times as it can not implement Copy or Clone. DBRef will take a Datastore and assign it with 'a lifetime.

In that way, this can be referenced in multiple places.

/// Assigning datastore reference to graph controllers
let path = generate_path(None);
let ds = Datastore::new(&path).unwrap();
let r = DBRef::new(&ds);
let vc = VertexController::new(r);
let ec = EdgeController::new(r);
let lc = LabelController::new(r);
let epc = EdgePropertyController::new(r);

Global Transaction struct

The idea is to provide a global Transaction struct a field inner. This inner identify a provided Datastore. I utilize the power of Rust enum to do generic datastore.

#[allow(clippy::large_enum_variant)]
pub(super) enum Inner {
    #[cfg(feature = "kv-rocksdb")]
    RocksDB(RocksDBTransaction),
}

pub struct Transaction {
    pub(super) inner: Inner,
}

#[async_trait]
impl SimpleTransaction for Transaction {
    // Example method
    fn method_name(&self) -> bool {
        match self {
            Transaction {
                inner: Inner::RocksDB(ds),
                ..
            } => ds.method_name(),
        }
    }
}

Iterate and refix iterate

RocksDB has a very powerful method called iterate. This let you to iterate each key-value pair in RocksDB filtering with specific condition. On the other hand, it also have prefix_iterate to iterate each pair and only get pair has key contains provided prefix.

async fn iterate(&self, cf: CF) -> Result<Vec<Result<KeyValuePair, Error>>, Error>;

async fn prefix_iterate<P>(
        &self,
        cf: CF,
        prefix: P,
    ) -> Result<Vec<KeyValuePair, Error>>, Error>;

Contributors