Interview :: Cassandra
In Cassandra, a collection of rows is referred as "column family".
Cassandra performs the write function by applying two commits:
- First commit is applied on disk and then second commit to an in-memory structure known as memtable.
- When the both commits are applied successfully, the write is achieved.
- Writes are written in the table structure as SSTable (sorted string table).
Memtable is in-memory/write-back cache space containing content in key and column format. In memtable, data is sorted by key, and each ColumnFamily has a distinct memtable that retrieves column data via key. It stores the writes until it is full, and then flushed out.
SSTable is a short form of 'Sorted String Table'. It refers to an important data file in Cassandra and accepts regular written memtables. They are stored on disk and exist for each Cassandra table.
SStables do not allow any further addition and removal of data items once written. For each SSTable, Cassandra creates three separate files like partition index, partition summary and a bloom filter.
DataStaxOpsCenter: It is an internet-based management and monitoring solution for Cassandra cluster and DataStax. It is free to download and includes an additional Edition of OpsCenter.
SPM: SPM primarily administers Cassandra metrics and various OS and JVM metrics. It also monitors Hadoop, Spark, Solr, Storm, zookeeper and other Big Data platforms besides Cassandra.
The main features of SPM are:
- Correlation of events and metrics
- Distributed transaction tracing
- Creating real-time graphs with zooming
- Detection and heartbeat alerting
In Cassandra, the cluster is an outermost container for keyspaces that arranges the nodes in a ring format and assigns data to them. These nodes have a replica which takes charge in case of data handling failure.
ALTER KEYSPACE is used to change the value of DURABLE_WRITES with its related properties.