View RSS Feed


  1. Benefits of Cassandra

    by , 02-23-2012 at 07:30 PM
    Cassandra database provides a number of benefits. Key benefits of Cassandra database are:

    • Fault Tolerant: It is fault tolerant. Cassandra automatically replicates data to multiple nodes for fault tolerance. Any failed node is replaced immediately with no downtime.
    • Decentralized: It is based on decentralized approach. Every node in Cassandra database cluster is identical. Also there are no network bottlenecks. – Every node in the cluster is identical. There are no network bottlenecks.
  2. NoSQL

    by , 02-23-2012 at 07:28 PM
    It is used to describe a database that does not expose an SQL interface. This term was first introduced in 1998. In 2009, Eric Evans, employee of Rackspace has introduced it again as he wants to organize an event on open source distributed databases.

    NoSQL term is used for such databases, which uses key-value pair architecture. Also it not the case of Cassandra database. These databases do not provide an interface for SQL queries and they avoid join operations normally. Memacached ...
    Tags: nosql Add / Edit Tags
  3. Cassandra Build the index

    by , 02-23-2012 at 07:27 PM
    When the data is ready, next step is to store it into column family. All the tags that are created in tokenizer can be processed in this step. Tokenizer has provided us a list of tags with document IDs. With the help of this information, we can do the following:
    • Check the tags for duplication.
    • Write data to column family in Cassandra.

    Java Code: This is the code to explain index buildning
    private void tokenize(String doc, String docID) {
            //remove all none alpha numeric vals
  4. Cassandra Index generator

    by , 02-23-2012 at 07:23 PM
    In this section, we will create a Cassandra index. The below code explains and simulated a simple indexer. In this index, few components are added. In this code, we read text files as resources for this index generator. After reading content into memory, we pass it to tokenizer. This tokenizer is used to remove all none alpha numeric characters using regular expressions. After this it will separate text files using spaces as delimiter. Finally it chooses randomly words which will be used as tags. ...
  5. Cassandra Delete Operation

    by , 02-23-2012 at 07:21 PM
    In this section, we will discuss that how we can delete a value from Cassandra database using Java code.
    Java Code: This is the code to explain Cassandra Delete Operation
    import org.apache.cassandra.thrift.Deletion;
    long timestamp =System.currentTimeMillis();
    List columns = new ArrayList();
    SlicePredicate slicePredicate = new SlicePredicate();
    Deletion deletion = new Deletion(timestamp);
  6. Cassandra Update Operation

    by , 02-23-2012 at 07:19 PM
    To update data in the Cassandra database, batch_update method is used to perform this action. In the below code, we want to update the email address in the database.
    Java Code: This is the code to explain Cassandra Update Operation
    long timestamp = System.currentTimeMillis();
    Column column = new Column("email".getBytes("utf-8"), "".getBytes("utf-8"), timestamp);
    ColumnOrSuperColumn columnOrSuperColumn = new ColumnOrSuperColumn();
  7. Cassandra Read Operation

    by , 02-23-2012 at 07:17 PM
    For example we have a number of authors in the database and we want to read them in Java. Below code explains this operation in detail.
    Java Code: This is the code to Cassandra Read Operation
    import org.apache.cassandra.thrift.SlicePredicate;
    import org.apache.cassandra.thrift.SliceRange;
    import org.apache.cassandra.thrift.ColumnOrSuperColumn;
    import org.apache.cassandra.thrift.ColumnParent;
    import org.apache.cassandra.thrift.ConsistencyLevel;
    SlicePredicate slicePredicate = new SlicePredicate();
  8. Cassandra Add Operation

    by , 02-23-2012 at 07:15 PM
    Following code explains add operation in java.
    Java Code: This is the code to explain Cassandra Add Operation
    Map<String, List<ColumnOrSuperColumn>> data = new HashMap<String, List<ColumnOrSuperColumn>>();        
    List<ColumnOrSuperColumn> columns = new ArrayList<ColumnOrSuperColumn>();
    // Create the email column.
    ColumnOrSuperColumn c = new ColumnOrSuperColumn();
    c.setColumn(new Column("email".getBytes("utf-8"), "ronald (at)".getBytes("utf-8"),
  9. Cassandra Search Indexes

    by , 02-23-2012 at 07:10 PM
    We can perform full text search in some application by reading whole content of the document and searching for the required data in document at acceptable speed. This operation is performed every time a query is executed. This approach is not feasible and recommended in cases where a huge amount of data is used for searching. Also almost no one is using this approach to search documents or database.
    With increasing amount of data on the internet, we need to search millions or sometime billions ...
  10. Cassandra Required Libraries

    by , 02-23-2012 at 07:08 PM
    To communicate with the Cassandra database in java, client will need a number of libraries on its classpath. All of these libraries are already provided within “CASSANDRA_HOME/lib” folder of Cassandra installation directory. These libraries include:

    • Apache-cassandra.jar: It contains customization of Thrift communications protocol.
    • slf4j-log4j.jar: It contains plug-in for SLF4J to add support for Log4j.
    • slf4j-api.jar: It’s a wrapper for logging frameworks.
  11. Making Cassandra Database Connection

    by , 02-23-2012 at 07:06 PM
    First thing to do is to create connection to database. For this purpose, on port 9160 open up a connection that would be a default port of Cassandra. Hand over it to the client of Cassandra and he will be taking care to communication to the server:
    Java Code: This is the code to explain Cassandra Database Connection
    import org.apache.thrift.protocol.TBinaryProtocol;
    import org.apache.thrift.protocol.TProtocol;
    import org.apache.thrift.transport.TSocket;
    import org.apache.thrift.transport.TTransport;
    import org.apache.cassandra.thrift.Cassandra;
  12. Cassandra Sorting

    by , 02-23-2012 at 07:04 PM
    Along with ColumnFamily CompareWith attribute, sorting gets specified. Here are few options from which one may select.
    1. BytesType
    2. UTF8Type
    3. LexicalUUIDType
    4. TimeUUIDType
    5. AsciiType
    6. LongType

    Content of Column names are threatened by each of the mentioned type, being a different data type. For example, Column names are being threat by Longtype as 64 Bit long value. Other examples are that suppose of ColumnFamily where CompareWith ...
  13. Cassandra Keyspaces

    by , 02-23-2012 at 07:02 PM
    Keyspaces are very simple. From RDBMS view point, you may make a comparison to your schema. Normally, you just have 1/application. ColumnFamily are present in a key space. However, it shall be noticed that no relationship exists b/w ColumnFamily as they are separate containers. Than, comes the turn of different containers sorting mechanism. This clearly lets you know that in Cassandra, how data model works.
  14. Cassandra SuperColumn Family

    by , 02-23-2012 at 07:00 PM
    Finally, largest container the SuperColumnFamily is here. If ColumnFamily is understandable by you then such construction is not at all harder. Despite of Column, in inner most Map, consider SuperColumns. This will surely make an extra dimension addition.Key of the Map (that consists of SuperColumn) shall be similar as that of name of the SuperColumn.
    Java Code: This is the code to explain Cassandra SuperColumn Family
    public class SuperColumnFamily {
      Byte[] name;
      // The key is a user generated key
  15. Cassandra Column Family

    by , 02-23-2012 at 06:58 PM
    ColumnFamily is considered as a structure which is responsible for keeping infinite rows for people who have RDBMS background. This structure has much resemblance with a Table. You may have an idea that ColumnFamily consists of a name that is compared to Table name. A map with a value(that is a map consisting of Columns) and a key (that is comparable to the identifier of rows). Map and SuperColumn have similar rules and key consists of similar value of name of the Column.

    Java Code: This is the code to explain Cassandra Column Family

    Updated 02-23-2012 at 07:01 PM by Cassandra

  16. Cassandra SuperColumn

    by , 02-23-2012 at 06:56 PM
    A SuperColumn is considered to be a triplet or tuple along with a value and a name. It doesn’t consist of any timestamp, for example the column tuple. It shall be noticed that value is not the binary value, but is more a Map style container. Column or key combinations are present in the map. The most important thing to mention is that name of the column and key consists of same value. Therefore to make it simple, we can say that one or more than one columns are present in SuperColumn.
  17. Cassandra Column

    by , 02-23-2012 at 06:54 PM
    Column is also known as triplet or tuple which has value, name & timestamp. This is considered to be a smallest data container.

    Name:  1.JPG
Views: 777
Size:  3.8 KB

    Cassandra Column

    Column’s Java representation is given as following. Complex structures are more conveniently been explained by this:
    Java Code: This is the code to explain Cassandra Column
    public class Column {
      Byte[] name;
      Byte[] value;
      Long timestamp;
  18. Cassandra alternatives

    by , 02-21-2012 at 09:12 PM
    Before taking any start along with Cassandra, it is important to be well aware of the fact that many alternatives exists for complimenting the relational database. E.g., memcached has been made to be used for, and is free of cost. If you apply such solution to some larger Mysql database alone, you may expect a great performance boost of 100 times.

    Other than this, another famous Nosql solution that has been well renowned is MongoDB, which has been written in language ...
    Tags: cassandra Add / Edit Tags
  19. Why Use Cassandra?

    by , 02-21-2012 at 09:09 PM
    Why Cassandra shall be chosen against any other NoSQL solution?

    • Main selling point of Cassandra is that it is being written in Java.
    • Cassandra has been present on the largest website of the world i.e. Face book.
    • Cassandra has met all the requirements to be decentralized and it doesn’t have one single failure point.
    • Read as well write throughput have been seen to increase in linear fashion, when newer machines get added without any sort of downtime or interruption.
    • It gets
    Tags: cassandra Add / Edit Tags
  20. What is Cassandra?

    by , 02-21-2012 at 09:06 PM
    The Apache Cassandra Project is involved in creating the highly scalable database that is second generation distributed. It brings the Bigtable’s ColumnFamily (based data model) & fully distributed design of Dynamo, together.

    Apache Cassandra is basically an open source that is a highly available database and distributed. Architecture of Apache Cassandra incorporates the design, from Dynamo project. It makes use of data model that is Google’s Bigtable data model based.
    Tags: cassandra Add / Edit Tags