View RSS Feed

Development

  1. Cassandra Build the index

    by , 02-23-2012 at 07:27 PM
    When the data is ready, next step is to store it into column family. All the tags that are created in tokenizer can be processed in this step. Tokenizer has provided us a list of tags with document IDs. With the help of this information, we can do the following:
    • Check the tags for duplication.
    • Write data to column family in Cassandra.

    Java Code: This is the code to explain index buildning
    private void tokenize(String doc, String docID) {
    
            //remove all none alpha numeric vals
    ...
    Categories
    Development
  2. Cassandra Index generator

    by , 02-23-2012 at 07:23 PM
    In this section, we will create a Cassandra index. The below code explains and simulated a simple indexer. In this index, few components are added. In this code, we read text files as resources for this index generator. After reading content into memory, we pass it to tokenizer. This tokenizer is used to remove all none alpha numeric characters using regular expressions. After this it will separate text files using spaces as delimiter. Finally it chooses randomly words which will be used as tags. ...
  3. Cassandra Delete Operation

    by , 02-23-2012 at 07:21 PM
    In this section, we will discuss that how we can delete a value from Cassandra database using Java code.
    Java Code: This is the code to explain Cassandra Delete Operation
    import org.apache.cassandra.thrift.Deletion;
    ...
    long timestamp =System.currentTimeMillis();
    List columns = new ArrayList();
    columns.add("email".getBytes());
    SlicePredicate slicePredicate = new SlicePredicate();
    slicePredicate.setColumn_names(columns);
    Deletion deletion = new Deletion(timestamp);
    deletion.setPredicate(slicePredicate);
    ...
  4. Cassandra Update Operation

    by , 02-23-2012 at 07:19 PM
    To update data in the Cassandra database, batch_update method is used to perform this action. In the below code, we want to update the email address in the database.
    Java Code: This is the code to explain Cassandra Update Operation
    long timestamp = System.currentTimeMillis();
    Column column = new Column("email".getBytes("utf-8"), "ronald@mathies.nl".getBytes("utf-8"), timestamp);
    ColumnOrSuperColumn columnOrSuperColumn = new ColumnOrSuperColumn();
    columnOrSuperColumn.setColumn(column);
    ...
  5. Cassandra Read Operation

    by , 02-23-2012 at 07:17 PM
    For example we have a number of authors in the database and we want to read them in Java. Below code explains this operation in detail.
    Java Code: This is the code to Cassandra Read Operation
    import org.apache.cassandra.thrift.SlicePredicate;
    import org.apache.cassandra.thrift.SliceRange;
    import org.apache.cassandra.thrift.ColumnOrSuperColumn;
    import org.apache.cassandra.thrift.ColumnParent;
    import org.apache.cassandra.thrift.ConsistencyLevel;
    ...
    SlicePredicate slicePredicate = new SlicePredicate();
    ...
  6. Cassandra Add Operation

    by , 02-23-2012 at 07:15 PM
    Following code explains add operation in java.
    Java Code: This is the code to explain Cassandra Add Operation
    Map<String, List<ColumnOrSuperColumn>> data = new HashMap<String, List<ColumnOrSuperColumn>>();        
    List<ColumnOrSuperColumn> columns = new ArrayList<ColumnOrSuperColumn>();
    // Create the email column.
    ColumnOrSuperColumn c = new ColumnOrSuperColumn();
    c.setColumn(new Column("email".getBytes("utf-8"), "ronald (at) sodeso.nl".getBytes("utf-8"),
    ...
    Categories
    Development
  7. Cassandra Search Indexes

    by , 02-23-2012 at 07:10 PM
    We can perform full text search in some application by reading whole content of the document and searching for the required data in document at acceptable speed. This operation is performed every time a query is executed. This approach is not feasible and recommended in cases where a huge amount of data is used for searching. Also almost no one is using this approach to search documents or database.
    With increasing amount of data on the internet, we need to search millions or sometime billions ...
  8. Cassandra Required Libraries

    by , 02-23-2012 at 07:08 PM
    To communicate with the Cassandra database in java, client will need a number of libraries on its classpath. All of these libraries are already provided within “CASSANDRA_HOME/lib” folder of Cassandra installation directory. These libraries include:

    • Apache-cassandra.jar: It contains customization of Thrift communications protocol.
    • slf4j-log4j.jar: It contains plug-in for SLF4J to add support for Log4j.
    • slf4j-api.jar: It’s a wrapper for logging frameworks.
    ...
    Categories
    Development
  9. Making Cassandra Database Connection

    by , 02-23-2012 at 07:06 PM
    First thing to do is to create connection to database. For this purpose, on port 9160 open up a connection that would be a default port of Cassandra. Hand over it to the client of Cassandra and he will be taking care to communication to the server:
    Java Code: This is the code to explain Cassandra Database Connection
    import org.apache.thrift.protocol.TBinaryProtocol;
    import org.apache.thrift.protocol.TProtocol;
    import org.apache.thrift.transport.TSocket;
    import org.apache.thrift.transport.TTransport;
    
    import org.apache.cassandra.thrift.Cassandra;
    ...