Cassandra tutorial ⋆ Networkking4u

Cassandra tutorial

Introduction

Cassandra was developed by Facebook for searching Facebook inbox and accepted into Apache Incubator in 2009.

Cassandra is a distributed storage system which is highly scalable. It is designed to handle large amount of data and to improve the availability of the system

.
Cassandra is designed to work with multiple servers across the network.

Cassandra is a distributed database from Apache that is highly scalable and designed to manage very large amounts of structured data. It provides high availability with no single point of failure.

The tutorial starts off with a basic introduction of Cassandra followed by its architecture, installation, and important classes and interfaces. Thereafter, it proceeds to cover how to perform operations such as create, alter, update, and delete on keyspaces, tables, and indexes using CQLSH as well as Java API. The tutorial also has dedicated chapters to explain the data types and collections available in CQL and how to make use of user-defined data types.

Audience
This tutorial will be extremely useful for software professionals in particular who aspire to learn the ropes of Cassandra and implement it in practice.

Prerequisites
It is an elementary tutorial and you can easily understand the concepts explained here with a basic knowledge of Java programming. However, it will help if you have some prior exposure to database concepts and any of the Linux flavors.

Example
Facebook works with multiple servers which are located in many data centers by using Cassandra.

Features of Cassandra

Various features of Cassandra were developed with an aim to achieve the following.

1. Scalability
Cassandra is highly scalable system, which allows to add more hardware as per the requirement of an organization.

2. No single point failure
Cassandra has no single point failure and is always available for business applications.

3. Performance
The total throughput can be increased by adding number of nodes in the cluster, which maintains a quick response time.

4. Data distribution
Cassandra provides flexibility of data distribution and allows the replication of data across multiple data centers as per the requirement.

5. Transaction support
Cassandra can support the properties like ACID (Atomicity, Consistency, Isolation,Durability)

6. Faster write operations
Cassandra provides the faster write operations and stores the large amount of data with good read efficiency.

Nosql Cassandra Database

NoSQL databases are called “Not Only SQL” or “Non-relational” databases. NoSQL databases store and retrieve data other than tabular relations such as relation databases.

NoSQL databases include MongoDB, HBase, and Cassandra.
There are following properties of NoSQL databases.

Design Simplicity
Horizontal Scaling
High Availability

Data structures used in Cassandra are more specified than data structures used in relational databases. Cassandra data structures are faster than relational database structures.

NoSQL databases are increasingly used in Big Data and real-time web applications. NoSQL databases are sometimes called Not Only SQL i.e. they may support SQL-like query language.

NoSQL Database Tutorial
Apache Cassandra Features

There are following features that Cassandra provides.

Massively Scalable Architecture: Cassandra has a masterless design where all nodes are at the same level which provides operational simplicity and easy scale out.

Masterless Architecture: Data can be written and read on any node.

Linear Scale Performance: As more nodes are added, the performance of Cassandra increases.

No Single point of failure: Cassandra replicates data on different nodes that ensures no single point of failure.

Fault Detection and Recovery: Failed nodes can easily be restored and recovered

.
Flexible and Dynamic Data Model: Supports datatypes with Fast writes and reads.

Data Protection: Data is protected with commit log design and build in security like backup and restore mechanisms.

Tunable Data Consistency: Support for strong data consistency across distributed architecture.

Multi Data Center Replication: Cassandra provides feature to replicate data across multiple data center.

Data Compression: Cassandra can compress up to 80% data without any overhead.

Cassandra Query language: Cassandra provides query language that is similar like SQL language. It makes very easy for relational database developers moving from relational database to Cassandra.

Apache Flume Tutorial

Cassandra Use Cases/Application

Cassandra is a non-relational database that can be used for different types of applications. Here are some use cases where Cassandra should be preferred.

Messaging

Cassandra is a great database for the companies that provides Mobile phones and messaging services. These companies have a huge amount of data, so Cassandra is best for them.

Internet of things Application

Cassandra is a great database for the applications where data is coming at very high speed from different devices or sensors.

Product Catalogs and retail apps

Cassandra is used by many retailers for durable shopping cart protection and fast product catalog input and output.

Social Media Analytics and recommendation engine

Cassandra is a great database for many online companies and social media providers for analysis and recommendation to their customers.

2 thoughts on “Cassandra tutorial

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: