TOP_NAV

Drop Down MenusCSS Drop Down MenuPure CSS Dropdown Menu

Introduction To NOSQL


Introduction To NOSQL

NoSQL, is basically a database used to manage huge sets of unstructured data, where in the data is not stored in tabular relations like relational databases. Most of the currently existing Relational Databases have failed in solving some of the complex modern problems like :
  • Continuosly changing nature of data - structured, semi-structured, unstructured and polymorphic data.
  • Applications now serve millions of users in different geo-locations, in different timezones and have to be up and running all the time, with data integrity maintained
  • Applications are becoming more distributed with many moving towards cloud computing.
NoSQL plays a vital role in an enterprise application which needs to access and analyze a massive set of data that is being made available on multiple virtual servers (remote based) in the cloud infrastructure and mainly when the data set is not structured. Hence, the NoSQL database is designed to overcome the Performance, Scalability, Data Modelling and Distribution limitations that are seen in the Relational Databases.


What is Structured Data?

Structured data is usually text files, with defined column titles and data in rows. Such data can easily be visulaized in form of charts and can be processed using data mining tools.

What is Unstructured Data?

Unstructured data can be anything like video file, image file, PDF, Emails etc. What does these files have in common, nothing. Structured Information can be extracted from unstructured data, but the process is time consuming. And as more and more modern data is unstructured, there was a need to have something to store such data for growing applications, hence setting path for NoSQL.



Image result for nosql advantages and disadvantages

NoSQL database types :


  • Document Databases : In this type, key is paired with a complex data structure called as Document. Example : MongoDB
  • Graph stores : This type of database is ususally used to store networked data. Where in we can relate data based on some existing data.
  • Key-Value stores : These are the simplest NoSQL databases. In this each is stored with a key to identify it. In some Key-value databases, we can even save the typr of the data saved along, like in Redis.
  • Wide-column stores : Used to store large data sets(store columns of data together). Example : Cassandra(Used in Facebook), HBase etc.




Image result for nosql



DynamoDB


DynamoDB is a fully-managed NoSQL database service designed to deliver fast and predictable performance. It uses the Dynamo model in the essence of its design, and improves those features.

  • Allows users to create databases capable of storing and retrieving any amount of data, and serving any amount of traffic.
  • Automatically distributes data and traffic over servers to dynamically manage each customer's requests, and also maintains fast performance. 
  • Uses a NoSQL model, which means it uses a non-relational system. 
  • Allows simple use of multiple languages: Ruby, Java, Python, C#, Erlang, PHP, and Perl.

Advantages:


Limtations :

  • Capacity Unit Sizes − A read capacity unit is a single consistent read per second for items no larger than 4KB. A write capacity unit is a single write per second for items no bigger than 1KB.
  • Provisioned Throughput Min/Max − All tables and global secondary indices have a minimum of one read and one write capacity unit. Maximums depend on region. In the US, 40K read and write remains the cap per table (80K per account), and other regions have a cap of 10K per table with a 20K account cap.
  • Provisioned Throughput Increase and Decrease − You can increase this as often as needed, but decreases remain limited to no more than four times daily per table.
  • Table Size and Quantity Per Account − Table sizes have no limits, but accounts have a 256 table limit unless you request a higher cap.
  • Secondary Indexes Per Table − Five local and five global are permitted.
  • Projected Secondary Index Attributes Per Table − DynamoDB allows 20 attributes.

Primary Key

The Primary Keys serve as the means of unique identification for table items, and secondary indexes provide query flexibility. DynamoDB streams record events by modifying the table data.
The Table Creation requires not only setting a name, but also the primary key; which identifies table items. No two items share a key. DynamoDB uses two types of primary keys −
  • Partition/ Primary Key − Internally, DynamoDB uses the key value as input for a hash function to determine storage.
  • Composite Primary Key  − It consists of two attributes.
    • The partition key :DynamoDB applies a hash function, and stores items with the same partition key together; with their order determined by the sort key. 
    • The sort key.
     Items can share partition keys, but not sort keys.
The Primary Key attributes only allow scalar (single) values; and string, number, or binary data types. The non-key attributes do not have these constraints.

Secondary Indexes

These indexes allow you to query table data with an alternate key. Though DynamoDB does not force their use, they optimize querying.
DynamoDB uses two types of secondary indexes −
  • Global Secondary Index −  possesses partition and sort keys, which can differ from table keys.
  • Local Secondary Index − possesses a partition key identical to the table, however, its sort key differs.
Relational vs. nonrelational databases
Traditional SQL NoSQL
DB
Primary Secondary
Scale up
DB
DB
DBDB
DB DB
Scale out
 SQL (Relational) vs. NoSQL (Non-relational)
Product
ID
Type
Odyssey Homer1 Book ID
2 Album ID 6 Partitas
2
Album ID:
Track...
WRITES
Replicated continuously to 3 AZs
Persisted to disk (custom SSD)
READS
Strongly or eventually consistent
No latency ...Fully managed service = automated operations
DB hosted on-premises DB hosted on Amazon EC2

Steps for Create Table from AWS Console 


1.
Products
Product_Id




2.
DynamoDB table structure
Table
Items
Attributes
Partition
key
Sort
key
Mandatory
Key-value access pattern
Determines data ...

Global secondary index (GSI)
GSIs
A5
(part.)
A4
(sort)
A1
(table key)
A3
(projected)
Table
INCLUDE A3
A4
(part.)
A5
(sort)...

Local secondary index (LSI)
Alternate sort key attribute
Index is local to a partition key
A1
(partition)
A3
(sort)
A2
(ta...
Integration capabilities
DynamoDB Triggers
 Implemented as AWS
Lambda functions
 Your code scales
automatically
 Java, ...



Integration capabilities
• Amazon Elasticsearch Service
integration
• Full-text queries
 Add search to mobile apps
 Moni...


Resources
Amazon DynamoDB: https://aws.amazon.com/dynamodb/
NoSQL on AWS: https://aws.amazon.com/nosql/document/
Upcoming ...

No comments:

Post a Comment