Introduction To NOSQL
NoSQL, is basically a database used to manage huge sets of unstructured data, where in the data is not stored in tabular relations like relational databases. Most of the currently existing Relational Databases have failed in solving some of the complex modern problems like :- Continuosly changing nature of data - structured, semi-structured, unstructured and polymorphic data.
- Applications now serve millions of users in different geo-locations, in different timezones and have to be up and running all the time, with data integrity maintained
- Applications are becoming more distributed with many moving towards cloud computing.
What is Structured Data?
Structured data is usually text files, with defined column titles and data in rows. Such data can easily be visulaized in form of charts and can be processed using data mining tools.What is Unstructured Data?
Unstructured data can be anything like video file, image file, PDF, Emails etc. What does these files have in common, nothing. Structured Information can be extracted from unstructured data, but the process is time consuming. And as more and more modern data is unstructured, there was a need to have something to store such data for growing applications, hence setting path for NoSQL.NoSQL database types :
- Document Databases : In this type, key is paired with a complex data structure called as Document. Example : MongoDB
- Graph stores : This type of database is ususally used to store networked data. Where in we can relate data based on some existing data.
- Key-Value stores : These are the simplest NoSQL databases. In this each is stored with a key to identify it. In some Key-value databases, we can even save the typr of the data saved along, like in Redis.
- Wide-column stores : Used to store large data sets(store columns of data together). Example : Cassandra(Used in Facebook), HBase etc.
DynamoDB
DynamoDB is a fully-managed NoSQL database service designed to deliver fast and predictable performance. It uses the Dynamo model in the essence of its design, and improves those features.
- Allows users to create databases capable of storing and retrieving any amount of data, and serving any amount of traffic.
- Automatically distributes data and traffic over servers to dynamically manage each customer's requests, and also maintains fast performance.
- Uses a NoSQL model, which means it uses a non-relational system.
- Allows simple use of multiple languages: Ruby, Java, Python, C#, Erlang, PHP, and Perl.
Advantages:
Limtations :
- Capacity Unit Sizes − A read capacity unit is a single consistent read per second for items no larger than 4KB. A write capacity unit is a single write per second for items no bigger than 1KB.
- Provisioned Throughput Min/Max − All tables and global secondary indices have a minimum of one read and one write capacity unit. Maximums depend on region. In the US, 40K read and write remains the cap per table (80K per account), and other regions have a cap of 10K per table with a 20K account cap.
- Provisioned Throughput Increase and Decrease − You can increase this as often as needed, but decreases remain limited to no more than four times daily per table.
- Table Size and Quantity Per Account − Table sizes have no limits, but accounts have a 256 table limit unless you request a higher cap.
- Secondary Indexes Per Table − Five local and five global are permitted.
- Projected Secondary Index Attributes Per Table − DynamoDB allows 20 attributes.
Primary Key
The Primary Keys serve as the means of unique identification for table items, and secondary indexes provide query flexibility. DynamoDB streams record events by modifying the table data.The Table Creation requires not only setting a name, but also the primary key; which identifies table items. No two items share a key. DynamoDB uses two types of primary keys −
- Partition/ Primary Key − Internally, DynamoDB uses the key value as input for a hash function to determine storage.
- Composite Primary Key − It consists of two attributes.
- The partition key :DynamoDB applies a hash function, and stores items with the same partition key together; with their order determined by the sort key.
- The sort key.
Secondary Indexes
These indexes allow you to query table data with an alternate key. Though DynamoDB does not force their use, they optimize querying.DynamoDB uses two types of secondary indexes −
- Global Secondary Index − possesses partition and sort keys, which can differ from table keys.
- Local Secondary Index − possesses a partition key identical to the table, however, its sort key differs.
Steps for Create Table from AWS Console
2.
No comments:
Post a Comment