Wednesday 28 January 2015

MongoDB

Hello everyone,
This post is about mongoDB database, its installation process and its implementation through python.

 MongoDB is a scalable, open source, high performance, document oriented database. It is an open source product and developed and supported by a company called 10gen. MongoDB is available under general public license for free and commercial license from the manufacture.

MongoDB falls into a class of databases called document oriented databases. Document oriented databases is a class that falls under the broader category called NoSQL databases. Overall databases can be classified as per following figure:

What is extra in NoSQL?
1. Query language
2. Fast performance
3. Horizontal scalability

What is missing in NoSQL?
1. No join supports.
2. No complex transactions support.
3. No constraints support (It is not applicable at database level but at application level)

Comparison between RDBMS and MongoDB:


S. No. RDBMS MongoDB
1. Storage format Data is stored in form of table. Example:
Fname Lname Dept.
Hasan Mir 20
Bill Ellison 10
Data is stored in form of collections or in BSON format which is similar to JSON format in java script. Example: {"_id" : ObjectId(2jk4fr5), "Fname" : "Hasan", "Lname" : "Mir", "Dept" : "20" }, {"_id" : ObjectId("2jk4fr7"), "Fname" : "Bill", "Lname" : "Ellison", }
2. Object and instance concept In RDBMS table is a object and each row is instance of that object. In MongoDB collection is a object and each document stored in collection is instance of that object.
3. Flexibility RDBMS is less flexible than NoSQL as: - Each row must have same no. of fields/key in a table (as above example). - Single field multiple attribute property is not supported easily. - Embedded data model or nested structure is not supported in RDBMS. MongoDB is more flexible than RDBMS as: - Each document can have different no. of key value pair (as above example). - Single field multiple attribute property is supported easily as multiple values can be stored for a key easily. Example: {"_id" : ObjectId("2jk4fr5"), "Fname" : "Hasan", "Project" : ["p1", "p2"], "Dept" : "20" } - Embedded data model (document inside the document) is supported in MongoDB. Example: {"_id" : "1234", "name" : "Hasan Mir", "Address" : [{"street" : "45", "City" : "Goa" }, {"street" : "78", "City" : "Dhule" }] }
4. Primary Key. It should be created by programmer explicitly. It is automatically generated if not mentioned by programmer and always denoted by key "_id".
5. Schema RDBMS is schema dependent. MongoDB is schema independent.
6. Horizontal scalability. Does not support. It supports.
7. Big data problem Does not capable to handle problem arise by data explosion. It is capable.
8. Information conversion Information is stored in form of relation hence when implemented with OOP languages data conversion is required. Information is stored in form of objects & its instances hence no conversion is required when implemented with OOP languages. This makes mongoDB faster.


Features of MongoDB:
1. It supports Ad hoc queries means it supports searches by field, range queries and regular expression searches.
2. Indexing is supported means any field in a document can be indexed.
3. Master-slave replication is supported. A master can perform read and write operation. A slave copies data from the master and can only be used for read or backup.
4. MongoDB can run over multiple servers and data is duplicated to give protection over hardware failure.
5. Automatic load balancing is built-in feature of mongoDB.
6. Horizontal scalability is most important feature of mongoDB. It means new systems can be added to existing database. Since performance is linearly proportional to no. of computers so it also increases performance.
7. Capped collection is supported means this type of collection maintains insertion order and, once the specified size is reached, behaves like a circular queue.
8. File storage system is supported by a special feature called GridFS.
9. Aggregation is supported through map reduce feature.
10. In query language java script function can be used which is a strongest feature.
11. It offers special support for locations since it understand longitude & latitude natively. 

Installation Process (For Ubuntu):
1. Download the MongoDB setup from the following link:
2. To install the mongoDB:
    Open the terminal (ctrl+Alt+t) and run the following commands (one by one). 
    
a. Import the public key used by the package management system:
    sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10

b. Create a list file for MongoDB:
    echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.list.d/mongodb.list

c. Reload local package database:
    sudo apt-get update
  
d. Install the MongoDB packages:
(i) To install the stable version of mongoDB:
    sudo apt-get install mongodb-org

(ii) To install the specific version of mongoDB (e.g. 2.6.1 version):
 apt-get install mongodb-org=2.6.1 mongodb-org-server=2.6.1 mongodb-org-shell=2.6.1 mongodb-org-mongos=2.6.1 mongodb-org-tools=2.6.1

Run MongoDB:

To start mongoDB issue the following command on terminal:

sudo service mongod start

(mongod is the primary daemon process for the MongoDB system. It handles data requests, manages data format, and performs background management operations.)

To stop mongoDB:

sudo service mongod stop

MongoDB Implementation through python:

1: Install pymongo library. Then import it through following code to connect to the mongoDB.
    from pymongo import MongoClient
2: Set the connection string. Format for the connection string is MongoClient(“<Port address>”, “<port no.>”)
connection = MongoClient("localhost, 27017")
3: Now connect your database name and the collection name to mongoDB through above connection variable and store it in another variable for convenience. Here db variable is used. Students is the name of database and mva is the name of collection.
    db = connection.students.mca
4: Create document through following code:
    db.insert({'name':student_name,'grade':student_grade})
5: Delete the document through following code. It will delete the document           where name is equal to ‘a’.
   db.remove({'name': a})
6. Update the document through following instruction. It update the document where name is equal to ‘a’ by setting its grade value to ‘A++.’
   db.update({'name':'a'}, {'$set': {'grade':'A++'}})
7. View the document as follow:
(i) To view the single document from the collection use following instruction. It will print the first document from the collection. TO view the specific document condition can be provided I the round braces as in the remove instruction.
   print db.find_one()
(ii) To view all documents present in the collection use following instruction.
   db.find()