-> Sharding is the process of storing data records across multiple machines and it is MongoDB’s approach to meeting the demands of data growth.
-> As the size of the data increases, a single machine may not be sufficient to store data nor provide an acceptable read and write throughput. Sharding solves the problem with horizontal scaling. With sharding you add more machines to support data growth and demands of read and write operations.
-> In replication all writes goes to master node.
-> Single replica set has limitation of 12 nodes.
-> Memory can’t be large enough when active dataset is big.
-> Vertical scaling is too expensive.
-> Shards are used to store data. They provide high availability and data consistency. In production environment each shard is a separate replica set.
-> Config servers store the cluster’s metadata. This data contains a mapping of the cluster’s data set to the shards. The query router uses this metadata to target operations to specific shards.
-> Query routers are basically mongo instances, interface with client applications and direct operations to the appropriate shard. The query router processes and targets the operations to shards and then returns results to the clients. A sharded cluster can contain more than one query router to divide the client request load.