MongoDB Structure - DB, Collections, and The Document Model

Learn via video courses
Topics Covered

Overview

MongoDB, in its MongoDB basics, uses a flexible document model to store data, which consists of databases that contain collections of documents. Each document within MongoDB basics is a set of key-value pairs and can have a variable structure, allowing for nested fields and arrays. The document model, employed throughout MongoDB basics, is utilized for various operations such as querying, updating, and indexing data.

MongoDB Databases

In MongoDB basics, MongoDB organizes data into databases, each of which can contain one or more collections of documents. To choose a specific database to work within the Mongosh shell, you can use the use <db> command, like so:

For example, to select a database named "mydb", you can run:

How to Create a Database

To create a database in MongoDB basics, you can use the following steps:

  • Connect to your MongoDB server using the MongoDB shell or a MongoDB driver.
  • Use the use command to specify the name of the database you want to create.
  • If the database does not exist, it will be created.

For example, to create a database named my_database, in MongoDB basics you would use the following command:

Once the database has been created, you can start inserting documents into it.

MongoDB Collections

A collection is a group of related documents stored in the same database. It can be thought of as a table in a relational database but with a more flexible schema design.

In MongoDB basics, collections are created automatically when you insert the first document into them, or you can create them explicitly using the createCollection() method.

How to Create a Collection in MongoDB

To create a collection in MongoDB basics, you can use the following steps:

  • Connect to your MongoDB server using the MongoDB shell or a MongoDB driver.
  • Use the db.createCollection() method to specify the name of the collection you want to create.
  • If the collection does not exist, it will be created.

For example, to create a collection named my_collection in the my_database database, you would use the following command:

Once the collection has been created, you can start inserting documents into it.

What is Explicit Collection

An explicit collection, within the realm of "mongodb basics," refers to a collection that is created explicitly using the db.createCollection() method.

If you wish to build an explicit collection in MongoDB basics with certain parameters or configurations, such as specifying a capped size, maximum number of documents, or custom validation rules, use the db.createCollection() method.

For example, you can use the following code to build a collection named "users" with a capped size of 10000 bytes and a maximum of 100 documents:

The capped option is set to true in this case, indicating that the collection is capped and has a fixed size. The size parameter defines the collection's maximum size in bytes, whereas the max option specifies the collection's maximum number of documents.

Document Validation in MongoDB

MongoDB basics has a document validation which is a feature that allows you to define rules for the format and content of documents in a collection. When document validation is enabled, MongoDB checks new documents against these rules and rejects those that do not comply.

Example

The following schema enforces that all documents in a collection named "users" must have a name field that is a non-empty string and an age field that is a positive integer:

You can then enable document validation on the "users" collection in MongoDB basics by running the following command:

After enabling validation, any attempt to insert or update a document that does not match the validation schema will result in an error.

How to Modify Document Structure in MongoDB?

In MongoDB basics, you can modify the structure of a document using the update() method or the updateOne() method.

For example, to add a new field to all documents in a collection, you can use the updateMany() method with the $set operator like this:

This will add a new field called newField to all documents in the myCollection collection and set its value to "default value". The {} argument in the first parameter of the updateMany() method matches all documents in the collection.

What are Unique Identifiers?

In MongoDB basics, an immutable UUID is a unique identifier for a MongoDB collection. It is generated using a timestamp, node ID, and random data.

The UUID is used to identify the collection during replication and sharding operations. You can obtain the UUID of a collection using the listCollections command or the db.getCollectionInfos() method.

Documents in MongoDB

In MongoDB basics, MongoDB stores data in BSON (Binary JSON) documents, which is a binary-encoded format used to represent JSON-like documents. BSON is designed to be more efficient than JSON and offers additional data types.

BSON includes all the data types available in JSON, such as strings, numbers, arrays, and objects. Additionally, BSON introduces new data types like Date, Binary Data, and ObjectId.

Document Structure

The structure of the document is similar to that of a JSON object, where each field is represented by a name and a value.

It is represented using the following structure:

A field in MongoDB can have any data type that exists in BSON, including arrays, documents, and arrays of documents.

For example, consider the following document:

The document includes an ObjectId in the _id field, which is a unique identifier created by MongoDB for each document.

The name field contains a nested document with two fields, first and last, which groups related fields together.

The birth and death fields hold dates and times in a standardized format of the Date type.

The contribs field is an array of strings, allowing for storage of multiple values in a single field.

The views field uses the NumberLong data type to store large integer values exceeding the range of a regular int data type.

By allowing such a wide range of data types and structures, MongoDB offers a flexible schema design that can accommodate a variety of data models and use cases.

Field Names

Field names are represented as strings.

When creating documents in MongoDB, there are specific limitations to field names that should be observed:

  • The _id serves as the primary key and must have a unique value within the collection, cannot be an array, and subfield names can't start with $.
  • Field names can't have the null character, but dots and dollar signs are allowed.
  • MongoDB 5.0 has some enhancements for using ($) and (.) symbols, but restrictions still apply.
  • While BSON can have multiple fields with the same name, most MongoDB interfaces do not support it.
  • MongoDB-generated documents may contain duplicate fields, but user documents can't have duplicate fields added by MongoDB.

Field Value Limits

MongoDB versions up to 2.6 have a restriction on the size of values that can be stored in indexed fields, known as the Maximum Index Key Length. This constraint is critical for optimizing data search and sort performance, as including large values in indexed fields can have negative effects. If a document has a field value exceeding this limit, creating an index will not be possible. From MongoDB 3.2 and above the Maximum Index Key Length limitation has been increased, and it is now possible to create indexes on fields with larger values.

Dot Notation in MongoDB

In MongoDB basics, Dot notation is a syntax used to access elements of an array or fields of an embedded document.

Arrays

Arrays are used to store multiple values in a single field. To access or specify an element of an array using its index position, you can concatenate the array name with the dot notation and the index position in quotes.

For example, as given in the following document:

To access the second element in the "contribs" array, you can use the dot notation "contribs.1".

Embedded Documents

In MongoDB basics, to access a field within an embedded document using dot notation, you need to concatenate the name of the embedded document with the dot (.) and the name of the field you want to access, and then enclose it in quotes. This allows you to navigate through the nested structure of the document and retrieve or modify the desired field.

To specify the last field in the name document, you would use the dot notation name.last.

To specify the number field in the phone document in the contact field, you would use the dot notation contact.phone.number.

Limitations in MongoDB Documents

Documents possess the subsequent features:

Document Size Limit

The maximum size of a BSON document in MongoDB basics is 16 megabytes, which is in place to prevent excessive RAM usage and bandwidth consumption during transmission. For documents larger than this limit, MongoDB offers the GridFS API (A MongoDB API that allows developers to store and retrieve large files, such as images, audio files, and videos, as documents in a MongoDB database).

Document Field Order

The fields in a BSON document are ordered, unlike JavaScript objects.

Field order in Query Operations

Field ordering is significant when comparing documents in a query, and the query engine may reorder fields during query processing for efficient execution.

Intermediary and final results of a query that uses projection operators may have their field order changed, so it's not advisable to depend on specific field ordering in the results returned by the query.

Field order in Write Operations

In MongoDB basics, MongoDB preserves the order of the document fields in write operations except for cases where the _id field is always the first field in the document, or renaming field names may result in the reordering of fields in the document.

More Uses of MongoDB Documents

MongoDB's document structure is not only used for defining data records, but also for other purposes such as specifying query filters, update specification documents, and index specification documents.

Query Filter

Query filter documents are used to define the conditions based on which records are selected for read, update, or delete operations.

To specify these conditions, you can use expressions like : to define the equality condition, as well as query operator expressions.

Update Specifications

In MongoDB basics, Update specification documents are used to modify data in specific fields during an update operation in MongoDB. These documents use update operators to specify the modifications to perform on the fields.

Index Specification

Index specification documents are used to define the field to index and the type of index to create in MongoDB.

More about MongoDB Documents

You can download the Guide to learn more about the MongoDB document model. The guide includes a presentation on data modeling with MongoDB, a white paper that covers the best practices and considerations for moving from an RDBMS data model to MongoDB, a reference schema that shows the equivalent RDBMS schema, and an application modernization scorecard.

FAQs

Q. What is the maximum number of indexes a collection can have in MongoDB?

A. The maximum number of indexes a collection can have in MongoDB is 64.

Q. How can you access a field of an embedded document in MongoDB?

A. You can access a field of an embedded document in MongoDB using dot notation, by concatenating the embedded document name with the field name and separating them with a dot (.)

Conclusion

  • MongoDB is a document-oriented data model database system. Databases in MongoDB comprise collections, and each collection contains a set of documents.
  • Every document in a collection has a unique "_id" field that serves as the collection's main key.
  • MongoDB documents are stored in BSON format, which is a binary-encoded equivalent of JSON. MongoDB can now support a wide range of data formats thanks to BSON, including arrays and embedded documents.
  • MongoDB uses dot notation to access fields of embedded documents and array items.
  • The conditions for selecting records for read, update, and delete actions are specified in query filter documents.
  • Update specification papers define changes to individual fields during an update process using update operators.
  • Index specification documents define the fields to index and the type of index to create.