For the most part, users of MongoDB can treat it as a black box. When trying to understand performance characteristics or looking to get a deeper understanding of the system, it helps to know a little bit about the internals of MongoDB, though.
Documents in MongoDB are an abstract concept—the concrete representation of a document varies depending on the driver/language being used. Because documents are used extensively for communication in MongoDB, there also needs to be a representation of documents that is shared by all drivers, tools, and processes in the MongoDB ecosystem. That representation is called Binary JSON (BSON).
BSON is a lightweight binary format capable of representing any MongoDB document as a string of bytes. The database understands BSON, and BSON is the format in which documents are saved to disk.
When a driver is given a document to insert, use as a query, and so on, it will encode that document to BSON before sending it to the server. Likewise, documents being returned to the client from the server are sent as BSON strings. This BSON data is decoded by the driver to its native document representation before being returned to the client.
The BSON format has three primary goals:
BSON is designed to represent data efficiently, without using much extra space. In the worst case BSON is slightly less efficient than JSON and in the best case (e.g., when storing binary data or large numerics), it is much more efficient.