Skip to main content

How MongoDB Sorting Works for Scalar Values

· 7 min read
Chi Fujii
FerretDB Team

How MongoDB Sorting Works for Scalar Values

In this blog post, we delve into the process of sorting scalar values in MongoDB.

Sorting in MongoDB involves comparing BSON values to ascertain their relative order – whether one value is equal to, greater than, or less than another. The resultant sorted values can be in either ascending or descending order.

When comparing different BSON types, the BSON comparison order is used.

BSON comparison order

If two BSON values share the same type, their values are compared to determine which is greater or less.

However, if the BSON types are different, a predefined BSON comparison order is used.

The table below shows the predefined BSON comparison order for each BSON type.

Order of Comparison
(lowest to highest)
BSON Types
1Null
2Numbers (Integer, Long, Double, Decimal)
3String
4Object
5Array
6BinData
7ObjectId
8Boolean
9Date
10Timestamp
11Regular Expression

Comparison of values with different BSON types

To compare values of different BSON types, look at the predefined comparison order given to each type. For example, the Null BSON type has the lowest order, which is 1. This means Null is less than any other BSON values. The Boolean BSON type has an order of 8. So, if you're comparing it with an ObjectId type that has a lower order, the Boolean is greater. But if you're comparing it with a Timestamp type that has a higher order, the Boolean is less.

The key to comparing different BSON types is simply to check their predefined orders. Arrays are an exception, and we'll talk about them in another blog post.

Number comparison

Even though Numbers come in various BSON types – Integer, Long, Double, and Decimal – they're treated as the same type when comparing. This means the focus is on the actual numerical values, not on whether they're Integer, Long, Double, or Decimal. For example, an Integer value of 0 is seen as the same as a Double value of 0.0 when comparing them.

Null and non-existent field comparison

For comparison, a non-existent field is equivalent to Null. This means that a field v with Null value {v: null} is considered the same as a non-existent v field in {}.

Examples showcasing sorting for scalar values

Let's create an outfits collection using the following query to insert documents.

db.outfits.insertMany([
{ _id: 1, name: 'flip flops', size: 'M', color: 'blue' },
{ _id: 2, name: 'sandals', size: 9, color: null },
{ _id: 3, name: 'boots', size: 8, color: 'black' },
{ _id: 4, name: 'sneakers', size: 8.5, color: 'blue' },
{ _id: 5, name: 'slippers' }
])

The outfits collection includes a size field that represents various BSON types. For instance, the document for flip flops contains a String value in this field, while sandals and boots have Integer values. The sneakers document has the size field as a Double value, and the slippers document lacks the size field altogether. To sort these documents in ascending order based on the size field, you would use a sorting order of 1 and execute the following query.

db.outfits.find().sort({ size: 1 })
[
{ _id: 5, name: 'slippers' },
{ _id: 3, name: 'boots', size: 8, color: 'black' },
{ _id: 4, name: 'sneakers', size: 8.5, color: 'blue' },
{ _id: 2, name: 'sandals', size: 9, color: null },
{ _id: 1, name: 'flip flops', size: 'M', color: 'blue' }
]

The sorted output starts with the slippers document, which lacks a size field. According to our earlier discussion on how Null and non-existent fields are equivalent, it has the lowest BSON type and appears first.

Next in line are documents with Number values in the size field. Numbers hold a higher BSON comparison order than Null, so they appear after slippers document with the missing size field. Specifically, we see boots with an Integer BSON type, followed by sneakers with a Double BSON type, and then sandals also with an Integer BSON type. Why this particular order? Because all Numbers, regardless of their BSON type, are considered equivalent for comparison. Only the actual numerical values matter. In this case, boots with a size of 8 comes before sneakers with a size of 8.5, which in turn precedes sandals with a size of 9.

Lastly, we have flip flops with a String BSON type. Strings have a higher BSON comparison order than Numbers, so this document comes at the end of our sorted list.

To sort the documents in descending order by the size field, you would use a sorting order of -1 and execute the following query.

db.outfits.find().sort({ size: -1 })
[
{ _id: 1, name: 'flip flops', size: 'M', color: 'blue' },
{ _id: 2, name: 'sandals', size: 9, color: null },
{ _id: 4, name: 'sneakers', size: 8.5, color: 'blue' },
{ _id: 3, name: 'boots', size: 8, color: 'black' },
{ _id: 5, name: 'slippers' }
]

This time, the output is sorted first by flip flops with String size field, then by size field with Numbers sandals, sneakers and boots and finally slippers with a non-existent size field.

Using _id as the second sort field

Suppose you want to sort the documents by the color field. You encounter multiple documents with the color blue, and one document has a Null value for this field while another is missing it altogether.

For instance, flip flops has a Null value in the color field, and the slippers document lacks this field. Since Null and non-existent fields are considered equivalent in sorting, either could appear first. In situations like this, the default order in which the records were retrieved from the database is applied.

To maintain a consistent sort order, it's advised to use _id as a secondary sorting option. In this setup, if multiple documents have the same or equivalent color values, it will rely on the _id field for sorting. The _id field value is unique within the collection which ensures that the sorted output remains consistent.

db.outfits.find().sort({ color: 1, _id: 1 })
[
{ _id: 2, name: 'sandals', size: 9, color: null },
{ _id: 5, name: 'slippers' },
{ _id: 3, name: 'boots', size: 8, color: 'black' },
{ _id: 1, name: 'flip flops', size: 'M', color: 'blue' },
{ _id: 4, name: 'sneakers', size: 8.5, color: 'blue' }
]

The output shows that the sandals document is sorted before slippers, despite both having equivalent values in the color field. This is because the sort mechanism uses the secondary _id field for ordering, and sandals has a lower _id value than slippers.

Likewise, both flip flops and sneakers have the same color value, but flip flops comes first because its _id value is lower than that of sneakers.

Roundup

This blog post has shown how BSON comparison order functions in sorting and how scalar values are compared against each other. In an upcoming blog post, we will delve into how sorting of Arrays and Objects works.

Stay tuned!