Overview
In the evolving landscape of NoSQL databases, MongoDB stands out as a popular choice for its flexibility and ease of use. This flexibility, however, sometimes necessitates a system of checks and constraints to ensure data integrity and validation. With MongoDB 3.6 and newer versions, document validation rules have been significantly boosted with the introduction of the $jsonSchema
validator. In this tutorial, we’ll explore how to leverage $jsonSchema
to validate documents in MongoDB, enhancing your ability to maintain consistent data models.
Introduction to JSON Schema Validation in MongoDB
JSON Schema is a powerful tool for validating the structure of JSON documents. Prior to the inclusion of $jsonSchema
in MongoDB, the schema validation options were somewhat limited, and this tool allows for a more comprehensive and standards-based approach. $jsonSchema
provides a rich set of validation keywords.
The Fundamentals
Before diving into examples, be sure your MongoDB server is running and accessible. For these examples, we’ll use the mongo
shell. You can also use a MongoDB driver in your preferred programming language.
Basic Document Validation with $jsonSchema
Let’s start by creating a new collection with basic validation rules. The following example defines a schema requiring that each document in the collection have a string-type name
and an integer-type age
.
db.createCollection("users", {
validator: {
$jsonSchema: {
bsonType: "object",
required: [ "name", "age" ],
properties: {
name: {
bsonType: "string",
description: "must be a string and is required"
},
age: {
bsonType: "int",
minimum: 0,
description: "must be an integer and is required"
}
}
}
}
});
Attempting to insert a document missing one of the required fields or if the types don’t match, an error message will signal the violation:
db.users.insert({ name: "John Doe", age: "unknown" });
// Error: Document failed validation
Advanced Field Validation
Expanding on basic validation, $jsonSchema
allows for more complex rules, such as conditional validation and data format verification. Here’s a schema that validates documents with more stringent rules for a user’s email and membership status:
db.runCommand({
collMod: "users",
validator: {
$jsonSchema: {
bsonType: "object",
required: [ "name", "age", "email", "isMember" ],
properties: {
name: {
bsonType: "string"
},
age: {
bsonType: "int",
minimum: 18
},
email: {
bsonType: "string",
pattern: "^.+@.+$"
},
isMember: {
bsonType: "bool"
}
}
}
}
});
This enforces that the email
field matches a basic pattern for email addresses, and the user’s age
must be at least 18. Inserting a document that violates these constraints will fail:
db.users.insert({ name: "Jane Doe", age: 17, email: "[email protected]", isMember: true });
// Error: Document failed validation
Logical Conditions and Compound Validation
Beyond individual field validation, $jsonSchema
can impose compound conditions that apply to multiple fields for more complex rule sets. Consider this extended example, adding logical conditions that interrelate the fields:
db.runCommand({
collMod: "users",
validator: {
$jsonSchema: {
bsonType: "object",
required: [ "name", "age", "memberTier" ],
properties: {
name: {
bsonType: "string"
},
age: {
bsonType: "int",
minimum: 18
},
memberTier: {
enum: [ "gold", "silver", "bronze" ],
description: "can only be one of the three allowed tiers"
}
},
if: {
properties: { memberTier: { const: "gold" } }
},
then: {
required: [ "premiumFeatures" ],
properties: {
premiumFeatures: {
bsonType: "array",
items: {
bsonType: "string"
}
}
}
},
else: {
properties: {
premiumFeatures: {
bsonType: "null"
}
}
}
}
}
});
This complex schema introduces a logical condition: if memberTier
is “gold”, the premiumFeatures
field (an array of strings) must be present. If any other tier is specified, premiumFeatures
must be set to null or excluded.
db.users.insert({
name: "Emily Clark",
age: 25,
memberTier: "gold",
premiumFeatures: ["priority support", "extended warranty"]
});
// Successfully inserted
db.users.insert({
name: "Michael Smith",
age: 30,
memberTier: "silver"
});
// Successfully inserted
Validation on Update Operations
$jsonSchema
does not apply only to insert operations but can also be enforced on updates. MongoDB ensures that any changes to documents adhere to the defined schema during updates.
db.users.update(
{ name: "John Doe" },
{ $set: { age: "thirty" } }
);
// Error: Document failed validation
Managing Existing Collections
You can add, modify, or remove validation rules for existing collections using the collMod
command. It provides the flexibility to evolve your schema with your application’s needs.
db.runCommand({
collMod: "users",
validator: { /* updated $jsonSchema rules */ }
});
Tips for Effective Schema Design
- Iterative Schema Development: Start with basic requirements and elaborate as needs become clearer over time.
- Consistent Field Definitions: Reuse common schema definitions to maintain consistency.
- Applying Default Values: Consider specifying defaults where applicable to streamline document insertion.
Performance Considerations
Be mindful of the performance implications of complex validation rules, especially for large collections or documents. Extensive use of regular expressions, conditional validation, and logical operations might impact write throughput. Profile writes to see if schema validation has become a bottleneck and refine the rules as needed.
Conclusion
This tutorial introduced MongoDB’s $jsonSchema
operator for document validation with a focus on creating robust data models while balancing flexibility. JSON Schema empowers developers to enforce necessary constraints, promote consistency, and ensure data integrity throughout a MongoDB instance.