Hey guys, as most of you know, request validation in building an API should be a must and very strict. In this article we will go through on how you can implement request validation when building an API (in python) and why is it so important.
- Problem At Hand
- Request Validation Concept
- Introduction to Marshmallow
- Request Validation in Production
- Important Examples
Problem At Hand
Let’s start by having a scenario in which you have a simple API (totally hypothetical) which takes an array of numbers in a POST request calculates the sum of numbers (integers/decimals), saves the answer in some database and also returns in the response. The example API could be – /add
with the given requestBody
{ "numbers": [1, 2, 3.0, 4, 5] }
Now this may seem very simple to you, where we have a simple array of numbers and have the following simple code for the add
function.
def add(numbers): sum = 0 for num in numbers: sum += num return sum
Now this will work just fine till the time we have a “numbers” array. Now suppose the client sends the array –
{ "numbers": [1, 2, 3.0, 4, 5, "hello"] }
Now, while this maybe a perfectly correct JSON as a requestBody, your addition algorithm would fail and application would crash with the following error –
Traceback (most recent call last): File "<stdin>", line 2, in <module> TypeError: unsupported operand type(s) for +=: 'float' and 'str'
Obviously, you would think, why would we ever have a string into the array in the first place? Yes, ideally you would not, but consider the case where you have your API published and some other 3rd party client (be it any type of app / other service) consuming your API, and by mistake they sent this. So rather than the client receiving a graceful error, your application’s thread would get killed and the client would receive a generic HTTP 500 INTERNAL SERVER ERROR
.
Ideally, your application should never exit / behave unexpectedly. You should handle all the scenarios gracefully..!
Request Validation
Now to solve this, you can have initial check where you iterate the array and check if each element meets your requirement and then you pass the array to your addition function (obviously you can check it while adding, but the scope is not about performance in this article but simplicity). This approach would seem good, where if you find any issue, you can just return a HTTP 400 BAD REQUEST
return status saying that the array does not have the proper format. Considering this, assume a type of real situation where you have the following JSON as requestBody for some API to add a new employee-
{ "name": "Shrey", "countryCode": 91, "age": 25, "address": { "city": "Abc", "stateCode": "XY", "zipcode": 12345 }, "employeeType": "E1", "department": "D2" }
Now, to validate this via if/else checks, just imagine how many custom check you would need if we needed to check the following stuff (all hypothetical assumptions) –
- Name should always be required.
- Country code should allow only a certain country codes in list – [1, 2, 91, 1050]
- Age should be greater than 0 but less than 120.
- Address is a nested object where city should have a minimum of 3 characters, stateCode should be a valid stateCode and zipcode should always be a 5 digit number.
- The department and employee has a M:N relationship where each department can have only defined set of employee types.
Now this will get complicated to write all your request validation and cases in your view/controller and get’s tightly coupled with that particular view. What if there is a better way to achieve this, making it independent and strict? Obviously there is..! We will see how we can use marshmallow
package in python to achieve this (and yes, you can use this concept in any language or framework you want.)
Introduction to Marshmallow
Marshmallow is an awesome package which allows you to check, verify and serialise your objects over a well defined strict model, allowing you to decouple request validation logic from your views. This package is well suited for any type of Python API be it Flask, Django, Pyramid or any other..! A small example would be –
from marshmallow import Schema, fields, validate class PersonSchema(Schema): name = fields.String(required=True) age = fields.Integer(required=True, validate=validate.Range(min=0, max=120))
Now to validate your requestBody, all you need to do is –
from custom_responses import BadRequest # Extract POST body or the arguments of GET request (request.GET.dict()) requestBody = request.POST errors = PersonSchema().validate(requestBody) if errors: # raise a 400 Bad Request with the errors dictionary. raise BadRequest("Validation failed.", errors=errors)
So all you need to do is build a Validation Schema for each API as a Model Class and use that to validate each of your incoming API calls. You can also reuse your models for different related APIs where you have similar structure.
How to use Request Validation in Production
Now as we saw, we can build our own Schemas by inheriting marshmallow’s Schema class, and can call the .validate(data_dict)
function to validate the data. Now, to write it in the views/controllers is okay, but you might want to decouple this outside your view in either some type of middleware, or you could write a simple decorator to handle this and decorate each of your API. In that way, you just need to pass in your validation Schema to the decorator and your view will only contain the business logic of your API. Example decorator (just a pseudo code example)-
# Only a pseudo code example def request_validator(validation_schema, *args, **kwargs): def validator(original_function): @wraps(original_function) def inner(request, *args, **kwargs): if method in ["GET", "HEAD"]: data = request.GET.dict() else: data = request.POST errors = validation_schema().validate(data) if errors: # raise a 400 HTTP response with errors. raise BadRequest("Validation Failed.", errors) return original_function(request, *args, **kwargs) return inner return validator
Then just wrap your API with this decorator –
from custom_utils import request_validator from custom_schemas import SomeAPISchema @require_http_methods(["GET"]) @request_validator(validation_schema=SomeAPISchema) def some_api(request): # do stuff return response
This way, you can actually scale your code, and all your Schemas and validations can rest independently of your view code.!
Important Validation Examples
Marshmallow is a very powerful tool to create really complex schemas and can apply very strict, custom, complex validations just the way you want. Almost everything is possible. Let’s recreate the example of add employee request body given above with the rules that are given below the example.
from marshmallow import Schema, ValidationError, fields, validate class AddressSchema(Schema): city = fields.String(required=True, validate=validate.Length(min=3)) stateCode = fields.String(default="XY", validate=validate.OneOf(["OH", "XY", "AB"])) zipcode = fields.Integer(validate=validate.Range(min=10000, max=99999)) class EmployeeSchema(Schema): name = fields.String(required=true) countryCode = fields.Integer(validate=validate.OneOf([1, 2, 91, 1050])) age = fields.Integer(validate=validate.Range(min=0, max=120)) address = fields.Nested(AddressSchema) employeeType = fields.String(required=True) department = fields.String(required=True) # extra examples marks = fields.List(fields.Integer()) extraInfo = fields.Dict() @validates("age") def custom_name_validator(self, value, **kwargs): if age > 25 and age < 35: raise ValidationError("Person in age 25-35 are not allowed.") @validate_schema def validate_employee_with_department(self, data, **kwargs): if not ( ( data["employeeType"] in ["E1", "E2", "E3"] and data["department"] in ["D1", "D2"] ) or ( data["employeeType"] in ["E4", "E5", "E6"] and data["department"] in ["D3", "D2", "D6"] ) ): raise ValidationError("Incorrect mapping.")
Do checkout the official marshmallow documentation for other validation functions, field types, parameters and customisations..!! Do like and comment on how this helped you in your coding life.! Keep sharing with your friends too..!!
Leave a Reply