This page looks best with JavaScript enabled

TDD Approach to Create an Authentication System With FastAPI Part 2

 ·   ·  ā˜• 15 min read

Introduction

In the last post in the series we did the project setup and seen some FastAPI basics along with red-green-refactor mantra of TDD. We know that FastAPI comes with inbuilt integration of SwaggerUI. We also know that FastAPI makes use of non-blocking code to make who thing lightning fast. With that said, let’s jump into our second part of the series which is about database setup and user registration.

Series Index

I love Postgres. I don’t have much strong opinion regarding Postgres, but I have seen this being used in production. I have worked with MySQL in past and Postgres is quite compatible with it. MySQL knowledge is transferrable to Postgres. On top of that, Postgres brings new things to the plate. One of the best benefits is the licensing. Postgres is community driven, while MySQL is owned by Oracle. Postgres has views. I’m also a big fan of Postgres' inbuilt data types. And I love the fact that Postgres is extensible.

While you read this post, take a moment to connect with me on LinkedIn.

Setting up PostgreSQL server

I’m not going to install PostgreSQL on my host system directly. Instead, I’ll install it on Docker layer. Make sure you have docker installed already.

I’m going to write a docker-compose.yml file.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
version: '3'

services: 
  database:
    image: postgres:12
    container_name: postgres_database
    environment:
      - POSTGRES_PASSWORD=postgres
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

That’s enough for getting us started. We don’t need any fancy setup right now. I could have not written this compose file and instead passed these commands as flags. But I have made this compose file for better documentation.

Things to note here is…

  1. I’m using version 12 of postgres.
  2. Postgres password for the user is postgres. If you head over to the environment variable section in https://hub.docker.com/_/postgres/, you’ll see that the default user is also named postgres.
  3. If you read further in that section, you’d see that default value for POSTGRES_DB is also postgres.
  4. We have mapped port 5432 inside of docker to the host system’s 5432. In simple words, to our application, the postgres is on the host system where application is running.
  5. Then there is volumes stuff around. It is for data persistency, because containers don’t retain data when they are restarted.

Run the database server

To run the container out of above compose file:

$ docker compose up

This command will output quite a lot of log messages. Wait until the last line of the log says database system is ready to accept connections.

Test the database connection

This is the time we need to test our connection from our host system to docker container running postgres. We need something called database connector. There are different database connectors for different databases. For postgres itself there are around half a dozen connectors. But I love using psycopg (the actual package name is psycopg2).

I’ll install a package called psycopg2-binary instead of psycopg2 as the later one requires some additional steps and development packages to build it from source, which can actually be avoided in our case.

Let’s install this python package in our environment.

pip install psycopg2-binary

Now let’s enter Python REPL to create some table and insert some records into them.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
$ python
Python 3.7.10 (default, Jun  3 2021, 00:02:01) 
[GCC 7.3.1 20180712 (Red Hat 7.3.1-13)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import psycopg2
>>> conn = psycopg2.connect("dbname=postgres user=postgres password=postgres host=localhost")
>>> cur = conn.cursor()
>>> cur.execute("CREATE TABLE test (id serial PRIMARY KEY, num integer, data varchar);")
>>> cur.execute("INSERT INTO test (num, data) VALUES (%s, %s)", (100, "abc'def"))
>>> cur.execute("SELECT * FROM test;")
>>> cur.fetchone()
(1, 100, "abc'def")
>>> conn.commit()
>>> cur.close()
>>> conn.close()
  1. We enter the REPL on line 1. Then let’s jump onto line line 5 where we import psycopg2.
  2. Pay attention to the connection string on line 6 to the psycopg2.connect. The default dbname and user is set to postgres, remember from last section? Right? host we can set to localhost as we have already bind inside container port to outside.
  3. One line 7 you can see that I’m creating a cursor object. You may have heard of database cursor?
  4. On line 8, I create a table named test. It has 3 fields: id, num and data. This table resides inside the postgres database.
  5. On line 9, I insert sample data on this table. And from line 10-12 we fetch the same data to make sure the data has been written.
  6. On line 13, we commit the transaction. Why? Because of atomicity.
  7. At last, we close the cursor and the connection to the database.

If any of the steps have failed, please let me know.

Note: For sake of simplicity, we are not going to setup any custom role or databases, but in production environment security matters.

Test if data is in fact written

This time we are going to get into the container which running the postgres server and check by hand if the record we insterted with psycopg actually persist on the db instance.

For getting into container, we first need to know the name of the container. The name is the same as we specified in the container_name section of docker-compose.yml file. Check by doing is docker ps.

$ docker ps
CONTAINER ID   IMAGE         COMMAND                  CREATED             STATUS             PORTS                                       NAMES
04d9f5dc11c1   postgres:12   "docker-entrypoint.sā€¦"   About an hour ago   Up About an hour   0.0.0.0:5432->5432/tcp, :::5432->5432/tcp   postgres_database

We can see the name of the container in the NAMES section, which is postgres_database.

We do a docker exec to get a shell inside the container. Then we get into postgres CLI with psql command. -U postgres is to specify the role name.

$ docker exec -it postgres_database bash
root@04d9f5dc11c1:/# psql -U postgres
psql (12.8 (Debian 12.8-1.pgdg110+1))
Type "help" for help.

postgres=# \c postgres
You are now connected to database "postgres" as user "postgres".
postgres=# \dt test
        List of relations
 Schema | Name | Type  |  Owner   
--------+------+-------+----------
 public | test | table | postgres
(1 row)

postgres=# SELECT * FROM public.test;
 id | num |  data   
----+-----+---------
  1 | 100 | abc'def
(1 row)

postgres=# 
  1. The first thing after I got into postgres shell is to connect to the database postgres with the command \c postgres. postgres is the default database created for us when we start the postgres container for first time. You can override the default. Read Environment Variables section of https://hub.docker.com/_/postgres/.
  2. To describe the table (or say to show the schema of the table), I issued \dt test. This is exactly the same as we specified with psycopg.
  3. Finally I did a SELECT * FROM public.test to show all the entries in the test table. And we get the same result as we got with Python REPL with psycopg.

It is now confirmed that psycopg is doing it’s work. This is the last time we are touching psycopg in this tutorial. Later on, it will be handled by SQLAlchemy.

SQLAlchemy layer between database and app layer

You know what I like about the SQLAlchemy the most? It helps prevent the vendor lock-in in long term. If I want to use another SQL based RDBMS tomorrow, I just need to swap the connection string and install some database connector package.

Why SQLAlchemy

Another good thing about SQLAlchemy is it’s Object Relational Mapper. Spend more time writing business logic and less time dealing with app to database interaction. Most common interactions are already handled by ORMs.

But it has downside too… If you are a database ninja, you may find yourself crippled to the ORMs way of doing things. In that case you should write your own queries.

Unless you are keen towards database management and want them to study inside out. Please use SQLAlchemy or something similar to deal with connection pooling and stuff.

Prepare the database

I will go ahead and create a database and a table for the authentication system we are building.

To do so we’ll get into the container like we did before, and create a database called fastauth. Like so:

1
CREATE DATABASE fastauth;

That is all we need, and rest will be taken care by SQLAlchemy. But just for sake of information, I’ll put up the schema here.

1
2
3
4
5
6
CREATE TABLE user_info(
    id serial PRIMARY KEY,
    username VARCHAR(50) NOT NULL,
    password VARCHAR(500) NOT NULL,
    fullname VARCHAR(50) NOT NULL
);

Now we can go ahead and write python files which will be responsible for connecting with the database.

Although I can go ahead and write all the logic in a single file, but I will create 4 new files just for sake of self-documentation.

  • database.py: for connecting postgres
  • models.py: the jam of the sandwhich which talks to both side of the loaf, aka FastAPI and postgres
  • schemas.py: to marshal/unmarshal data on/off from/to request/response
  • crud.py: actions to manipulate data at postgres

If I have to catogarize above files into database application code. I would say that database.py and models.py are more of database related files, schemas.py is for both, and crud.py is mostly the application code.

database.py:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

SQLALCHEMY_DATABASE_URL = "postgresql+psycopg2://postgres:postgres@localhost/fastauth"

engine = create_engine(SQLALCHEMY_DATABASE_URL)

SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)

Base = declarative_base()

So what are create_engine, declarative_base and sessionmaker? Let’s go through one by one:

  1. create_engine: create_engine creates a new Engine instance. Which basically let’s SQLAlchemy deal with most of the database stuff. As you can see, I have passed create_engine a database string to connect to our postgres instance. Similarly, connection string for various other db servers can be passed. MySQL? Oracle? SQLite? MSSQL? There are many other dialects that SQLAlchemy supports.

  2. sessionmaker: is a factory function which creates an insance of Session. Sessions can be used with Python’s context managers amoung various other features. Dig deeper with sessions at Using the Session. We have bind our engine with the session. In other words, session is using our engine.

  3. declarative_base: Pardon my database knowledge, but declarative_base has something to do with models we are going to create in below sections. It maps directly to the tables we have in the database system. Read more about object-relation mapping at Mapping Python Classes

models.py:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from sqlalchemy import Column, Integer, String
from .database import Base

class UserInfo(Base):
    __tablename__ = "user_info"

    id = Column(Integer, primary_key=True, index=True)
    username = Column(String, unique=True)
    password = Column(String)
    fullname = Column(String)

Here we have used the declarative_base to create a UserInfo model. This represents a table in the database. We have different fields in the table such as id, username, password and fullname. The above code is self explanatory. The parameters to Column is the data type and some other column properties. Don’t be shy to lookup on internet if you don’t know about them.

schemas.py:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
from typing import List
from pydantic import BaseModel


class UserInfoBase(BaseModel):
    username: str
    fullname: str


class UserCreate(UserInfoBase):
    password: str


class UserInfo(UserInfoBase):
    id: int

    class Config:
        orm_mode = True

Schemas created here will be used for validation of HTTP requests. And also for sending responses. It won’t make much sense now. But I’ll poke you again when dealing with response and request.

crud.py:

Our crud.py is mostly the application code which uses models and schemas.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
from sqlalchemy.orm import Session
import bcrypt
from . import models, schemas

def get_user_by_username(db: Session, username: str):
    return db.query(models.UserInfo).filter(models.UserInfo.username == username).first()

def create_user(db: Session, user: schemas.UserCreate):
    hashed_password = bcrypt.hashpw(user.password.encode('utf-8'), bcrypt.gensalt())
    db_user = models.UserInfo(username=user.username, password=hashed_password, fullname=user.fullname)
    db.add(db_user)
    db.commit()
    db.refresh(db_user)

    return db_user

Reorganise files

Put all 4 files in a directory with same name as root directory.

Also create a empty file named __init__.py in the same directory. In my case, I’ve put them in fastauth.

I have also moved the test_main.py to a dedicated tests/ directory. It also has the same __init__.py.

Here is what my current directory structure looks like.

$ tree
.
ā”œā”€ā”€ docker-compose.yml
ā”œā”€ā”€ fastauth
ā”‚   ā”œā”€ā”€ crud.py
ā”‚   ā”œā”€ā”€ database.py
ā”‚   ā”œā”€ā”€ __init__.py
ā”‚   ā”œā”€ā”€ models.py
ā”‚   ā””ā”€ā”€ schemas.py
ā”œā”€ā”€ main.py
ā””ā”€ā”€ tests
    ā”œā”€ā”€ __init__.py
    ā””ā”€ā”€ test_main.py

2 directories, 9 files
$ pwd
/home/ec2-user/workspace/fastauth

Define requirement for registration endpoint

We have come a long way. Kudos to you.

Getting back to our main.py which has the only endpoint which is /ping. We need to add another endpoint to /users to our service. But before that we need to write tests. And to write tests, we need to have the requirements done.

Thinks about what our endpoints will look like in a finished product. Here is what I can think of:

  1. The consumer hits /users/register with GET method should get Method Not Allowed.

  2. The API consumer hits /users/register with a POST method:

    • Without body = Should get error about required body parameters.
    • With body:
      • Should check in database if the user exists, if so: Throws error that user exists
      • If user not found, returns a 201 response with passed username.

Let’s do this much first, then we’ll continue with the login system.

Functional requirments turned into test

tests/test_users.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import pytest
from fastapi.testclient import TestClient

from main import app

client = TestClient(app)


class TestUserRegistration:
    """TestUserRegistration tests /users/register"""

    def test_get_request_returns_405(self):
        """registration endpoint does only expect a post request"""
        response = client.get("/users/register")
        assert response.status_code == 405

    def test_post_request_without_body_returns_422(self):
        """body should have username, password and fullname"""
        response = client.post("/users/register")
        assert response.status_code == 422

    def test_post_request_with_improper_body_returns_422(self):
        """all of username, password and fullname is required"""
        response = client.post(
            "/users/register",
            json={"username": "santosh"}
        )
        assert response.status_code == 422

    def test_post_request_with_proper_body_returns_201(self):
        response = client.post(
            "/users/register",
            json={"username": "santosh", "password": "sntsh", "fullname": "Santosh Kumar"}
        )
        assert response.status_code == 201

I have left one test here. The one which said Should check in database if the user exists, if so: Throws error that user exists. That is because I wanted to show you something. Let’s see what I’ve modified in main.py

Implementation of the test

I am skipping the red-green-refactor thing here because I know what the requirements are and for the sake of this tutorial it would be too much detail. Anyway be have already seen what red-green-refactor feels like in our last post.

Let’s fulfill our test cases in main.py.

main.py

I have added new bits together to use our database integration layer into our application. Instead of showing the entire file, I’ll just show the changes.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
diff --git a/main.py b/main.py
index 47722cd..9c5eab9 100644
--- a/main.py
+++ b/main.py
@@ -1,7 +1,28 @@
-from fastapi import FastAPI
+from fastapi import FastAPI, Depends, HTTPException
+from sqlalchemy.orm import Session
+
+from fastauth import models, schemas, crud
+from fastauth.database import engine, SessionLocal
+
+models.Base.metadata.create_all(bind=engine)
+
+def get_db():
+    db = None
+    try:
+        db = SessionLocal()
+        yield db
+    finally:
+        db.close()
 
 app = FastAPI()
 
 @app.get("/ping")
 async def ping():
     return {'msg': 'pong'}
+
[email protected]("/users/register", status_code=201, response_model=schemas.UserInfo)
+def register_user(user: schemas.UserCreate, db: Session = Depends(get_db)):
+    db_user = crud.get_user_by_username(db, username=user.username)
+    if db_user:
+        raise HTTPException(status_code=409, detail="Username already registered")
+    return crud.create_user(db=db, user=user)

Let’s just straight to line 29 and start from there. Not a single bit is extra here. Every token has some significance.

  • On line 29, we have the route /users/register for which we are going to define a handler on the next line. On success, it will respond with 201.
  • Do you remember I told you that I’ll poke you when dealing with schemas? You can see them in use on line 29 and 30. When response_model is set to schemas.UserInfo, it means that on success, POST to /users/register responds with fields mentioned in schemas.UserInfo. Similarly, input to register_user() is user i.e. schemas.UserCreate. Meaning that /users/register tries to find every field from this schema, otherwise errors out.
  • register_user also depends on db to function.
  • On line 31-34 we simply see if the specified user already exists in database. If so, respond with a 400. Otherwise go ahead and insert a row in the user_info table with crud.create_user API we defined.

Related reading: https://swagger.io/resources/articles/best-practices-in-api-design/

Test /users/register

At first, all the test should pass if you are running on local and your database instance is running.

$ pytest
================================ test session starts ==================================
platform linux -- Python 3.7.10, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /efs/repos/fastauth
plugins: anyio-3.3.4
collected 5 items                                                                      

tests/test_main.py .                                                             [ 20%]
tests/test_users.py ....                                                         [100%]

================================= 5 passed in 1.71s ===================================

The second time, test shouldn’t pass:

$ pytest
================================================== test session starts ==================================================
platform linux -- Python 3.7.10, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /efs/repos/fastauth
plugins: anyio-3.3.4
collected 5 items                                                                                                       

tests/test_main.py .                                                                                              [ 20%]
tests/test_users.py ...F                                                                                          [100%]

======================================================= FAILURES ========================================================
__________________________ TestUserRegistration.test_post_request_with_proper_body_returns_201 __________________________

self = <tests.test_users.TestUserRegistration object at 0x7fa6570ef8d0>

    def test_post_request_with_proper_body_returns_201(self):
        response = client.post(
            "/users/register",
            json={"username": "santosh", "password": "sntsh", "fullname": "Santosh Kumar"}
        )
>       assert response.status_code == 201
E       assert 409 == 201
E        +  where 409 = <Response [409]>.status_code

tests/test_users.py:39: AssertionError
================================================ short test summary info ================================================
FAILED tests/test_users.py::TestUserRegistration::test_post_request_with_proper_body_returns_201 - assert 409 == 201
============================================== 1 failed, 4 passed in 1.19s ==============================================

There are a lot of things going on here. I’ll leave this as a suspense.

Conclusion

In last post we setup our environment. In this post we implemented our user registration system. But unfortunately our tests only pass for a single time. In next post I will discuss why this happens and what are the best practices we can apply here to fix the failing test. Until then, have a nice weekend.

If you liked this post, please share it with your network. Subscribe to the newsletter below for similar news.

Share on

Santosh Kumar
WRITTEN BY
Santosh Kumar
Santosh is a Software Developer currently working with NuNet as a Full Stack Developer.