Deploying FastKafka using Docker
Building a Docker Image​
To build a Docker image for a FastKafka project, we need the following items:
- A library that is built using FastKafka.
- A file in which the requirements are specified. This could be a requirements.txt file, a setup.py file, or even a wheel file.
- A Dockerfile to build an image that will include the two files mentioned above.
Creating FastKafka Code​
Let’s create a
FastKafka
-based
application and write it to the application.py
file based on the
tutorial.
# content of the "application.py" file
from contextlib import asynccontextmanager
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from fastkafka import FastKafka
ml_models = {}
@asynccontextmanager
async def lifespan(app: FastKafka):
# Load the ML model
X, y = load_iris(return_X_y=True)
ml_models["iris_predictor"] = LogisticRegression(random_state=0, max_iter=500).fit(
X, y
)
yield
# Clean up the ML models and release the resources
ml_models.clear()
from pydantic import BaseModel, NonNegativeFloat, Field
class IrisInputData(BaseModel):
sepal_length: NonNegativeFloat = Field(
..., example=0.5, description="Sepal length in cm"
)
sepal_width: NonNegativeFloat = Field(
..., example=0.5, description="Sepal width in cm"
)
petal_length: NonNegativeFloat = Field(
..., example=0.5, description="Petal length in cm"
)
petal_width: NonNegativeFloat = Field(
..., example=0.5, description="Petal width in cm"
)
class IrisPrediction(BaseModel):
species: str = Field(..., example="setosa", description="Predicted species")
from fastkafka import FastKafka
kafka_brokers = {
"localhost": {
"url": "localhost",
"description": "local development kafka broker",
"port": 9092,
},
"production": {
"url": "kafka.airt.ai",
"description": "production kafka broker",
"port": 9092,
"protocol": "kafka-secure",
"security": {"type": "plain"},
},
}
kafka_app = FastKafka(
title="Iris predictions",
kafka_brokers=kafka_brokers,
lifespan=lifespan,
)
@kafka_app.consumes(topic="input_data", auto_offset_reset="latest")
async def on_input_data(msg: IrisInputData):
species_class = ml_models["iris_predictor"].predict(
[[msg.sepal_length, msg.sepal_width, msg.petal_length, msg.petal_width]]
)[0]
await to_predictions(species_class)
@kafka_app.produces(topic="predictions")
async def to_predictions(species_class: int) -> IrisPrediction:
iris_species = ["setosa", "versicolor", "virginica"]
prediction = IrisPrediction(species=iris_species[species_class])
return prediction
Creating requirements.txt file​
The above code only requires fastkafka
. So, we will add only
fastkafka
to the requirements.txt
file, but you can add additional
requirements to it as well.
fastkafka>=0.3.0
Here we are using requirements.txt
to store the project’s
dependencies. However, other methods like setup.py
, pipenv
, and
wheel
files can also be used. setup.py
is commonly used for
packaging and distributing Python modules, while pipenv
is a tool used
for managing virtual environments and package dependencies. wheel
files are built distributions of Python packages that can be installed
with pip.
Creating Dockerfile​
# (1)
FROM python:3.9-slim-bullseye
# (2)
WORKDIR /project
# (3)
COPY application.py requirements.txt /project/
# (4)
RUN pip install --no-cache-dir --upgrade -r /project/requirements.txt
# (5)
CMD ["fastkafka", "run", "--num-workers", "2", "--kafka-broker", "production", "application:kafka_app"]
Start from the official Python base image.
Set the current working directory to
/project
.This is where we’ll put the
requirements.txt
file and theapplication.py
file.Copy the
application.py
file andrequirements.txt
file inside the/project
directory.Install the package dependencies in the requirements file.
The
--no-cache-dir
option tellspip
to not save the downloaded packages locally, as that is only ifpip
was going to be run again to install the same packages, but that’s not the case when working with containers.The
--upgrade
option tellspip
to upgrade the packages if they are already installed.Set the command to run the
fastkafka run
command.CMD
takes a list of strings, each of these strings is what you would type in the command line separated by spaces.This command will be run from the current working directory, the same
/project
directory you set above withWORKDIR /project
.We supply additional parameters
--num-workers
and--kafka-broker
for the run command. Finally, we specify the location of ourfastkafka
application location as a command argument.To learn more about
fastkafka run
command please check the CLI docs.
Build the Docker Image​
Now that all the files are in place, let’s build the container image.
Go to the project directory (where your
Dockerfile
is, containing yourapplication.py
file).Run the following command to build the image:
docker build -t fastkafka_project_image .
This command will create a docker image with the name
fastkafka_project_image
and thelatest
tag.
That’s it! You have now built a docker image for your FastKafka project.
Start the Docker Container​
Run a container based on the built image:
docker run -d --name fastkafka_project_container fastkafka_project_image
Additional Security​
Trivy
is an open-source tool that scans Docker images for
vulnerabilities. It can be integrated into your CI/CD pipeline to ensure
that your images are secure and free from known vulnerabilities. Here’s
how you can use trivy
to scan your fastkafka_project_image
:
Install
trivy
on your local machine by following the instructions provided in the officialtrivy
documentation.Run the following command to scan your fastkafka_project_image:
trivy image fastkafka_project_image
This command will scan your
fastkafka_project_image
for any vulnerabilities and provide you with a report of its findings.Fix any vulnerabilities identified by
trivy
. You can do this by updating the vulnerable package to a more secure version or by using a different package altogether.Rebuild your
fastkafka_project_image
and repeat steps 2 and 3 untiltrivy
reports no vulnerabilities.
By using trivy
to scan your Docker images, you can ensure that your
containers are secure and free from known vulnerabilities.
Example repo​
A
FastKafka
based library which uses above mentioned Dockerfile to build a docker
image can be found
here