I needed a tool that extracts text and text chunks from documents, and I needed it done in a couple of hours, not days. In this scenario, the choice was clear: Python with FastAPI.

Code

Documentation

Why Python?

Robust Libraries: The Python community excels in providing libraries for data manipulation, making it ideal for handling various document formats. This means minimal coding, abundant community examples, and thorough documentation. Adding support for a new file type is as simple as a Google search.

Why FastAPI?

Swagger Integration: Poorly documented API contracts can be a major headache. With FastAPI, as soon as your API is operational, you have access to ready-to-use interactive documentation. This feature simplifies testing and integration, enhancing developer experience and reducing time to deployment.

Makefile

We’ve all experienced the friction of setting up and deploying services. Instead of trying to cover everything in a README, I prefer using a Makefile. It handles setup, Docker tasks, and local deployment efficiently, significantly reducing the setup friction.

GitHub Actions

Ultimately, what I want is a Docker container with the latest code that I can pull and run on a small VM. GitHub Actions allow me to build and store artifacts alongside the code, streamlining the deployment process.

Delivering the App

To deploy, I clone a basic Debian VM that has Docker and the correct firewall setup. I pull the latest version of the app and run it, then expose my VM via a Cloudflare tunnel. The entire process takes about 10 minutes.