The right tool for the job
I needed a tool that extracts text and text chunks from documents, and I needed it done in a couple of hours, not days. In this scenario, the choice was clear: Python with FastAPI.
Why Python?
Robust Libraries: The Python community excels in providing libraries for data manipulation, making it ideal for handling various document formats. This means minimal coding, abundant community examples, and thorough documentation. Adding support for a new file type is as simple as a Google search.
Why FastAPI?
Swagger Integration: Poorly documented API contracts can be a major headache. With FastAPI, as soon as your API is operational, you have access to ready-to-use interactive documentation. This feature simplifies testing and integration, enhancing developer experience and reducing time to deployment.
Makefile
We’ve all experienced the friction of setting up and deploying services. Instead of trying to cover everything in a README, I prefer using a Makefile. It handles setup, Docker tasks, and local deployment efficiently, significantly reducing the setup friction.
GitHub Actions
Ultimately, what I want is a Docker container with the latest code that I can pull and run on a small VM. GitHub Actions allow me to build and store artifacts alongside the code, streamlining the deployment process.
Delivering the App
To deploy, I clone a basic Debian VM that has Docker and the correct firewall setup. I pull the latest version of the app and run it, then expose my VM via a Cloudflare tunnel. The entire process takes about 10 minutes.