Back to Blog
Open Source2024·7 min read

bt-flow: Deploying ML Models to Production in One Line

The origin story of bt-flow feeds on a frustration I repeatedly experienced in my own research projects: every time I needed to bring a trained scikit-learn model to a production environment, I had to write the same boilerplate code — model serialization, defining FastAPI endpoints, creating Pydantic schemas, error handling. While these steps may seem trivial, for a researcher they are genuinely distracting; they make you spend your core energy on infrastructure rather than the model. Could I break this cycle by writing a library?

During the technical design process, I first modeled the process from back to front: what does the user want to see? The ideal interface should be as simple as `bt_flow.deploy(model)`. Behind this single line lay model introspection (feature names, input types, sklearn pipeline support), automatic Pydantic schema generation, FastAPI app instantiation and service startup with uvicorn. I wrote a dispatcher using Python metaclasses and the `__init_subclass__` mechanism to dynamically detect different model types (classifier, regressor, pipeline).

The most critical part of the FastAPI integration was accurately reflecting the feature space the model expects at runtime. Unpacking nested transformers inside scikit-learn Pipeline objects and carefully handling the cross-version behavioral differences of the `get_feature_names_out()` API to correctly obtain feature names required careful attention. Additionally, `/predict`, `/predict_proba` and `/health` endpoints are automatically created and become instantly testable through Swagger UI. All configuration was designed to be overridable with optional parameters.

My first PyPI publishing experience was both exciting and educational. I experienced for the first time the pyproject.toml configuration, semantic versioning and package uploading with twine. The most valuable feedback came through GitHub Issues; users reported compatibility issues they experienced with transformer pipelines, which allowed me to make important fixes in v0.2. Managing the non-purely-technical side of open-source development — README clarity, example notebooks, release notes — contributed significantly to my understanding of the communication dimension of software development.

Tags

PythonFastAPIMLOpsOpen SourcePyPIscikit-learn