Skip to content

Environment

To make our research reproducible, it is important that we have a controlled research environment. This is as true for data science-type research as it is in the laboratory. For our programming environment to be controlled and reproducible, we need to have a specification of what version of Python and any packages that we are using. We do this by setting up a virtual environment and using a package manager.

A virtual environment is an isolated installation of Python, which contains its own Python interpreter and packages. It is a clean, project-specific workspace where we can install packages without affecting other projects or the system. It is also easy to create and restore snapshots of the environment, which makes reproducibility easy. Python has a built in virtual environment manager called venv.

Packages in Python are pieces of code written by others and (often) published on the Python Package Index (PyPI), Python central package repository (similar to CRAN for R). A package manager can be used to install/update/remove packages from a Python installation. Python has a built in package manager called pip.

uv is a relatively new and very fast virtual environment and package manager for Python. It can be used as a drop-in replacement for both venv and pip, combining both tools into one while also providing better performance and dependency resolution.

We can create a new virtual environment we use uv init:

uv init

This will create a new virtual environment in the .venv directory inside the project and any packages will also be installed there. However, uv is smart enough to not download the same package version multiple times across environments and instead links the packages into the environment from a central cache.

To install a package we use uv install ...:

uv install package-name

And finally, to run commands inside our new environment we use uv run ...:

uv run python my-script.py

Another popular virtual environment and package manager, especially for data science tasks, is Conda (Anaconda). Read more about it in the FAQ.