Better python project management starts now: A quick introduction to Rye

Published 2/17/2024 12:04:01 PM
Filed under Machine Learning

Have you ever found yourself tangled in the web of project management, trying to juggle different Python versions, packages, and environments, all while keeping your sanity intact? You're not alone. I've been there, and felt the frustration. But, fear not! I stumbled upon some game-changing tools and strategies that transformed my Python project management challenges into a breeze.

Join me as I dive into the world of Python project management, share the challenges I faced, and reveal how using the right tools like the all-in-one solution, Rye can help. Whether you're a seasoned developer or just starting out, this journey through Python project management is bound to offer you some valuable insights and tricks to up your game.

Let's get started!

The challenges of maintaining a Python project

The last few years I've worked with a number of data scientists on various projects. While working on these projects I've seen quite a few interesting challenges that broke our Python projects that made me realize how important it is to have a good tool to manage python projects.

Challenge 1: Using pip install is breaking your project

The first challenge with Python projects comes from its package management. Out of the box you can use pip to install packages on a project. Most data scientists I've met will set up a project like this: They create a new folder, and start with pip install pytorch to start a deep learning project. Which pytorch? Well, the most recent version of course.

For a new project that's not a problem, because you want the latest version. However, for older project it matters which pytorch you install. In many cases the version of a package isn't documented.

When a new team member comes a long and runs pip install pytorch they will install the wrong version of pytorch. The end result is a broken project. Usually a discussion follows to figure out which pytorch version was acually used.

Challenge 2: A different python version for every project

As you can imagine, knowing which version of a package you used is important. And so is knowing which Python version you should use. Not all packages work on the latest version of Python. I usually end up with 3 to 5 versions of Python on my machine after a while. It's interesting to say the least.

What happens to me quite often is that I try to run an old project against a new version of python which breaks the code. It's irritating, because now I have to figure out which Python version to use.

Using tools to improve your Python game

It may feel a little weird when you come from a .NET or Java background. Why are Python developers not using a package management solution? We are, let me take you through some of the tooling that I've come across.

Using pyenv to maintain multiple versions of Python

Running multiple python versions can be challenging if you do it by hand. You have to constantly update the PATH environment variable with the right information to ensure you have the right python version available. A much better solution is to use something like pyenv.

A tool like pyenv let's you install and maintain multiple python versions. For example, to install a new version of Python you can use pyenv install <version>. You can switch between various versions of python based on which project you work on with pyenv local <version.

I've used pyenv for quite some time it's made working with multiple python versions a lot easier.

Using virtualenv to manage project specific packages

As much as pyenv helps with managing python versions, it's not a complete solution. Python packages are installed globally. They're installed in the package directory of the python version you use. You'll need some way of isolating package versions per project or you'll end up with conflicting package versions when you work on multiple projects.

The best way to isolate Python packages is by using a virtual environment. You can create one using the command python -m venv .venv where the .venv argument is the name of the directory where the environment is located. To use the environment you'll need to run source .venv/bin/activate. After that you can use pip install to install packages in the new environment.

Virtual environments are essential to make sure you can maintain a project over a longer period of time.

Using requirements.txt to collect dependencies

We're still not at the end of the list of Python project management problems. We can use different Python versions, isolate package versions, but we don't know which version of a package we need.

There are different solutions for maintaining information about requirements in a project. One of them is the use of requirements.txt. You can use the requirements.txt file to list dependencies of your project. It looks like this:

flask>=2.0

Each line in the file lists a package, and a version range that your project can work with. To install the dependencies you can run pip install -r requirements.txt.

Using a requirements.txt makes the environment reproducable in combination with a virtual environment and a tool like pyenv. It's a lot of tooling to deal with to be honest.

The ultimate solution: Rye

Wouldn't it be nice to have an all-in-one solution instead of running so many different tools? Meet Rye, a pretty new python project management tool.

Rye does several things:

  • It can install and pin python versions in use for your projects
  • It maintains dependencies for your project, both for production and development environments
  • It isolates your project environment with a virtual Python environment

Let's take a look at how you can install and configure Rye on your machine.

Setting up Rye

Rye works on Windows, Mac, and Linux machines. I'm using Mac as my work environment, so for my machine I need to run the following command to install Rye:

curl -sSf https://rye-up.com/get | bash

The exact instructions vary by operating system. You can find the installation instructions on the Rye website.

After installation, you have a new command rye that you can use to initialize a project.

Initializing a new project

Initializing a new Rye project is done by calling rye init in an empty directory. You can also use Rye with existing projects. The rye init command can convert existing projects into rye-based projects. To find out more, run rye init --help for all the options available.

After running rye init you'll have a src directory, and a pyproject.toml file. The pyproject.toml file contains information about the python project and lists the dependencies. The code of your project is located in the src directory.

Pinning the python version

When you install rye it will install a default version of Python for you. You can change the Python version of your project by running rye pin <python-version>. For example, I use rye pin 3.11 because most of the packages I use don't support Python 3.12 yet.

To activate the new python version, make sure you run rye sync. It will install the right python version and setup the virtual environment for your project.

The rye sync command will play an important role in the next section when we cover how to manage dependencies.

Managing dependencies

Managing dependencies with Rye can be done with the command rye add <package>. To add a new dependency like pytorch you can run rye add pytorch. This command will install the latest version of pytorch. You can specify a specific version of a package too with rye add pytorch>2.0.

Adding a dependency doesn't install it in the project's virtual environment. To install all dependencies for a project you need to run rye sync.

You can of course also remove dependencies with rye remove <package>. A removed dependency isn't immediately removed. Running rye sync will remove it for you.

You may not want all your dependencies on a production environment. For example, a unit-test framework dependency doesn't belong in production. You can specify development dependencies by adding them like so rye add --dev <package>.

When you run rye sync both production and development dependencies are installed. However, when you run rye build to package your project, only the production dependencies are included.

Just the dependency and python management alone makes Rye awesome. Especially since Rye is super fast thanks to it's native implementation in Rust. But there's more to Rye.

Rye comes with batteries included

Since Python is a dynamically typed interpreted language it comes with challenges. The code you write needs to be neatly formatted and any typing mistake will break at runtime rather than during a build. If you want to verify your code before it gets to production you'll want to run a linter to verify that the code is formatted correctly.

There are multiple options when it comes to linting in Python. The most popular I've found is Flake8. There is a faster option too: Ruff.

To use a linter, you add it to your project with rye add --dev ruff. Next you can run it with rye lint. Rye will automatically detect which linter you installed and run the right one for you. A neat trick that makes Rye feel solid.

In the same way you can automatically format your python files with rye format. It will automatically locate the correct formatter and run it for you on your project files.

Frequently asked Questions

Rye works with all the popular editors I've found. I tried it with PyCharm, VSCode, and Neovim. It doesn't interfere with any of them. Although none of them use the commands from Rye. But if you don't mind a little console action, it's going to be fine.

Working with frameworks like Django

When I used Rye the first time, I found out that it automatically uses the src directory for your python sources. That's not ideal if you're working with a framework like Django. Frameworks like Django don't use a `src directory.

But fear not, you can initialize your project with rye init --virtual to omit the src folder. It works fine with Django too.

Publishing packages

Rye can publish packages to pypi and other feeds. It includes a build command to create source distributions and python wheels. You can push the distribution of your package to a package feed with the publish command.

Conclusion

Wrapping up our journey through the world of Python project management, we've covered the importance of managing your python project with the right tools. We've seen how Rye can help you get the best experience.

With its ability to streamline our Python projects, from version pinning to dependency management and beyond, Rye offers a great solution for developers looking to simplify their workflow.

But, as with any tool, the real magic happens when it's in the hands of individuals like you. Have you had a chance to use Rye in your projects? What has been your experience? Or perhaps you've encountered challenges or discovered neat tricks that could benefit the community? Let me know in the comments. Share your thoughts, questions, or even a cool project you're working on. Cheers to our ongoing journey in coding!