Page Nav

HIDE

Breaking News:

latest

Ads Place

Python Virtual Environments: Why and When Should You Use Them?

https://ift.tt/Vx8hZkB Use virtual environments. Almost always. Photo by Markus Spiske on  Unsplash You can often hear that beginners ...

https://ift.tt/Vx8hZkB

Use virtual environments. Almost always.

Photo by Markus Spiske on Unsplash

You can often hear that beginners often use their system installation of Python, which is wrong. Heresy. You should never use your system installation, as you can do a lot of harm. So, don’t. Do. That.

Is that true? Should we never use the Python installed in our system? Or should we use what they call virtual environments? This article discusses when one should use virtual environments and whether this when means always. Before discussing this aspect, I will have to introduce virtual environments, so that all of you grasp the basic idea behind them. The truth is, anyone using Python for more advanced tasks than what a calculator offers should know how to use virtual environments.

I remember my first attempts to use virtual environments. They were… they were depressing — both my attempts and my virtual environments. After some time, I learned that they were so because of the IDE I used. It does not matter which IDE it was. Perhaps it was my mistake, perhaps there as something wrong with the IDE — it does not matter. What matters is that I was unable to solve the issue myself. I decided to check out Visual Studio Code, and all my problems disappeared, just like that. Everything worked as it should.

In this article, I will not discuss the various ways of creating virtual environments and managing dependencies. Instead, I will explain why we should use virtual environments and when. I will also discuss some essential aspects of them and of using them.

To make my point, however, I will have to use something, and I will use venv. It’s a popular and quite simple tool for creating virtual environments, which I find efficient enough in most situations. When I was starting my Python adventure, I did have a couple of depressing moments. After some time, I appreciated venv’s simplicity. I could have appreciated it earlier, but I missed a good resource on virtual environments. This is the main reason of why I am writing this article. I hope it will save many beginners from such moments of depression and discouragement. I also believe that most beginners should appreciate venv and its simplicity, the way I did. That’s why we will use it in this article.

After reading this article, even beginner data scientists and Python enthusiasts should know what virtual environments are, why they should use them, and how to do it.

What are virtual environments?

According to the documentation of venv,

A virtual environment is created on top of an existing Python installation, known as the virtual environment’s “base” Python, and may optionally be isolated from the packages in the base environment, so only those explicitly installed in the virtual environment are available.

Let’s translate it to a beginner’s language. Imagine you have installed Python in your computer. This is “an existing Python installation,” and it’s installed in your operating system. It’s also your system installation, which means that when you run the python (in Linux, you still may need to run python3, to distinguish it from Python 2) command in the shell, this very installation will open (for the sake of simplicity, let us assume you do not have any virtual environments on your system yet). You can use this existing Python installation to create a virtual environment. When you do so, it will become the “base” Python for your virtual environment.

Note: When you have more versions of Python installed on your machine, the one that opens after the python commands is the system Python. Remember that you can use any of these versions to create a virtual environment. For the sake of simplicity, we will not use any of these additional versions in these article; instead, we will work with the system installation of Python.

Now imagine that you install (from PyPi) add-on packages in your system installation of Python. To do this, you can run the below command in shell:

$ pip install makepackage perftester==0.4.0 easycheck==0.3.3

These three packages are just examples: makepackage is a Python package to create new Python packages, perftester is dedicated to performance testing of Python functions, and easycheck helps you write readable assertion-like checks inside your code (instead of the assert statement, which should not be used in production code). I use them as I’ve (co-)authored them.

These three packages are now available in your system installation of Python. So, if you run Python (we still assume there is no Python virtual environment in your machine), you will be able to import them.

An important note about package installation. Note that we installed the three packages in two ways: makepackage is installed without indicating its version; for perftester, we requested version 0.4.0 while for easycheck version 0.3.3. This means that whenever one installs these three packages that way, makepackage will be installed in the up-to-date (most recent) version, perftester in the 0.4.0 version, and easycheck in the 0.3.3 version.

Imagine now that in your current project, say Project Before, you indeed need perftester==0.4.0 and easycheck==0.3.3. You finished the project, everything works fine. After some time, you start a second project, say Project After, and you need easycheck==0.5.0. Changes from 0.3.3 to 0.5.0 in easycheck were important, as the way messages are handled changed. We need to upgrade the package:

$ pip install --upgrade easycheck==0.5.0

You finish Project After, and all works fine. But after a month, you need to return to Project Before. You run the project, but it does not work the way it worked before you started Project After. Why?

Project Before’s application has changed its behavior because you changed the Python environment! What you changed is the version of easycheck. Such small changes can result in smaller or bigger changes in your projects, depending on various aspects of the code. In extreme cases (not necessarily with easycheck), the application can even stop running at all.

Since Project Before was ready, you cannot change its code for so unimportant reasons. Thus, you downgrade easycheck. After the downgrade, Project Before works just fine, the way it did before… But the next day your boss asks you to run the app from Project After, so you have to upgrade the package again; all is fine, again. Well, not all… Project Before’s application would not work… In the meantime, perftester has been updated; you update your version in the system installation. Project Before again… Project After… makepackage upgraded… Project Before, Project After, Project Before, makepackage, Project After, easycheck…

What a mess!

Despite working on only two projects, your life has changed to a nightmare. Why? What’s happened?

Did you notice that we operated in a single environment, consisting of the base Python, the one we installed in our system? This is why we had to downgrade and upgrade easycheck like crazy, every time we wanted to switch between the two projects.

There must be a better way!

Note that here we used one environment, and wouldn’t it be easier to have two environments, one for Project Before and another for Project After? Hmm… and still another for makepackage, as we want to use it to create packages for new projects? And if we start another Python project, why can’t we use a fresh new environment…?

This is where virtual environments come into play.

Virtual environments: the essence

Let’s put aside the technicalities of virtual environments. If you’re a beginner, you do not need too detailed knowledge about them — enough to know what they are, how to use them and when; if you’re an advanced Python developer, you probably know all that and much more. If you do want or need to learn more about virtual environments, however, you will need to look for more advanced resources. If you know particularly good ones, please share them with us in the comments, and tell us what (in general, not in detail) one can learn from them about virtual environments.

In this article, we’re discussing the practicalities of virtual environments. Now that you know what they are, I want you to understand what they offer and how you can utilize them. I also want to answer the question that the subtitle suggests: Why should you use virtual environments almost always?

You can consider a Python virtual environment as your project’s environment for developing Python code. T he environment consists of

  • the Python in a particular version;
  • the standard library of this Python;
  • additionally installed Python packages, whether in specified versions or not.

The main element is, of course, Python. You can use any Python version; if you use venv, this version must be installed on your machine. The standard library, of course, comes with it. Remember that a virtual environment’s Python does not have to be the same as your system installation. You can several virtual environments each with a difference Python version. In each virtual environment, you can install any packages (from PyPi or any other package registry, or from local files).

As you see, a virtual environment makes your environment almost independent. It’s not independent in terms of the operating system, as Docker containers are. But it’s independent in terms of Python and its dependencies.

Let’s create such virtual environments for the two projects above. Let’s assume you’re using the system installation of Python 3.9.5.

Environment for Project Before

$ mkdir project_before
$ python -m venv venv-before

Hmm… That’s it? Yes, that’s it, or rather that’s almost it. These two lines you have created a brand new virtual environment, with Python and the standard library. The environment is called venv-before, and you can see it has a dedicated folder named, not unexpectedly,venv-before. Its structure depends on whether you work on Linux or Windows. I’d recommend that you check what your virtual environment contains, as this can help you learn some details. You will find there a place for base Python, for the standard library, and for external packages.

Our next step will be to install site packages. But first, we need to activate the environment. How to do this depends on the operating system:

------ Windows ------
> venv-before\Scripts\activate
------ Linux ------
$ venv-before/bin/activate

From now on, the shell prompt should show that the environment is activate, unless it’s structured in a way that disables such information to be shown. Every shell prompt on default shows this information, as, e.g., (venv-before) at the beginning of the prompt; below, you will see what it looks like.

Now it’s time to install the packages we need. The same command is used in Windows and Linux:

(venv-before) $ python -m pip install perftester==0.4.0 easycheck==0.3.3

This will install the current version of perftester (version 0.4.0) and easycheck (version 0.3.3). These were the requirements of Project Before. And that’s it! Your environment is ready to be used in Project Before.

Do remember that once you’re done with working on Project Before and want to switch between projects, you need to deactivate your virtual environment. You can do it from any location, using command deactivate:

(venv-before) $ deactivate

Deactivation is sometimes done in the background. For instance, when you are in one environment and activate another one, the first one is automatically deactivated before the second one is activated.

This deactivation is essential. Without it whatever you do — like installing a new package — would be done inside the venv-before virtual environment, so it would affect this very environment.

Whether or not you a careful and organized developer, you should take precaution measures. One way of doing this is by creating a file with requirements (the dependencies you need), requirements.txt, and save there the requirements:

# requirements.txt
easycheck==0.3.3
perftester==0.4.0

This file is called a requirements file. If you need to recreate a virtual environment or to install it in a different machine (e.g., all developers from the team should work using the same virtual environments), you can use the command below instead of installing the site packages manually:

(venv-before) $ python -m pip install -r requirements.txt

We’ve covered the basics. There is much more to this topic: code packaging, pip-tools, makepackage, and other, usually more advanced, tools, such as poetry and Cookiecutter. But what we’ve learned until now should be enough for most projects at the basic and intermediate levels — and sometimes even advanced ones.

Sometimes you may trick into some problems with permissions. You will have to solve them; they are not necessarily related to Python, but rather to your operating system and your user’s permissions.

Environment for Project After

Now that we have the venv-before virtual environment, we can work on Project Before and use the resulting application. To develop Project After, however, we need to create its virtual environment. Let’s start in the root folder, where we want the project to be located.

(venv-before) $ deactivate
$ mkdir project-after
$ python -m venv venv-after
$ source project-after/bic/activate
(venv-after) $ python -m pip install easycheck==0.5.0 perftester==0.4.0

And that’s it! We can now switch between the environments (by activating the one you want to use) and develop or run the application inside them. This is exactly what virtual environments exist for: to enable the developer/user to work in environments dedicated to particular projects.

We should now create a third environment, say, venv-makepackage. It would not be used in a project, but in order to create new Python packages. I will leave you with this exercise: do it yourself and check if the resulting virtual environment works fine. Remember that we do not want to use any particular version of makepackage, which basically means we will use its most recent version.

More on virtual environments

Above, we’ve covered the basics of virtual environments. Often, these basics are enough to develop Python projects. Before continuing, I suggest that you spend some time practicing these basics. This should help you feel the vibe of working with virtual environments. After a couple of projects, you should have no problems with this approach to development.

There is more to virtual environments, however. In what follows, I will discuss several important issues, though I will not dig too deep into them, as this would make the article far too long and complicated, something I want to avoid. I plan to cover some of these issues in my future articles.

Packaging

When I was a beginning Python developer, I used to develop Python applications by developing their code inside virtual environments — just the way this article describes. Often you don’t need anything else; sometimes, however, you may need more.

There are various other approaches that are more advanced than this basic one. Packaging is one of them — and actually, it’s one that I use these days in almost all my projects. Although it may seem complicated, packaging is efficient in many respects.

Here, I just want to mention packaging code, and I will write more about it in another article. If you want to learn more, you may read the documentation of the makepackage Python package, which I created in order to make packaging simpler. You can use it to create the structure of a Python package.

Dependencies

When you install a package in your virtual environment, it can be installed with or without dependencies. When it has its own dependencies, meaning that it requires other site packages to work, then pip install will, on default, install those dependencies.

You should remember this, because when you analyze a list of installed packages in your virtual environment, which you do with command pip list, you will see the packages you installed, but also their dependencies — and these dependencies’ dependencies, and so on. If you need to create another instance of the virtual environment (e.g., on a different computer), in order for your environment to work properly, you may need to ensure that all these dependencies be in the same versions as in the original virtual environment.

There are methods to achieve this, such as pip freeze or poetry, some better than others. In the future, we will discuss some of them.

Spoiling a virtual environment

Imagine that you develop your application inside a virtual environment. One day, you need to try a new site package; so, you install it inside the environment and check how your application works with it. Unfortunately, it does not work the way you expected, so you give up the idea of using it. A week later, you check another site package, and the story repeats, so it occurs to be of no use.

The virtual environment has started to look like a dump, with all those packages you don’t need, along with their their dependencies… To put it simply, your virtual environment is spoiled!

To clean this mess up, you could uninstall the packages you installed. You do this using the same pip command you used to install the packages, but replacing install with uninstall. So, for instance, to remove perftester, you can use the following command

(venv-before) $ python -m uninstall perftester

No need to provide the package version here, as you can have only one package version installed in the virtual environment.

There is — or rather can be — one problem with this. If the packages you’ve removed have their own dependencies, these dependencies were added to the virtual environment. Indeed, perftester does have a dependency, that is, memory_profiler; what’s more, memory_profiler has its own dependency, psutil, and it was installed too. When you removed perftester, you removed neither its dependency (memory profiler) nor its dependency (psutil)!

After all, your environment indeed becomes a dump. I did not mention other possible problems, which happen from time to time. For instance, a package can have a dependency that needs to be in a different version than the one we have already installed and need in our project. Such a conflict can be difficult to resolve.

Below, I explain how to proceed when you have approached this very point of development: your virtual environment is spoiled. In my opinion, prevention is the best approach here, representing the better-safe-than-sorry approach. Virtual environments are not very light, but are not that heavy, either. In WSL, a new virtual environment, without any site packages, takes about 7.3 MB of space; in Windows, it’s 15.3 MB. For example, installing pandas and its dependencies increases the size to 138 and 124 MB, respectively.

The better-safe-than-sorry solution in the case of working with virtual environments is more or less as follows. When you want to check how a new version of the environment works (e.g., with a new site package), create a new environment and work in it. Once you decide to use this package, install it in the actual virtual environment. If not, simply remove this environment.

That way, you will always have a working and up-to-date virtual environment. If you need to check a new idea that requires changes to the environment, do it in a new environment and:

  • If the idea is implemented, you can make these changes in the main environment and remove the other virtual environment; or you can treat this other environment as the main one.
  • If the idea is rejected, remove the other virtual environment.

Reinstallation

As it follows from above, there is no need to become attached to a virtual environment. You can easily remove one and create another. If your virtual environment has been spoiled somehow, simply remove it and create a new one.

This requires you to keep track of all the dependencies you need to have installed in the virtual environment. Hence you need a good requirements file so that you can easily create a new virtual environment from scratch.

Never be afraid of creating a new virtual environment and removing old ones. When I was a beginner, I was spending far too much time on the attempts to fix my virtual environments. It really makes no sense. Virtual environments are not part of the solution — the information on how to create a virtual environment is; that is, the Python version and the dependencies.

Simply put, virtual environments are just tools, and you can use several of them at the same time. There is no need to be attached to any of them. Therefore, you should create a virtual environment every now and then — better more often than less often; it enables you to check if the solution still works as expected.

What if the code has worked just fine in the previous virtual environment but does not work in a new one? Most likely, this means that we have incorrectly documented the creation of the virtual environment — there is a bug somewhere. In order to discover this bug, you can compare both environments, because it is possible that the previous one contains a dependency that the new environment misses. It is also possible that there is a difference in the versions of some dependency/dependencies.

Such mistakes happen, and this is why it’s advisable to replace, from time to time, the working virtual environment with a new one.

Using the system installation of Python

Above, we used the system installation of Python for only one purpose: to install virtual environments. But is there any other use for system Python?
I have encountered a radical approach according to which this is its only application, and whatever we do in Python, we should do this inside a dedicated virtual environment. The same radical approach states that you should not install any additional packages outside of virtual environments.

Is it the case indeed? Should we limit ourselves that much? Should we be so careful and not use the Python system installation, except for creating virtual environments?

Honestly, I do not entirely agree with this claim. While I totally agree that each project should have its own dedicated virtual environment, I don’t think we must not use system Python if we need to do something minor. I do use the Python system installation from time to time — but never for important issues.

For example, sometimes I want to calculate something or see how to do something in Python. Should I really install a new virtual environment for this purpose? Wouldn’t that be an overkill? Surely, when I want to check something as part of a given project, I use the project’s virtual environment. But let’s say I want to recall how to create custom exception classes with .__ init __() and .__ str __() methods; or how to use multiprocessing.Pool; or how .__repr__() works in classes; or check how dataclasses work; or check how faster {} is than dict(), using timeit; and so on… Should I create a virtual environment every time I want to do something like that? I don’t think so. Instead, I simply use the system installation.

You could actually create a virtual environment and use it for such purposes, instead of the system installation. You could consider such an approach — that is, using a virtual environment for checking various things — an extra precaution measure, an approach that is slightly safer than using the system installation.

Installing site packages in the Python system installation is another matter. Be careful with that. Let’s say you want to install pandas so that you can fire up an interactive session and conduct exploratory analysis of a dataset. In my opinion, you should do this in a virtual environment created specifically for data analysis; you can install there other analytical packages, including ones that enable you to visualize data. After all, pandas or numpy or whatever you’ve installed there can be updated from time to time, and so it’s safer to update them in the virtual environment than in the system installation.

In other words, I do not believe that you should never ever use the Python system installation, under any circumstances; and that whatever we do in Python, we must do it in a virtual environment, without any exceptions. Sometimes we can safely use the system Python. Why would you do that? For simplicity, that’s all. However, if whatever you want to do requires site package(s) to be installed, that’s another matter — it will be much safer to set up a new virtual environment, install the package(s) there, and do whatever you need inside it.

Conclusions

You can use many various approaches to organize your software projects. Python’s basic tool is virtual environments. Despite being basic, virtual environments offer quite a lot. Once you know how to use them, you will be able to organize your work in Python without bigger problems. With time, your programming skills will improve, and you will learn more tools and techniques of organizing your projects.

First things first, however. When learning Python, learn working in virtual environments. Do not postpone it; learn it from day one.

While virtual environments is a general term describing a concept of the same name, there are various tools you can use. Some time ago, virtualenv was perhaps the most common tool. Now, it’s venv. But you can also use conda environments and other tools.

In this article, I described venv. The reasons were twofold. First, I consider it a very good tool for beginners. When I was a beginner, venv helped me a lot. When I started using it, my Python development accelerated. I did not start using venv from day one, and I regret it. But I started soon enough, and I am happy about it.

Second, I find venv my tool. I consider it lightweight and simple enough. When working on my open-source packages, I use virtual environments, and I create them using venv. When I work on software projects in the industry, I almost always use venv, too. When it’s my decision, it’s always, though I combine venv with packaging. Sometimes a DevOps in a project decides we should use another tool, like development containers; I am fine with that. But such tools are far more complicated, and I would not suggest them to beginners or even intermediate Pythonistas.

To summarize, my suggestion is the following. When you learn Python or work on your projects, use virtual environments. To create them, use venv. I do so even in data science projects, even though conda is a common tool to manage virtual environments among data scientists. I like, however, to have a full control of environments in which I code, and venv enables me to have it.

I am not saying other tools are bad or that you should not use them. But I am saying that I like using venv and consider it a good tool for beginners. The time will come for you to learn and use other tools, but take your time; don’t rush.

Thanks for reading this text. You don’t have to agree with everything I wrote. Maybe you don’t like venv or have had some unpleasant experiences with it? If so, please tell us about that in the comments. Do so also when you prefer another tool, but don’t forget to tell us why.

Resources


Python Virtual Environments: Why and When Should You Use Them? was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.


from Towards Data Science - Medium
https://towardsdatascience.com/python-virtual-environments-why-and-when-should-you-use-them-be57b0c0323d?source=rss----7f60cf5620c9---4
via RiYo Analytics

No comments

Latest Articles