- First things first: Basic toolbox
- Managing Python Versions: pyenv
- Managing Libraries: pip
- Managing Environments: pipenv
- Jupyter lab & virtual environments
Setting up your work environment from scratch might be something you do every three, four or five years, depending on how often you get a new machine. After that it’s mainly upgrading, bit the core tools (git, brew etc) are there, and your system variables are meant to work.
However, whenever we plan for a Python related workshop where we don’t know exactly what hardware to expect on the side of the participants or when we have a standalone app we want to share (e.g. an Electron App with a Python Backend) the larger setup of a systems becomes important.
In the end, if a similar environment is critical for a workshop and we don’t want to spend half a day with getting the equipment running, my first choice would still be an online tool (e.g. Google’s Collaboratory - which is based on open source Jupyter Notebooks), or – if it’s about demonstrating an application – maybe a Cloud service (e.g. Herroku) or AWS in combination with a dockerized solution.
Anyway, these approaches have their own disadvantages and sometimes it’s easier to list the steps to recreate similar working environments and point to the caveats. The following workflow is tested on OS X Yosemite (Version 10.10.15) and ad Mojave (10.14.6) with no major differences as far as the installations are concerned. Realpython also published a comprehensive overview including Windows Terminals (for command line tools) and Conda (a package and environment management system as part of Anaconda, a Python distribution focused on scientific applications).
Getting a toolbox ...
So before we get to install new stuff, it’s a safe step to check what we already have. So a typical Mac would come with Python 2.7 (owned by the Mac OS). That’s it. What we need, however, is ..
- Homebrew, a Package Manager for macOS this link lists a curl statement you can copy and paste in your terminal
- via homebrew we then get the latest version of Python3 as well as pip3
$ brew install python3
- as just outlined, we get pip as a courtesy from python3 – still, good to know that pip is a Package Manager for Python, managing additional libraries and dependencies that are not distributed as part of the standard library
- git, a Version Control System
- shell configuration via ~/.bash_profile including aliases and PATH management
- for a controlled development environment, we also want pipenv for project-specific libraries and pyenv for project-specific python versions
Managing Python Versions: pyenv
Managing Python versions becomes important when we want to run or further develop applications that have been written using a different Python Version than the one, we have installed. This isn’t necessarily a problem if we talk about backwards compatible minor versions, however changing from Python 2 to 3 can require a major rewrite.
How do we get pyenv, especially the latest version, which, at the time of writing was 1.2.26 (see
pyenv --version). The first option
pip install pyenv only gets you up to 1.2.21 which includes Python 3.7 but not 3.9.x. So we have to git clone the pyenv github repository into /Users/username/.pyenv and add a few lines to .bash_profile as described in the README of the repo. If you don't need the very last Python version, you are just fine with the pip install method.
Following some key commands for pyenv
#.. listing all possible installation including Anaconda, micropython, pypy, miniconda, jython etc pyenv install --list # installing Python 3.9.4 and making it the global version $ pyenv install 3.9.4 $ pyenv global 3.9.4 # listing all versions installed $ pyenv versions # checking where Python executables are located type -a python # results in something like .. python is /Users/me/.pyenv/shims/python python is /usr/local/bin/python python is /usr/bin/python
It’s not necessarily needed if we use
pipenv as described in the next section. But we can also cd into a new folder and define a local Python version via
pyenv local 3.7.10`.
This creates a file .python-version in folder. When using pyenv running the following in a terminal is paramount ...
# Add pyenv-virtualenv initializer to shell startup script echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bash_profile # Reload your profile source ~/.bash_profile
Managing Libraries: pip
Before we can talk about
pipenv we need to look into
pip. Pip is a recursive acronym for ‘Pip installs packages’ and connects to an online repository (PyPI – Python Package Index). By default, packages are installed to the *running Python installation's site-packages directory. So it can happen that after changing the Python version, we need to install the required packages again to the new site-packages folder.
# pip list -v ... list available packages (verbose) Package Version Location Installer ---------- ------- -------------------------------------- --------- numpy 1.20.2 /usr/local/lib/python3.9/site-packages pip pip 21.1 /usr/local/lib/python3.9/site-packages pip setuptools 51.0.0 /usr/local/lib/python3.9/site-packages wheel 0.36.1 /usr/local/lib/python3.9/site-packages # pip list -o ... list **o**utdated packages (example taken from T-Test blog post) Package Version Latest Type ------------ --------- --------- ----- certifi 2020.11.8 2020.12.5 wheel matplotlib 3.3.3 3.4.1 wheel numpy 1.19.4 1.20.2 wheel outdated 0.2.0 0.2.1 wheel pandas 1.1.4 1.2.4 wheel pingouin 0.3.8 0.3.11 sdist pip 21.0.1 21.1 wheel scikit-learn 0.23.2 0.24.1 wheel
Any update would work similar to this
pip install ipywidgets --upgrade
Managing Environments: pipenv
Pipenv handles virtual environments and package dependencies. For a closer look into packages and modules have a look here. Unlike using pip in combination with virtualenv, pipenv uses one tool to manage dependencies and creating isolated virtual environments. It also auto-updates the Pipfiles, explained in more detail below.
If we need a virtual environment with a specific python version we can get this through
pipenv --python 3.6. If that happens at a later stage, when we have also some additional packages installed. We can delete the environment
pipenv --rm and simply rebuild it
pipenv install, which will automatically pick up the information provided in Pipfile.
If you have pyenv installed, Pipenv will ask you if you want to install a required version of Python if it’s not available yet.
# the following commands are largely self-explanatory ... $ pipenv install pytest --dev $ pipenv update [package-name] $ pipenv uninstall [package-name]
Every change is also automatically reflected in the Pipfile and Pipfile.lock. The essential difference between both files is that * Pipfile* describes a working project set-up (i.e. library version should be more recent than 3.2.1 or could be any version *). Pipfile.lock is more precise in the sense that it locks in a specific version of every library, i.e. the one currently installed.
# A possible *Pipfile* structure for a generic machine learning project [[source]] name = "pypi" url = "https://pypi.org/simple" verify_ssl = true [dev-packages] pytest = "*" [packages] scikit-learn = "*" pandas = "*" plotly-express = "*" numpy = "*" plotly = "*" [requires] python_version = "3.8"
If we want to see what packages have been installed in our virtual environment, we can use
pip list -vagain or, if we want to see a tree structure,
Another useful command is
pipenv check, which highlights vulnerabilities, suggesting a need to update affected packages.
# a possible output could be ... Checking installed package safety... 39611: pyyaml <5.4 resolved (5.3.1 installed)! A vulnerability was discovered in the PyYAML library in versions before 5.4, where it is susceptible to arbitrary code execution
Last but not least, the following will generate a requirments.txt file out of your Pipfile:
$ pipenv lock -r --dev > requirements.txt. Requirements are often needed by app hosting platforms such as Heroku, but since Pipfiles are meant to replace requirments.txt, they are (almost always?) accepted as alternatives.
Jupyter lab and virtual environments (kernels)
Effective environment management saves time and allows developers to create an isolated software product such that collaborators or contributors can recreate your environment and run your code. This, of course, also applies to jupyter notebooks. You can find a more extensive wrap up how to use jupyter notebooks here or here.
Pipenv, as introduced above, provides a standardized way to install project dependencies and testing and development requirements. Jupyter Lab is mainly a browser-based, very interactive development environment, which you get via
pip install jupyterlaband getting started via
Default starting folder is ’/tree/’. If you prefer a customized path, you need to go through the following steps:
jupyter notebook --generate-config
- this generates a file to /Users/username/.jupyter/jupyter_notebook_config.py
- cd into that folder so you can edit the config file
- search for the following line in the file: #c.NotebookApp.notebook_dir = '' and replace ir with c.NotebookApp.notebook_dir = '/the/path/to/desired/folder/'
Next we want to reuse our pipenv configurations by installing the necessary kernel:
# 1st step # cd into project folder and activate the virtual environment pipenv shell # 2nd step pipenv install ipykernel # 3rd step # ml_scikitc can be replaced by any name of your choosing python -m ipykernel install --user --display-name ml_scikit --name ml_scikit
As a result, you should be able to run your notebook with that specific kernel (drop down menu – right upper corner).
# install notebook extensions pip install jupyter_contrib_nbextensions # copying extensions into jupyter server directories & configuration jupyter contrib nbextension install –user # installing the configurator pip install jupyter_nbextensions_configurator # Configuring the notebook server to load the server extension jupyter nbextensions_configurator enable –user # Restart the server with ‘jupyter notebook’ & then select extensions under the ‘NBextensions’ menu (this process is slightly different when using ‘jupyter lab’)
Introduction to Anaconda https://realpython.com/python-windows-machine-learning-setup/
Pyenv and Shims https://mungingdata.com/python/how-pyenv-works-shims/