Installing TensorFlow and TensorFlow Addons on Debian bullseye (Debian 11)
• deep-learning • PermalinkThough I usually use PyTorch for my deep learning work, I have just been given a piece of code written using TensorFlow, so I needed to install TensorFlow and TensorFlow Addons on my Debian testing system (aka Debian bullseye, which will shortly become Debian 11). Unfortunately, the binaries available on PyPI are only built for Python 3.6-3.8, but Debian bullseye now runs Python 3.9.
This blog post documents how I managed to build these packages for my system (which was far more effort than it probably should have been!).
System setup
I have the following key packages/libraries installed. There will certainly be others that I have overlooked, of course; please feel free to let me know of anything I’ve overlooked.
- Python:
python3-dev
(currently version 3.9.2-2) - Essential Python packages (according to the TensorFlow installation
page) are
pip
,numpy
,wheel
andkeras_preprocessing
. The Debian packages providing these are:python3-keras-preprocessing
python3-numpy
python3-pip
python3-wheel
though the version of
keras_preprocessing
currently in the Debian archive is older than that required by the TensorFlow installation, so it may be wiser to install it withpip install -U --user keras_preprocessing --no-deps
as explained on the TensorFlow installation page.
- Python packages: either install Debian versions of these or let
pip
install them during the package installation. The relevant packages seem to be:python3-flatbuffers
python3-google-auto-oauthlib
python3-grpcio
(*)python3-h5py
python3-markdown
python3-protobuf
python3-requests
python3-setuptools
python3-six
python3-termcolor
python3-typeguard
(*)python3-typing-extensions
python3-werkzeug
python3-wrapt
and their dependencies.
(*): In these cases, the current Debian version is too old for TensorFlow 2.4.1.
- cuDNN:
libcudnn8-dev
(version 8.1.0.77-1+cuda11.2; this package was downloaded directly from the NVIDIA website) - Go compiler:
golang-go
- this is needed forbazelisk
.
Unfortunately, it turns out that the Debian-packaged CUDA packages do not work when building TensorFlow; see this GitHub issue. It is likely this this will not be fixed. On the other hand, the NVIDIA-provided Debian packages don’t seem to work nicely with some of the other parts of the system, so I never used them.
To get around this, I downloaded cuda_11.2.1_460.32.03_linux.run
from the NVIDIA
website.
(The current version, though, is 11.2.2, but as my Debian CUDA
packages are 11.2.1, I downloaded the older version from the Archive
of Previous CUDA
Releases.)
I needed write permission on the parent directory of the
desired target location, so I created a directory
/usr/local/cuda-11.2.1
(as root) and then ran
$ chown jdg:jdg /usr/local/cuda-11.2.1
(I could equally have created this directory in some other location without needing to be root.) I then unpacked the CUDA package into it:
$ sh cuda_11.2.1_460.32.03_linux.run --installpath=/usr/local/cuda-11.2.1/cuda
After the installation, I tidied up:
$ mv /usr/local/cuda-11.2.1/cuda/* /usr/local/cuda-11.2.1/
$ rmdir /usr/local/cuda-11.2.1/cuda/
so that everything is now directly in /usr/local/cuda-11.2.1
. (At
the end of the build process, it appears that this entire directory
can be deleted, as long as the relevant Debian CUDA packages are still
present.)
Building TensorFlow
I started by following the guidance on the TensorFlow installing from source webpage.
Installing Bazel
There will eventually be a Debian bazel package, but unfortunately that is some way in the future still (there are a team working on this, but there are currently technical difficulties). So I installed Bazelisk, following the instructions on the Baselisk GitHub page:
$ go get github.com/bazelbuild/bazelisk
$ export PATH=$PATH:$(go env GOPATH)/bin
$ (cd $(go env GOPATH)/bin && ln -s bazelisk bazel)
(The export
may well be extraneous, as PATH
is usually already
exported.)
Build directory setup
Since I am building both TensorFlow and TensorFlow Addons, I created a
directory called ~/packages/tensorflow
and clone the git
repositories into that directory; the intention is to build the wheels
in that same directory, so everything is together.
Downloading the TensorFlow source code
I followed the instructions as given; I also checked out the latest
release branch, which at the time of writing is r2.4
.
$ git clone https://github.com/tensorflow/tensorflow.git
$ cd tensorflow
$ git checkout -t origin/r2.4
Configuring the build
Here begins the fun! Some of these lines have been wrapped to fit better on the screen.
$ ./configure
You have bazel 3.1.0 installed.
Found possible Python library paths:
/usr/lib/python3/dist-packages
/usr/lib/python3.9/dist-packages
/usr/local/lib/python3.9/dist-packages
/home/jdg/lib/python
Please input the desired Python library path to use. Default is [/usr/lib/python3/dist-packages]
/home/jdg/.local/lib/python3.9/site-packages
I can’t write to the system directory, and it seems as though this is
where some libraries may be written, so I set it to be my local
repository instead. I don’t know whether leaving this as
/usr/lib/...
would work equally well.
Do you wish to build TensorFlow with ROCm support? [y/N]:
No ROCm support will be enabled for TensorFlow.
Do you wish to build TensorFlow with TensorRT support? [y/N]:
No TensorRT support will be enabled for TensorFlow.
I accepted the defaults for both of these; I don’t have TensorRT installed.
Inconsistent CUDA toolkit path: /usr vs /usr/libAsking for detailed CUDA configuration...
Please specify the CUDA SDK version you want to use.
[Leave empty to default to CUDA 10]: 11.2
Please specify the cuDNN version you want to use.
[Leave empty to default to cuDNN 7]: 8.1
I have CUDA 11.2.1 installed, so I responded 11.2
. Perhaps this would
be better as just 11
, but I’m not sure. Likewise, I have cuDNN
8.1.0.77 installed, so I responded 8.1
.
Please specify the locally installed NCCL version you want to use.
[Leave empty to use http://github.com/nvidia/nccl]:
I left this empty, as I do not have NCCL installed.
Please specify the comma-separated list of base paths to look for CUDA
libraries and headers. [Leave empty to use the default]:
/usr/local/cuda-11.2.1
This is the point at which everything goes wrong with the Debian-packaged version of the CUDA libraries. The Debian versions can be installed on the system simultaneously during the build, it seems, but they do not work for building TensorFlow. So I gave the path to the NVIDIA-unpackage libraries.
Please specify a list of comma-separated CUDA compute capabilities you
want to build with.
You can find the compute capability of your device at:
https://developer.nvidia.com/cuda-gpus.
Each capability can be specified as "x.y" or "compute_xy" to include
both virtual and binary GPU code, or as "sm_xy" to only include the
binary code.
Please note that each additional compute capability significantly
increases your build time and binary size, and that TensorFlow only
supports compute capabilities >= 3.5 [Default is: 3.5,7.0]:
I set this to 3.5,7.5
based on the information on the webpage
referred to. Maybe I don’t need the 3.5
part, but I’m not sure, so
I left it in.
Do you want to use clang as CUDA compiler? [y/N]:
Please specify which gcc should be used by nvcc as the host
compiler. [Default is /usr/bin/gcc]:
Please specify optimization flags to use during compilation when bazel
option "--config=opt" is specified [Default is -Wno-sign-compare]:
Would you like to interactively configure ./WORKSPACE for Android
builds? [y/N]:
I left all of these with their default settings.
And the configuration is finished!
Building the pip package
Unfortunately, some of the scripts in the TensorFlow sources call
python using the shebang formulation #!/usr/bin/env python
, which
breaks unless there is a python
on PATH
; this should presumably be
Python 3.x (though I haven’t checked). In Debian 10, python
was
python2
, but in Debian 11, there is no python
executable by
default (though there is a python-is-python3
package which could be
installed, which creates /usr/bin/python
as a symlink to
/usr/bin/python3
). Since I don’t want things to break silently, I
didn’t install that package, but instead set up a local symlink for the
purpose of this build. (Note that the bazel environment variable
PYTHON_BIN_PATH
does not help at all.)
$ mkdir ../bin
$ ln -s /usr/bin/python3 ../bin/python
$ PATH=$(realpath ../bin):$PATH
I then ran the bazel command:
$ bazel build --config=cuda //tensorflow/tools/pip_package:build_pip_package
This compilation step is pretty long.
Building and installing the pip package
Following the instructions, I ran:
./bazel-bin/tensorflow/tools/pip_package/build_pip_package ..
so that the resulting wheel ended up in the parent directory.
I then installed the wheel; in my case, the command line was:
pip install --user ../tensorflow-2.4.1-cp39-cp39-linux_x86_64.whl
Installing TensorFlow Addons
The instructions for this are on the TensorFlow website here.
I first cloned the repository; again, I started in the directory
~/packages/tensorflow
.
$ git clone https://github.com/tensorflow/addons.git
$ cd addons
$ git checkout -t origin/r0.12
The exports, though, are not quite as described. They should instead be the following:
$ export TF_NEED_CUDA=1
$ export CUDA_TOOLKIT_PATH=/usr/local/cuda-11.2.1
(and I have just reported it, so this may well be fixed very soon). The rest ran smoothly:
$ python3 ./configure.py
$ bazel build build_pip_pkg
$ bazel-bin/build_pip_pkg ..
$ pip install ../tensorflow_addons-0.12.2-cp39-cp39-linux_x86_64.whl
And after this, I removed (actually just temporarily renamed)
/usr/local/cuda-11.2.1
, and everything still works.