Tensorflow Issue while model Training — Failed to get convolution algorithm

Photo by Mukil Menon on Unsplash

Many have gone through this issue and today I faced the it in my Ubuntu 20.04 machine.

Error : Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above

This normally iscaused by either an incompatibility in cuda, cudnn and Nvidia drivers or memory growth issue. The solution in here addresses the memory growth issue which was the case for me today.

This solution here worked for me.

Set the TF_FORCE_GPU_ALLOW_GROWTH environment variable to true. In your terminal, run this command.


Other Details around versions in my Machine

My cuda version — you will know this by running `nvcc — version``

Cuda compilation tools, release 11.0, V11.0.194
Build cuda_11.0_bu.TC445_37.28540450_0

And my Tensorflow version 2.3.1

You can run below command in Terminal to get TF version

python -c 'import tensorflow as tf; print(tf.__version__)'

For an explanation on TF_FORCE_GPU_ALLOW_GROWTH see here

“By default, TensorFlow maps nearly all of the GPU memory of all GPUs (subject to CUDA_VISIBLE_DEVICES) visible to the process. This is done to more efficiently use the relatively precious GPU memory resources on the devices by reducing memory fragmentation. To limit TensorFlow to a specific set of GPUs we use the tf.config.experimental.set_visible_devices method.

In some cases it is desirable for the process to only allocate a subset of the available memory, or to only grow the memory usage as is needed by the process. TensorFlow provides two methods to control this.

The first option is to turn on memory growth by calling tf.config.experimental.set_memory_growth, which attempts to allocate only as much GPU memory as needed for the runtime allocations: it starts out allocating very little memory, and as the program gets run and more GPU memory is needed, we extend the GPU memory region allocated to the TensorFlow process. Note we do not release memory, since it can lead to memory fragmentation. To turn on memory growth for a specific GPU, use the following code prior to allocating any tensors or executing any ops.

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
# Currently, memory growth needs to be the same across GPUs
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Memory growth must be set before GPUs have been initialized

Another way to enable this option is to set the environmental variable TF_FORCE_GPU_ALLOW_GROWTH to true. This configuration is platform specific.”

The second method is to configure a virtual GPU device with tf.config.experimental.set_virtual_device_configuration and set a hard limit on the total memory to allocate on the GPU.

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
# Restrict TensorFlow to only allocate 1GB of memory on the first GPU
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Virtual devices must be set before GPUs have been initialized

This is useful if you want to truly bound the amount of GPU memory available to the TensorFlow process. This is common practice for local development when the GPU is shared with other applications such as a workstation GUI.

Some further explanation about TF_FORCE_GPU_ALLOW_GROWTH from this Reddit thread

“The reason for the TF_FORCE_GPU_ALLOW_GROWTH flag is to allow TF to play nice with other apps (or TF instances?) that also need to use GPU memory. The issue is that GPU memory is fundamentally managed by CUDA API’s, but for efficiency TF wants to manage the memory itself, so TF maintains it’s own heap (memory allocator) using GPU memory it obtained via CUDA, and TF applications then allocate/release memory to/from the TF heap, not directly to/from CUDA.

The TF heap only ever grows, when needed (i.e. if a TF app requests more memory than TF currently has available), by grabbing more memory from CUDA. TF never shrinks its heap by releasing memory back to CUDA. The TF_FORCE_GPU_ALLOW_GROWTH flag determines whether TF grabs all the CUDA memory it wants at start-up, or — to play nice with other CUDA apps — starts small and grabs more memory only as needed.”

And the Tensorflow source code of this flag is here

DataScience | ML | 2x Kaggle Expert. Ex Fullstack Engineer and Ex International Financial Analyst. https://www.linkedin.com/in/rohan-paul-b27285129/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store