Troubleshooting
Errors
You may encounter errors while using the ZerNet client. Below are some common errors and their solutions:
Can't connect to Docker API, please ensure Docker is running and you have sufficient permissions
The ZerNet client cannot connect to your Docker socket. This could be because:
- The ZerNet client program lacks permissions
- The user running ZerNet isn't part of the
dockeruser group - The ZerNet client has not been started with
sudo
It could also indicate that /var/run/docker.sock does not exist due to Docker not being started or
installed correctly. Review the installation instructions for your operating system.
Can't start NVIDIA container, please ensure nvidia-ctk is installed
ZerNet cannot start a Docker container using the NVIDIA Container Toolkit. Please make sure you followed the installation instructions for your operating system, and make sure you can run this command successfully:
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Can't start NVIDIA container, driver/library version mismatch
This error is caused by a mismatch between usermode tools and the NVIDIA drivers. You may see the same issue if you run:
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
This error usually happens because of unattended package upgrades that have upgraded certain NVIDIA packages, but the old NVIDIA drivers are still being loaded.
On Ubuntu and Debian, you can prevent this by pinning the NVIDIA packages to specific versions that are compatible with your driver version.
Unknown error running NVIDIA container, please check your configuration
An error that ZerNet does not recognize has occurred. Review the output of this command to see what might be causing the issue:
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Node not registered, please register it
You must register your Node with --register. Review the Register a Node guide for instructions on how to register your Node.
Lockfile error
error: unable to open lock file: Permission denied
terminate called after throwing an instance of 'std::runtime_error'
what(): unable to open lock file.
Aborted
This can happen after a bad exit if you then try to run ZerNet as a user with less permissions than the initial execution. For example:
- ZerNet is started with
sudo - ZerNet has a bad exit
- ZerNet is started again without
sudo
To fix this, start the ZerNet client with sudo again.