I have been troubleshooting this for a couple of days and finally have the NVidia drivers installed properly from piecing together other posts of other user's woes:

Code:
root@kali-desktop:~# lsmod | grep nouveau
root@kali-desktop:~# lsmod | grep nvidia
nvidia_uvm            434176  0
nvidia_modeset        741376  6
nvidia              10080256  120 nvidia_modeset,nvidia_uvm
drm                   360448  6 nvidia

root@kali-desktop:~# glxinfo | grep "direct rendering"
direct rendering: Yes
I'm confident that the nouveau driver is blacklisted

Code:
root@kali-desktop:~# cat /etc/default/grub | grep nouveau
GRUB_CMDLINE_LINUX_DEFAULT=" nouveau.modset=0"
However pyrit doesn't see the cuda cores at all

Code:
pyrit list_cores
Pyrit 0.4.0 (C) 2008-2011 Lukas Lueg http://pyrit.googlecode.com
This code is distributed under the GNU General Public License v3+

The following cores seem available...
#1:  'CPU-Core (SSE2)'
#2:  'CPU-Core (SSE2)'
#3:  'CPU-Core (SSE2)'
#4:  'CPU-Core (SSE2)'
#5:  'CPU-Core (SSE2)'
#6:  'CPU-Core (SSE2)'
#7:  'CPU-Core (SSE2)'
#8:  'CPU-Core (SSE2)'
Strangely hashcat seems to: I grabbed a very simple md5 hash from http://www.miraclesalad.com/webtools/md5.php and tested it

Code:
hashcat -m 0  -d 1  79cfeb94XXXXXXXXXXXXX -o ?l?l?l?l?d?d?d
hashcat (v3.10) starting...

OpenCL Platform #1: NVIDIA Corporation  
======================================
- Device #1: GeForce GTX 970, 1023/4095 MB allocatable, 13MCU
- Device #1: WARNING! Kernel exec timeout is not disabled, it might cause you errors of code 702
             See the wiki on how to disable it: https://hashcat.net/wiki/doku.php?id=timeout_patch

OpenCL Platform #2: Mesa, skipped! No OpenCL compatible devices found

WARNING: NVML library load failed, proceed without NVML HWMon enabled.
Hashes: 1 hashes; 1 unique digests, 1 unique salts
Bitmaps: 16 bits, 65536 entries, 0x0000ffff mask, 262144 bytes, 5/13 rotates
Rules: 1
Applicable Optimizers:
* Zero-Byte
* Precompute-Init
* Precompute-Merkle-Demgard
* Meet-In-The-Middle
* Early-Skip
* Not-Salted
* Not-Iterated
* Single-Hash
* Single-Salt
* Raw-Hash
Watchdog: Temperature abort trigger set to 90c
Watchdog: Temperature retain trigger set to 75c
There is that bit about no OpenCL though...and performance is TERRIBLE

Code:
Session.Name...: hashcat
Status.........: Running
Input.Mode.....: Pipe
Hash.Target....: 79cfeb94XXXXXXXXXXXXX
Hash.Type......: MD5
Time.Started...: Tue Sep 13 21:44:39 2016 (1 min, 9 secs)
Speed.Dev.#1...:        0 H/s (0.00ms)
Recovered......: 0/1 (0.00%) Digests, 0/1 (0.00%) Salts
Progress.......: 0
Rejected.......: 0
HWMon.Dev.#1...: N/A
This is an NVidia GTX 970 - so I would expect a *little* more than 0 H/s!

I feel like I must be missing something on the OpenCL side since Pyrit doesn't see anything - but I'm stumped