Recebi o email de aprovação da NVIDIA para utilização dos recursos Deep Learning acelerado por GPU que dobrará a performance dos treinamentos de redes neurais.

“Your application for the program CUDA Registered Developer Program is approved.
Congratulations, you are now a member of the CUDA/GPU Computing Developer Program.
Should you have any questions or issues with you membership please don’t hesitate to contact us.

Best regards,
NVIDIA Developer Relations”

Treinar amostragens maiores e mais sofisticadas com mais rapidez somente é possível com aceleração por GPU de rotinas matemáticas para redes neurais profundas.

O resultado é acelerações significativas de performance para o treinamento de redes neurais em uma única GPU NVIDIA GeForce® GTX™ TITAN X.²

Agora vamos para a nova fase do game, Deep Learning na GPU para o Certiface!

A seguir os primeiro testes…

# optirun ./mnistCUDNN
cudnnGetVersion() : 3007 , CUDNN_VERSION from cudnn.h : 3007 (3.0.07)
Host compiler version : GCC 4.8.5
There are 1 CUDA capable devices on your machine :
device 0 : sms 5 Capabilities 5.0, SmClock 1019.5 Mhz, MemSize (Mb) 2047, MemClock 2505.0 Mhz, Ecc=0, boardGroupID=0
Using device 0
Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.049184 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.051776 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.059488 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.188672 time requiring 207360 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
Testing half precision (math in single precision)
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.032640 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.035456 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.051904 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.200064 time requiring 207360 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!