Recebi o email de aprovação da NVIDIA para utilização dos recursos Deep Learning acelerado por GPU que dobrará a performance dos treinamentos de redes neurais.
“Your application for the program CUDA Registered Developer Program is approved.
Congratulations, you are now a member of the CUDA/GPU Computing Developer Program.
Should you have any questions or issues with you membership please don’t hesitate to contact us.
Best regards,
NVIDIA Developer Relations”
Treinar amostragens maiores e mais sofisticadas com mais rapidez somente é possível com aceleração por GPU de rotinas matemáticas para redes neurais profundas.
O resultado é acelerações significativas de performance para o treinamento de redes neurais em uma única GPU NVIDIA GeForce® GTX™ TITAN X.²
Agora vamos para a nova fase do game, Deep Learning na GPU para o Certiface!
A seguir os primeiro testes…
# optirun ./mnistCUDNN
cudnnGetVersion() : 3007 , CUDNN_VERSION from cudnn.h : 3007 (3.0.07) Host compiler version : GCC 4.8.5 There are 1 CUDA capable devices on your machine : device 0 : sms 5 Capabilities 5.0, SmClock 1019.5 Mhz, MemSize (Mb) 2047, MemClock 2505.0 Mhz, Ecc=0, boardGroupID=0 Using device 0
Testing single precision Loading image data/one_28x28.pgm Performing forward propagation ... Testing cudnnGetConvolutionForwardAlgorithm ... Fastest algorithm is Algo 1 Testing cudnnFindConvolutionForwardAlgorithm ... ^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.049184 time requiring 0 memory ^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.051776 time requiring 3464 memory ^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.059488 time requiring 57600 memory ^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.188672 time requiring 207360 memory ^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory Resulting weights from Softmax: 0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000 Loading image data/three_28x28.pgm Performing forward propagation ... Resulting weights from Softmax: 0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000 Loading image data/five_28x28.pgm Performing forward propagation ... Resulting weights from Softmax: 0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
Testing half precision (math in single precision) Loading image data/one_28x28.pgm Performing forward propagation ... Testing cudnnGetConvolutionForwardAlgorithm ... Fastest algorithm is Algo 1 Testing cudnnFindConvolutionForwardAlgorithm ... ^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.032640 time requiring 0 memory ^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.035456 time requiring 3464 memory ^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.051904 time requiring 28800 memory ^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.200064 time requiring 207360 memory ^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory Resulting weights from Softmax: 0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001 Loading image data/three_28x28.pgm Performing forward propagation ... Resulting weights from Softmax: 0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000 Loading image data/five_28x28.pgm Performing forward propagation ... Resulting weights from Softmax: 0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!