Tutorial: Getting Started with Distributed Deep Learning with Caffe on Windows
Introduction
What is Caffe?
Setup
My setup:
Windows 8.1 on 64bit
Visual Studio 2013 Community
GeForce GT 750M
CUDA 7.5
1. Check for Compatibility
Windows 8.1
Windows 7
Windows Server 2008
Windows Server 2012.(If you are using Windows 8, upgrade through here: http://windows.microsoft.com/en-ca/windows-8/update-from-windows-8-tutorial)
Make sure your GPU is supported by CUDA: https://developer.nvidia.com/cuda-gpus
Anything with compute capability of >=3.0 should be good.
If you do not have a compatible GPU, you can still use Caffe but it will be magnitudes slower than with a GPU and skip part 2.
Make sure you have a compatible Visual Studios for CUDA support:
Visual Studio 2013
Visual Studio 2013 Community (Download Visual Studio 2013 Community Edition Free)
Visual Studio 2012
Visual Studio 2010
More nVidia documentation at:
http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-microsoft-windows/#axzz3wsl3JktL
2. Install CUDA
3. Install Caffe
Remember to add caffe-windows/3rdparty/bin to your PATH
Open caffe-windows/buildVS2013/MainBuilder.sln in Visual Studio
If you don’t have a compatible GPU, open caffe-windows/build_cpu_only/MainBuilder.sln
Set the GPU compatible mode:
Right click the caffe project and click properties
In the left menu, go to Configuration Properties -> Cuda C/C++ -> Device
In the Code Generation key, modify the compute capabilities to your GPU’s (such as compute_30,sm_30; etc)
Build the solution in release mode
Right click the solution and click Build Solution
(It’s OK if matcafe and pycafe fail)
Testing
Download the mnist leveldb from http://pan.baidu.com/s/1mgl9ndu
Extract the folders to caffe-windows/examples/mnist
Run caffe-windows/run_mnist.bat
You should get some output similar to the following when you finish:
….
I0112 00:06:37.180341 45040 solver.cpp:326] Iteration 10000, loss = 0.00428135
I0112 00:06:37.181342 45040 solver.cpp:346] Iteration 10000, Testing net (#0)
I0112 00:06:51.726634 45040 solver.cpp:414] Test net output #0: accuracy = 0
.9914
I0112 00:06:51.726634 45040 solver.cpp:414] Test net output #1: loss = 0.027
0199 (* 1 = 0.0270199 loss)
I0112 00:06:51.726634 45040 solver.cpp:331] Optimization Done.
I0112 00:06:51.726634 45040 caffe.cpp:215] Optimization Done.
Full instructions can be found on the readme of https://github.com/happynear/caffe-windows
Results:
solver_mode: GPU
Start Time: 23:25:19.38
Finish Time: 23:28:37.62
solver_mode: CPU
Finish Time: 0:06:51.91As you can see, even a low-end GPU can train a magnitude faster than a CPU.