Tutorial: Getting Started with Distributed Deep Learning with Caffe on Windows

Categories Machine Learning, Uncategorized


What is Caffe?

A deep learning framework developed by Berkeley Vision and Learning Center. It makes creating deep neural networks easy without writing a ton of code.If you don’t know what deep learning is, here is a great guide to getting started: http://cs231n.github.io/.


My setup:
Windows 8.1 on 64bit
Visual Studio 2013 Community
GeForce GT 750M
CUDA 7.5

1. Check for Compatibility

Make sure you are on a supported Windows operating system:
Windows 8.1
Windows 7
Windows Server 2008
Windows Server 2012.(If you are using Windows 8, upgrade through here: http://windows.microsoft.com/en-ca/windows-8/update-from-windows-8-tutorial)

Make sure your GPU is supported by CUDA: https://developer.nvidia.com/cuda-gpus 
Anything with compute capability of  >=3.0 should be good.

If you do not have a compatible GPU, you can still use Caffe but it will be magnitudes slower than with a GPU and skip part 2.

Make sure you have a compatible Visual Studios for CUDA support: 
Visual Studio 2013
Visual Studio 2013 Community (Download Visual Studio 2013 Community Edition Free)
Visual Studio 2012
Visual Studio 2010

More nVidia documentation at:

2. Install CUDA

Download and install CUDA toolkit here: https://developer.nvidia.com/cuda-downloads
Verify CUDA can compile:
Go to C:ProgramDataNVIDIA CorporationCUDA Samplesv7.5 and open the solution file (i.e. Samples_vs2013.sln) in Visual Studio
In the solution explorer, build 0_Simple/vectorAdd
Run C:ProgramDataNVIDIA CorporationCUDA Samplesv7.5binwin64debugvectorAdd.exe
The output should be:
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory

3. Install Caffe

Fork the windows port of Caffe: https://github.com/happynear/caffe-windowsDownload third party libraries and extract to caffe-windows/3rdparty
Remember to add caffe-windows/3rdparty/bin to your PATH

Open caffe-windows/buildVS2013/MainBuilder.sln in Visual Studio
If you don’t have a compatible GPU, open caffe-windows/build_cpu_only/MainBuilder.sln

Set the GPU compatible mode:
Right click the caffe project and click properties
In the left menu, go to Configuration Properties -> Cuda C/C++ -> Device
In the Code Generation key, modify the compute capabilities to your GPU’s (such as compute_30,sm_30; etc)

Build the solution in release mode
Right click the solution and click Build Solution
(It’s OK if matcafe and pycafe fail)

Download the mnist leveldb from http://pan.baidu.com/s/1mgl9ndu
Extract the folders to caffe-windows/examples/mnist
Run caffe-windows/run_mnist.bat

You should get some output similar to the following when you finish:
I0112 00:06:37.180341 45040 solver.cpp:326] Iteration 10000, loss = 0.00428135
I0112 00:06:37.181342 45040 solver.cpp:346] Iteration 10000, Testing net (#0)
I0112 00:06:51.726634 45040 solver.cpp:414]     Test net output #0: accuracy = 0
I0112 00:06:51.726634 45040 solver.cpp:414]     Test net output #1: loss = 0.027
0199 (* 1 = 0.0270199 loss)
I0112 00:06:51.726634 45040 solver.cpp:331] Optimization Done.
I0112 00:06:51.726634 45040 caffe.cpp:215] Optimization Done.

Full instructions can be found on the readme of https://github.com/happynear/caffe-windows

solver_mode: GPU
Start Time: 23:25:19.38
Finish Time: 23:28:37.62

solver_mode: CPU

Start Time: 23:38:01.62
Finish Time:  0:06:51.91As you can see, even a low-end GPU can train a magnitude faster than a CPU.

A Simple Artificial Intelligence Capable of Basic Reading Comprehension

Categories Machine Learning, Uncategorized
I was interested in an artificial intelligence that could do reading comprehension, but surprisingly,  I could not find much on the topic. I decided to try and create an AI that could do some basic reading comprehension and I got some good results:

Short Passage (Input):
Mary went sledding with her family. Mary won a race against her brother to the bottom of a hill. Mary cheered because it was her first time winning.

Input: Where did Mary go?
Output: Sledding

Input: What did Mary win?
Output: A race

Input: Who did Mary win a race against?
Output: Her brother

Input: Where did Mary win a race to?
Output: The bottom of the hill

Input: Why did Mary cheer?
Output: Because it was her first time winning

As we can see, the program can actually answer the questions about the passage.

Full Source Available Here


What I am trying to accomplish is program capable of artificial semantic memory. Semantic memory refers to how we store our explicit knowledge and facts about the world. For example, our memory of our birth date or our knowledge that humans are mammals. I wanted to be able to make something that was able to read a passage and answer any questions I had.

Abstract Idea

An abstract idea of how I accomplished artificial semantic memory was to create a structure that can store a sentence in a different way that can be used to answer questions.

1. Structure the relationships betweens objects (nouns) in the sentence.

For example, in the sentence “Mary went sledding with her family”, there are three objects “Mary”, “sledding” and “her family”. Mary has a verb “go” (present tense of went) with the object “sledding”. The verb “go” is “with” the object “her parents”. 
After brainstorming different ways to represent the relationships between objects and actions, I came up with a structure similar to a trie which I will call a “word graph”. In a word graph, each word is a node and the edges are actions or propositions. 
Mary went sledding with her family
Mary won a race against her brother to the bottom of the hill
Mary cheered because it was her first time winning

2. Answer questions using the structure.

A key observation to answering questions is that they can be reworded to be fill in the blanks. 
Where did Mary go -> Mary went _______
What did Mary win -> Mary won _______
Who did Mary win a race against? -> Mary won a race against _______
Why did Mary cheer -> Mary cheered because/since _______
We can use this observation to read out answers from our tree structure. We can parse the question, convert it to a fill in the blank format and then 
Mary went _____
By following the tree, we see that we should put “sledding” in the blank.
Mary won _______
Mary won a race against ______
Mary won a race to ______
By following the tree, we see that Mary won “a race”, against “her brother”, to “the bottom”.


I chose to implement this in Python since it is easy to use and has libraries to support natural language processing. There are three steps in my program: parsing, describing and answering. 
Parsing converting a sentence to a structure that makes sense of the sentence structure.
Describing is reading in a sentence and adding the information to our tree structure.
Answering is reading in a question, changing the format and completing from our tree structure.


The first thing we have to do is parse the sentence to see the sentence structure and to determine which parts of a sentence are objects, verbs and propositions. To do this, I used the Stanford parser which works well enough for most cases. 
Example: the sentence “Mary went sledding with her family” becomes:
    (NP (NNP Mary))
      (VBD went)
      (NP (NN sledding))
      (PP (IN with) (NP (PRP$ her) (NN family)))))
The top level tree S (declarative clause) has two children, NP (noun phrase) and VP (verb phrase). The NP consist of one child NNP (proper noun singular) which is “Mary”. The VP has three children: VBD (verb past tense) which is “went”, NP, and a PP (propositional phrase). We can use the recursive structure of a parse tree to help us build our word graph.

A full reference for the parsers tags can be found here.

I put the Stanford parser files in my working directory but you might want to change the location to where you put the files.

os.environ['STANFORD_PARSER'] = '.'
os.environ['STANFORD_MODELS'] = '.'

parser = stanford.StanfordParser()

line = 'Mary went sledding with her family'
tree = list(parser.raw_parse(line))[0]


We can use the parse tree to build the word graph by doing it recursively. For each grammar rule, we need to describe how to build the word graph.

Our method looks like this:

# Returns edge, node 
def describe(parse_tree):


  if matches(parse_tree,'( S ( NP ) ( VP ) )'):

    np = parse_tree[0] # subject
    vp = parse_tree[1] # action

    _, subject = describe(np) # describe noun
    action, action_node = describe(vp) # recursively describe action

    subject.set(action, action_node) # create new edge labeled action to the action_node
    return action, action_node

We do this for each grammar rule to recursively build the word graph. When we see a NP (noun phrase) we treat it as an object and extract the words from it. When we see a proposition or verb, we attach it to the current node and when we see another object, we use a dot ( . ) edge to indicate the object of the current node.

Currently, my program supports the following rules:

( S ( NP ) ( VP ) )
( S ( VP ) )
( NP )
( PP ( . ) ( NP ) )
( PRT )
( VP ( VBD ) ( VP ) $ )
( VP ( VB/VBD ) $ )
( VP ( TO ) ( VP ) )
( SBAR ( IN ) ( S ) )

For verbs, I used Nodebox (a linguistic library) for getting the present tense of a word so that the program knows different tenses of a word. E.g. “go” is the same word as “went”. 


We can answer questions by converting the question to a “fill in the blank” and then following the words in the “fill in the blank” in the word graph to the answer. My program supports two types of fill in the blanks: from the end and from the beginning.

Type I: From the end

A from the end type of fill in the blank is a question like:

Where did Mary go?

Which converts to:

Mary went _______

And as you can see, the blank comes at the end of the sentence. We can fill in this blank by following each word in our structure to the answer. A sample of the code is below:

# Matches "Where did Mary go"
if matches(parse_tree, '( SBARQ ( WHADVP ) ( SQ ( VBD ) ( NP ) ( VP )  )'):

  tokens = get_tokens(parse_tree) # Get tokens from parse tree

  subject = get_node(tokens[3]) # Get subject of sentence

  tokens = tokens[3:] # Skip first two tokens to make fill in the blank

  return subject.complete(tokens) # Complete rest of tokens
The node completes by reading each token and following the corresponding edges. When we run out of tokens, we follow the first edge until we reach another object and return the edges followed and the object.

Simplified node.complete:

class Node:
  def complete(self, tokens, qtype):
    if len(tokens) == 0:
      # no tokens left
      if qtype == 'why':
        # special case
        return self.why()
      if self.isObject:
        # return object
        return self.label
        # follow first until object
        return self.first.label + self.first.complete(tokens, qtype) 
      for edge, node in self:
        if edge == tokens[0]:
          # match rest of tokens
          return node.complete(tokens, qtype) 
      return "No answer"

We have to handle “Why” as a special case because we need to complete with “because” or “since” after there are no more tokens and we have to iterate backwards to the first object.

Type 2: From the beginning

A from the beginning type is a question like:

Who went sledding?

Which converts to:

 ____ went sledding?

As we can see, the blank is at the beginning of the sentence and my solution for this was to iterate through all possible objects and see which objects have tokens that match the rest of the fill in the bank.

Further Steps

There is still a long way to go, to make an AI perform reading comprehension at a human level. Below are some possible improvements and things to handle to make the program better:

Grouped Objects

We need to be able to handle groups of objects, e.g. “Sarah and Sam walked to the beach” should be split into two individual sentences.

Pronoun Resolution

Currently, pronouns such as he and she are not supported and resolution can be added by looking at the last object. However, resolution is not possible in all cases when there are ambiguities such as “Sam kicked Paul because he was stupid”. In this sentence “he” could refer to Sam or Paul.


If we have the sentence: “Jack leaped over the fence”, the program will not be able to answer “What did Jack jump over” since the program interprets jump as a different word than leap. However, we can solve this problem by using asking the same question for all synonyms of the verb and seeing if any answers work.

Augmented Information

If we have the sentence “Jack threw the football to Sam”, the program would not be able to answer “Who caught the football”. We can add information such as “Sam caught the football from Jack” which we can infer from the original sentence.


Sometimes objects can have different names, e.g. “James’s dog is called Spot” and the program should be able to know that James’ dog and Spot both refer to the same object. We can do this by adding a special rule for words such as “called”, “named”, “also known as” , etc.


There are probably other quirks of language that need to be handled and perhaps instead of explicitly handling all these cases, we should come up with a machine learning model that can read many passages and be able to construct a structure of the content as well as to augment any additional information.

Full Source Available Here

Tutorial: Getting Started with Machine Learning with the SciPy stack

Categories Machine Learning, Uncategorized
There are many machine learning libraries out there, but I heard that SciPy was good so I decided to try it out. We will be doing a simple walkthrough a k means clustering example:

Full Source Here

Sample Data Here

SciPy Stack

The contents of the SciPy stack are:

Python: Powerful scripting language
Numpy: Python package for numerical computing
SciPy: Python package for scientific computing
Matplotlib: Python package for plotting
iPython: Interactive python shell
Pandas: Python package for data analysis
SymPy: Python package for computer algebra systems
Nose: Python package for unit tests


I will go through my Mac installation but if you are using another OS, you can find the installation instructions for SciPy on: http://www.scipy.org/install.html.

You should have Python 2.7.

Mac Installation

I am using a Mac on OS X 10.8.5 and used MacPorts to setup the SciPy stack on my machine.

Install macports if you haven’t already: http://www.macports.org/

Otherwise open Terminal and run: ‘sudo macports selfupdate’

Next in your Terminal run: ‘sudo port install py27-numpy py27-scipy py27-matplotlib py27-ipython +notebook py27-pandas py27-sympy py27-nose’

Run the following in terminal to select package versions.

sudo port select –set python python27
sudo port select –set ipython ipython27

Hello World

IPython allows you to create interactive python notebooks in your browser. We will get started by creating a simple hello world notebook.
Create a new directory where you want your notebooks to be placed in.
In your directory, run in terminal:
ipython notebook

This should open your browser to the IPython notebook web interface. If it does not open, point your browser to http://localhost:8888.

 Click New -> Notebooks -> Python 2

This should open a new tab with a newly create notebook.

Click Untitled at the top, rename the notebook to Hello World and press OK.

In the first line, change the line format from Code to Markdown and type in:

# Hello World Code

And click run (the black triangle that looks like a play button)

On the next line, in code, type:

print ‘Hello World’

and press run.

K Means Clustering Seed Example

Suppose we are doing a study on a wheat farm to determine how much of each kind of wheat is in the field. We collect a random sample of seeds from the field and measure different attributes such as area, perimeter, length, width, etc. Using this attributes we can use k-means clustering to classify seeds into different types and determine the percentage of each type.

Sample data can be found here: http://archive.ics.uci.edu/ml/datasets/seeds

The sample data contains data that comes from real measurements. The attributes are:

1. area A, 
2. perimeter P, 
3. compactness C = 4*pi*A/P^2, 
4. length of kernel, 
5. width of kernel, 
6. asymmetry coefficient 
7. length of kernel groove. 

Example: 15.26, 14.84, 0.871, 5.763, 3.312, 2.221, 5.22, 1

Download the file into the same folder as your notebook.


Create a new notebook and name it whatever you want. We can put all the code into one cell.

First, we need to parse the data so that we can run k-means on it. We open the file using a csv reader and convert each cell to a float. We will skip rows that contain missing data.

Sample row:

['15.26', '14.84', '0.871', '5.763', '3.312', '2.221', '5.22', '1']
# Read data
for row in bank_csv:
    missing = False
    float_arr = []
    for cell in row:
        if not cell:
            missing = True
            # Convert each cell to float
    # Take row if row is not missing data
    if not missing:
data = np.array(data)

Next, we normalize the features for the k means algorithm. Since Scipy implements the k means clustering algorithm for us, all the hard work is done.

# Normalize vectors
whitened = vq.whiten(data)

# Perform k means on all features to classify into 3 groups
centroids, _ = vq.kmeans(whitened, 3)

We then classify each data point by distance to centroid:

# Classify data by distance to centroids
cls, _ = vq.vq(whitened, centroids)

Finally, we can graph the classifications of the data points by the first two features. There are seven features total, but it would be hard to visualize. You can graph by other features for similar visualizations.

# Plot first two features (area vs perimter in this case)
plt.plot(data[cls==0,0], data[cls==0,6],'ob',
        data[cls==1,0], data[cls==1,6],'or',
        data[cls==2,0], data[cls==2,6],'og')

Note: to show the plot inline in the cell, we put ‘%matplotlib inline’ at the beginning of the cell.

Sample Data Here