Monday, May 14, 2012

The Quest for the Proper Proper Subset Classification

Switch to SVM 

We’ve switched from logistic regression with HOG features to Support Vector Machines using a variant of SIFT descriptors. 

Our code is a slightly modified version of this vl_feat example code.

Here is a description of what the code does, as far as I currently understand it:

The code first trains a bag of visual words “vocabulary” based on PHOW descriptors (a variant of dense SIFT, implemented in vl_feat and described in this paper: Image Classification using Random Forests and Ferns. The "vocabulary" is created by running k-means on the PHOW descriptors.

The vocabulary comes from a subset of the training data, and afterwards all the
training data is given a feature description based on PHOW descriptors and the vocabulary.

A feature map is then computed using the chi-squared kernel.  


The classification is done through all-vs-one classification, I'm thinking about possibly altering it to training a bunch of binary classifiers then doing a decision tree or a voting thing. Also, I need to make the output soft for the parser.

This is a good tutorial for SVM's.





Experimental Results


Methodology: We have 300 samples for each of the 17 classes. I wrote a python script that randomly selected 20 of each of the class examples to be "holdout" data. I haven't modified the code yet to do cross-validation.

The 20 holdout data samples per class were used to create various formulas in first-order logic (the definition of function and the definition of proper subset). When the segmentation and extractor get better new novel data will be used instead of these hold-out samples for testing the effectiveness of the system.

At first I used 50 random samples as training and the remaining 230 as test. I observed that most of the errors were due to classifying symbols as negation.

Observing the (scores,classes) output of the svm one-vs-all classifier, I saw that for the "negation errors" the correct label usually was the 2nd most likely.


Here is an example of the scores for a "z" that was mistaken for a "negation":

    scores   class
   -0.2888   11 (neg)        
   -0.3795   17 (z)      
   -0.4663    9 (left paren)        
   -0.4823    7 (if then)        
   -0.5354   12 (or)            
   -0.5367   16 (y)        
   -0.5602   10 (ne)          
   -0.5620   13 (right paren)            
   -0.5640   14 (subset)        
   -0.5786    4 (elem)      
   -0.6220   15 (x)        
   -0.6455    2 (R)          
   -0.6618    5 (exist)      
   -0.6747    3 (and)      
   -0.6786    1 (F)      
   -0.7181    6 (for all)    
   -0.7594    8 (iff)      




I decided to create alternate samples that used a tilde instead of  the hook-like symbol for negation, to see how that would affect the results.

In the end I trained 4 models and tested each of them on 2 different formulas:

Models:
1) 50 train
2) 230 train
3) 50 train (tilde for negation)
4) 230 train (tilde for negation)

Formulas:
1)Function definition

Correct classification for function definition 



2)Proper Subset definition


Proper classification, definition of proper subset




PARAMS that may need to be tweaked:


There are a few parameters in the svm code. I need do more testing to see how tweaking these will affect classification results.

Number of words: This is the "vocabulary" created by k-means. It was originally set to 300, but I increased it to 600. I'm not sure yet if this helped or hurt, more testing is needed. 

K-means algorithm: The vl_feat implementation of k-mean can use the standard Lloyd's algorithm or UCSD professor Elkan's accelerated k-mean's algorithm. Current results use Elkan's algorithm, but I'm not sure if this sped-up version gives the same results for our (relatively) small dataset.

Kernel:  The feature map is computed using the vl_homkermap function. This function has a few parameters and potentially other kernel mappings could be swapped for it.



Confusion Matrices




"Hook" negation:
50 training examples, 230 test




230 training, 50 test




Tilde negation:


50 train, 230 test, tilde for negation



230 train, 50 test, tilde for negation




Parsing Results for Holdout Data Formulas






Function Formula(7 symbols from holdout):


50 training ( 0 errors!)



230 training ( 0 errors!)


50 training with tilde for negation (2 errors)



230 training with tilde for negation (1 error, left paren for x)




Proper Subset Formula (30 symbols from holdout):


50 training (7 errors)





230 train (8 errors)







50 training with tilde for negation (4 errors)
230 training with tilde for negation (4 errors)


Current Goals:






Classification seems "good enough", next to do is: 

1) Implementing the graph structure/Context Free Grammar for parsing
2) Improving segmentation and extraction 
3) Expanding the data to include numbers.


No comments:

Post a Comment