Switch to SVM
We’ve switched from logistic regression with HOG features to Support Vector Machines using a variant of SIFT descriptors.
Our code is a slightly modified version of this vl_feat example code.
Here is a description of what the code does, as far as I currently understand it:
Here is a description of what the code does, as far as I currently understand it:
The code first trains a bag of visual words “vocabulary” based on PHOW descriptors (a variant of dense SIFT, implemented in vl_feat and described in this paper: Image Classification using Random Forests and Ferns. The "vocabulary" is created by running k-means on the PHOW descriptors.
The vocabulary comes from a subset of the training data, and afterwards all the
training data is given a feature description based on PHOW descriptors and the vocabulary.
A feature map is then computed using the chi-squared kernel.
The classification is done through all-vs-one classification, I'm thinking about possibly altering it to training a bunch of binary classifiers then doing a decision tree or a voting thing. Also, I need to make the output soft for the parser.
This is a good tutorial for SVM's.
Experimental Results
Methodology: We have 300 samples for each of the 17 classes. I wrote a python script that randomly selected 20 of each of the class examples to be "holdout" data. I haven't modified the code yet to do cross-validation.
The 20 holdout data samples per class were used to create various formulas in first-order logic (the definition of function and the definition of proper subset). When the segmentation and extractor get better new novel data will be used instead of these hold-out samples for testing the effectiveness of the system.
At first I used 50 random samples as training and the remaining 230 as test. I observed that most of the errors were due to classifying symbols as negation.
Observing the (scores,classes) output of the svm one-vs-all classifier, I saw that for the "negation errors" the correct label usually was the 2nd most likely.
Here is an example of the scores for a "z" that was mistaken for a "negation":
scores class
-0.2888 11 (neg)
-0.3795 17 (z)
-0.4663 9 (left paren)
-0.4823 7 (if then)
-0.5354 12 (or)
-0.5367 16 (y)
-0.5602 10 (ne)
-0.5620 13 (right paren)
-0.5640 14 (subset)
-0.5786 4 (elem)
-0.6220 15 (x)
-0.6455 2 (R)
-0.6618 5 (exist)
-0.6747 3 (and)
-0.6786 1 (F)
-0.7181 6 (for all)
-0.7594 8 (iff)
I decided to create alternate samples that used a tilde instead of the hook-like symbol for negation, to see how that would affect the results.
The 20 holdout data samples per class were used to create various formulas in first-order logic (the definition of function and the definition of proper subset). When the segmentation and extractor get better new novel data will be used instead of these hold-out samples for testing the effectiveness of the system.
At first I used 50 random samples as training and the remaining 230 as test. I observed that most of the errors were due to classifying symbols as negation.
Observing the (scores,classes) output of the svm one-vs-all classifier, I saw that for the "negation errors" the correct label usually was the 2nd most likely.
Here is an example of the scores for a "z" that was mistaken for a "negation":
scores class
-0.2888 11 (neg)
-0.3795 17 (z)
-0.4663 9 (left paren)
-0.4823 7 (if then)
-0.5354 12 (or)
-0.5367 16 (y)
-0.5602 10 (ne)
-0.5620 13 (right paren)
-0.5640 14 (subset)
-0.5786 4 (elem)
-0.6220 15 (x)
-0.6455 2 (R)
-0.6618 5 (exist)
-0.6747 3 (and)
-0.6786 1 (F)
-0.7181 6 (for all)
-0.7594 8 (iff)
I decided to create alternate samples that used a tilde instead of the hook-like symbol for negation, to see how that would affect the results.
In the end I trained 4 models and tested each of them on 2 different formulas:
Models:
1) 50 train
2) 230 train
3) 50 train (tilde for negation)
4) 230 train (tilde for negation)
Models:
1) 50 train
2) 230 train
3) 50 train (tilde for negation)
4) 230 train (tilde for negation)
Formulas:
1)Function definition
Correct classification for function definition |
2)Proper Subset definition
Proper classification, definition of proper subset |
PARAMS that may need to be tweaked:
There are a few parameters in the svm code. I need do more testing to see how tweaking these will affect classification results.
Number of words: This is the "vocabulary" created by k-means. It was originally set to 300, but I increased it to 600. I'm not sure yet if this helped or hurt, more testing is needed.
K-means algorithm: The vl_feat implementation of k-mean can use the standard Lloyd's algorithm or UCSD professor Elkan's accelerated k-mean's algorithm. Current results use Elkan's algorithm, but I'm not sure if this sped-up version gives the same results for our (relatively) small dataset.
Kernel: The feature map is computed using the vl_homkermap function. This function has a few parameters and potentially other kernel mappings could be swapped for it.
Confusion Matrices
"Hook" negation:
50 training examples, 230 test 230 training, 50 test Tilde negation:
230 train, 50 test, tilde for negation Parsing Results for Holdout Data Formulas |
50 training with tilde for negation (4 errors) |
230 training with tilde for negation (4 errors) |
Current Goals:
Classification seems "good enough", next to do is:
1) Implementing the graph structure/Context Free Grammar for parsing
2) Improving segmentation and extraction
3) Expanding the data to include numbers.
No comments:
Post a Comment