First, this one with preprocessing / dataset stuff.
Next, classifier stuff.
Then, localization/bounding box stuff.
Finally, a brief TODO for the week and notes on version control.
This is for ease of reading/accessing posts by content in the future.
We have begun to play with our toy dataset!
The first task was removing the black grid lines so our samples wouldn’t have a bunch of extraneous noise. Oren wrote a script that did a fantastic job of erasing these lines:
Here is an example of an original dataset image:
The result of the preprocessing script:
We would like to thank Jeanne Wang for bringing the inftyMDB dataset to our attention. We have downloaded it, and are currently trying to figure out how to use it.
The inftyMDB dataset can be found here
In our proposal we planned to use mechanical turk to generate a dataset by the end of the first week,however I feel we need a few more iterations of toy datasets and to use the inftyMDB dataset before we generate a final dataset.
Current dataset goals include:
Figure out how to use inftyMDB
possibly make toydataset2
finalize planned range of math symbols for dataset
The inftyMDB dataset can be found here
In our proposal we planned to use mechanical turk to generate a dataset by the end of the first week,however I feel we need a few more iterations of toy datasets and to use the inftyMDB dataset before we generate a final dataset.
Current dataset goals include:
Figure out how to use inftyMDB
possibly make toydataset2
finalize planned range of math symbols for dataset
No comments:
Post a Comment