Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

GBs and TBs of data is common, not for this task. All you are doing is Word Sense Disambiguation and there are algorithms to do WSD that work with much much smaller training sets. Just don't think that the exponential increase in training data is justified...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: