Tom Mitchell
30 days ago
Never-Ending Learning to Read the Web
Machine Learning
August 2013: "Never-Ending Learning to Read the Web." Presented by Tom M. Mitchell, Founder and Chair of Carnegie Mellon University's Machine Learning Department; moderated by Yolanda Gil, Chair of ACM SIGART.

One of the great technical challenges in big data is to construct computer systems that learn continuously over years, from a continuing stream of diverse data, improving their competence at a variety of tasks, and becoming better learners over time.

This webinar describes Carnegie Mellon University's research to build a Never-Ending Language Learner (NELL) that runs 24 hours per day, forever, learning to read the web. Each day NELL extracts (reads) more facts from the web, and integrates these into its growing knowledge base of beliefs. Each day NELL also learns to read better than yesterday, enabling it to go back to the text it read yesterday, and extract more facts, more accurately, today.

NELL has been running 24 hours/day for over three years now. The result so far is a collection of 50 million interconnected beliefs (e.g., servedWith(coffee, applePie), isA(applePie, bakedGood)), that NELL is considering at different levels of confidence, along with hundreds of thousands of learned phrasings, morphological features, and web page structures that NELL has learned to use to extract beliefs from the web. Track NELL's progress at http://rtw.ml.cmu.edu.
Confirm Your Email: Send