5 3 3
Home About us MoEF Contact us Sitemap Tamil Website  
About Envis
Whats New
Microorganisms
Research on Microbes
Database
Bibliography
Publications
Library
E-Resources
Microbiology Experts
Events
Online Submission
Access Statistics

Site Visitors

blog tracking


 
Information Sciences
Vol. 329, 2016, Pages: 125–143

Generalization of parse trees for iterative taxonomy learning

Boris A. Galitsky

eBay Inc., San Jose, CA 95125, USA.

Abstract

We build a taxonomy of entities which is intended to improve the relevance of search engine in a vertical domain. The taxonomy construction process starts from the seed entities and mines the web for new entities associated with them. To form these new entities, machine learning of syntactic parse trees (their generalization) is applied to the search results for existing entities to form commonalities between them. These commonality expressions then form parameters of existing entities, and are turned into new entities at the next learning iteration.

Taxonomy and paragraph-level syntactic generalization are applied to relevance improvement in search and text similarity assessment. We conduct an evaluation of the search relevance improvement in vertical and horizontal domains and observe significant contribution of the learned taxonomy in the former, and a noticeable contribution of a hybrid system in the latter domain. We also perform industrial evaluation of taxonomy and syntactic generalization-based text relevance assessment and conclude that proposed algorithm for automated taxonomy learning is suitable for integration into industrial systems. Proposed algorithm is implemented as a part of Apache OpenNLP.Similarity project.

Keywords: Learning taxonomy; Web mining; Learning constituency parse tree; Search relevance.

 
Copyright © 2005 ENVIS Centre ! All rights reserved
This site is optimized for 1024 x 768 screen resolution