Version 2.1.0.0 CRISP Logo CRISP Homepage Help for CRISP Email Us

Abstract

Grant Number: 5R44LM006520-03
PI Name: MARCHISIO, GIOVANNI
PI Email: giovanni@statsci.com
PI Title:
Project Title: Bayesian Textual and Multimedia Information Retrieval

Abstract: DESCRIPTION (provided by applicant): Industry transformation and rapid advancements are creating tremendous amounts of electronic multimedia information. We are developing a multimedia search agent for health data networks and data banks. The search agent employs a generalized probabilistic model that bridges the gap between automatic feature extraction and semantic understanding. What distinguishes our approach is the integration of Bayesian methodology with a fast and scalable semantic interpreter. The semantic interpreter can generate a bootstrap database of prior probabilities, overcoming a major weakness of the traditional probabilistic model. We also provide a principled approach to user feedback, contrasting existing probabilistic and nonprobabilistic ad-hoc methods. Relevance feedback on an initial database of prior probabilities can incrementally improve retrieval results to unprecedented levels of precision/recall. The model supports interactive definition and training of new semantic labels in a collaborative environment. Semantic labels organize index term dependencies in a tree-like structure with probabilities at each node, and allow the user to define concepts that match specific information needs more closely than the raw feature information found in an indexed database. Finally, we propose a comprehensive approach to the difficult problem of combining probability distributions, or relevance judgements, from different search engines. PROPOSED COMMERCIAL APPLICATIONS: The proposed methodology adds a customizable interpretive layer to electronic collections and archival multimedia databases. The Phase I prototype has opened an immediate partnership opportunity in this area. The Bayesian search and retrieval functions will become an add-on module to many database management systems. The ability to train the system to recognize visual concepts gives us a definite advantage In image mining. The decision-maker software can merge the results of different experts or search engines, a problem for which presently there are only ad hoc unsatisfactory solutions, and great opportunities in the Web and e-commerce.

Thesaurus Terms:
abstracting /text searching, computer program /software, computer system design /evaluation, information retrieval, information system, method development
health science research analysis /evaluation, interactive multimedia
human data

Institution: INSIGHTFUL CORPORATION
1700 WESTLAKE AVE N, STE 500
SEATTLE, WA 98109
Fiscal Year: 2002
Department:
Project Start: 15-SEP-1997
Project End: 29-SEP-2003
ICD: NATIONAL LIBRARY OF MEDICINE
IRG: ZRG1


CRISP Homepage Help for CRISP Email Us