Research Areas

Web Search

With the advent of the Internet, online resources are increasingly available. Many users choose search engines to perform an online search to satisfy their information need. However, these search engines tend to turn up many non-relevant documents, which make their retrieval precision very low. How to find appropriate optimized strategy to retrieve more relevant documents and fewer non-relevant documents for users remains a big challenge to the information retrieval community. The main contributions of Web Search Group related to data mining of log data, site information, document contents, user behaviors, for the exploitation of the mining results in improving the performance of the search engines. These include the following subjects:

Semantic Search

With the maturing of many text mining techniques (e.g. information extraction, text categorization), the semantic information of documents are increasingly easier to get. The advent of XML and the Semantic Web are also pushing towards a semantic richer information world. This trend will make the information source of traditional IR shift from document set (possiblely with inter-document links, e.g. Web pages) to a hybrid information environment with both documents and semantics.

How to satisfy users' information needs in such a hybrid information environment is a challenge. Semantic information makes it possible to utilize formal reasoning to answer user's query more acurately. However, what is the relationship between reasoning and retrieval? Can they help each other? How to integrate them? Because of the inherent uncertainty in judging whether a piece of information satisfies a user's need, and of the difference between formal inference mechanism and common-sense human reasoning, these problems are quite unique and difficult.

This research tries to solve the problem by establishing models and methods for semantic search. We will test the solution by applying it in searching enterprise knowledge portals.

P2P Search

Peer-to-Peer (P2P) systems are now one of the most prevalent Internet distributed application due to their scalability, fault-tolerance and self-organizing nature. It gradually invokes more and more academic/industrial researchers' interests from 2000. One of the most important applications in P2P platform is file sharing. There are dozens of open problems about find the shared data in P2P environment, including document retrival.

What we are now interested in is how to implement the high efficient and feasible document retrival in P2P environment. Our main researching topics are followings:

This page has been visited 876 times since September 25, 2007.

Researches (last edited 2007-09-25 12:45:18 by handy)