Madhulika Mohanty: “Effective Exploration of Knowledge Graphs”

Madhulika Mohanty will present her work on April 7th at 2pm. It will be online on Zoom at https://ecolepolytechnique.zoom.us/j/86323133834?pwd=QzFqdlpwalBwTUtRbzQxYWQwSXNLUT09

Title: Effective Exploration of Knowledge Graphs

Abstract:
Knowledge Graphs (KGs) such as YAGO, DBPedia, and Freebase, form the backbone for applications includings chatbots, personal assistants and question answering systems. They represent the information in the form of a graph consisting of nodes to denote entities and edges to denote relationships between these entities. They can be queried using either structured queries or relationship queries. Many of these KGs are huge in size, having millions of nodes and edges. Hence, querying them effectively is non-trivial. Users querying KGs can range from beginners looking to casually explore the system without a particular information need in mind, to expert users querying for specific information needs.

In this talk, I will discuss solutions addressing two problems in exploring KGs:

i) A common problem faced by the users is getting empty results. This is because of their lack of knowledge of the exact labels over nodes and edges in the KG. Also, since KGs allow schemaless addition of information, a given information may not be represented uniquely. Thus, structured queries seeking exact matches face the issue of poor recall. On getting unsatisfactory results, users try to relax their queries preserving original query intent. Nevertheless, coming up with useful relaxations is also challenging for these users.
I will discuss Spec-QP, which supports top-k exploratory search using structured queries by performing automatic relaxations and efficiently computing them.

ii) The results for the structured and unstructured queries over graphs are subgraphs/trees. There are multiple results of the same kind which contain redundant information. The burden to find the distinct pieces of information from the ocean of results is usually left to the user. I will discuss KlusTree, which uses a novel language-model (LM) based similarity metric, for summarizing the results based on their information content and presenting diverse results to the users. This enhances the result presentation by giving the user a bird’s eye view over the results.

Bio:
Madhulika Mohanty has joined the CEDAR Team as a postdoc starting Feb 2021. She would like to present her work to the group.