Abstract: Much of the data needed to answer user queries resides on the web – accessible via web apis or web forms. The setting is answering queries over sources that are connected to one another by semantic relationships (that is, integrity constraints). The sources may have overlap, may have multiple means of access with different requirements (e.g. number of required arguments), and the means of access within and across relations may differ radically in cost. This work presents a method for generating query plans that run on top of these sources, obeying the access patterns and exploiting the integrity constraints. The approach is based on "implementing" interpolation and definability results in predicate logic, while extending them to take into account access restrictions, modifying them work in the presence of background integrity constraints, combining them with cost considerations. In this talk I will give a flavor of the theory behind the logic-based approach, and then discuss a variety of algorithmic and system issues. This is joint work with Balder ten Cate, Julien Leblay, and Efi Tsamoura.
Abstract: In this talk, using my own life as an example, I discuss the excitement of doing research and developing advanced technologies. Starting with my undergraduate days as a chemical engineering student at IIT Madras, I narrate how my academic life changed when I got introduced to an IBM mainframe machine which led to my ultimately doing a PhD in computer science at University of Texas at Austin. Along the way, I talk about the prolonged hunt for a PhD thesis topic. Then, I trace my professional career at IBM which included working on hot topics as well as what were initially thought to be dead topics which turned out to be not so! I deal with the difficulties of getting papers published and doing technology transfer to new products as well as old ones. Having never been a manager during my entire 32 year IBM career, even though as an IBM Fellow I have been an IBM executive for 16 years, I bring a unique perspective on what it takes to be on a long term technical career.
Abstract: Taking advantage of Big Data while retaining a user-centered point of view is quite difficult. Managing data volume, variety and velocity to extract the relevant information is still challenging. The information extraction needs customization to adapt both content and presentation to fit users' current profile. Regarding the content, data volume can be reduced and personalized by using user preferences. Regarding presentation, answers should be adapted to be displayed on the user devices. This talk focuses on improving stream data monitoring by proposing the construction of ad-hoc summaries. We will present a comprehensive solution which relies on a personalized and continuous refinement of data in order to generate texts that provide a tailored synthesis of relevant data. Short texts in natural language will summarize the result of continuous complex data monitoring. The presented solution adopts contextual preferences to better fit user’s current priorities. Text summaries can be shared on social networks and delivered to personal devices in various contexts (e.g. listen to summaries while driving).
Abstract: We attempt a structured view at the ontological query answering problem by distinguishing between two antagonistic perspectives: The knowledge representation perspective considers the ontology as a part of the specified knowledge whereas the database perspective assumes it to be part of the query. These two perspectives give rise to two computation strategies: (data-driven) forward chaining and (query-driven) backward chaining, based on which different types of decidability criteria can be defined. We give an overview of the two views as well as ensuing conditions for decidability, focusing on existential rules as ontological formalism that has lately gained a lot of renewed interest from both the KR and DB communities.