Knowledge-base mining of sequential patterns in ASP: an application to patient care-pathway analysis

The IRISA/Inria offers a 12-month post-doc position co-funded by Region Bretagne and ANSM-PEPS project.

Call closes on February 29th, 2016, and expected start is in April 2016 (or sooner).

Candidate profile

Required skills:

  • Pattern mining/KDD

  • Declarative programming : logic programming (answer set programming) / constraints programming

  • Knowledge in the field of knowledge reasoning

  • Application development knowledge (python, C++)

  • Oral and written proficiency in English

  • Autonomy, Creativity and abilities to communicate with domain-experts (medical staff)

Candidate could either have backgrounds in logic/constraints programming and knowledge about KDD or have backgrounds in KDD and knowledge in CP.

Candidate should also be interested in applied research and more especially in medical context of the PEPS project. Candidates with PhD in medical informatics must ensure that they have strong backgrounds in at least one of the required other fields.

The main objective is to explore the use of declarative programming to develop a flexible and knowledge-based tool to analyse patient care pathways.

Context and project objectives

Pharmaco-epidemiology is the study of uses and effects of health products (medical devices and drugs) on population. The exploitation of available large administrative databases of patient care pathways is become a huge challenge for clinicians. The ANSM-PEPS project develop tools and methodologies to support pharmaco-epidemiological studies based on the French medico-administrative databases (SNIIRAM).

The SNIIRAM database is a huge and rich database that enable to rebuild patient care pathways. Face to very rich data-model, the SePaDec project proposes and explores intensive knowledge pattern mining approaches to support the mining of meaningful patterns on patient care pathways. It consists in exploring declarative programming to provide flexible and knowledge-based tool for extracting frequent sequential patterns from patient care pathways.

Declarative programming paradigms attract the interested of the data mining community to tackle pattern mining tasks. Recent works used constraints programming [1,3] or answer set programming [2] to encode pattern mining tasks. Thanks to the efficiency of solvers based on constraints satisfaction solving, such approaches become mature enough to tackle real issues and real data.

The SePaDec project aims, on the one side, to improve the efficiency of the ASP-mining approaches developed in collaboration between the LACODAM team and the Potsdam KR team developing the ASP solver clingo. The two research directions that could be explored are 1) the use of dedicated algorithms combined with clingo and 2) the study of various solvers efficiency to the pattern mining task. A better understanding of the solvers behaviors and the construction of models of efficiency may support the build of optimizers able to design effective resolution plans, like optimizers used in the DBMS.

On the other side, we would like to provide a proof of concept of the interest of the declarative approach to extract « meaningful » patterns from databases. This part of the work consists in proposing and implementing in ASP a knowledge-base for a dedicated pharmaco-epidemiological study and to illustrate the blending of mining and reasoning thanks to the declarative approaches.

[1] L. De Raedt, T. Guns, S. Nijssen. Constraint programming for itemset mining. Proceedings of KDD, 2008.

[2] T. Guyet, Y. Moinard, R. Quiniou, T. Schaub. Fouille de motifs séquentiels avec ASP, Actes de EGC, 2016.

[3] Benjamin Negrevergne and Tias Guns. Constraint-based sequence mining using constraint programming. In Integration of AI and OR Techniques in Constraint Programming (CPAIOR), 2015.

Job description

The post-doctoral research will concentrate on:

  • design, implement, improve and evaluate the combination of dedicated algorithms with clingo

  • working on the knowledge formalization for one selected pharmaco-epidemiological study

Workplace

The post-doc would be under the supervision of Thomas Guyet in the LACODAM (Large Scale Collaborative Data Mining) team in Rennes/France. It will include collaborations with the Rennes Hospital for applicative field and with the knowledge processing and information systems team at the University of Potsdam.

Application and recruitment guidelines

  • Eligible applicants should hold a PhD in computer science with strong publication record

  • Selected candidates should be able to start their position from April 2016.

  • Complete applications with attachments should be sent by email (thomas.guyet@irisa.fr) before the 29th of February 2016.

Applications should include:

  • A motivation letter

  • A full CV with a list of most relevant publications,

  • (optional, but well received) a short summary of your research project (3 pages max featuring at least some original idea)

  • Names and contact information for 2 referees or recommendation letters

This annouced is also available on www.KDnuggets.com – Analytics, Big Data, Data Mining, and Data Science Resources

Comments are closed.