Qualinca, a Research Project Dealing with the Quality of Document Databases
Share this Session:
  Julien Gibert   Julien Gibert


Tuesday, August 18, 2015
04:00 PM - 04:30 PM

Level:  Technical - Intermediate

In a bibliographic database, the bibliographic records (documents) are linked to authority records (persons such as authors for example). Qualinca is an ongoing research project looking at developing mechanisms allowing to quantify the quality level of a bibliographical knowledge base, which means in our case that we want to make sure that the author associated with a bibliographic record is the proper one.

To achieve this, we will:

  • Get from our database all the authors bearing the same name than the one from the bibliographic record tested, as well as the bibliographic records where this name is mentioned
  • Compare sets of attributes such as names, domains, dates, publishers and so on from those records (using distance measures such as levenshtein, soft cosinus distance, simplex...)
  • Apply set of prolog rules to try to gather together entities that refer to the same person

Julien is a software developer with 14+ years in programming, using technologies such as java programming (familiar with design patterns, code refactoring), xml (xsl, xpath, schema), oracle database, solr-lucene (6+ years) on a daily basis. He is specialized in the field of digital libraries and more broadly describing / archiving / disseminating resources and their metadata. Julien is currently working at ABES (http://en.abes.fr), a french public agency which creates and manages information-based tools and services for the University and Research communities.

Close Window