OAEI 2007 food
Thesaurus Mapping Task

Abstract

To fulfill the OAEI 2007 food thesaurus mapping task, participants are required to align two SKOS thesauri using relations from the SKOS Mapping vocabulary. The results are collected and validated by domain experts.

Consider participating in the environment task!

AGROVOC-NALT

Current Status

The evaluation is done.

The results of the participants can be found at the following web address: http://www.few.vu.nl/~wrvhage/oaei2007/results
The Gold Standard used to evaluate Precision and Recall can be found at the following web address: http://www.few.vu.nl/~wrvhage/oaei2007/gold_standard

Task

Create an alignment between the SKOS version of the United Nations Food and Agriculture Organization (FAO) AGROVOC thesaurus (±28,000 terms, multilingual: ar, cs, de, en, es, fr, hu, ja, pt, sk, th, zh) and the United States National Agricultural Library (NAL) Agricultural thesaurus (±42,000 terms, monolingual: en), preferably using relations from the SKOS Mapping Vocabulary.

A specification of the SKOS vocabularies can be found at the SKOS website. (http://www.w3.org/2004/02/skos/)
A description of these relations can be found in the SKOS Mapping Vocabulary. (http://www.w3.org/2004/02/skos/mapping/)

Participants are advised to use the alignment API to produce the common format for alignments, but using the following mapping relations:

http://www.w3.org/2004/02/skos/mapping#narrowMatch
http://www.w3.org/2004/02/skos/mapping#exactMatch
http://www.w3.org/2004/02/skos/mapping#broadMatch
The other relations and boolean combinators of the SKOS Mapping Vocabulary are also allowed, but will not be evaluated for the OAEI 2007 food thesaurus mapping task.
http://www.w3.org/2004/02/skos/mapping#minorMatch
http://www.w3.org/2004/02/skos/mapping#majorMatch
http://www.w3.org/2004/02/skos/mapping#AND
http://www.w3.org/2004/02/skos/mapping#OR
http://www.w3.org/2004/02/skos/mapping#NOT

An example broaderMatch mapping between AGROVOC “hard cheese” and NALT “cheeses” in the common format for alignments, produced by the API looks like this:

<rdf:RDF xmlns="http://knowledgeweb.semanticweb.org/heterogeneity/alignment"
         xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:xsd="http://www.w3.org/2001/XMLSchema#">
  <Alignment>
  <xml>yes</xml>
    <level>0</level>
    <type>**</type>
    <onto1>http://www.fao.org/aos/agrovoc</onto1>
    <onto2>http://agclass.nal.usda.gov/nalt/2007.xml</onto2>
    <map>
      <Cell>
        <entity1 rdf:resource="http://www.fao.org/aos/agrovoc#16492" />
        <entity2 rdf:resource="http://agclass.nal.usda.gov/nalt/2007.xml#cheeses" />
        <measure rdf:datatype="&xsd;float">1.0</measure>
        <relation>http://www.w3.org/2004/02/skos/mapping#broadMatch</relation>
      </Cell>
    </map>
  </Alignment>
</rdf:RDF>

The file containing the alignments should be submitted by e-mail to .

Evaluation procedure

  1. Each participant submits his preliminary mappings, in the common format for alignments, before September 3rd 2007.
  2. Each participant submits his final mappings before October 1st 2007.
  3. A sample of the mappings will be assessed by domain experts at the FAO and USDA.
  4. The domain experts are required to assess the mappings appointed to them before October 11th 2007.
  5. The results are published before October 11th 2007.
  6. Evaluation measurements of the participants' systems calculated based on this list of reference alignments.
  7. The final list of judgements is given to domain experts and librarians for manual extension to create an official mapping between the thesauri by November 11th 2007.

Thesauri

The latest SKOS version and a naieve OWL Lite conversion of the thesauri can be downloaded from the directories listed below. (updated june 18th 2007) The OWL version was derived in the same way as for the library case. The conversion SeRQL queries can be downloaded here. Be advised, when you use the OWL version of the thesauri, that both skos:prefLabel and skos:altLabel have been mapped to rdfs:label. The skos:altLabel is often used to represent synonyms, but also to refer to omitted related terms. If you have any questions about the format, or if you prefer the input in a different format, please let me know. ( )

AGROVOC
Download AGROVOC. (version 2007-02-19, updated 2007-06-28)
Read more about AGROVOC at http://www.fao.org/agrovoc.

NAL thesaurus
Download the NAL thesaurus. (version 2007)
Read more about the NAL thesaurus at http://agclass.nal.usda.gov/agt.

Results

collection system relation type Precision Recall
NALT-AGROVOC Falcon-AO exactMatch 0.84 0.48
NALT-AGROVOC DSSim exactMatch 0.49 0.20
NALT-AGROVOC X-SOM exactMatch 0.45 0.06
NALT-AGROVOC RIMOM exactMatch 0.62 0.42
NALT-AGROVOC SCARLET exactMatch 0.66 0.003
NALT-AGROVOC SCARLET broadMatch/narrowMatch 0.25 0.006
NALT-AGROVOC SCARLET disjoint 0.64 0

A more detailed listing of all the results can be found in this Excel sheet (104KB) and this PDF presentation (3MB)

The results of the participants can be found at the following web address: http://www.few.vu.nl/~wrvhage/oaei2007/results
The Gold Standard used to evaluate Precision and Recall can be found at the following web address: http://www.few.vu.nl/~wrvhage/oaei2007/gold_standard

Organization

Send any questions, comments, or suggestions to:
Willem Robert van Hage ( )