Information Extraction

NGSLT - Short course

Lecturer

Dr. Mark Stevenson
Department of Computer Science, University of Sheffield
Course web page

Date: April 21-24, 2008

Location: Room 231a, Reykjavik University, Ofanleiti 2, Reykjavik 103, Iceland (see travel information).

Format: Lectures and exercises

Time slots for lectures:

ECTS credits: 4

Assessment: pass/fail grade

Minimum of registered students: 5
Participants

Goals

Summary of contents

Information Extraction (IE) is an important language technology which aims to identify specific types of information from documents. IE has been applied to a variety of domains, including the mining of text, such as news or biomedical articles, and the Semantic Web. For example, IE systems have been created which identify the movements of executives within companies from newspaper reports and to identify interactions between proteins from scientific journals.

This course will consist of (1) an overview of IE systems and their components, including a description of early approaches which relied on hand written rules, (2) a description of evaluation methodologies commonly used for IE systems including the Message Understanding Conferences, (3) the use of machine learning algorithms to assist in the development and adaptation of IE systems , thereby avoiding the need for expert domain knowledge which is often difficult to obtain, and (4) analysis of various linguistic considerations which effect the difficulty of IE tasks.

Literature

The material in the course will be based on a number of research papers. The following list includes some sample papers:

Pre-course preparation

Students registered for this course need to study the following papers and slides before the course starts:

Background on Information Extraction

Background on some relevant Language Understanding Tools

Prerequisites

The course has no special requisites over and above what is required for admission to NGSLT.

Travel and accommodation

Course coordinators

Hrafn Loftsson, Reykjavik University, hrafn@ru.is
Eiríkur Rögnvaldsson, University of Iceland, eirikur@hi.is



Last modified April 14, 2008