Module Description, Aims and Objectives:
Module Description, Aims and Objectives: This
module discusses the basic concepts and methods of information
retrieval including capturing, representing, storing, organizing,
and retrieving unstructured or loosely structured information. The
most well-known aspect of information retrieval is document
retrieval: the process of indexing and retrieving text
documents. However, the field of information retrieval includes
almost any type of unstructured or semi-structured data, including
newswire stories, transcribed speech, email, blogs, images, or
video. Therefore, information retrieval is a critical aspect of
Web search engines. This module also serves as the foundation for
subsequent modules on the understanding, processing and retrieval
of particular web media.
N.B. We will be teaching and using the Python programming
language throughout this class. We will using Python 2.6.x (2.6.6
or 2.6.4) instead of the updated Python 3.x, as the NLTK library
that we will also be using is currently incompatible with 3.x.
Course Characteristics:
- Modular credits: 4.
- Prerequisites:: CS 2010 Data
Structures and Algorithms II or its equivalent.
Important:
It is highly suggested to have some advanced
mathematics background such as probability and
statistics, and/or linear algebra. Exceptions to
these pre-requisites can be made on a case-by-case basis
only. See instructor for details.
- Instructor: Min-Yen KAN, <kanmy@comp.nus.edu.sg>
Office: AS6 05-12 (x1885).
Office hours are held (before after class) on Monday
9:00-10:00 in the classroom (Video Conference Room), or
more commonly by appointment. Emails to me as a default
are assumed to be public, and my replies and your
anonymized email will likely be posted to IVLE. Please
let me know if you do not want the
contents of your email posted; I will be happy to honor
your requests.
- Workload: (2-1-0-5-2) Translation:
2 lecture hours per week
1 hour of tutorials or labs per week
5 hours for projects, assignments, fieldwork, etc. per week
2 hours for preparatory work by a student per week
- Textbooks:
- Required: Christopher D. Manning, Prabhakar Raghavan and
Hinrich Schütze, Introduction
to Information Retrieval, Cambridge University Press. 2008.
[ Check
LINC for book ]
- Recommended: Steven Bird, Ewan Klein and Edward Loper, Natural language processing with
Python, O'Reilly. 2009.
[ Check
LINC for book ]
- Recommended: Ian H. Witten, Alistair Moffat, Timothy C.
Bell, Managing Gigabytes: Compressing and Indexing Documents and
Images, 1999. Second Edition
[ Check LINC
for book ]
- Recommended: Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern
Information Retrieval, 1999. First Edition
[ Check LINC
for book ]
- Tutorials: Note: There will only be
five tutorial sessions; each tutorial is on a subject
related to a homework assignment, and the tutorials are
held every other week. We have only two slots that are
back-to-back on Thursday afternoon.
- Thursdays 12:00-13:00 (SR7; COM1 #02-07)
- Thursdays 13:00-14:00 (SR7; COM1 #02-07)
- Final Exam: Tuesday, 7 May 2012
Afternoon, Venue: TBA. This is an open-book
exam.
Note to NUS-external visitors: Welcome! If you're a fellow
IR course instructor looking for lecture material, you can see
the syllabus menu item on the left for a preview. Please contact
me if you'd like to use any of my material. Thanks!