Module Description, Aims and Objectives:
Module Description, Aims and Objectives: This
module discusses the basic concepts and methods of information
retrieval including capturing, representing, storing, organizing,
and retrieving unstructured or loosely structured information. The
most well-known aspect of information retrieval is document
retrieval: the process of indexing and retrieving text
documents. However, the field of information retrieval includes
almost any type of unstructured or semi-structured data, including
newswire stories, transcribed speech, email, blogs, images, or
video. Therefore, information retrieval is a critical aspect of
Web search engines. This module also serves as the foundation for
subsequent modules on the understanding, processing and retrieval
of particular web media.
N.B. We will be teaching and using the Python programming
language throughout this class. We will using Python 2.6.x (2.6.6
or 2.6.4) instead of the updated Python 3.x, as the NLTK library
that we will also be using is currently incompatible with 3.x.
Course Characteristics:
- Modular credits: 4.
- Prerequisites:: CS 2010 Data
Structures and Algorithms II or its equivalent.
Important:
It is highly suggested to have some advanced
mathematics background such as probability and
statistics, and/or linear algebra. Exceptions to
these pre-requisites can be made on a case-by-case basis
only. See instructor for details.
Instructor: Min-Yen KAN, <kanmy@comp.nus.edu.sg>
Office: AS6 05-12 (x61885).
Teaching Assistant: Xiangnan HE, <xiangnan@comp.nus.edu.sg>
Office: Computational Linguistics Lab (AS6 04-13).
Office hours are held (before and after class) on
Fridays after class, but more commonly by
appointment. Emails to me as a default are assumed to be
public, and my replies and your anonymized email will
likely be posted to IVLE. Please let me know if
you do not want the contents of your
email posted; I will be happy to honor your
requests.
- Workload: (2-1-0-5-2) Translation:
2 lecture hours per week
1 hour of tutorials or labs per week
5 hours for projects, assignments, fieldwork, etc. per week
2 hours for preparatory work by a student per week
- Textbooks:
- Required: Christopher D. Manning, Prabhakar Raghavan and
Hinrich Schütze, Introduction
to Information Retrieval, Cambridge University Press. 2008.
[ Check
LINC for book ]
- Recommended: Steven Bird, Ewan Klein and Edward Loper, Natural language processing with
Python, O'Reilly. 2009.
[ Check
LINC for book ]
- Recommended: Ian H. Witten, Alistair Moffat, Timothy C.
Bell, Managing Gigabytes: Compressing and Indexing Documents and
Images, 1999. Second Edition
[ Check LINC
for book ]
- Recommended: Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern
Information Retrieval, 1999. First Edition
[ Check LINC
for book ]
- Tutorials: Note: There will only be
five tutorial sessions; each tutorial is on a subject
related to a homework assignment, and the tutorials are
held every other week. We have only two slots that are
back-to-back on Thursday afternoon.
- Tutorial 1: Thursdays 14:00-15:00 (SR7; COM1 #02-07)
- Tutorial 2: Thursdays
15:00-16:00 13:00-14:00 (SR7; COM1 #02-07)
- Tutorial 3: Fridays 10:00-11:00 (SR10; COM1 #02-10)
- Final Exam:
Updated!
Tuesday, 6 May 2014
Morning, Venue: SoC COM1 HCI Design Studio (COM1
#02-02), This is an open-book exam.
Note to NUS-external visitors: Welcome! If you're a fellow
IR course instructor looking for lecture material, you can see
the syllabus menu item on the left for a preview. Please contact
me if you'd like to use any of my material. Thanks!