Last updated:
Thursday, December 14, 2017 - This iteration of the class has finished. Please refer to the
current iteration of the class for detail and note that the lecturer has changed.
Module Description, Aims and Objectives:
Module Description, Aims and Objectives: This
module discusses the basic concepts and methods of information
retrieval including capturing, representing, storing, organizing,
and retrieving unstructured or loosely structured information. The
most well-known aspect of information retrieval is document
retrieval: the process of indexing and retrieving text
documents. However, the field of information retrieval includes
almost any type of unstructured or semi-structured data, including
newswire stories, transcribed speech, email, blogs, images, or
video. Therefore, information retrieval is a critical aspect of
Web search engines. This module also serves as the foundation for
subsequent modules on the understanding, processing and retrieval
of particular web media.
Starting this semester, I will also be maintaining a
Facebook page (accessible from the FB
link on the top menu) for this course across cohorts.
Current students and alumni are welcome to contribute news
and items of potential interest to the page (i.e., IR
news, job openings specific to IR).
N.B. We will be teaching and using the Python programming
language throughout this class.
Updated We will using Python 2.7.x
or Python 3.4.x, as these versions also work for the NLTK
library.
Course Characteristics:
- Modular credits: 4.
- Prerequisites:: CS 2010 Data
Structures and Algorithms II or its equivalent.
Important:
It is highly suggested to have some advanced
mathematics background such as probability and
statistics, and/or linear algebra. Exceptions to
these pre-requisites can be made on a case-by-case basis
only. See instructor for details.
Instructor: Min-Yen
KAN,
<kanmy@comp.nus.edu.sg>
Office: AS6 05-12 (x61885).
Teaching Assistants: Ashish DANDEKAR,
<ashishdandekar@u.nus.edu>
Office hours are held (before and after class) on
Fridays after class, but more commonly by
appointment. Emails to me as a default are assumed to be
public, and my replies and your anonymized email will
likely be posted to IVLE. Please let me know if
you do not want the contents of your
email posted; I will be happy to honor your
requests.
- Workload: (2-1-0-5-2) Translation:
2 lecture hours per week
1 hour of tutorials or labs per week
5 hours for projects, assignments, fieldwork, etc. per week
2 hours for preparatory work by a student per week
- Textbooks:
- Required: Christopher D. Manning, Prabhakar Raghavan and
Hinrich Schütze, Introduction
to Information Retrieval, Cambridge University Press. 2008.
[ Check
LINC for book ]
- Recommended: Steven Bird, Ewan Klein and Edward Loper, Natural language processing with
Python, O'Reilly. 2009.
[ Check
LINC for book ]
- Recommended: Ian H. Witten, Alistair Moffat, Timothy C.
Bell, Managing Gigabytes: Compressing and Indexing Documents and
Images, 1999. Second Edition
[ Check LINC
for book ]
- Recommended: Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern
Information Retrieval, 1999. First Edition
[ Check LINC
for book ]
- Tutorials: Note: There will only be
five or six tutorial sessions; each tutorial is on a subject
related to a homework assignment, and the tutorials are only
held in place every other week, currently scheduled for Tuesday afternoon.
- Tutorial 1: Tuesdays 16:00-17:00 (SR5; COM1 #02-01)
- Tutorial 2: Tuesdays 17:00-18:00 (SR5; COM1 #02-01)
- Tutorial 3: Tuesdays 14:00-15:00 (SR5; COM1 #02-01)
- Tutorial 4: Tuesdays 12:00-13:00 (SR6; COM1 #02-03)
- Final Exam: Tuesday, 25 April 2017, 9:00-11:00
Morning, Venue: SR1 (COM1 #02-06). This is an open-book exam.
Note to NUS-external visitors: Welcome! If you're a fellow
IR course instructor looking for lecture material, you can see
the Syllabus menu item on the nav bar for a preview. Please contact
me if you'd like to use any of my material. Thanks!