Last updated:
Tuesday, 15 January, 2018 - Tutorial group information updated.
Module Description, Aims and Objectives:
Module Description, Aims and Objectives: This
module discusses the basic concepts and methods of information
retrieval including capturing, representing, storing, organizing,
and retrieving unstructured or loosely structured information. The
most well-known aspect of information retrieval is document
retrieval: the process of indexing and retrieving text
documents. However, the field of information retrieval includes
almost any type of unstructured or semi-structured data, including
newswire stories, transcribed speech, email, blogs, images, or
video. Therefore, information retrieval is a critical aspect of
Web search engines. This module also serves as the foundation for
subsequent modules on the understanding, processing and retrieval
of particular web media.
Starting this semester, I will also be maintaining a
Facebook page (accessible from the FB
link on the top menu) for this course across cohorts.
Current students and alumni are welcome to contribute news
and items of potential interest to the page (i.e., IR
news, job openings specific to IR).
N.B. We will be using the Python (2.7.x or 3.4.x) throughout this class, as these versions also work for the NLTK
library.
Course Characteristics:
- Modular credits: 4.
- Prerequisites:: CS2010 or CS2020 or (CS2030 or CS2113/T) and (CS2040 or CS2040C).
Important:
It is highly suggested to have some advanced
mathematics background such as probability and
statistics, and/or linear algebra. Exceptions to
these pre-requisites can be made on a case-by-case basis
only. See instructor for details.
- Staff:
- Instructor: ZHAO Jin,
<zhaojin@comp.nus.edu.sg>
Office: COM2-02-10 (66011083).
- Office hours are held on Fridays after class, but more commonly by
appointment. Emails to me as a default are assumed to be
public, and my replies and your anonymized email will
likely be posted to IVLE. Please let me know if
you do not want the contents of your
email posted; I will be happy to honor your
requests.
- Workload: (2-1-0-5-2)
- 2 lecture hours per week
- 1 tutorial hour per week
- 5 hours for projects, assignments, fieldwork, etc. per week
- 2 hours for preparatory work by a student per week
- Textbooks:
- Required: Christopher D. Manning, Prabhakar Raghavan and
Hinrich Schütze, Introduction
to Information Retrieval, Cambridge University Press. 2008.
[ Check
LINC for book ]
- Recommended: Steven Bird, Ewan Klein and Edward Loper, Natural language processing with
Python, O'Reilly. 2009.
[ Check
LINC for book ]
- Recommended: Ian H. Witten, Alistair Moffat, Timothy C.
Bell, Managing Gigabytes: Compressing and Indexing Documents and
Images, 1999. Second Edition
[ Check LINC
for book ]
- Recommended: Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Modern
Information Retrieval, 1999. First Edition
[ Check LINC
for book ]
- Tutorials: Note: There will only be
five or six tutorial sessions; each tutorial is on a subject
related to a homework assignment, and the tutorials are only
held in place every other week, currently scheduled for Monday morning and Thursday afternoon.
- Tutorial 1: Thursdays 13:00-14:00 (TR11; COM1 #02-16)
- Tutorial 2: Thursdays 12:00-13:00 (TR11; COM1 #02-16)
- Tutorial 3: Monday 10:00-11:00 (SR5; COM1 #02-01)
- Tutorial 4: Monday 11:00-12:00 (SR5; COM1 #02-01)
- Final Exam: Saturday, 27 Apr 2019 (afternoon), Venue: TBA. This is an open-book exam.
Note to NUS-external visitors: Welcome! If you're a fellow
IR course instructor looking for lecture material, you can see
the Syllabus menu item on the nav bar for a preview. Please contact
me if you'd like to use any of my material. Thanks!