CS6222: Advanced Topics in Computational Biology
Instructor: Limsoon Wong / AY2022/2023 Semester 1


Overview

The goals of CS6222 (Advanced Topics in Computational Biology) are (i) to introduce students to computational approaches enabled by omics technologies in biology and (ii) to develop students into reflective scientists. This is a seminar-based course where selected papers in computational biology are discussed. This discussion is in-depth; beyond what is done in a paper, we also focus on why it is done that way, whether that is correct, how it could be done better, and so on.

The topics that I plan to cover are (i) protein function prediction, (ii) gene expression signature identification, (iii) single-cell transcriptome analysis, and (iv) batch effects in omics. I may change the last topic if there are specific popular requests on topics from the class. As I expect most students (if not all) to be new to computational biology, I plan to have two to three sessions per topic. In the first session of a topic, we go over general background for the topic. In the second and (maybe third) session of a topic, we discuss some key papers on the topic. This way, I hope no one gets left behind.

It is my plan that most sessions will be half led by students and half led by professor (i.e., me.) How will this work? Well, I will divide the class into groups. Each group will present and lead the discussion on an aspect of a paper; that will be the first half of the session. Then, I will share my own perspectives on the paper and topic; that will be the second half of the session. Every student is expected to contribute to each session; i.e. students not in the presenting group should contribute by e.g. asking questions, answering questions, and sharing perspectives. The assessment for this module is planned to consist of the following four components:

  1. Presentation of your assigned papers, 25%;
  2. Contribution to discussion, 25%;
  3. A pair of pre- and post-session reports on a paper, 20%; and
  4. A project report, 30%.

The "presentation of an aspect of a paper" might be unfamiliar to you... So here is a brief explanation. There are at least three different aspects that need to be presented for a paper: (i) Background information - this is needed to help everyone in the class better understand the topic. (ii) The paper itself - so that we know the work that we are discussing. And (iii) Possible points for discussion - so that we have some starting points for the discussion. Each of these three aspects will be presented/led by a team of 3-4 students. Each team will stay fixed throughout the module and will be rotated through all three aspects; i.e., you will be presenting three papers, but on a different aspect of the different papers as you rotate through the papers.

The “pre- and post-session reports” might also be unfamiliar to you… So, here is a brief explanation. Of the three designated sessions, you are expected to choose one session and submit the pre- and post-session report for it. A “pre-session” report must be submitted before the session, but not more than 3 days earlier. A “post-session” report must be submitted after the session, but not more than 3 days later. The “pre-session” report describes what you think about the paper. The “post-session” report describes what you now think about the paper, considering what you have also learned from the discussion on the paper. These reports should not be long-winded; anywhere from half a page to one page is about right. It is not necessary to talk about every detail; just focus on what you consider the most important points.

The emphasis is on what you think about the paper and the session. This means you should not be merely repeating details described in the paper or summarising discussions in the session. Rather, you should focus on providing a reflective discussion on the paper and on what you have learned from the session. Reflection is a process involving the following key steps: (i) "description" - recalling key points or concepts in the paper and session; (ii) "analysis" - asking questions that prompt deeper thought, (iii) "evaluation" - asessing your understanding and identifying the strengths and weaknesses in your comprehension of the topic, and (iv) "connection" - making connections between what you have learned from the paper and the session and your existing knowledge on the same or other topics.

Finally, by “Contribution to discussion”, I mean a brief 1- or 2-page report that provides a record of the questions you have personally asked or personally responded to in a session; and for questions you have asked, record also the responses you have received on those questions. The questions can be directed at a presenter or anyone else in the class (Please say to whom the questions were addressed.) The responses can be those from the presenter or anyone else in the class (Please say from whom the responses were received.) Your report should provide your reflection on what you have learned from these interactions and from the session. You do not need to submit a report for a session if you have not participated in any interaction in that session. However, I expect you to have participated in at least 5 sessions.

Why do I want these reports? Three reasons: Firstly, the reports help me keep track of the class interactions. Secondly, and more importantly, when you write your reports, you get a chance to reflect on the questions and responses and hopefully achieve a better understanding. Thirdly, from the way you write, I also can see if you have understood the responses. Again, the emphasis is on reflection.

Some important notes

As a 6000-level seminar-based course, CS6222 is not intended to systematically teach biology knowledge or computational biology knowledge in a spoon-fed manner. Interactions and discussions are the key ingredient in a 6000-level seminar class. I hope everyone will participate actively and learn from each other. The more actively you participate, you more everyone will be able to extract from the course.

In the past, there were some sessions where the interactions appeared monopolised by a few students. This caused some other students, when I gave them low interaction marks, to blame me for letting these “talkative students” monopolise the interactions with what they claimed as "trivial questions". So, I want to tell you straight that you should not and cannot blame anyone when this happens. Here is why… There are two easy ways this monopoly situation can be broken. Firstly, you can jump in with your own questions if you have some questions you are dying to ask; I will definitely let you have a go at it. Secondly, you can jump in with your answers to the questions of those monopolising students if you have the answers; I will be delighted to see you answer each other questions. If you keep quiet all the time, then you cannot blame other students for monopolizing the floor. And since you don’t have answers to their questions (as you are not jumping in to answer them), you also cannot say later that they are wasting time with trivial questions (if their questions are trivial, then surely you must know the answers and thus can jump in to answer them.)

In the past, there were some students who thought they have asked or answered n questions in a session and therefore should get high marks for "Contribution to discussion". They were surprised that I did not give them high marks for this. They forgot that "Contribution to discussion" is assessed also on their reflection on what they have learned, and not by counting the number of interactions. Hence, you cannot get high marks for "Contribution to discussion" purely by providing a literal record of the interactions. You need to elaborate on what you have learned from some of those interactions.

Topics

Project

Can you think of a good way for measuring/quantifying batch effects? No need a fully worked out method. Just provide an outline of your idea, along with an explanation of why you think it will work. Best keep to 1-2 pages, though it is fine if you have done a lot of work and want to show more.