GS6883B: Integrative Science & Engineering Research, "Journal Club on the Reluctant Data Scientist", 2023

Session #1, 29 August 2023: Anna Karenina Principle

There is a big theory–practice gap that exists when theoretical statistics are applied on real-world data. It derives from the situation where the null hypothesis is rejected for extraneous reasons (or confounders), rather than because the alternative hypothesis is relevant to the disease phenotype. The mechanics of applying statistical tests therefore must address and resolve confounders. It is inadequate to simply rely on manipulating the P-value; indeed, I will show how/why this can be the wrong thing to do!

Session #2, 30 August 2023: Moneta-Koehler et al., "The limitations of GRE in predicting success in biomedical graduate school", PLoS ONE 12(1):e0166742, 2017

This paper shows that GRE scores are not correlated with outcomes in graduate school training such as passing QE, shorter time to defense, etc. Can you find some analysis/methodological bugs that might invalidate some of these claims?

Session #3, 5 September 2023: Draghici et al., "Global functional profiling of gene expression", Genomics 81(2):98-104, 2003

This paper shows how to test a list of differentially expressed genes (in a condition being studied) against biological pathways to identify biological processes pertinent to the said condition. Can you find some analysis/methodological bugs that might cause the proposed approach to be less effective than claimed?

Session #4, 13 September 2023: Seo et al., "DeepFam: deep learning based alignment-free method for protein family modeling and prediction", Bioinformatics 34(13):i254-i262, 2018

This paper claims very high accuracy in protein family prediction. Can you find some analysis/methodological bugs that might cause the proposed approach to be less effective than claimed?

An important note

Interactions and discussions are the key ingredient in a 6000-level seminar class. I hope everyone will participate actively and learn from each other.

In that past, there were some sessions where the interactions appeared monopolised by a few students. This caused some other students, when I gave them low interaction marks, to blame me for letting these “talkative students” monopolise the interactions with what they claimed as "trivial questions".

So, I want to tell you straight that you should not and cannot blame anyone when this happens. Here is why… There are two easy ways this monopoly situation can be broken. Firstly, you can jump in with your own questions if you have some questions you are dying to ask; I will definitely let you have a go at it. Secondly, you can jump in with your answers to the questions of those monopolising students if you have the answers; I will be delighted to see you answer each other questions.

If you keep quiet all the time, then you cannot blame other students for monopolizing the floor. And since you don’t have answers to their questions (as you are not jumping in to answer them), you also cannot say later that they are wasting time with trivial questions (if their questions are trivial, then surely you must know the answers and thus can jump in to answer them.)

Assessment

(50%) Present 1-2 slides of about 5 minutes in sessions #2 - #4. The presentation can be general background that you think is relevant and important for the session. It can also be about the method / results of the paper chosen for the session. Of course, you can also present your opinions about the paper. But remember: 2 slides max, and 5 minutes max.

(50%) Contribution to discussion in at least 2 sessions. For each session, write a brief 1- or 2-page report that provides a record of the questions you have PERSONALLY ASKED or PERSONALLY RESPONDED to in that session; and for questions you have asked, record also the responses you have received on those questions. The questions can be directed at a presenter or anyone else in the class (Please say to whom the questions were addressed.) You do not need to submit a report for a session if you have not participated in any interaction in that session.