Students will be introduced to the concepts, tools and techniques of bioinformatics, a field of immense importance for understanding molecular evolution, individualized medicine, and data-intensive biology. The module includes a conceptual framework for modern bioinformatics, an introduction to key bioinformatics topics such as databases and software, sequence analysis, pairwise alignment, multiple sequence alignment, sequence database searches, and profile-based methods, molecular phylogenetics, visualization and basic homology modelling of molecular structure, pathway analysis and personal genomics. Concepts emphasized in the lectures are complemented by hands-on use of bioinformatics tools in the practicals. Students will achieve highly valued skills as biological researchers with basic competence in computational and bioinformatics techniques, with proper foundation to learn more advanced skills in bioinformatics and biocomputing.
The goals of CS2220 (Introduction to Computational Biology) are: (1) development of flexible and logical problem solving skills; (2) understanding of main bioinformatics problems; and (3) appreciation of main techniques and approaches to bioinformatics. To achieve the goals above, we expose the students to a series of case studies spanning gene feature recognition, gene expression and proteomic analysis, gene finding, sequence homology interpretation, phylogeny analysis, physical mapping, and genome sequencing.
Computational Biology is a fast changing area. In the post-genome era, many new bio-technologies appear and classical algorithms for computational biology are no longer enough. This module is intended to cover the important algorithms related to the new technologies like microarray, SNPs, mass spectrometry, etc. After the complete sequencing of a number of genomes, we are at a stage to understand the mystery of our body; that is, we need to understand the information encoded in the genome and its relationship to RNA and protein. This aim of this module is to cover the algorithms related to this stage. In the module, we cover the algorithms related to phylogenetic, RNA, proteomics, population genetics, microarray, etc.
This module is an introduction to the algorithms and popular software tools for basic computational problems in genomics. It studies exact algorithms for those problems that can be solved easily and approximation and/or heuristic algorithms for more difficult problems. The objective is to develop competitive knowledge in formulating biological problems in computational terms and solving these problems using the algorithmic approach. This module is for students with interests in computational molecular biology. Major topics: Sequence analysis, multiple sequence alignment, phylogenetic analysis, DNA sequences assembly and mapping, gene finding, protein folding problem.
Present-day biomedical researchers are confronted by vast amounts of data from genome sequencing, microscopy, high-throughput analytical techniques for DNA, RNA, and proteins, and a host of other new experimental technologies. Coupled with the advances in computing power, this flow of information should enable scientists to model and understand biological systems in novel ways. The goals of CS4220 (Knowledge Discovery Methods in Bioinformatics) are: (1) expose students to knowledge discovery techniques, (2) enhance students' flexible and logical problem solving skills, (3) develop students' understanding of bioinformatics and issues in analysis of real-life high-throughput biological data. To achieve these goals, we do a series of in-depth studies and hands-on projects on topics such as gene-expression profile analysis, epistatic-interaction detection, protein-family recognition, protein-complex prediction, disease-associated-mutation detection, etc. At the end of the course, students will be able to identify the relevant techniques for different biological data to uncover new information, as well as be confident in formulating and validating hypothesis underlying observations from biological data.
Biology data are too enormous. Handling them using brute-force approaches becomes impossible and efficient algorithms are required. This module is an in-depth study of some of these advance algorithms. Through the course, students are able to understand these algorithms in detail. They are also given a chance to solve some research problems in this field, including sequence comparison, indexing of biological database, sequencing by hybridization, and more.
This course provides an introduction to modeling and analysis techniques relevant to systems biology with a focus on the dynamics of biochemical networks. We shall introduce models such as ordinary differential equations, Petri nets, Markov chains and dynamic Bayesian networks and show how they can be used to describe and analyse metabolic, signaling and gene regulatory networks. Selfstudy, tool-based modeling assignments and guest lectures by biologists will also be key components of the course. The core lectures will be largely self-contained and students with diverse backgrounds are expected and welcome to attend.
To understand the molecular basis of heredity, including the structure and function of genes and their role in phenotypic variations. Practical sessions will be conducted to reinforce the concepts taught during lectures. Techniques for genetic analysis and the use of model organisms such as Escherichia coli, Drosophila and higher plants will be taught. Which should we teach first: Mendelian Genetics or Molecular Genetics? The classical way genetics are taught is to start with Mendelian Genetics, followed by Molecular Genetics. Nevertheless, we have chosen to reverse the order as we need to understand and grasp the fundamental molecular mechanisms underlying transmission or classical genetics today. Most students will have heard of dominant or ressessive genes, but very few will be able to explain the molecular reasons or mechanisms that determine "dominance" or "recessiveness". This is one example of the fundamental mindset we hope to change and accomplish.
CS1020 aims to give a systematic introduction to data structures and algorithms for constructing efficient computer programs. Emphasis is on data abstraction issues (through Abstract Data Types) in the program development process,and on efficient implementations of chosen data structures and algorithms. Commonly used data structures covered include stacks, queues, trees (including binary search trees, heaps and AVL trees), hashing, tables, and graphs; together with their corresponding algorithms (tree and graph traversals, minimum spanning trees). Simple algorithmic paradigms,such as generate-and-test (search) algorithms, greedy algorithms and divide-and-conquer algorithms will be introduced. Elementary analyses of algorithmic complexities will also be taught. CS1102C covers the same topics, but from an imperative paradigm perspective. CS1102S covers the same topics, but from a functional programming perspective.
Random sample and statistics, method of moments, maximum likelihood estimate, Fisher information, sufficiency and completeness, consistency and unbiasedness, sampling distributions, x2-, t- and F-distributions, confidence intervals, exact and asymptotic pivotal method, concepts of hypothesis testing, likelihood ratio test, Neyman-Pearson lemma. This module is targeted at students who are interested in Statistics and are able to meet the pre-requisites.
This module introduces different techniques of designing and analysing algorithms. Students will learn about the framework for algorithm analysis, for example, lower bound arguments, average case analysis, and the theory of NP-completeness. In addition, students are exposed to various algorithm design paradigms. The module serves two purposes: to improve the students' ability to design algorithms in different areas, and to prepare students for the study of more advanced algorithms. The module covers lower and upper bounds, recurrences, basic algorithm paradigms (such as prune-and-search, dynamic programming, branch-and-bound, graph traversal, and randomised approaches), amortized analysis, NP-completeness, and some selected advanced topics.
This module aims to prepare students in competitive problem solving. It covers techniques for attacking and solving challenging computational problems. Fundamental algorithmic solving techniques covered include divide and conquer, greedy, dynamic programming, backtracking and branch and bound. Domain specific techniques like number theory, computational geometry, string processing and graph theoretic will also be covered. Advanced AI search techniques like iterative deepening, A* and heuristic search will be included. The module also covers algorithmic and programming language toolkits used in problem solving supported by the solution of representative or well-known problems in the various algorithmic paradigms.
The module introduces the basic concepts in artificial intelligence. Topics covered include: intelligent agent, search, game playing, constraint satisfaction, logic, planning, reasoning, and learning.
This module introduces basic concepts and algorithms in machine learning and neural networks. The main reason for studying computational learning is to make better use of powerful computers to learn knowledge (or regularities) from the raw data. The ultimate objective is to build self-learning systems to relieve human from some of already-too-many programming tasks. At the end of the course, students are expected to be familiar with the theories and paradigms of computational learning, and capable of implementing basic learning systems.
This module addresses the design of relational databases and object oriented databases. Topics covered include: normalisation theory: functional, multi-valued and join dependency, normal forms, relational database schema design using decomposition method and synthesizing method; entity-relationship approach: normal form entity-relationship diagram, its derivation, and its translation to relational, network, and hierarchical database schemas; schema integration: view integration and database integration, schema conflict resolution; nested relations: normal form nested relations, nested relations design using decomposition method and entity-relationship approach; object-oriented databases: basic concepts, inadequacies in object-oriented data models, inheritance conflict resolution, translate relational database schemas and entity-relationship diagrams to object-oriented database schemas.
This is a module that contains both the theory and practice of building knowledge-based systems. The aim of this module is to prepare students so that they can design and build knowledge-based systems to solve real-world problems. The module starts with motivations, background and history of knowledge-based system development. The main content has five parts: rule-based programming language, uncertainty management, knowledge-based systems design, development and life cycle, efficiency in rule-based language and knowledge-based systems design examples.
Design of experiments. Descriptive statistics. Measurement error, correlation, regression. Probability. Chance variability. Estimation. Chance models, tests of significance. Multiple regression. Analysis of variance.
Data mining is a diverse field which draws its foundation from many research areas like databases, machine learning, AI, statistics, etc. The aim of this course is to highlight concepts from these areas which are fundamental and often used in building data mining tools. At the end of the course, students should: (1) have a good knowledge of the fundamental concepts that provide the foundation of data mining; (2) understand how these concepts are engineered to provide some of the basic data mining tools; and (3) be able to adopt these concepts to develop new data mining tools for new applications.
Decision making technologies can support decision making in the financial, operational,marketing and other areas. Efforts have been directed at finding new machine learning (ML) techniques for decision making and their possible application in solving practical problems. ML techniques such as artificial neural network methods have been proven to be powerful tools for decision making. Applications include credit rating, bankruptcy analysis, foreign exchange rate predictions and many others. The techniques covered in this course include neural networks for classification/regression/clustering, genetic algorithm for optimization, decision tree methods, support vector machine and data mining.
The module covers modelling methods that are suitable for reasoning with uncertainty. The main focus will be on probabilistic models including Bayesian networks and Markov networks. Topics include representing conditional independence, building graphical models, inference using graphical models and learning from data. Selected applications in various domains such as speech, vision, natural language processing, medical informatics, bioinformatics, data mining and others will be discussed.