All teaching materials from the Core Bioinformatics group can be found on our GitHub page

CSCI Teaching

Introduction to RNAseq

This repository contains an overview of bulk RNA-Seq data analysis delivered to members of the CSCI in November 2021

Overview of pipeline components
QC, alignment to the reference genome and feature quantification
Noise quantification and removal
Post-alignment QC, differential expression and enrichment analysis

BBS

This repository contains slides of a course on various bioinformatics topics delivered in February to March 2021

Linear Regression

The linear regression lecture slides can be found here.

The practical materials can be found here.

The supervision materials can be found here.

RNAseq

The RNAseq lecture slides can be found here and here.

The practical materials can be found here for mRNAseq and here for sRNAseq.

The supervision materials can be found here.

Machine Learning

Supervised Learning

The supervised Machine Learning lecture slides can be found here.

The practical materials for supervised Machine Learning can be found here.

Unsupervised Learning

The unsupervised Machine Learning lecture slides can be found here.

The practical materials for unsupervised Machine Learning can be found here.

Introduction to Machine Learning

This repository contains an introductory practical machine learning delivered to Astra Zeneca in November 2021

Introduction, overview of techniques and cross-validation
CARET package and k nearest neighbours
Decision trees, random forests, and support vector machines
Practical on supervised approaches
Dimensionality reduction and clustering

Cuomo et al 2020

This example is based on single-cell RNA-sequencing of differentiating iPS cells and the data comes from the Cuomo et al 2020 paper. In this example, we will focus on day 3 and only three donors to reduce the time requirements. Using 500 randomly selected genes, we want to accurately classify between three pre-defined cell types. The data corresponding to these genes, timepoints and donors can be found here.

You should first perform some visualisation to understand the data then preprocess the data ready for classification. Apply and test some classical classifiers (decision tree, random forest, SVM) and optimise the respective hyperparameters. You could also identify the genes which best discriminate between cell types and retrain classifiers using a smaller set of genes. Finally, compare between the classifiers and choose the best one. A skeleton Rmd broken down into these steps can be found here.

Some example analysis can be found in this folder.

Teaching material

CSCI Teaching

Introduction to RNAseq

BBS

Linear Regression

RNAseq

Machine Learning

Supervised Learning

Unsupervised Learning

Introduction to Machine Learning

Cuomo et al 2020

core-bioinformatics-website-1000.png

Study at Cambridge

About the University

Research at Cambridge