Paul Harrison


Home page of Dr. Paul Harrison. I work for the Monash Genomics and Bioinformatics Platform, Bioinformatics Node at Monash University.

(or, or


PGP public key
Fingerprint B829 F851 00B7 9CB7 6897 DCA7 A881 5DA8 5150 5ABF


Langevitour: Javascript widget with convenient R and Python wrappers. Interactively "tour" projections of high-dimensional datasets. The widget can be included in R Markdown or Quarto documents for easy sharing.

Topconfects: Bioconductor R package. Top differential expression by confident effect size. A more intuitive way to use "TREAT".

Weitrix: Bioconductor R package. Tools for working with a matrix of data measured with varying precision or with missing values.

Varistran: R package. Variance stabilizing transformation for RNA-Seq data. This is a useful pre-processing step prior to visualization, and also makes clustering and statistical analysis more straightforward. Includes a shiny app for assessing RNA-seq data quality with heatmap and biplot.

Tail-Tools: Analysis pipeline for PAT-Seq data.

Nesoni: Tools for processing bacterial high-throughput sequencing data. (No longer maintained.)

Plasmodium Autocount and Semiautocount: Cell counting on blood smears. (Historical interest only.)

Talks and slides

2022-03-25 slides Short talk for the NUMBATS group.
2022-06-22 video, slides useR! Conference talk.
2022-09-20 slides ABACBS National Seminars talk.
2023-12-08 short video, slides Using Langevitour to examine what UMAP and t-SNE hide and misrepresent, presented at IASC-ARS 2023.

2020-10-12 slides Weitrix: calibrate precision weights for diverse types of data in a matrix, explore and test weighted or sparse data (for Bioconductor Asia 2020).]
2020-11-04 video

2019-08-21 slides An introduction for a general audience (for the NUMBATS group).
2020-03-10 slides A more technical presentation (WEHI Bioinformatics Seminar).

Machine learning
2017-06-14 slides Introduction to logistic regression and decision trees, and a description of my Melbourne Datathon 2017 Kaggle entry

Tiling patterns
2022-01-27 video slides Mathematical weirdness in tiling patterns.

2024-05-23 slides A short history of the Data Fluency program at Monash (for ARDC Digital Research Skills Summit/Carpentry Connect).


Publications on Google Scholar. As a bioinformatician, I am a perennial middle author.

PhD thesis (2005) on image texture synthesis. I also have several publications relating to my PhD work on texture.

Introduction to R, version 2 (2018) (version 1 here) Workshop material introducing R.

Programming and tidy data analysis in R (2019) (based on an ealier workshop R more (2016)). Workshop to go from beginner to intermediate R usage, which I have developed and presented with other members of the Monash Bioinformatics Platform and the Monash Data Fluency initiative.

Linear models in R (2018) Workshop material introducing linear models in R. Linear models are a very broadly useful statistical tool, and also necessary background knowledge for using Bioconductor packages such as limma.

Dance of the CIs, a Javascript app version of the "Dance of the CIs" from the "New Statistics". Illustrates the meaning of confidence intervals, standard error, standard deviation, and prediction intervals.

PhD Co-supervision

Dr. Andrew David Pattison
PhD awarded 2018, principal supervisor Traude Beilharz, with Paul Harrison.

Dr. Stuart Andrew Lee
PhD awarded 2020, principal supervisor Dianne Cook, with Matthew Ritchie, and Paul Harrison.

Jayani Lakshika
In-progress, principal supervisor Dianne Cook, with Michael Lideamore, Thiyanga Talaga, and Paul Harrison.

→ fun and personal stuff