tinyurl.com/whzcdjtk

langevitour

palmerpenguins

Explore orthonormal projections of high dimensional data.

langevitour is an htmlwidget. Interaction, calculations, and drawing are done in Javascript.

install.packages("langevitour")
library(langevitour)
langevitour(dataset, labels)



= seek a “good” projection


= spin randomly (Asimov, 1985)

langevitour is a new take on the “tour”

Novel feature is using Langevin Dynamics to produce a smooth random path of projections.



Other tour software generates a sequence of distinct projections, then interpolates.

Langevin Dynamics

“Stochastic Differential Equation”

Constrained Langevin Dynamics

The “position” should always represent an orthonormal projection.


langevitour uses “Position Based Dynamics”, a simple and stable way to maintain constraints in a physics engine.


Each timestep:

  1. Update velocity \(\mathbf{v} \leftarrow \mathbf{v} - \mathrm{damping} + \mathrm{noise} + \mathrm{forces}\)
  2. Propose new position \(\mathbf{x} \leftarrow \mathbf{x} + \Delta t\ \mathbf{v}\)
  3. Fix the proposed \(\mathbf{x}\) by finding the nearest valid orthonormal projection. (SVD and set singular values to 1)
  4. Fix \(\mathbf{v}\) to be consistent with the actual step taken by \(\mathbf{x}\).

Guided Tour

langevitour can sample projections near an optimum defined by a potential energy function.

For potential energy \(U(\mathbf{x})\) and temperature \(T\) it samples with density:

\[ \rho(\mathbf{x}) \propto e^{\frac{-U(\mathbf{x})}{T}} \]


The force applied is \(-\nabla U(\mathbf{x})\).


langevitour has potential functions based on distances between points, as if there is a repulsion force between points.


Stochastic Gradients: To save computation, only a mini-batch of the gradient is computed per frame.
Using a Stochastic Gradient just adds extra noise to Langevin Dynamics.

Future work: implement the index functions from tourr.

Single cell RNA sequencing example

Count RNA molecules from thousands of genes in tens of thousands of cells.


Example (Kang et al. 2018)

Peripheral Blood Mononuclear Cells
(immune cells from blood)
from eight donors with lupus.


Standard Seurat processing steps:

  1. Normalize and log transform counts.
  2. Find Principal Components.
  3. UMAP layout of cells.

Some points are doublets, a mixture of two cells. A nice feature of this dataset is that doublets can be identified confidently from genetic differences.

langevitour
visualization

10 Principal Components,
varimax rotated for interpretability.

Showing 10,000 cells.

Denoising

knnDenoise(scores)

Denoising makes the geometry clearer.

For each cell, find 30 nearest neighbors and their 30 nearest neighbors.

Set cell location to the average of this set of cells.


Doublets remain ambiguous with this method. Mouse-over the doublets label, see how they lie between cell types.

Response to cytokine

Cells were stimulated with a cytokine, recombinant IFN-β.

Gene expression changes, ready to fight an infection.

UMAP hints at the geometry.

captures the response to the cytokine in most cells.

brings most cell types into alignment.

Monocytes are doing something extra with .

Discussion

Seeing what a processing step does:
harmony removes differences between samples, allowing clustering by cell type.

Soak in the data or actively explore it.

Our eyes are good at interpreting, and enjoy,
small random motions.


See what scRNA-Seq processing steps are doing.


Use directly from R/RStudio, or share as static HTML from rmarkdown.


Acknowledgements

The start of the idea for langevitour came out of discussions with Prof. Di Cook and Dr. Stuart Lee.

Extra slides

Gene loadings

The gene loadings are a spiky ball.

Spikes are sets of genes that activate together.

varimax rotation has aligned the spikes to the axes.

For example, we can pull out genes involved in , or , or search for .

Momentum is important

Momentum-based methods are not just visually appealing, they are extremely efficient.

  • Momentum in optimization quickly traverses valleys.

  • Momentum means random paths reach distances \(\propto t\), rather than the \(\propto \sqrt t\) of Brownian motion.


Related methods

Hamiltonian Monte-Carlo (e.g. Stan):

  • Randomizes the velocity at distinct points rather rather than continuously injecting new randomness.
  • Metropolis-Hastings accept/reject step removes discrete time approximation error.


Stochastic Gradient Descent with momentum (e.g. Adam):

Langevin Dynamics

See e.g. Leimkuhler and Matthews (2015) for more details.

To simplify the equations on subsequent slides, we will say the mass and Boltzmann’s constant are 1.

Symbols

\[ \begin{align*} \mathbf{x} & \ \ \text{Position} \\ \mathbf{v} & \ \ \text{Velocity} \\ \Delta t & \ \ \text{Time step} \\ \gamma & \ \ \text{Damping amount} \\ T & \ \ \text{Temperature} \\ U(\mathbf{x}) & \ \ \text{Potential energy function} \\ H(\mathbf{x},\mathbf{v}) & \ \ \text{Hamiltonian} \\ \rho(\mathbf{x},\mathbf{v}) & \ \ \text{Probability density} \\ \mathbf{W} & \ \ \text{Vector of Weiner processes (Brownian motion)} \\ \end{align*} \]

Langevin Dynamics

An example phase-space portrait of 1D Langevin Dynamics, with position as the x-axis and velocity as the y-axis, showing how the state circulates over time.

\[ \begin{align*} \mathrm{d}\mathbf{v} &= -\gamma \mathbf{v} \mathrm{d}t + \sqrt{2\gamma T}\mathrm{d\mathbf{W}} -\nabla U(\mathbf{x}) \mathrm{d}t\\ \mathrm{d}\mathbf{x} &= \mathbf{v} \mathrm{d}t \end{align*} \]


The Hamiltonian is:

\[ H(\mathbf{x},\mathbf{v}) = \frac{1}{2}|\mathbf{v}|^2 + U(\mathbf{x}) \]

In steady state, the system has probability density proportional to:

\[ \rho(\mathbf{x},\mathbf{v}) \propto e^\frac{-H(\mathbf{x},\mathbf{v})}{T} \]

From this you can see the distribution of \(\mathbf{v}\) is multivariate-normal and the distribution of \(\mathbf{x}\) is determined by \(U(\mathbf{x})/T\). (Constraints add some complications to this.)

Langevin Dynamics numerical simulation

Care is taken to remain stable even with high damping.

1. Update velocity:

\[ \begin{align*} \mathbf{v}_{\text{proposed}\ i} &=& e^{- \Delta t \gamma} & \ \mathbf{v}_{i-1} & & \text{Damp previous velocity}\\ & & +\sqrt{T(1-e^{-2 \Delta t \gamma})} & \ \mathcal{N}(0,I) & & \text{Replace noise lost to damping}\\ & & -\Delta t & \ \nabla U(\mathbf{x}_{i-1}) & & \text{Accelerate downhill} \end{align*} \]

2. Update position:

\[ \mathbf{x}_{\text{proposed}\ i} = \mathbf{x}_{i-1} + \Delta t \ \mathbf{v}_{\text{proposed}\ i} \]

3. Fix position:

\[ \mathbf{x}_i = \text{project}( \mathbf{x}_{\text{proposed}\ i} ) \]

\(\ \ \ \text{project}\) SVD-decomposes projection matrix into \(\mathbf{USV}^\top\), returns \(\mathbf{UV}^\top\).

4. Fix velocity:

\[ \mathbf{v}_i = \frac{\mathbf{x}_i - \mathbf{x}_{i-1}}{\Delta t} \]