Contents

1 Overview

Droplet-based scRNA-seq protocols encapsulate cells in water-in-oil droplets for massively multiplexed library preparation. We will be examining how to analyze this type of data, using a publicly available data set generated by 10X Genomics. We assume that you have already taken the previous workshop describing the basics of scRNA-seq data analysis.

This workshop will use R version 3.5.0 or higher, with a number of Bioconductor packages. If you haven’t downloaded and installed them already, you can do so by running the code below. This only needs to be done once - the packages will be on your computer once installed, and can be loaded with library.

source("https://bioconductor.org/biocLite.R")
biocLite(c("knitr", "BiocStyle", "EnsDb.Hsapiens.v86", "scater", 
    "Rtsne", "DropletUtils", "scran", "pheatmap"))

2 Setting up the data

2.1 Reading in a sparse matrix

We are using a data set containing 3000 T cells from a healthy human donor, see https://support.10xgenomics.com/single-cell-gene-expression/datasets/2.1.0/t_3k. The data have already been run through the CellRanger pipeline to obtain unique molecular identifier (UMI) counts for each gene in each cell.

We load in the raw count matrix using the read10xCounts() function from the DropletUtils package. This will create a SingleCellExperiment object where each column corresponds to a cell barcode.

library(DropletUtils)
fname <- "t_3k/raw_gene_bc_matrices/GRCh38"
sce <- read10xCounts(fname, col.names=TRUE)
sce
## class: SingleCellExperiment 
## dim: 33694 737280 
## metadata(0):
## assays(1): counts
## rownames(33694): ENSG00000243485 ENSG00000237613 ... ENSG00000277475
##   ENSG00000268674
## rowData names(2): ID Symbol
## colnames(737280): AAACCTGAGAAACCAT-1 AAACCTGAGAAACCGC-1 ...
##   TTTGTCATCTTTAGTC-1 TTTGTCATCTTTCCTC-1
## colData names(2): Sample Barcode
## reducedDimNames(0):
## spikeNames(0):

Here, each count represents the number of UMIs assigned to a gene for a cell barcode. Note that the counts are loaded as a sparse matrix object - specifically, a dgCMatrix instance from the Matrix package. This avoids allocating memory to hold zero counts.

class(counts(sce))
## [1] "dgCMatrix"
## attr(,"package")
## [1] "Matrix"

Exercise:

How many non-zeroes are there?

mean(counts(sce)!=0)
## [1] 0.000289687

What is the difference in memory usage?

object.size(counts(sce))
## 156613864 bytes
as.numeric(ncol(sce))*as.numeric(nrow(sce)) * 4
## [1] 99367649280

How can you get column sums?

library(Matrix)
head(colSums(counts(sce)))
## AAACCTGAGAAACCAT-1 AAACCTGAGAAACCGC-1 AAACCTGAGAAACCTA-1 AAACCTGAGAAACGAG-1 
##                  0                  2                 22                  0 
## AAACCTGAGAAACGCC-1 AAACCTGAGAAAGTGG-1 
##                  0                  5
head(Matrix::colSums(counts(sce)))
## AAACCTGAGAAACCAT-1 AAACCTGAGAAACCGC-1 AAACCTGAGAAACCTA-1 AAACCTGAGAAACGAG-1 
##                  0                  2                 22                  0 
## AAACCTGAGAAACGCC-1 AAACCTGAGAAAGTGG-1 
##                  0                  5

2.2 Annotating the rows

We relabel the rows with the gene symbols for easier reading using the uniquifyFeatureNames() function.

library(scater)
rownames(sce) <- uniquifyFeatureNames(rowData(sce)$ID, rowData(sce)$Symbol)
head(rownames(sce))
## [1] "RP11-34P13.3"  "FAM138A"       "OR4F5"         "RP11-34P13.7" 
## [5] "RP11-34P13.8"  "RP11-34P13.14"

We also identify the chromosomal location for each gene, especially the mitochondrial location as this is particularly useful for later quality control.

library(EnsDb.Hsapiens.v86)
location <- mapIds(EnsDb.Hsapiens.v86, keys=rowData(sce)$ID, 
    column="SEQNAME", keytype="GENEID")
rowData(sce)$CHR <- location
summary(location=="MT")
##    Mode   FALSE    TRUE    NA's 
## logical   33537      13     144

3 Calling cells from empty droplets

In droplet-based data, we don’t know which droplets actually contain cells and which droplets are empty. We need to call cells from empty droplets based on the observed expression profiles, which is not always easy due to the presence of extracellular (or ambient) RNA in empty droplets. A good place to start is to examine a “barcode rank” plot:

br.out <- barcodeRanks(counts(sce))
plot(br.out$rank, br.out$total, log="xy", xlab="Rank", ylab="Total counts")
abline(h=br.out$inflection, col="forestgreen")
abline(h=br.out$knee, col="dodgerblue")
legend("bottomleft", legend=c("inflection", "knee"),
    col=c("forestgreen", "dodgerblue"), lwd=2)

We should see a sharp drop, above which are presumably cells (high RNA content) and below which are empty droplets (low RNA content). We could draw some horizontal line to use as a threshold for defining cells, which is pretty much what CellRanger does.

A more objective approach is implemented in the emptyDrops() function (see https://www.biorxiv.org/content/10.1101/234872v2). This tests whether the expression profile for each cell barcode is significantly different from the ambient pool. Any significant deviation indicates that the barcode corresponds to a cell-containing droplet.

We call cells at a false discovery rate (FDR) of 1%, meaning that no more than 1% of our called barcodes should be empty droplets on average.

set.seed(100)
e.out <- emptyDrops(counts(sce))
sig <- e.out$FDR <= 0.01
sum(sig, na.rm=TRUE)
## [1] 4009

We can look at how the likelihoods (of originating from the ambient pool) decrease with increasing total UMI count. Note that the negative log-probabilities are plotted below, which increase with increasing total count.

plot(e.out$Total, -e.out$LogProb, log="xy", col=ifelse(sig, "red", "black"),
    xlab="Total count", ylab="-Log probability")
legend("topleft", legend=c("Putative cell", "Empty"), 
    col=c("red", "black"), pch=16)

emptyDrops() computes Monte Carlo p-values, so it is important to set the random seed to obtain reproducible results. The number of Monte Carlo iterations also determines the lower bound for the _p_values. If any non-significant barcodes are TRUE for Limited, we may need to increase the number of iterations to ensure that they can be detected.

table(Sig=e.out$FDR <= 0.01, Limited=e.out$Limited)
##        Limited
## Sig     FALSE TRUE
##   FALSE   428    0
##   TRUE    231 3778

We then subset our SingleCellExperiment object to retain only the detected cells.

# using which() to automatically remove NAs.
sce <- sce[,which(e.out$FDR <= 0.01)]
sce
## class: SingleCellExperiment 
## dim: 33694 4009 
## metadata(0):
## assays(1): counts
## rownames(33694): RP11-34P13.3 FAM138A ... AC213203.1 FAM231B
## rowData names(3): ID Symbol CHR
## colnames(4009): AAACCTGCACATTAGC-1 AAACCTGGTAAATACG-1 ...
##   TTTGTCATCCCTCTTT-1 TTTGTCATCTGTCCGT-1
## colData names(2): Sample Barcode
## reducedDimNames(0):
## spikeNames(0):

Exercise:

How could I get all cells above the knee point?

keep <- br.out$total >= br.out$knee
summary(keep)
##    Mode 
## logical

How can I recapitulate CellRanger’s cell calling?

cr.keep <- defaultDrops(counts(sce), expected=3000)
summary(cr.keep)
##    Mode   FALSE    TRUE 
## logical     588    3421

What does table() do with two variables?

fruits <- sample(c("Apple", "Banana", "Cucumbers"), 10, replace=TRUE)
people <- sample(c("Alex", "Brett", "Cameron"), 10, replace=TRUE)
table(fruits, people)
##            people
## fruits      Alex Brett Cameron
##   Apple        0     0       2
##   Banana       2     1       2
##   Cucumbers    0     3       0

4 Quality control on the cells

It is possible for non-empty droplets to contain damaged or dying cells, which need to be removed prior to downstream analysis. We compute some QC metrics using calculateQCMetrics() and examine their distributions:

sce <- calculateQCMetrics(sce, feature_controls=list(Mito=which(location=="MT")))
par(mfrow=c(1,3))
hist(sce$log10_total_counts, breaks=20, col="grey80",
    xlab="Log-total UMI count")
hist(sce$log10_total_features_by_counts, breaks=20, col="grey80",
    xlab="Log-total number of expressed features")
hist(sce$pct_counts_Mito, breaks=20, col="grey80",
    xlab="Proportion of reads in mitochondrial genes")

We remove cells with low library sizes, low total number of expressed features or high mitochondrial proportions. The last is a proxy for cell damage in the absence of spike-in RNA.

low.lib <- isOutlier(sce$log10_total_counts, nmads=3, type="lower")
low.nfeat <- isOutlier(sce$log10_total_features_by_counts, nmads=3, type="lower")
high.mito <- isOutlier(sce$pct_counts_Mito, nmads=3, type="higher")
discard <- low.lib | low.nfeat | high.mito
data.frame(LowLib=sum(low.lib), LowNFeat=sum(low.nfeat), 
    HighMito=sum(high.mito), discard=sum(discard))
##   LowLib LowNFeat HighMito discard
## 1    623      614      541     713

We subset the SingleCellExperiment object to remove the low-quality cells.

sce <- sce[,!discard]
summary(discard)
##    Mode   FALSE    TRUE 
## logical    3296     713

Outlier-based QC requires some care in heterogeneous datasets, where particular cell types might naturally express fewer features. In this case, the population should be fairly homogeneous (all T cells) so we don’t have to worry too much.

5 Examining gene expression

The average expression of each gene is much lower here compared to the read count data set, due to:

ave <- calcAverage(sce)
rowData(sce)$AveCount <- ave
hist(log10(ave), col="grey80")

The set of most highly expressed genes is dominated by ribosomal protein and mitochondrial genes, as expected.

plotHighestExprs(sce)

6 Normalizing for cell-specific biases

Before estimating size factors with the deconvolution method (https://dx.doi.org/10.1186/s13059-016-0947-7), we perform some pre-clustering to break up obvious clusters and avoid pooling cells that are very different. The min.mean= argument just avoids using genes with lots of zero counts.

library(scran)
clusters <- quickCluster(sce, method="igraph", min.mean=0.1)
table(clusters)
## clusters
##    1    2    3    4    5    6    7 
##  173 1100  684  198  400  285  456

We then apply the deconvolution method to compute size factors for all cells.

sce <- computeSumFactors(sce, min.mean=0.1, cluster=clusters)
summary(sizeFactors(sce))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.3454  0.7950  0.9362  1.0000  1.0857 13.7084

The size factors are well correlated against the library sizes, indicating that capture efficiency and sequencing depth are the major biases.

plot(sce$total_counts, sizeFactors(sce), log="xy")
Size factors for all cells in the PBMC dataset, plotted against the library size.

Figure 1: Size factors for all cells in the PBMC dataset, plotted against the library size

Finally, we compute normalized log-expression values. There is no need to call computeSpikeFactors() here, as there are no spike-in transcripts available.

sce <- normalize(sce)
assayNames(sce)
## [1] "counts"    "logcounts"

Exercise:

What do we do about negative size factors?

lib.factors <- librarySizeFactors(sce)
is.neg <- sizeFactors(sce) < 0
# sum(is.neg)  # in this case we don't get any negative size factors, but this often happens
sizeFactors(sce)[is.neg] <- lib.factors[is.neg]

7 Modelling the mean-variance trend

We don’t have spike-ins, so we can’t directly model the technical noise. One option is to assume that most genes do not exhibit strong biological variation, and to fit a trend to the variances of endogenous genes.

fit <- trendVar(sce, use.spikes=FALSE, loess.args=list(span=0.05))
plot(fit$mean, fit$var, pch=16, xlab="Mean log-exression",
    ylab="Variance of log-expression")
curve(fit$trend(x), col="dodgerblue", add=TRUE)

Another option is to assume that the technical noise is Poisson and create a fitted trend on that basis using the makeTechTrend() function.

new.trend <- makeTechTrend(x=sce)
plot(fit$mean, fit$var, pch=16,  xlab="Mean log-exression",
    ylab="Variance of log-expression")
curve(new.trend(x), col="red", add=TRUE)

Assuming Poisson noise, we decompose the total variance to its technical and biological components:

fit$trend <- new.trend # tricking decomposeVar into thinking this is the trend!
dec <- decomposeVar(fit=fit)
top.dec <- dec[order(dec$bio, decreasing=TRUE),]
head(top.dec)
## DataFrame with 6 rows and 6 columns
##                     mean            total               bio
##                <numeric>        <numeric>         <numeric>
## JUNB    3.39636209008565 1.51843057378741  1.28953983309707
## CCL5   0.846162187216013 1.80776873132686  1.26186194792039
## S100A4  1.98965776403636 1.56977284981874  1.03811966473351
## KLRB1  0.732229495410003  1.5266610118305  1.01866513709676
## JUN      2.3625249611841  1.4212739579397 0.974601639995423
## GNLY   0.559285945928888 1.39020132410692 0.959013605318938
##                     tech   p.value       FDR
##                <numeric> <numeric> <numeric>
## JUNB   0.228890740690336         0         0
## CCL5   0.545906783406467         0         0
## S100A4 0.531653185085226         0         0
## KLRB1  0.507995874733747         0         0
## JUN    0.446672317944277         0         0
## GNLY   0.431187718787979         0         0

We have a look at the genes with the largest biological components.

plotExpression(sce, features=rownames(top.dec)[1:10])

8 Dimensionality reduction

We use the denoisePCA() function with the assumed Poisson technical trend, to choose the number of dimensions to retain after PCA. Note the approx=TRUE, which uses irlba to perform a fast PCA.

sce <- denoisePCA(sce, technical=new.trend, approx=TRUE)
ncol(reducedDim(sce, "PCA"))
## [1] 25

Examination of the first few PCs reveals some structure in the data:

plotPCA(sce, ncomponents=3)

This is more evident with a t-SNE plot on the PCs (see the use of use_dimred=).

sce <- runTSNE(sce, use_dimred="PCA", perplexity=40)
plotTSNE(sce)

Exercise:

How do we check the percentage of variance explained?

plot(attr(reducedDim(sce), "percentVar"), xlab="PC",
    ylab="Proportion of variance explained")
abline(v=ncol(reducedDim(sce, "PCA")), lty=2, col="red")

9 Clustering with graph-based methods

We start by building a shared nearest neighbour graph, where each cell is connected to its neighbours by an edge. Unlike hierarchical clustering, we do not need to construct a distance matrix for a large number of cells.

set.seed(100)
snn.gr <- buildSNNGraph(sce, use.dimred="PCA")
snn.gr
## IGRAPH c57897f U-W- 3296 223958 -- 
## + attr: weight (e/n)
## + edges from c57897f:
##  [1] 1--   9 1--  20 1-- 100 1-- 108 1-- 123 1-- 128 1-- 172 1-- 179 1-- 192
## [10] 1-- 203 1-- 420 1-- 457 1-- 466 1-- 472 1-- 475 1-- 477 1-- 479 1-- 560
## [19] 1-- 561 1-- 604 1-- 623 1-- 625 1-- 640 1-- 646 1-- 677 1-- 716 1-- 760
## [28] 1-- 782 1-- 811 1-- 867 1-- 893 1-- 920 1-- 945 1-- 955 1-- 966 1--1009
## [37] 1--1026 1--1064 1--1082 1--1085 1--1122 1--1143 1--1148 1--1161 1--1209
## [46] 1--1253 1--1256 1--1265 1--1291 1--1317 1--1329 1--1359 1--1364 1--1383
## [55] 1--1465 1--1527 1--1534 1--1556 1--1585 1--1614 1--1617 1--1625 1--1639
## [64] 1--1647 1--1655 1--1674 1--1675 1--1707 1--1713 1--1755 1--1770 1--1803
## + ... omitted several edges

We then use the Walktrap algorithm to identify clusters, i.e., “communities” in the graph. These are groups of cells that are highly connected within each group, with relatively few connections between groups.

clusters <- igraph::cluster_walktrap(snn.gr)
sce$Cluster <- factor(clusters$membership)
table(sce$Cluster)
## 
##   1   2   3   4   5   6   7   8   9  10 
## 420 282  52 408 764  20 812 125  40 373

We examine the “modularity” of the clusters, i.e., how enriched a cluster is for intra-group connections relative to what is expected under a null model. At least a few of the clusters are not very modular (strong off-diagonal entries) - not surprising for T cell subsets.

cluster.mod <- clusterModularity(snn.gr, sce$Cluster, get.values=TRUE)
log.ratio <- log2(cluster.mod$observed/cluster.mod$expected + 1)

library(pheatmap)
pheatmap(log.ratio, cluster_rows=FALSE, cluster_cols=FALSE, 
    color=colorRampPalette(c("white", "blue"))(100))

We examine the cluster identities on a t-SNE plot:

plotTSNE(sce, colour_by="Cluster")

Exercise:

Why did I use igraph::?

library(igraph)
normalize
## function (xmin = -1, xmax = 1, ymin = xmin, ymax = xmax, zmin = xmin, 
##     zmax = xmax) 
## {
##     args <- grab_args()
##     layout_modifier(id = "normalize", args = args)
## }
## <bytecode: 0x563cd0b25fd8>
## <environment: namespace:igraph>

10 Marker gene detection

We detect marker genes for each cluster using findMarkers(). We only look at upregulated genes in each cluster, as these are more useful for positive identification of cell types in a heterogeneous population.

markers <- findMarkers(sce, clusters=sce$Cluster, direction="up")

We examine the markers for cluster 10 in more detail.

marker.set <- markers[["10"]]
head(marker.set, 10) 
## DataFrame with 10 rows and 12 columns
##                     Top               p.value                   FDR
##               <integer>             <numeric>             <numeric>
## YPEL5                 1 6.74134041941487e-145 2.27142724091768e-140
## CD8B                  1 3.27464099861097e-131 5.51678769035996e-127
## RPS6                  1 1.30756234003215e-113 1.46856684950143e-109
## RP11-291B21.2         1  1.43119863317174e-96   6.8889723922984e-93
## RPL32                 1  8.80002538397251e-95  3.29453394763963e-91
## RCAN3                 1  1.46064933436353e-55  8.06805224131879e-53
## RPS3A                 2 2.23459865182688e-106 1.88231417436639e-102
## RPL34                 2 2.01275702466737e-103  1.13029725315239e-99
## EIF1                  2   2.5169585243996e-63  2.01920001240763e-60
## RPL21                 3 1.80146792258314e-105 1.21397320367032e-101
##                           logFC.1           logFC.2           logFC.3
##                         <numeric>         <numeric>         <numeric>
## YPEL5            1.70211141911096  1.30612566468142  1.81228458064147
## CD8B            0.839988144406146  1.00510055804443  1.39376533952869
## RPS6           0.0793291583715217 0.855530191551329  1.81057128996238
## RP11-291B21.2    1.03276712709175  1.30166459747607  1.44777718650354
## RPL32          0.0214806202433477 0.799788752464195  1.86453459595128
## RCAN3          0.0375157233456345 0.370713659322517 0.235576350805402
## RPS3A          0.0731771668433199  0.86940599790502  1.70832973028906
## RPL34          0.0623807587860856 0.779973140731459  2.00227946840967
## EIF1            0.748013534263353 0.266138911901223 0.713354154565358
## RPL21         0.00465536788547283 0.757738865244604  1.95999745155759
##                         logFC.4            logFC.5           logFC.6
##                       <numeric>          <numeric>         <numeric>
## YPEL5          1.79163092027117   0.56309647611639   1.2231920710111
## CD8B            1.3441052413981   1.42165481827623  1.54309895451392
## RPS6          0.439098023361932  0.413977578610591 0.961435693008676
## RP11-291B21.2  1.43659979383851   1.43513978508051  1.49132067952303
## RPL32         0.303601161456027  0.350014403257666  1.03236011743579
## RCAN3         0.203662920416228 0.0278589948734735  0.51445452488965
## RPS3A         0.517080257843371  0.503649105966952 0.636237793216829
## RPL34         0.293034854009429  0.269938602394037   1.1557633420992
## EIF1          0.500174948514634 -0.124691033572306 0.539991982719002
## RPL21         0.514857022922799   0.45909597047159  1.23872489209415
##                          logFC.7           logFC.8           logFC.9
##                        <numeric>         <numeric>         <numeric>
## YPEL5         0.0755868974651621  1.38207073360608  2.22537072230096
## CD8B            1.32289756054928 0.789122519550124 0.752651135990378
## RPS6           0.102821626169893  1.51743512272752  1.47105713445701
## RP11-291B21.2   1.35668609385301  1.33326484614313  1.43069781924929
## RPL32         0.0632682203956509  1.55642626699641   1.5031495831972
## RCAN3         0.0100623839909427 0.547560977450219 0.596049357463156
## RPS3A         0.0936364647617074   1.5336287770961  1.37574678453785
## RPL34         0.0476929190272086  1.56796595997118  1.40183247412623
## EIF1          -0.103879278750165 0.381278184819664  1.02448327706038
## RPL21         0.0385124342755301  1.32924355159677  1.25661644579947

The transcriptional profile of cluster 10 is clearly distinct from the others:

chosen <- rownames(marker.set)[marker.set$Top <= 10]
plotHeatmap(sce, features=chosen, exprs_values="logcounts", 
    zlim=3, center=TRUE, symmetric=TRUE, cluster_cols=FALSE,
    colour_columns_by="Cluster", columns=order(sce$Cluster))

… which can be further examined on the t-SNE plot:

plotTSNE(sce, colour_by="GNLY")

So, these are probably cytotoxic T cells, with some Jun/Fos activity.

11 Concluding remarks

Having completed the basic analysis, we save the SingleCellExperiment object with its associated data to file. This avoids having to repeat all of the pre-processing steps described above prior to further analyses.

saveRDS(sce, file="t3k_data.rds")

See https://www.bioconductor.org/packages/release/workflows/vignettes/simpleSingleCell/inst/doc/work-3-tenx.html for more details.

Meanwhile, show the session information for record-keeping:

sessionInfo()
## R Under development (unstable) (2019-03-19 r76252)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.2 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/local/lib/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] igraph_1.2.4                pheatmap_1.0.12            
##  [3] scran_1.11.23               EnsDb.Hsapiens.v86_2.99.0  
##  [5] ensembldb_2.7.11            AnnotationFilter_1.7.0     
##  [7] GenomicFeatures_1.35.9      AnnotationDbi_1.45.1       
##  [9] scater_1.11.13              ggplot2_3.1.0              
## [11] Matrix_1.2-17               DropletUtils_1.3.13        
## [13] SingleCellExperiment_1.5.2  SummarizedExperiment_1.13.0
## [15] DelayedArray_0.9.9          BiocParallel_1.17.18       
## [17] matrixStats_0.54.0          Biobase_2.43.1             
## [19] GenomicRanges_1.35.1        GenomeInfoDb_1.19.2        
## [21] IRanges_2.17.4              S4Vectors_0.21.21          
## [23] BiocGenerics_0.29.2         knitr_1.22                 
## [25] BiocStyle_2.8.2            
## 
## loaded via a namespace (and not attached):
##  [1] ProtGenerics_1.15.0      bitops_1.0-6            
##  [3] bit64_0.9-7              RColorBrewer_1.1-2      
##  [5] progress_1.2.0           httr_1.4.0              
##  [7] dynamicTreeCut_1.63-1    tools_3.6.0             
##  [9] R6_2.4.0                 irlba_2.3.3             
## [11] HDF5Array_1.11.11        vipor_0.4.5             
## [13] DBI_1.0.0                lazyeval_0.2.2          
## [15] colorspace_1.4-1         withr_2.1.2             
## [17] tidyselect_0.2.5         gridExtra_2.3           
## [19] prettyunits_1.0.2        curl_3.3                
## [21] bit_1.1-14               compiler_3.6.0          
## [23] BiocNeighbors_1.1.12     labeling_0.3            
## [25] rtracklayer_1.43.3       bookdown_0.9            
## [27] scales_1.0.0             stringr_1.4.0           
## [29] digest_0.6.18            Rsamtools_1.99.4        
## [31] rmarkdown_1.12           R.utils_2.8.0           
## [33] XVector_0.23.2           pkgconfig_2.0.2         
## [35] htmltools_0.3.6          highr_0.8               
## [37] limma_3.39.14            rlang_0.3.3             
## [39] RSQLite_2.1.1            DelayedMatrixStats_1.5.2
## [41] dplyr_0.8.0.1            R.oo_1.22.0             
## [43] RCurl_1.95-4.12          magrittr_1.5            
## [45] BiocSingular_0.99.14     GenomeInfoDbData_1.2.0  
## [47] Rcpp_1.0.1               ggbeeswarm_0.6.0        
## [49] munsell_0.5.0            Rhdf5lib_1.5.4          
## [51] viridis_0.5.1            R.methodsS3_1.7.1       
## [53] stringi_1.4.3            yaml_2.2.0              
## [55] edgeR_3.25.3             zlibbioc_1.29.0         
## [57] Rtsne_0.15               rhdf5_2.27.15           
## [59] plyr_1.8.4               grid_3.6.0              
## [61] blob_1.1.1               dqrng_0.1.1             
## [63] crayon_1.3.4             lattice_0.20-38         
## [65] cowplot_0.9.4            Biostrings_2.51.5       
## [67] hms_0.4.2                locfit_1.5-9.1          
## [69] pillar_1.3.1             reshape2_1.4.3          
## [71] biomaRt_2.39.2           XML_3.98-1.19           
## [73] glue_1.3.1               evaluate_0.13           
## [75] gtable_0.3.0             purrr_0.3.2             
## [77] assertthat_0.2.1         xfun_0.5                
## [79] rsvd_1.0.0               viridisLite_0.3.0       
## [81] tibble_2.1.1             GenomicAlignments_1.19.1
## [83] beeswarm_0.2.3           memoise_1.1.0           
## [85] statmod_1.4.30