10 Dimentionality Reduction and Clustering CosMx data

Next, you can do the traditional DimReduction methods if you want to visualise the clusters etc of this data. However, it needs to be noted that, clustering may not really distinguish the good clusters for spatial data on a 2D scale. But always plot these clusters using ImageDimPlot to see whether the clustering makes sense in the spatial context.

plot_dir <-"./plots/"
dir.create(plot_dir, showWarnings = FALSE)

# config
min_count_per_cell <- 100
max_pc_negs        <- 1.5
max_avg_neg        <- 0.5
num_dims           <- 10  

Also one thing I noticed recently is that, not using “too much” PCA loading to calculate downstream UMAP or TSNE reductions make the cluster separation a bit “better”. That is why you see the num_dims as the number of dimensions I pick from the PCA reductions.

merged <- FindVariableFeatures(merged, nfeatures = 500) 
merged <- ScaleData(merged)
merged <- RunPCA(merged, features = VariableFeatures(merged))
merged <- RunUMAP(merged, dims=1:num_dims)
merged <- FindNeighbors(merged, dims = 1:num_dims)
merged <- FindClusters(merged)

# after filtering
pdf(paste0(plot_dir,"filtered_umaps.pdf"), width = 17, height = 12)
  DimPlot(merged)
  DimPlot(merged, group.by = 'tma', raster = FALSE)
  DimPlot(merged, group.by = 'tma',split.by = "tumourtype_loc", ncol = 3, raster = FALSE)
  DimPlot(merged, group.by = 'Tumour_type', raster = FALSE)
  DimPlot(merged, group.by = 'Tumour_type', split.by = 'Tumour_type', raster = FALSE)
  FeaturePlot(merged, features = c("nCount_RNA", "nFeature_RNA", "nCount_negprobes", "nFeature_negprobes"), raster = FALSE)
dev.off()