Algorithm and scientific questions: <Huidong.Chen at mgh dot harvard dot edu>
Module wrapping issues: Ted Liefeld < jliefeld at cloud dot ucsd dot edu>
STREAM (Single-cell Trajectories Reconstruction, Exploration And Mapping) is an interactive pipeline capable of disentangling and visualizing complex branching trajectories from both single-cell transcriptomic and epigenomic data. Within GenePattern STREAM is implemented as a collection of modules that cover the entire STREAM processing pipeline to allow individual steps to be performed interactively for data exploration.
STREAM.Plolt2DVisualization is used check if there is clear meaningful trajectory pattern to the data. If there is, we will continue the downstream analysis placing the cells onto the trajectories. If not, we would go back to previous steps to modify the parameters used to filter and prepare the data to try different settings.
UMAP is a manifold learning technique ifor dimension reduction constructed from a theoretical framework based in Riemannian geometry and algebraic topology. UMAP preserves more of the global structure than tSNE and runs more quickly.
tSNE is a technique for dimensionality reduction for the visualization of high-dimensional datasets. This technique is implemented via Barnes-Hut approximations, allowing it to be applied on large real-world datasets.
H Chen, L Albergante, JY Hsu, CA Lareau, GL Bosco, J Guan, S Zhou, AN Gorban, DE Bauer, MJ Aryee, DM Langenau, A Zinovyev, JD Buenrostro, GC Yuan, L Pinello Single-cell trajectories reconstruction, exploration and mapping of omics data with STREAM. Nature Communications, volume 10, Article number: 1903 (2019)
Nestorowa, S. et al. A single-cell resolution map of mouse hematopoietic stem and progenitor cell differentiation. Blood 128, e20-31 (2016).
Pinello Lab STREAM Github Repository
Leland McInnes, John Healy, James Melville, UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
L.J.P. van der Maaten. Accelerating t-SNE using Tree-Based Algorithms. Journal of Machine Learning Research 15(Oct):3221-3245, 2014.
Example data for the STREAM workflow can be downloaded from dropbox: Stream Example Data
Ref: Nestorowa, S. et al. A single-cell resolution map of mouse hematopoietic stem and progenitor cell differentiation. Blood 128, e20-31 (2016).
An input file suitable for this step is available at dimred_stream_result.pkl
GenePattern 3.9.11 or later (dockerized).
|data file*||A STREAM pkl file containing an annotated AnnData matrix of gene expression data/td>|
|output filename*||The output filename prefix.|
|method||Method used for visualization. Choose from; 'umap': Uniform Manifold Approximation and Projection; 'tsne': t-Distributed Stochastic Neighbor Embedding.|
|percent neighbor cells||The percentage of neighbor cells (only valid when 'umap' is specified).|
|perplexity||The perplexity used (only valid when tSNE is specified).|
|color by||Specify how to color cells. 'label': the cell labels, 'branch': the branch id identifed by STREAM|
|use precomputed||If True, the visualization coordinates from previous computation result (in pkl input file) will be used|
PlottingParameters controlling the output figures.
|figure height||Figure height as used in matplotlib graphs. Default=8.|
|figure width||Figure width as used in matplotlib plots. Default=8|
|figure legend num columns||The number of columns used in the figure legend, default=3.|
* - required