Algorithm and scientific questions: <Huidong.Chen at mgh dot harvard dot edu>
Module wrapping issues: Ted Liefeld < jliefeld at cloud dot ucsd dot edu>
STREAM (Single-cell Trajectories Reconstruction, Exploration And Mapping) is an interactive pipeline capable of disentangling and visualizing complex branching trajectories from both single-cell transcriptomic and epigenomic data. Within GenePattern STREAM is implemented as a collection of modules that cover the entire STREAM processing pipeline to allow individual steps to be performed interactively for data exploration.
STREAM.DetectTransitionGenes is used to detect marker genes for each transition.
For each branch Bi and for each gene E we first scale the gene expression values to [0,1] for convenience. Then we check if the candidate gene has a reasonable dynamic range considering cells close to the start and end points. To this end, we consider the fold change in average gene expressions of the first 20% and the last 80% of the cells based on the inferred pseudotime. If the difference is greater than a specified threshold (the default log2 fold change value is 0.25), we then calculate Spearman’s rank correlation between inferred pseudotime and gene expression of all the cells along Bi. Genes with Spearman’s correlation coefficient above a specified threshold (0.4 by default) are identified and reported as transition genes.
H Chen, L Albergante, JY Hsu, CA Lareau, GL Bosco, J Guan, S Zhou, AN Gorban, DE Bauer, MJ Aryee, DM Langenau, A Zinovyev, JD Buenrostro, GC Yuan, L Pinello Single-cell trajectories reconstruction, exploration and mapping of omics data with STREAM. Nature Communications, volume 10, Article number: 1903 (2019)
Nestorowa, S. et al. A single-cell resolution map of mouse hematopoietic stem and progenitor cell differentiation. Blood 128, e20-31 (2016).
Pinello Lab STREAM Github Repository
Example data for the STREAM workflow can be downloaded from dropbox: Stream Example Data
Ref: Nestorowa, S. et al. A single-cell resolution map of mouse hematopoietic stem and progenitor cell differentiation. Blood 128, e20-31 (2016).
Example data for this specific step can be found at stream_epg_result.pkl
GenePattern 3.9.11 or later (dockerized).
Inputs and Outputs | |
Name | Description |
---|---|
data file* | A STREAM pkl file containing an annotated AnnData matrix of gene expression data/td> |
output filename* | The output filename prefix. |
Transition Gene DetectionParameters used if variable genes are to be selected as the feature. | |
Name | Description |
root | The starting node. |
preference | The preference of nodes. The branch with speficied nodes are preferred and put on the top part of subway plot. The higher ranks the node have, the closer to the top the branch with that node is. e.g. S3,S4. |
percentile expr* | Between 0 and 100. Between 0 and 100. Specify the percentile of gene expression greater than 0 to filter out some extreme gene expressions. |
use precomputed* | If True, the previously computed scaled gene expression will be used. |
cutoff zscore* | The z-score cutoff used for mean values of all leaf branches. |
cutoff pvalue | The p value cutoff used for Kruskal-Wallis H-test and post-hoc pairwise Conover's test. |
PlottingParameters controlling the output figures. | |
Name | Description |
num genes | The number of genes to plot |
figure height/td> | Figure height as used in matplotlib graphs. Default=8. |
figure width | Figure width as used in matplotlib plots. Default=8 |
* - required