nf-core/sopa
Nextflow version of Sopa - spatial omics pipeline and analysis
Define where the pipeline should find input data and save output data.
Path to comma-separated file containing information about the samples in the experiment.
string^\S+\.csv$The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
stringEmail address for completion summary.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$Options related to Space Ranger execution and raw spatial data processing
Location of Space Ranger probeset file.
string^\S+\.csv$Location of Space Ranger reference directory. May be packed as tar.gz file.
stringhttps://cf.10xgenomics.com/supp/spatial-exp/refdata-gex-GRCh38-2020-A.tar.gzParameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
stringmasterBase directory for Institutional configs.
stringhttps://raw.githubusercontent.com/nf-core/configs/masterInstitutional config name.
stringInstitutional config description.
stringInstitutional config contact information.
stringInstitutional config URL link.
stringLess common options for the pipeline, typically set in a config file.
Display version and exit.
booleanMethod used to save pipeline results to output directory.
stringEmail address for completion summary, only when pipeline fails.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$Send plain-text email instead of HTML.
booleanDo not use coloured log outputs.
booleanIncoming hook URL for messaging service
stringBoolean whether to validate parameters against the schema at runtime
booleantrueBase URL or local path to location of pipeline test dataset files
stringhttps://raw.githubusercontent.com/nf-core/test-datasets/Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.
stringDisplay the help message.
boolean,stringDisplay the full detailed help message.
booleanDisplay hidden parameters in the help message (only works when —help or —help_full are provided).
booleanParameters related to the SpatialData reader
Technology used for the spatial data, e.g., ‘xenium’, ‘merscope’, …
stringOptional page for the imageio reader
numberCreation of the image and transcript patches before segmentation
Width (and height) of each patch in pixels
numberNumber of overlapping pixels between the patches. We advise to choose approximately twice the diameter of a cell
numberWidth (and height) of each patch in microns
numberNumber of overlapping microns between the patches. We advise to choose approximately twice the diameter of a cell
numberOptional name of the boundaries element to use as a segmentation prior. Either a column name for the transcript dataframe, or a key of sdata containing the shapes names. If combining cellpose with proseg/baysor/comseg, it will be set automatically to 'cellpose_boundaries'.
stringIf prior_shapes_key is provided, this is the value given to transcripts that are not inside any cell (if it’s already 0, don’t provide this argument)
stringParameters related to the tissue segmentation
Whether to run tissue segmentation
booleanLevel of the image pyramid to use for tissue segmentation
numberMode for the tissue segmentation: ‘staining’ or ‘saturation’ (for H&E images).
stringAdditional tissue segmentation parameters as a python dict string
stringFiltering low-quality cells during and after segmentation
Cells with an area less than this value will be filtered. The unit is in pixels^2, and used by Stardist/Cellpose.
numberCells with an area less than this value will be filtered. The unit is in microns^2, and used by Baysor/Comseg. Not used by Proseg.
numberCells with less transcripts than this value will be filtered.
numberCells whose mean channel intensity is less than min_intensity_ratio * quantile_90 will be filtered.
numberAggregation of genes and channels inside each cell
Whether to aggregate the genes (counts) inside each cell
booleanWhether to aggregate the channels (intensity) inside each cell
booleanCells polygons will be expanded by expand_radius_ratio * mean_radius for channels averaging only. This help better aggregate boundary stainings
stringOptional scanpy table preprocessing (log1p, UMAP, leiden clustering) after aggregation/annotation.
Whether to run scanpy preprocessing
booleanResolution parameter for the leiden clustering
numberWhether to check that adata.X contains counts
booleanWhether to compute highly variable genes before computing the UMAP and clustering
booleanParameters related to the Xenium Explorer visualization tool
Number of microns in a pixel. Invalid value can lead to inconsistent scales in the Explorer.
number0.2125If True, will not load the full images in memory (except if the image memory is below ram_threshold_gb)
booleantrueThreshold (in gygabytes) from which image can be loaded in memory.
number4Image preprocessing applied before channel-based segmentation methods (cellpose, stardist)
Parameter for scipy gaussian_filter (applied before running the segmentation method)
numberParameter for skimage.exposure.equalize_adapthist (applied before running the segmentation method)
numberParameter for skimage.exposure.equalize_adapthist (applied before running the segmentation method)
numberParameters related to the proseg segmentation method
Whether to run proseg segmentation
booleanString suffix to add to the proseg command line. This can be used to add extra parameters to the proseg command line.
stringWhether to infer the proseg presets based on the columns of the transcripts dataframe.
booleanOnly for Visium HD data. Key of sdata containing the prior cell boundaries. If 'auto', use the latest performed segmentation (e.g., stardist or the 10X Genomics segmentation). If combining stardist with proseg, it will be set automatically to 'stardist_boundaries'.
stringParameters related to the comseg segmentation method
Whether to run comseg segmentation
booleanComseg mean_cell_diameter parameter
number10Comseg max_cell_radius parameter
number15Comseg alpha parameter
number0.5Comseg min_rna_per_cell parameter
number1Comseg allow_disconnected_polygon parameter
booleanComseg norm_vector parameter
booleanParameters related to the baysor segmentation method
Whether to run baysor segmentation
booleanBaysor scale parameter
number-1Baysor scale_std parameter
string25%Baysor prior_segmentation_confidence parameter
number0.2Baysor min_molecules_per_cell parameter
number20Baysor min_molecules_per_gene parameter
number10Baysor min_molecules_per_segment parameter
numberBaysor confidence_nn_id parameter
numberBaysor force_2d parameter
booleantrueParameters related to the cellpose segmentation method
Whether to run cellpose segmentation
booleanCellpose diameter parameter
numberChannel name(s) to use for cellpose segmentation. If multiple, separate by space, comma or pipe characters.
stringCellpose flow_threshold parameter
numberCellpose cellprob_threshold parameter
numberCellpose model type to use
stringCellpose pretrained_model parameter
stringWhether to use GPU for Cellpose segmentation
booleanAdditional cellpose parameters as a python dict string
stringParameters related to the stardist segmentation method
Whether to run stardist segmentation
booleanName of stardist model to use
stringStardist prob_thresh parameter
numberStardist nms_thresh parameter
numberOptional channel name(s) to use for stardist segmentation. If multiple, separate by space, comma or pipe characters.
stringAdditional stardist parameters as a python dict string
stringParameters related to the tangram cell-type annotation method
Whether to run tangram cell-type annotation
booleanPath to the scRNAseq annotated reference, as a .h5ad file
stringKey of adata_ref.obs containing the cell-types
stringPreprocessing method applied to the reference. Either None (raw counts), or normalized (sc.pp.normalize_total) or log1p (sc.pp.normalize_total and sc.pp.log1p)
stringNumber of cells in each bag of the spatial table. Low values will decrease the memory usage
numberMaximum samples to be considered in the reference for tangram. Low values will decrease the memory usage
numberParameters related to the fluorescence-based cell-type annotation method
Whether to run cell-type annotation based on a marker-to-cell dictionary
booleanKey of sdata.obs containing the cell-types
stringDictionary mapping whose keys are marker channel names and values are the cell types associated to each marker. Should be provided as a string representation of a python dictionary.
string