sopa: Parameters

Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples in the experiment.

required

type: string

pattern: ^\S+\.csv$

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Options related to Space Ranger execution and raw spatial data processing

Location of Space Ranger probeset file.

type: string

pattern: ^\S+\.csv$

Location of Space Ranger reference directory. May be packed as tar.gz file.

type: string

default: https://cf.10xgenomics.com/supp/spatial-exp/refdata-gex-GRCh38-2020-A.tar.gz

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Base URL or local path to location of pipeline test dataset files

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden

type: string

Display the help message.

type: boolean,string

Display the full detailed help message.

type: boolean

Display hidden parameters in the help message (only works when —help or —help_full are provided).

type: boolean

Parameters related to the SpatialData reader

Technology used for the spatial data, e.g., ‘xenium’, ‘merscope’, …

required

type: string

Optional page for the imageio reader

type: number

Creation of the image and transcript patches before segmentation

Width (and height) of each patch in pixels

type: number

Number of overlapping pixels between the patches. We advise to choose approximately twice the diameter of a cell

type: number

Width (and height) of each patch in microns

type: number

Number of overlapping microns between the patches. We advise to choose approximately twice the diameter of a cell

type: number

Optional name of the boundaries element to use as a segmentation prior. Either a column name for the transcript dataframe, or a key of sdata containing the shapes names. If combining cellpose with proseg/baysor/comseg, it will be set automatically to 'cellpose_boundaries'.

type: string

If prior_shapes_key is provided, this is the value given to transcripts that are not inside any cell (if it’s already 0, don’t provide this argument)

type: string

Parameters related to the tissue segmentation

Whether to run tissue segmentation

type: boolean

Level of the image pyramid to use for tissue segmentation

type: number

Mode for the tissue segmentation: ‘staining’ or ‘saturation’ (for H&E images).

type: string

Additional tissue segmentation parameters as a python dict string

type: string

Filtering low-quality cells during and after segmentation

Cells with an area less than this value will be filtered. The unit is in pixels^2, and used by Stardist/Cellpose.

type: number

Cells with an area less than this value will be filtered. The unit is in microns^2, and used by Baysor/Comseg. Not used by Proseg.

type: number

Cells with less transcripts than this value will be filtered.

type: number

Cells whose mean channel intensity is less than min_intensity_ratio * quantile_90 will be filtered.

type: number

Aggregation of genes and channels inside each cell

Whether to aggregate the genes (counts) inside each cell

type: boolean

Whether to aggregate the channels (intensity) inside each cell

type: boolean

Cells polygons will be expanded by expand_radius_ratio * mean_radius for channels averaging only. This help better aggregate boundary stainings

type: string

Optional scanpy table preprocessing (log1p, UMAP, leiden clustering) after aggregation/annotation.

Whether to run scanpy preprocessing

type: boolean

Resolution parameter for the leiden clustering

type: number

Whether to check that adata.X contains counts

type: boolean

Whether to compute highly variable genes before computing the UMAP and clustering

type: boolean

Parameters related to the Xenium Explorer visualization tool

Number of microns in a pixel. Invalid value can lead to inconsistent scales in the Explorer.

type: number

default: 0.2125

If True, will not load the full images in memory (except if the image memory is below ram_threshold_gb)

type: boolean

default: true

Threshold (in gygabytes) from which image can be loaded in memory.

type: number

default: 4

Image preprocessing applied before channel-based segmentation methods (cellpose, stardist)

Parameter for scipy gaussian_filter (applied before running the segmentation method)

type: number

Parameter for skimage.exposure.equalize_adapthist (applied before running the segmentation method)

type: number

Parameter for skimage.exposure.equalize_adapthist (applied before running the segmentation method)

type: number

Parameters related to the proseg segmentation method

Whether to run proseg segmentation

type: boolean

String suffix to add to the proseg command line. This can be used to add extra parameters to the proseg command line.

type: string

Whether to infer the proseg presets based on the columns of the transcripts dataframe.

type: boolean

Only for Visium HD data. Key of sdata containing the prior cell boundaries. If 'auto', use the latest performed segmentation (e.g., stardist or the 10X Genomics segmentation). If combining stardist with proseg, it will be set automatically to 'stardist_boundaries'.

type: string

Parameters related to the comseg segmentation method

Whether to run comseg segmentation

type: boolean

Comseg mean_cell_diameter parameter

type: number

default: 10

Comseg max_cell_radius parameter

type: number

default: 15

Comseg alpha parameter

type: number

default: 0.5

Comseg min_rna_per_cell parameter

type: number

default: 1

Comseg allow_disconnected_polygon parameter

type: boolean

Comseg norm_vector parameter

type: boolean

Parameters related to the baysor segmentation method

Whether to run baysor segmentation

type: boolean

Baysor scale parameter

type: number

default: -1

Baysor scale_std parameter

type: string

default: 25%

Baysor prior_segmentation_confidence parameter

type: number

default: 0.2

Baysor min_molecules_per_cell parameter

type: number

default: 20

Baysor min_molecules_per_gene parameter

type: number

default: 10

Baysor min_molecules_per_segment parameter

type: number

Baysor confidence_nn_id parameter

type: number

Baysor force_2d parameter

type: boolean

default: true

Parameters related to the cellpose segmentation method

Whether to run cellpose segmentation

type: boolean

Cellpose diameter parameter

type: number

Channel name(s) to use for cellpose segmentation. If multiple, separate by space, comma or pipe characters.

type: string

Cellpose flow_threshold parameter

type: number

Cellpose cellprob_threshold parameter

type: number

Cellpose model type to use

type: string

Cellpose pretrained_model parameter

type: string

Whether to use GPU for Cellpose segmentation

type: boolean

Additional cellpose parameters as a python dict string

type: string

Parameters related to the stardist segmentation method

Whether to run stardist segmentation

type: boolean

Name of stardist model to use

type: string

Stardist prob_thresh parameter

type: number

Stardist nms_thresh parameter

type: number

Optional channel name(s) to use for stardist segmentation. If multiple, separate by space, comma or pipe characters.

type: string

Additional stardist parameters as a python dict string

type: string

Parameters related to the tangram cell-type annotation method

Whether to run tangram cell-type annotation

type: boolean

Path to the scRNAseq annotated reference, as a .h5ad file

type: string

Key of adata_ref.obs containing the cell-types

type: string

Preprocessing method applied to the reference. Either None (raw counts), or normalized (sc.pp.normalize_total) or log1p (sc.pp.normalize_total and sc.pp.log1p)

type: string

Number of cells in each bag of the spatial table. Low values will decrease the memory usage

type: number

Maximum samples to be considered in the reference for tangram. Low values will decrease the memory usage

type: number

Parameters related to the fluorescence-based cell-type annotation method

Whether to run cell-type annotation based on a marker-to-cell dictionary

type: boolean

Key of sdata.obs containing the cell-types

type: string

Dictionary mapping whose keys are marker channel names and values are the cell types associated to each marker. Should be provided as a string representation of a python dictionary.

type: string

nf-core/sopa