2.2 Example Run

In this example, we will run a fully automated single-cell analysis on six 10X samples using scFlow. The input sparse matrices are the standard feature-barcode files generated by CellRanger. These are a small, artificial dataset generated from mouse brain data.

2.2.1 Getting Started

2.2.1.1 Install NextFlow

To get started, first install the latest version of NextFlow (>20.10) according to the installation instructions.

2.2.1.2 Download the Example Dataset

Next, create a folder for the analysis called scFlowExample. Inside this folder, generate a conf sub-folder for storing configuration files, and a refs folder for reference files.

The sparse matrices for the example dataset are available in the scFlowExample folder in this Google Bucket: <https://console.cloud.google.com/storage/browser/scflowexamplegcp> . Save these files to your preferred location – we tend to keep these sparse matrices in a separate storage location for sequencing data.

Next, download the Manifest.txt and SampleSheet.tsv files from and to the refs folder. Finally, download the scflow_analysis.config and reddim_genes.yml files from and to the conf folder.

Your analysis folder scFlowExample should now have this structure: -

.
├── conf
│   ├── reddim_genes.yml
│   └── scflow_analysis.config
└── refs
    ├── Manifest.txt
    └── SampleSheet.tsv

2.2.1.3 Edit the Manifest File

The Manifest.txt file should include absolute paths to the sparse matrix containing folders. Edit these locations to reflect the locations of the files downloaded in the previous step, e.g.

key filepath
tipif /foo/bar/scflowexamplegcp/scFlowExample/individual_1
jarul /foo/bar/scflowexamplegcp/scFlowExample/individual_2
zoham /foo/bar/scflowexamplegcp/scFlowExample/individual_3
sibod /foo/bar/scflowexamplegcp/scFlowExample/individual_4
limuz /foo/bar/scflowexamplegcp/scFlowExample/individual_5
horud /foo/bar/scflowexamplegcp/scFlowExample/individual_6

2.2.1.4 Download Additional Resources

The automated cell-type annotation reference files can be downloaded from the Google Bucket scFlowResources/refs/ctd. An ensembl_mappings.tsv file can be downloaded from the Google Bucket in the scFlowResources/src/ensembl-ids folder. These resources are common for different analyses and can be saved in a generic location outside of the analysis folder.

To override the default parameter values with these locations, we will use a custom configuration file. Create a file with a .config extension (e.g. my_scflow.config) and add the locations, e.g.

params {
  ensembl_mappings = "/foo/bar/scFlowResources/src/ensembl-ids/ensembl_mappings.tsv"
  ctd_folder = "/foo/bar/scFlowResources/refs/ctd"
}

2.2.2 Setting up Nextflow

2.2.2.1 Enable Nextflow Tower

Nextflow Tower is an optional – though highly recommended – add-on for Nextflow. It provides a number of features including powerful real-time monitoring of workflows, and is useful for troubleshooting. Simply register for an account at tower.nf/login and obtain a token. Nextflow tower can be enabled for your scFlow run by appending the following to your custom configuration file above: -

tower {
  accessToken = 'insertyourtokenhere'
  enabled = true
}

2.2.2.2 Infrastructure Configuration

Further details on infrastructure configuration, including example config files, are available at the nf-core/configs github repository. Finer details on individual parameters are available in the Nextflow documentation here.

2.2.3 First Run

The analysis can now be run with the custom parameters and configuration options attached with the -c parameter. For convenience, a bash script can be used: -

#!/bin/sh
nextflow run combiz/nf-core-scflow \
-c ./conf/scflow_analysis.config \
-c ~/combiz_config/imperial_dri.config \
-c ~/combiz_config/my_scflow.config \
-resume

In this example dataset, the “filtered” feature-barcode matrices are used; as such, ambient RNA profiling has already been performed by CellRanger so this step can be skipped in scFlow.