Chapter 4 Channels, tuples, & lists
This section shows how to create, edit, and manipulate channels, tuples, and lists.
For an explanation on these concepts please see chapter 2
4.1 Channels
For more info on channels including channel operators please see the nextflow docs
4.1.1 Creating channels
There are many ways to create various types of channels.
4.1.1.5 Paired files
Useful for paired illumina fastq files.
params.reads = "/path_to_fastq_dir/*_R{1,2}.fastq"
workflow {
Channel
.fromFilePairs(params.reads, checkIfExists: true)
.set {read_pairs_ch}
}
This creates a channel of multiple values based on the number of paired files. Each channel value has contains a tuple of 2 values:
- First is a text value of the file prefix. The file prefix is the text that the
*
reprsents in the params.reads path. - The second is a list with the first value is the R1 file and the second value is the R2 file.
For example if you had the files:
- S1_R1.fastq
- S1_R2.fastq
The tuple would be:
["S1",["S1_R1.fastq","S1_R2.fastq]]
You could then use this channel like so:
params.reads = "/path_to_fastq_dir/*_R{1,2}.fastq"
process RUN {
input:
tuple val(sample_id), path(reads)
script:
"""
command -s ${sample_id} -r1 ${reads[0]} -r2 ${reads[1]}
"""
}
workflow {
Channel
.fromFilePairs(params.reads, checkIfExists: true)
.set {read_pairs_ch}
RUN(read_pairs_ch)
}
Note: Indexing a list starts at 0 (-r1 ${reads[0]}
)
4.1.2 Extracting channels from process
process RUN {
input:
path(R1)
output:
path("R1_trimmed.fastq"), emit: R1_trimmed
path("R1_trimmed.stats.txt"), emit: R1_trimmed_stats
}
workflow {
/// Input reads from params
r1_ch = file(params.r1_fastq)
/// Run process
RUN(r1_ch)
/// Extract channels from run process
r1_trimmed_ch = RUN.out.R1_trimmed
r1_trimmed_stats_ch = RUN.out.R1_trimmed_stats
}
Note: You only need to extract channels form a process if they are going to be used in another process.
4.2 Tuples
4.2.2 Transpose
If you have the below channel of 3 values each containing a tuple of 2....
[[“s1”,”s1.txt”],[“s2”,”s2.txt”],[“s3”,”s3.txt”]]
and want to change it to the below channel of 1 value containing a tuple of 2, each tuple containing a list of 3....
[[“s1”,”s2”,”s3”],[“s1.txt”,”s2.txt”,”s3.txt”]]
You can run the following
This can the be called as input to a process like so:
4.3 List
4.3.1 Channel values to one list
You can collect all the values of a channel into one list.
workflow {
all_R1_fastq_ch = FASTQC.out.htmlR.collect()
all_R2_fastq_ch = FASTQC.out.htmlL.collect()
}
This ends up with lists like [“S1_R1.fastqc.html”, “S2_R1.fastqc.html”]
4.3.2 Parameter in front of each list value
Some programs accept a list of values separated by spaces. However, some may require a parameter flag in front of each path/file. This can be carried out by setting a definition in the process block.
process RUN {
input:
path(all_R1_fastq)
script:
def R1_fastqc_lines = all_R1_fastq.collect {
fastqc -> “-F ${fastqc}”
}.join(‘ ‘)
"""
multiqc -o multiqc $R1_fastqc_lines
"""
}
That then becomes
“-F S1_R1.fastqc.html -F S2_R1.fastqc.html”