Chapter 4 Channels, tuples, & lists

This section shows how to create, edit, and manipulate channels, tuples, and lists.

For an explanation on these concepts please see chapter 2

4.1 Channels

For more info on channels including channel operators please see the nextflow docs

4.1.1 Creating channels

There are many ways to create various types of channels.

4.1.1.1 Text

workflow {
  hello_ch = Channel.of("Hello","Bonjour")
}

4.1.1.2 Data file

data_ch = Channel.fromPath(params.input_file)

or

workflow {
  data_ch = file(params.input_file)
}

4.1.1.3 CSV of files

workflow {
  files_ch = Channel.fromPath(params.input_file)
      .splitCsv()
      .flatten()
}

4.1.1.4 FOFN

A file containing file names sperated by lines

workflow {
  files_ch = Channel.fromPath(params.file).splitText()
}

4.1.1.5 Paired files

Useful for paired illumina fastq files.

params.reads = "/path_to_fastq_dir/*_R{1,2}.fastq"
workflow {
  Channel
    .fromFilePairs(params.reads, checkIfExists: true)
    .set {read_pairs_ch}
}

This creates a channel of multiple values based on the number of paired files. Each channel value has contains a tuple of 2 values:

  • First is a text value of the file prefix. The file prefix is the text that the * reprsents in the params.reads path.
  • The second is a list with the first value is the R1 file and the second value is the R2 file.

For example if you had the files:

  • S1_R1.fastq
  • S1_R2.fastq

The tuple would be:

["S1",["S1_R1.fastq","S1_R2.fastq]]

You could then use this channel like so:

params.reads = "/path_to_fastq_dir/*_R{1,2}.fastq"
process RUN {
  input:
    tuple val(sample_id), path(reads)
  script:
  """
  command -s ${sample_id} -r1 ${reads[0]} -r2 ${reads[1]}
  """
}
workflow {
  Channel
    .fromFilePairs(params.reads, checkIfExists: true)
    .set {read_pairs_ch}
  RUN(read_pairs_ch)
}

Note: Indexing a list starts at 0 (-r1 ${reads[0]})

4.1.2 Extracting channels from process

process RUN {
  input:
    path(R1)
  output:
    path("R1_trimmed.fastq"), emit: R1_trimmed
    path("R1_trimmed.stats.txt"), emit: R1_trimmed_stats
}
workflow {
  /// Input reads from params
  r1_ch = file(params.r1_fastq)
  /// Run process
  RUN(r1_ch)
  /// Extract channels from run process
  r1_trimmed_ch = RUN.out.R1_trimmed
  r1_trimmed_stats_ch = RUN.out.R1_trimmed_stats
}

Note: You only need to extract channels form a process if they are going to be used in another process.

4.1.3 View

You can view a channel's content and structure by including .view and running the workflow (nextflow run main.nf).

workflow {
  /// Input reads from params
  r1_ch = file(params.r1_fastq).view
}

4.2 Tuples

4.2.1 Input & Output tuples

process RUN {
input:
  tuple val(id), path(R1), path(R2)
output:
    tuple val(id), path(“R1_trim.fastq”), file(“R2_trim.fastq”)
}

4.2.2 Transpose

If you have the below channel of 3 values each containing a tuple of 2....

[[“s1”,”s1.txt”],[“s2”,”s2.txt”],[“s3”,”s3.txt”]]

and want to change it to the below channel of 1 value containing a tuple of 2, each tuple containing a list of 3....

[[“s1”,”s2”,”s3”],[“s1.txt”,”s2.txt”,”s3.txt”]]

You can run the following

workflow {
  transposed_ch = id_file_pair_ch.collect(flat:false)
                                  .map{ it.transpose()}
}

This can the be called as input to a process like so:

process RUN {
  input:
    tuple val(sample_ids_list), path(files_list)
}

4.3 List

4.3.1 Channel values to one list

You can collect all the values of a channel into one list.

workflow {
  all_R1_fastq_ch = FASTQC.out.htmlR.collect()
  all_R2_fastq_ch = FASTQC.out.htmlL.collect()
}

This ends up with lists like [“S1_R1.fastqc.html”, “S2_R1.fastqc.html”]

4.3.2 Parameter in front of each list value

Some programs accept a list of values separated by spaces. However, some may require a parameter flag in front of each path/file. This can be carried out by setting a definition in the process block.

process RUN {
  input:
    path(all_R1_fastq)
  
  script:
  def R1_fastqc_lines = all_R1_fastq.collect { 
    fastqc -> “-F ${fastqc}
  }.join()
  """
  multiqc -o multiqc $R1_fastqc_lines
  """
}

That then becomes “-F S1_R1.fastqc.html -F S2_R1.fastqc.html”