atacseq icon indicating copy to clipboard operation
atacseq copied to clipboard

ERROR: Please check design file header: group,replicate,fastq_1,fastq_2 != group,replicate,fastq_1,fastq_2

Open bschilder opened this issue 3 years ago • 0 comments

Reposting here from the nf-core Slack for posterity. Thanks for @drpatelh for figuring this out!

Problem

nextflow/atacseq pipeline does not recognize the design.csv file, even though it follows the structure indicated here.

The example bash script below follows a modified version of the command shown here.

Bash script

#!/bin/bash

source ~/.bashrc
module load nextflow

export repo_dir=$HOME/neurogenomics/GitRepos/CUT_n_TAG
export project_id=HK5M2BBXY
mkdir -p $repo_dir/processed_data/$project_id


nextflow run nf-core/atacseq \
    --input $repo_dir/raw_data/$project_id/design_noindex.csv \
    --genome GRCh37 \
    --narrow_peak \
    --outdir $repo_dir/processed_data/$project_id \
    -with-singularity $HOME/atacseq_latest.sif \
    -c $repo_dir/hpc_config \
    -r 1.2.1 

Error output

...
Error executing process > 'CHECK_DESIGN (design.csv)'

Caused by:
  Process `CHECK_DESIGN (design.csv)` terminated with an error exit status (1)

Command executed:

  check_design.py design.csv design_reads.csv

Command exit status:
  1

Command output:
  ERROR: Please check design file header: group,replicate,fastq_1,fastq_2 != group,replicate,fastq_1,fastq_2

Command wrapper:
  singularity/default :: no need to load a module to use singularity
  ERROR: Please check design file header: group,replicate,fastq_1,fastq_2 != group,replicate,fastq_1,fastq_2
  
  ============================================
  
          Job resource usage summary 
  
                   Memory (GB)    NCPUs
   Requested  :         6             1
   Used       :         0 (peak)   0.50 (ave)
  
  ============================================

Work dir:
  /rds/general/user/bms20/ephemeral/tmp/b8/e3357840d49b488c3f691f8db4592d

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

Solution

The design.csv file MUST be in plain csv format, not UTF-8 encoded, which is the default in Excel.
See attached screenshot of how to save as correct format.

Screenshot 2020-12-17 at 17 52 43

bschilder avatar Dec 17 '20 23:12 bschilder