Type: Bug
Resolution: Fixed
Priority: Major
Affects Version/s: master
Component/s: Planner: Cleanup Module
Email from Tobias Tikaa@USC
I hope you are doing good. I am currently trying to modify my pipeline so that pegasus cleans up the files that are not longer needed, so I removed the "--cleanup none" from my script that runs pegasus-plan. Pegasus does however remove an intermediate file that is created by a job, and the whole run crashes since that file is needed by next job.
How should I do in order to get this to work? I have attached a picture of the jobs (it is the cleanup_PegasusVM_level_3 that deletes the file I need) and the folder from the work directory that pegasus generates
the name of the file is EC000284.intervals_from_target_creator.intervals and it is created by GATKTargetCreatorJob.
I get the following log from pegasus-analyzer:
site: PegasusVM
submit file: IndelRealigner_IndelRealigner1.sub
output file: IndelRealigner_IndelRealigner1.out.001
error file: IndelRealigner_IndelRealigner1.err.001
------------------------------Task #1 - Summary-------------------------------
site : PegasusVM
hostname : unknown
executable : /usr/java/jdk1.8.0_51/jre/bin/java
arguments : -Xmx4g -jar /home/bcpipeline/software/gatk/default/GenomeAnalysisTK.jar -T IndelRealigner -R reference.fa -I EC000284.bam -targetIntervals EC000284.intervals_from_target_creator.intervals -o EC000284.realigned.bam -log bySample/EC000284/EC000284.indelRealigner.log -known knownIndels.0.vcf.gz -known knownIndels.1.vcf.gz
exitcode : 1
working dir : /test_disk/shared-scratch/bcpipeline/pegasus/RecalibrateAndRealignDax/run0016
----Task #1 - bc::IndelRealigner:1.0 - IndelRealigner1 - Kickstart stderr-----
INFO 11:57:22,769 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:57:22,773 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.4-0-g7e26428, Compiled 2015/05/15 03:25:41
INFO 11:57:22,774 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 11:57:22,775 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 11:57:22,778 HelpFormatter - Program Args: -T IndelRealigner -R reference.fa -I EC000284.bam -targetIntervals EC000284.intervals_from_target_creator.intervals -o EC000284.realigned.bam -log bySample/EC000284/EC000284.indelRealigner.log -known knownIndels.0.vcf.gz -known knownIndels.1.vcf.gz
INFO 11:57:22,784 HelpFormatter - Executing as bcpipeline@localhost.localdomain on Linux 2.6.32-504.30.3.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_51-b16.
INFO 11:57:22,785 HelpFormatter - Date/Time: 2015/08/04 11:57:22
INFO 11:57:22,785 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:57:22,786 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:57:23,495 GenomeAnalysisEngine - Strictness is SILENT
INFO 11:57:23,639 GenomeAnalysisEngine - Downsampling Settings: No downsampling
INFO 11:57:23,680 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
WARNING: BAM index file /test_disk/shared-scratch/bcpipeline/pegasus/RecalibrateAndRealignDax/run0016/EC000284.bam.bai is older than BAM /test_disk/shared-scratch/bcpipeline/pegasus/RecalibrateAndRealignDax/run0016/EC000284.bam
INFO 11:57:23,784 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.10
WARN 11:57:24,090 IndexDictionaryUtils - Track knownAlleles doesn't have a sequence dictionary built in, skipping dictionary validation
WARN 11:57:24,094 IndexDictionaryUtils - Track knownAlleles2 doesn't have a sequence dictionary built in, skipping dictionary validation
INFO 11:57:24,307 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
INFO 11:57:24,313 GenomeAnalysisEngine - Done preparing for traversal
INFO 11:57:24,313 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 11:57:24,314 ProgressMeter - Location | reads | elapsed | reads | completed | runtime | runtime
INFO 11:57:25,643 GATKRunReport - Uploaded run statistics report to AWS S3
- ERROR ------------------------------------------------------------------------------------------
- ERROR A USER ERROR has occurred (version 3.4-0-g7e26428):
- ERROR This means that one or more arguments or inputs in your command are incorrect.
- ERROR The error message below tells you what is the problem.
- ERROR If the problem is an invalid argument, please check the online documentation guide
- ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
- ERROR Visit our website and forum for extensive documentation and answers to
- ERROR commonly asked questions http://www.broadinstitute.org/gatk
- ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
- ERROR MESSAGE: Couldn't read file /test_disk/shared-scratch/bcpipeline/pegasus/RecalibrateAndRealignDax/run0016/EC000284.intervals_from_target_creator.intervals because The interval file does not exist.
- ERROR ------------------------------------------------------------------------------------------