Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-1947

workflow restart fails because of error reading a tc file in /tmp

XMLWordPrintable

      Hello I am encountering some issues using pegasus version 5.0.8dev. When I am running the workflow, I stop the workflow and try to restart it. However, when I do that, I notice the workflow fails since it is trying to read a file from /tmp.
       
      (astro) kkacanja@sugwg-login:/search/O3_analysis_1/output_3/submitdir/work$ tail -n 40 pegasus-plan_o31-main.pre.log.002 2024.04.15 11:35:27.233 EDT: [DEBUG] Parsed DAX with following metrics {"compute_tasks":82384,"dax_tasks":8,"dag_tasks":0,"total_tasks":82392,"deleted_tasks":0,"dax_input_files":17659,"dax_inter_files":82933,"dax_output_files":87,"dax_total_files":100679,"compute_jobs":82384,"clustered_jobs":0,"si_tx_jobs":0,"so_tx_jobs":0,"inter_tx_jobs":0,"reg_jobs":0,"cleanup_jobs":0,"create_dir_jobs":0,"dax_jobs":8,"dag_jobs":0,"chmod_jobs":0,"total_jobs":82392,"mDAXLabel":"o31-main.dax"} 2024.04.15 11:35:27.256 EDT: [CONFIG] Loading site catalog file /home/kkacanja/search/O3_analysis_1/output_3/sites.yml 2024.04.15 11:35:27.256 EDT: [DEBUG] All sites will be loaded from the site catalog 2024.04.15 11:35:27.257 EDT: [DEBUG] event.pegasus.parse.site-catalog site-catalog.id /home/kkacanja/search/O3_analysis_1/output_3/sites.yml - STARTED 2024.04.15 11:35:27.373 EDT: [DEBUG] event.pegasus.parse.site-catalog site-catalog.id /home/kkacanja/search/O3_analysis_1/output_3/sites.yml (0.116 seconds) - FINISHED 2024.04.15 11:35:27.373 EDT: [DEBUG] Sites loaded are [osg, condorpool_shared, condorpool_symlink, condorpool_copy, local] 2024.04.15 11:35:27.374 EDT: [CONFIG] Set environment profile for local site PATH=/home/kkacanja/jre1.8.0_391/bin:/home/kkacanja/git/git-2.33.0:/home/kkacanja/pegasus-5.0.8dev-binary/bin:/home/kkacanja/miniconda3/envs/astro/bin:/home/kkacanja/miniconda3/bin:/home/kkacanja/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin 2024.04.15 11:35:27.374 EDT: [CONFIG] Set environment profile for local site PYTHONPATH=/home/kkacanja/pegasus-5.0.8dev-binary/lib/python3.8/dist-packages:/home/kkacanja/pegasus-5.0.8dev-binary/lib/pegasus/externals/python 2024.04.15 11:35:27.374 EDT: [CONFIG] Constructed default site catalog entry for condorpool site <site handle="condorpool" arch="x86_64" os="linux" osrelease="" osversion="" glibc=""> <profile namespace="pegasus" key="style" >condor</profile> </site> 2024.04.15 11:35:27.430 EDT: [DEBUG] Mount Under Scratch Directories [/tmp, /var/tmp] 2024.04.15 11:35:27.431 EDT: [DEBUG] Style detected for site osg is class edu.isi.pegasus.planner.code.generator.condor.style.Condor 2024.04.15 11:35:27.431 EDT: [DEBUG] Style detected for site condorpool_shared is class edu.isi.pegasus.planner.code.generator.condor.style.Condor 2024.04.15 11:35:27.431 EDT: [DEBUG] Style detected for site condorpool is class edu.isi.pegasus.planner.code.generator.condor.style.Condor 2024.04.15 11:35:27.431 EDT: [DEBUG] Style detected for site condorpool_symlink is class edu.isi.pegasus.planner.code.generator.condor.style.Condor 2024.04.15 11:35:27.431 EDT: [DEBUG] Style detected for site condorpool_copy is class edu.isi.pegasus.planner.code.generator.condor.style.Condor 2024.04.15 11:35:27.431 EDT: [DEBUG] Style detected for site local is class edu.isi.pegasus.planner.code.generator.condor.style.Condor 2024.04.15 11:35:27.431 EDT: [DEBUG] Execution sites are [condorpool_shared, condorpool_symlink, local] 2024.04.15 11:35:27.435 EDT: [INFO] event.pegasus.load.directory dax.id o31-main.dax-0 - STARTED 2024.04.15 11:35:27.435 EDT: [CONFIG] Transformation Catalog Type used Directory backed TC 2024.04.15 11:35:27.436 EDT: [CONFIG] Loading transformations from directory backend with properties {directory=/tmp/pegasus.JtwQOEOP4/transformations, file=/tmp/tc.2292643847909039152.txt} 2024.04.15 11:35:27.436 EDT: [CONFIG] Directory from where transformations will be picked up /tmp/pegasus.JtwQOEOP4/transformations 2024.04.15 11:35:27.436 EDT: [DEBUG] Unable to load transformations from directory /tmp/pegasus.JtwQOEOP4/transformations 2024.04.15 11:35:27.436 EDT: [DEBUG] Unable to load transformations from directory /tmp/pegasus.JtwQOEOP4/transformations class edu.isi.pegasus.planner.catalog.transformation.TransformationFactoryException: Unable to instantiate Transformation Catalog 2024.04.15 11:35:27.436 EDT: [INFO] event.pegasus.load.directory dax.id o31-main.dax-0 (0.002 seconds) - FINISHED 2024.04.15 11:35:27.437 EDT: [FATAL ERROR] java.lang.RuntimeException: File does not exist or with read bit set to false /tmp/tc.2292643847909039152.txt at edu.isi.pegasus.common.util.FileDetector.isTypeYAML(FileDetector.java:132) at edu.isi.pegasus.common.util.FileDetector.isTypeYAML(FileDetector.java:119) at edu.isi.pegasus.planner.catalog.transformation.TransformationFactory.loadInstance(TransformationFactory.java:242) at edu.isi.pegasus.planner.catalog.transformation.TransformationFactory.loadInstanceWithStores(TransformationFactory.java:158) at edu.isi.pegasus.planner.catalog.transformation.TransformationFactory.loadInstanceWithStores(TransformationFactory.java:104) at edu.isi.pegasus.planner.client.CPlanner.executeCommand(CPlanner.java:450) at edu.isi.pegasus.planner.client.CPlanner.executeCommand(CPlanner.java:328) at edu.isi.pegasus.planner.client.CPlanner.main(CPlanner.java:206) ERROR while logging metrics The metrics file location is not yet initialized 2024.04.15 11:35:27.439 EDT: [DEBUG] Exiting with non-zero exit-code 1 2024.04.15 11:35:27.439 EDT: [INFO] event.pegasus.planner planner.version 5.0.8dev (34.51 seconds) - FINISHED (astro) kkacanja@sugwg-login:/search/O3_analysis_1/output_3/submitdir/work$ cat pegasus-plan_o31-main.pre.log.002 2024.04.15 11:34:52.945 EDT: [INFO] Planner launched in the following directory /tmp/pegasus.JtwQOEOP4 2024.04.15 11:34:52.948 EDT: [INFO] Planner invoked with following arguments --conf /home/kkacanja/search/O3_analysis_1/output_3/pycbc-tmp_cymutq2_/work/pegasus.9196196284482208928.properties --dir /home/kkacanja/search/O3_analysis_1/output_3/pycbc-tmp_cymutq2_ --relative-dir work/./o31-main.dax_o31-main --relative-submit-dir work/././o31-main.dax_o31-main --basename o31-main --sites condorpool_shared,condorpool_symlink,local --staging-site condorpool_shared=condorpool_shared,condorpool_symlink=local,local=local, --cache /home/kkacanja/search/O3_analysis_1/gw-main_3.map,/home/kkacanja/search/O3_analysis_1/output_3/pycbc-tmp_cymutq2_/work/./pegasus-plan_o31-main.input.cache --inherited-rc-files /home/kkacanja/search/O3_analysis_1/output_3/pycbc-tmp_cymutq2_/work/o31.dax-0.replica.store --cluster label,horizontal --output-sites local --cleanup inplace --verbose --verbose --verbose --deferred /home/kkacanja/search/O3_analysis_1/output_3/o31-main.dax ...

            Assignee:
            vahi Karan Vahi
            Reporter:
            iwharry Ian Harry
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: