-
Type: Bug
-
Resolution: Fixed
-
Priority: Major
-
Affects Version/s: master, 4.7.4
-
Component/s: CLI: pegasus-db-admin, Monitord
-
None
Hi Karan,
I'm running into an issue when I run pegasus-monitord in replay mode on some post-processing workflows. It looks like it's trying to add all of my output files to a database, but they've been added already:
2017-08-23 12:55:39,852:ERROR:Pegasus.db.workflow_loader.WorkflowLoader(107): Insert failed for event <class 'Pegasus.db.schema.RCPFN'>:
- wf_id : 5240
- wf_uuid : f8419003-ec55-4f2f-81e5-8088c5f443a2
- pfn_id : None
- lfn : RotD_UCSB_4911_267_4.rotd
- site : shock
- pfn : gsiftp://hpc-transfer.usc.edu/home/scec-04/tera3d/CyberShake/data/PPFiles/UCSB/4911/RotD_UCSB_267_4.rotd
- lfn_id : 1984029
- event : stampede.rc.pfn
: (IntegrityError) (1062, "Duplicate entry '1984029-gsiftp://hpc-transfer.usc.edu/home/scec-04/tera3d/CyberS' for key 'UNIQUE_PFN'") 'INSERT INTO rc_pfn (lfn_i
d, pfn, site) VALUES (%s, %s, %s)' (1984029L, 'gsiftp://hpc-transfer.usc.edu/home/scec-04/tera3d/CyberShake/data/PPFiles/UCSB/4911/RotD_UCSB_267_4.rotd', 'shock
')
2017-08-23 12:55:39,885:ERROR:Pegasus.db.workflow_loader.WorkflowLoader(107): Insert failed for event <class 'Pegasus.db.schema.RCPFN'>: - wf_id : 5240
- wf_uuid : f8419003-ec55-4f2f-81e5-8088c5f443a2
- pfn_id : None
- lfn : RotD_UCSB_4911_254_19.rotd
- site : shock
- pfn : gsiftp://hpc-transfer.usc.edu/home/scec-04/tera3d/CyberShake/data/PPFiles/UCSB/4911/RotD_UCSB_254_19.rotd
- lfn_id : 1984030
- event : stampede.rc.pfn
: (IntegrityError) (1062, "Duplicate entry '1984030-gsiftp://hpc-transfer.usc.edu/home/scec-04/tera3d/CyberS' for key 'UNIQUE_PFN'") 'INSERT INTO rc_pfn (lfn_i
d, pfn, site) VALUES (%s, %s, %s)' (1984030L, 'gsiftp://hpc-transfer.usc.edu/home/scec-04/tera3d/CyberShake/data/PPFiles/UCSB/4911/RotD_UCSB_254_19.rotd', 'shoc
k')
The issue is that it does this for all ~28k files for each workflow, and as a result takes forever. Is there a way to disable this, since it seems like the files have been inserted already? Is it necessary for getting statistics, which is what I'm interested in? Thanks!
-Scott