-
Type: Bug
-
Resolution: Fixed
-
Priority: Major
-
Affects Version/s: master, 4.8.0
-
Component/s: CLI: pegasus-s3, CLI: pegasus-transfer
-
None
I am trying to run a workflow with s3 as a staging site.
here is my site catalog entry
<site handle="aws-batch" arch="x86_64" os="LINUX">
<directory path="/pegasus-batch-1231-2/scratch" type="shared-scratch" free-size="" total-size="">
<file-server operation="all" url="s3://vahi@amazon/pegasus-batch-1231-2/scratch">
</file-server>
</directory>
<profile namespace="pegasus" key="clusters.num">1</profile>
<profile namespace="pegasus" key="style">condor</profile>
<profile namespace="condor" key="universe">vanilla</profile>
</site>
First time I run a workflow, my create dir job succeeds and creates the bucket
s3://vahi@amazon/pegasus-batch-1231-2/scratch/vahi/pegasus/diamond/run0001
On subsequent workflow planning and runs ( where the run directory gets incremented) , the create dir job fails
cat 00/00/create_dir_diamond_0_aws-batch.in
[
{
"id": 1,
"type": "mkdir",
"target":
}
]
2018-01-11 11:23:03,743 INFO: Reading URL pairs from stdin
2018-01-11 11:23:03,743 INFO: 1 transfers loaded
2018-01-11 11:23:03,744 INFO: PATH=/usr/bin:/bin:/sw/bin
2018-01-11 11:23:03,744 INFO: LD_LIBRARY_PATH=
2018-01-11 11:23:03,862 INFO: --------------------------------------------------------------------------------
2018-01-11 11:23:03,863 INFO: Starting transfers - attempt 1
2018-01-11 11:23:05,867 INFO: Tool found: pegasus-s3 Version: N/A Path: /Volumes/Work/lfs1/devel/Pegasus/git/pegasus/bin/pegasus-s3
2018-01-11 11:23:05,868 INFO: /Volumes/Work/lfs1/devel/Pegasus/git/pegasus/bin/pegasus-s3 mkdir s3://vahi@amazon/pegasus-batch-1231-2
2018-01-11 11:23:07,772 INFO: ERROR: Your previous request to create the named bucket succeeded and you already own it.
2018-01-11 11:23:07,773 ERROR: Command exited with non-zero exit code (1): /Volumes/Work/lfs1/devel/Pegasus/git/pegasus/bin/pegasus-s3 ...
2018-01-11 11:23:36,775 INFO: --------------------------------------------------------------------------------
2018-01-11 11:23:36,775 INFO: Starting transfers - attempt 2
2018-01-11 11:23:38,777 INFO: /Volumes/Work/lfs1/devel/Pegasus/git/pegasus/bin/pegasus-s3 mkdir s3://vahi@amazon/pegasus-batch-1231-2
2018-01-11 11:23:40,144 INFO: ERROR: Your previous request to create the named bucket succeeded and you already own it.
2018-01-11 11:23:40,145 ERROR: Command exited with non-zero exit code (1): /Volumes/Work/lfs1/devel/Pegasus/git/pegasus/bin/pegasus-s3 ...
In this case, it should have just put in a new key scratch/vahi/pegasus/diamond/run0002 in the bucket s3://vahi@amazon/pegasus-batch-1231-2/
a similar issue might exist with the cleanup jobs that remove directories