Loading...

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

XML

Word

Printable

Type: New Feature
Resolution: Fixed
Priority: Major
Fix Version/s: 3.0
Affects Version/s: None
Component/s: Pegasus Planner
Labels:
None
Environment:
amazon ec2 cloud, using S3 to store data

In case of the cloud environment , where there is no shared filesystem we rely on staging data to S3 storage on the cloud and enabling worker node execution in Pegasus.
This results in creation of SLS files to be created in the submit directory, that are then staged to the S3 storage as part of the first level staging.
However, this adds to the data transfer time in reference to how long the workflow executes.

Since the S3 transfer implementation uses seqexec to execute multiple s3 commands in one job, there is possibility of optimization whereby the contents of all sls files for a job can be coalesced into a single seqexec input file.
This seqexec input file will then be transferred to the node by condor when running the job.

To achieve this we require a new GridStart implementation in Pegasus that uses seqexec to launch jobs.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

seqexec-sls.tgz
10 kB
26/Aug/10 6:21 PM

1.	Clustered jobs with the SeqExec gridstart do not get -w argument for kickstart	Closed
2.	for worker node execution with seqexec wrapper clustered jobs stage the output products back to /tmp instead of head node directory	Closed

Assignee:: Karan Vahi
Reporter:: Karan Vahi

Created:: 21/Oct/09 2:42 PM
Updated:: 16/Apr/14 11:00 AM
Resolved:: 16/Apr/14 11:00 AM
Archived:: 14/Dec/24 10:43 PM

Details

Description

Attachments

Attachments

Sub-Tasks

Activity

People

Dates