-
Type: Bug
-
Resolution: Fixed
-
Priority: Major
-
Affects Version/s: 4.6.0
-
Component/s: Pegasus Planner
-
None
If all three are specified, then the planner prints out this:
2016.02.02 15:12:29.646 EST: [DEBUG] Postscript constructed is
/autofs/nccs-svm1_sw/redhat6/pegasus/4.6.0/bin/pegasus-exitcode
2016.02.02 15:12:29.649 EST: [DEBUG] Written Submit file :
chmod_acme-run_ID0000002_0.sub
2016.02.02 15:12:29.649 EST: [DEBUG] Applying priority of 800 to
chmod_acme-setup_ID0000001_0
2016.02.02 15:12:29.652 EST: [DEBUG] Trying to get TCEntries for
pegasus::kickstart on resource local-pbs-titan of type INSTALLED
2016.02.02 15:12:29.652 EST: [DEBUG] Postscript constructed is
/autofs/nccs-svm1_sw/redhat6/pegasus/4.6.0/bin/pegasus-exitcode
2016.02.02 15:12:29.654 EST: [DEBUG] Written Submit file :
chmod_acme-setup_ID0000001_0.sub
2016.02.02 15:12:29.654 EST: [DEBUG] Applying priority of 800 to
chmod_acme-output_ID0000003_0
2016.02.02 15:12:29.655 EST: [DEBUG] Trying to get TCEntries for
pegasus::kickstart on resource local-pbs-titan of type INSTALLED
2016.02.02 15:12:29.655 EST: [DEBUG] Postscript constructed is
/autofs/nccs-svm1_sw/redhat6/pegasus/4.6.0/bin/pegasus-exitcode
2016.02.02 15:12:29.656 EST: [DEBUG] Written Submit file :
chmod_acme-output_ID0000003_0.sub
2016.02.02 15:12:29.656 EST: [DEBUG] Applying priority of 30 to
acme-setup_ID0000001
2016.02.02 15:12:29.658 EST: [DEBUG] Trying to get TCEntries for
pegasus::kickstart on resource local-pbs-titan of type INSTALLED
2016.02.02 15:12:29.658 EST: [DEBUG] Postscript constructed is
/autofs/nccs-svm1_sw/redhat6/pegasus/4.6.0/bin/pegasus-exitcode
2016.02.02 15:12:29.658 EST: [INFO] event.pegasus.code.generation dax.id
acme-20160202T180009Z_0 (0.059 seconds) - FINISHED
2016.02.02 15:12:29.659 EST: [FATAL ERROR] Unable to generate code
2016.02.02 15:12:29.669 EST: [DEBUG] Sending Planner Metrics to [1 of 1]
http://metrics.pegasus.isi.edu/metrics
2016.02.02 15:12:30.097 EST: [DEBUG] Metrics succesfully sent to the
server
2016.02.02 15:12:30.098 EST: [DEBUG] Exiting with non-zero exit-code 1
2016.02.02 15:12:30.098 EST: [INFO] event.pegasus.code.generation dax.id
acme-20160202T180009Z_0 (0.503 seconds) - FINISHED
Note that the cause of the RuntimeException is not printed, even with many -v's.
Here is the actual exception from the metrics server:
java.lang.RuntimeException: Unable to generate code
at edu.isi.pegasus.planner.client.CPlanner.executeCommand(CPlanner.java:680)
at edu.isi.pegasus.planner.client.CPlanner.executeCommand(CPlanner.java:365)
at edu.isi.pegasus.planner.client.CPlanner.main(CPlanner.java:245)
Caused by: edu.isi.pegasus.planner.code.generator.condor.CondorStyleException: Invalid combination of cores nodes ppn (1,1,1,) for job acme-setup_ID0000001
at edu.isi.pegasus.planner.code.generator.condor.style.GLite.handleResourceRequirements(GLite.java:594)
at edu.isi.pegasus.planner.code.generator.condor.style.GLite.getCERequirementsForJob(GLite.java:339)
at edu.isi.pegasus.planner.code.generator.condor.style.GLite.apply(GLite.java:239)
at edu.isi.pegasus.planner.code.generator.condor.CondorGenerator.applyStyle(CondorGenerator.java:1790)
at edu.isi.pegasus.planner.code.generator.condor.CondorGenerator.generateCode(CondorGenerator.java:679)
at edu.isi.pegasus.planner.code.generator.condor.CondorGenerator.generateCode(CondorGenerator.java:513)
at edu.isi.pegasus.planner.client.CPlanner.executeCommand(CPlanner.java:677)
... 2 more
The error should actually be something like:
"Only two of (nodes, cores, ppn) should be specified for job X"
And, really, if the values for nodes cores and ppn satisfy nodes * ppn = cores (for example, 1,1,1), then it shouldn't really be an error (maybe a warning). The only time it should be an error is in cases that don't satisfy that equation like: nodes = 2, ppn = 2, cores = 2.
Maybe the error should be something like:
"The values of (nodes, ppn, cores) for job X (2,2,2) do not satisfy cores = nodes * ppn. Please specify only two of (nodes, cores, ppn)."