BalancedCluster refiner - Better balance of files in stagin jobs

XMLWordPrintable

      The following would be a nice performance improvement which we have talked about in the past.

      For workflows with a large number of inputs, the idea is that we can spread do the transfers in parallel by having a set of stagin jobs running in parallel. However, Pegasus currently does a poor job balancing the inputs across those jobs. For example, for a 8deg Montage workflow with 20 stagin jobs:

      $ for FILE in `ls stage_in_local_local_*.in | sort`; do COUNT=`cat $FILE | grep http | wc -l`; echo "$FILE $COUNT"; done
      stage_in_local_local_0_0.in 692
      stage_in_local_local_0_1.in 689
      stage_in_local_local_0_2.in 637
      stage_in_local_local_0_3.in 638
      stage_in_local_local_2_0.in 1
      stage_in_local_local_3_0.in 1
      stage_in_local_local_5_0.in 4
      stage_in_local_local_5_1.in 4
      stage_in_local_local_5_2.in 4
      stage_in_local_local_5_3.in 4
      stage_in_local_local_6_0.in 4
      stage_in_local_local_6_1.in 4
      stage_in_local_local_6_2.in 4
      stage_in_local_local_6_3.in 4
      stage_in_local_local_8_0.in 1
      stage_in_local_local_9_0.in 1

            Assignee:
            Karan Vahi
            Reporter:
            Mats Rynge [X] (Inactive)
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: