Uploaded image for project: 'Pegasus'
  1. Pegasus
  2. PM-1066

wget errors because of network issues

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: master, 4.6.0
    • Fix Version/s: master, 4.7.0, 4.6.1
    • Component/s: pegasus-transfer
    • Labels:
      None

      Description

      Hi Karan, Mats,

      starting a few weeks ago we’ve been running into a problem where jobs on Comet are unable to download the executable from Syracuse. Here’s an example

       <https://sugar-dev2.phy.syr.edu/pegasus/u/amber.lenon/r/119/w/2/j/19376/ji/10327/stderr>

      and the error is

       2016-02-24 01:49:53,195 INFO: Tool found: wget Version: 1.12 Path: /usr/bin/wget
       2016-02-24 01:49:53,195 INFO: /usr/bin/wget -nv --no-cookies --no-check-certificate -O '/data1/condor_local/execute/dir_3185356/glide_d7pv6O/execute/dir_3234931/pegasus.te9xVU/inspiral-NSBH01_INJ-L1_ID372' 'http://code.pycbc.phy.syr.edu/pycbc-software/v1.3.6/x86_64/composer_xe_2015.0.090/pycbc_inspiral&#39;
       2016-02-24 01:49:53,210 ERROR: Command exited with non-zero exit code (4): /usr/bin/wget …

      From what I can tell by looking at the apache logs these requests aren’t even reaching us. Edgar has run the same wget from the command line on that machine and it worked

       [1139] ligo@comet-18 ~$ wget -nv --no-cookies --no-check-certificate -O '/tmp/ada' 'http://code.pycbc.phy.syr.edu/pycbc-software/v1.3.6/x86_64/composer_xe_2015.0.090/pycbc_inspiral&#39;
      2016-02-24 11:40:03 URL:http://code.pycbc.phy.syr.edu/pycbc-software/v1.3.6/x86_64/composer_xe_2015.0.090/pycbc_inspiral [80055063/80055063] -> "/tmp/ada" [1]
       [1140] ligo@comet-18 ~$ ls -lh /tmp/ada
      -rw-rw-r--. 1 ligo ligo 77M Feb 3 12:02 /tmp/ada

      Brian has also been able to run from the command line after doing condor_ssh_to_job to mimic the environment as closely as possible.

      Can either of you think of anything about the environment that Pegasus sets up that could cause this?

      Thanks,

        Attachments

          Activity

            People

            • Assignee:
              rynge Mats Rynge
              Reporter:
              dbrown Duncan Brown
            • Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: