Feature #78

Randomize job state polling

Added by Eduardo Huedo over 1 year ago. Updated 4 months ago.

Status:Closed Start date:2010-06-07
Priority:Normal Due date:2010-06-15
Assignee:Eduardo Huedo % Done:

100%

Category:-
Target version:5.8

Description

GRAM2 (used in EGEE) does not scale well, making job state polling to fail with the following message:

Thu Mar 18 10:55:00 2010 [EM][E]: Job poll failed (connecting to the job manager failed. Possible reasons: job terminated, invalid job contact, network problems, ... (0)), will poll again.

An approach similar to exponential backoff used in Ethernet (and for banning resources in GW) could alleviate this problem.

The proposal is to randomize polls, increasing the interval between polls when failures occur.

History

Updated by Eduardo Huedo over 1 year ago

  • % Done changed from 0 to 90

Updated by Eduardo Huedo over 1 year ago

  • Due date set to 2010-06-15
  • Target version set to 5.7
  • Estimated time set to 30.00

Updated by Eduardo Huedo about 1 year ago

  • Status changed from New to Assigned

Updated by Eduardo Huedo about 1 year ago

  • Status changed from Assigned to Resolved

Updated by Eduardo Huedo 10 months ago

  • % Done changed from 90 to 100

Updated by Eduardo Huedo 10 months ago

  • Estimated time deleted (30.00)

Updated by Eduardo Huedo 9 months ago

  • Status changed from Resolved to Closed

Updated by Eduardo Huedo 4 months ago

  • Target version changed from 5.7 to 5.8

Also available in: Atom PDF