You are here:  » Drop Fetch job if too long and go to next


Drop Fetch job if too long and go to next

Submitted by CashNexus on Tue, 2019-09-17 22:14 in

Hello David,
Sometimes from the side of the affiliate networks there is a problem they can't properly give a file for fetch - and then command
/usr/bin/php -q fetch.php @ALL
just hangs during the hours...

I've setup job_id as per
https://www.pricetapestry.com/node/5401
and I make the log
/usr/bin/php -q fetch.php @ALL > /var/www/html/db/ptlogs/total_fetch.log ;
so I know what file is problematic and under /feeds I see this job id
tmp1221
tmp1221.ungzipped
...their size does not change during 2 hours already...so it means something is wrong from affiliate network, download for this file is dead, this is not due to the file size.
Usually this is a temporary affiliate network problem - so next time everything could be ok with the fetch of the same file.
But it really hangs the whole fetch process.

My question is - is it possible somehow to jump over too long fetch and go to next fetch job ?
For example - set a timeout inside fetch.php - if a fetch job not finished during 60 minutes - just drop it and go to next one.

Ideally if we could detect "tmp file size not changed during an hour" for
tmp1221
because it's the real prove the connection is dead for this file - so we can drop it and jump to next job...but I've no idea is it possible to implement such a solution using PHP...

Thank you in advance for any idea what could help with such the situation what happens in real life :)

Submitted by support on Wed, 2019-09-18 08:12

Hi,

Sure - first make sure that you are using the CURL handler and that fetching a single feed manually still works by editing config.advanced.php and changing line 43 as follows;

  $config_automationHandler = "curl";

Then to set an operation time limit, edit includes/automation.php and look for the following code at line 149:

    curl_setopt($ch,CURLOPT_FILE,$fp);

...and REPLACE with:

    curl_setopt($ch,CURLOPT_FILE,$fp);
    curl_setopt($ch,CURLOPT_TIMEOUT,3600);

(timeout is in seconds so that will give the one hour as mentioned)

Cheers,
David.
--
PriceTapestry.com

Submitted by CashNexus on Wed, 2019-09-18 18:03

Thank you, David ! This is very helpful !
I've added the code as you advised - but before test of total fetch let me ask - is it possible to add a timing to
/usr/bin/php -q fetch.php @ALL > /var/www/html/db/ptlogs/total_fetch.log ;

I mean usually the report is
fetching 360training06.xml
fetching 360training06.xml...[OK]
etc etc

Could it be something like
fetching 360training06.xml
fetching 360training06.xml...[OK]
Time spent 658 seconds
etc etc ?

It could seriously help to understand what is a longest time to set up correct value in
curl_setopt($ch,CURLOPT_TIMEOUT,3600);

Best regards,

Submitted by support on Thu, 2019-09-19 10:33

Hi,

Sure - if you edit scripts/fetch.php and look for the following code beginning at line 27:

    $status = automation_run($job["id"]);
    print chr(13)."fetching ".$job["filename"]."...[".$status."] \n";

...and REPLACE with:

    $a = time();
    $status = automation_run($job["id"]);
    $b = time();
    print chr(13)."fetching ".$job["filename"]."...[".$status." (".($b-$a)."s)] \n";

...and that will output the time taken after the status for example;

fetching 360training06.xml...[OK 16s]

Cheers,
David.
--
PriceTapestry.com

Submitted by CashNexus on Fri, 2019-09-20 06:34

Thank you VERY MUCH, David !
This is REALLY helpful !
Have a nice coming weekend !
Best regards,

Submitted by CashNexus on Sat, 2019-09-21 19:41

Hello David, small question.
Automation Tool - New Job - there is an option "Abort if less than:___ bytes".
What filesize here you mean ?
Archived (tar.gz or zip) - or extracted ?
I suppose it should be size of archived datafeed, for example if "Abort if less than 200 bytes" - it means datafeed "Test.zip" with size 150 bytes will not be fetched ?
Thanks in advance for a comment for general understanding.
Nothing urgent :)
Have a nice weekend !

Submitted by support on Mon, 2019-09-23 08:16

Hi,

That's correct - it is the size of the actual download (before unzip). The purpose of this feature is to make sure that a good feed isn't overwritten with the results of, for example a transient error on the affiliate network's servers and there was a problem generating the feed.

Cheers,
David.
--
PriceTapestry.com

Submitted by CashNexus on Mon, 2019-09-23 08:51

VERY and VERY :) smart solution you have foreseen for fetch jobs !
Thanks for explanations,
Best regards,
Serge