Hello,
I've been working with my host for several days now to try to get one complete feed successfully imported. No luck so far. My host has whitelisted my import files so the server won't stop them. I'm trying import.php from the command line through my SSH client and feeds_import.php from a browser.
In the first case - a feed with 3461 records - import.php from the command line usually imports about 2800 records. When I use feeds_import.php, it seems to do a little better, with 3400 records being the best so far. It's never the same number.
I've tried two other feeds. The first one has more than 18000 records and it stops importing at a little over 17000. The other one has about 58000 records, but stops at just under 15000.
I haven't used a stopwatch, but I'm guessing these all stop running after about the same amount of time. The first feed is much smaller, but it also has twice as many filters on it.
I'm completely stumped. The only other idea I had was to try to modify import.php with something like this:
ini_set("memory_limit","120M");
ini_set("max_execution_time","1800");
Those lines were given to me from a member of a php forum when I was having trouble a year or so ago with a custom feed import script. I figured I'd ask about it here before I go messing things up with my trial-and-error coding methods ;)
Thanks!
Jim
Hi David,
This is the error message that pops up from the terminal window:
Command 'php import.php myfeed.txt'
failed with return code 137 and error message
-bash: line 41 19340 Killed php import.php myfeed.txt
Thanks,
Jim
Thanks for the copy of the message, Jim.
It does still appear that an external guardian process is killing PHP when executing a script that takes over a preset time to execute; however you mentioned that your host had whitelisted import.php.
I've done a search on exit code 137 and it is associated with a SIGKILL which occurs when a process is forced to quit. Using the ini_set can also be overridden so these statements may not be having an effect. To this end, I would contact support again and ask them to have another look at why your scripts are not being allowed unlimited execution time.
In the mean, time, it would be possible for you to get started by artificially limiting the maximum number of records to be imported. You could do this by modifying includes/admin.php. On Line 292 you will see the declaration of admin_import() as follows:
function admin_import($filename,$limit=0,$callback="")
To limit the process to 10,000 records you could use:
function admin_import($filename,$limit=10000,$callback="")
If the import process is then successful it again confirms that safemode or other config based resource restriction is in place and would need the support of your host to remove it.
Luckily, hosting companies do seem to be understanding and in the few instances where Price Tapestry users have found resource limitations getting in the way their hosting company have agreed to remove the limits (in order to keep their business). In this case, it sounds like your hosting company are being coooperative but are having trouble removing the restriction - but hopefully they'll be able to sort it...
Hope this helps,
Regards,
David.
Just another thought further to my previous post (see above).... As you mention that your host has a mechanism to "whitelist" individual scripts; I'm wondering if that might apply even down to the include level; in which case includes/admin.php (which contains the actual import routines for all import methods) may have to be "whitelisted".
I can't find any information about a standard PHP security solution that uses whitelists so it may be something that your host has implemented themselves, but would be worth looking at.
Cheers,
David.
Hi David,
This is the latest from my hosting company:
"The processes are not being killed because of time execution. They are simply using too many CPU cycles for too long - basically 100% of them:
23557 myaccount 25 0 50640 7036 4348 R 99.4 0.2 2:49.70 php
You may remember my post from about a week ago when I was having this same problem. I actually changed hosting companies because I was getting shut down due to high cpu use when I ran the import scripts. I selected new host (site5) based on their reputation for very fast servers and good customer service. I even selected their top tier account that offered the highest cpu usage. After all that I'm still having the same problem.
Do I need to be on a dedicated server to run these scripts?
Thanks,
Jim
Hi Jim,
There should be no need to require a dedicated server to run Price Tapestry - most users run the script with no problems on standard hosting accounts.
However, for you then to have resource problems on 2 successive hosts leads me to wonder if something else is going - perhaps a feature or possibly a formatting error of the feeds you are using is causing PHP's Sablotron parser (the XML library upon which Magic Parser is based) to lock up or cause a significant memory leak.
Would it be possible for you to email the link to the feeds that you are using (for example the URL to http://www.yoursite.com/feeds/merchant.xml) and i'll download them and take a look....
Cheers,
David.
Hi David,
I sent an email with those links. Thank you.
As an experiment, I removed all the filters from one of the feeds and did an import. It was the smaller feed that was supposed to import 3461 records after filtering, but was stopping at 3400. With no filtering, that feed now imports 17357 of a possible 17639. Close, but it's still stopping short. I wouldn't necessarily say this is progress because I was able to import more records than that on other feeds. It's just using too much cpu.
These are all CJ feeds, btw.
Thanks,
Jim
Hi Jim,
Thanks for that - I've just sent you some test code to try out with these feeds.
Incidentally, do CJ provide an XML format at all?
Cheers,
David.
Hi David,
Yes, I believe CJ does provide an XML format. Would that be faster or have other advantages with PT?
For anyone browsing this forum and considering buying Price Tapestry, this issue was successfully resolved with some modified code provided by David.
Hi Jim,
XML format feeds are generally less prone to formatting errors in their generation; which may result in more products being imported which is why I would always use XML in preference to CSV where both were available.
Cheers,
David.
David,
I'm having the same problem since moving my accounts over from a shared hosting account to a VPS account (the VPS400 at http://www.spry.com/cpanel-vps/).
I was having problems (segmentation faults) when importing large csv's from CJ on the shared account but now can't import smaller feeds on the VPS account. I get the response "Killed" when executing import.php from SSH. I recently requested to have all my CJ feeds switched to XML but haven't had a chance to test.
Any ideas or things to check while I wait on CJ to change my feed format?
Bill Whelan
To add to my previous comment, I get the following from my web host:
Hello,
Perhaps the shared hosting environment it worked on had additional PHP modules or
configuration which you have not added to your VPS. Check the output of phpinfo() on
the working system and compare to the same on your VPS. Then rebuild PHP using
"Apache Update" in your WHM to match.
--
Spry
Bill Whelan
Hi Bill,
Is your server "killing" the import script immediately, or after a similar period each time, say 30 seconds? Your new account may be configured differently with regards to script execution time. To check the settings, you can use phpinfo as suggested by your host, something like:
phpinfo.php:
<?php
phpinfo();
?>
(look for MAX_EXCECUTION_TIME). If this is the case, you can try the ways to disable this described in the following thread...
http://www.pricetapestry.com/node/582
Hope this helps,
Cheers,
David.
David,
I found out from the hosting company that they are running an app called PRM which is set to kill any process running over 40% cpu for longer than six minutes. I cancelled and got a dedicated server.
Bill Whelan
is there a way to make the process "burning" less resources and be more optimized?
i know that it might delay the whole process of importing stuff but at the end its worth it cause everyone who is starting right now using the script cannot move on to a dedicated server until he/she gets some results out of it
Hi,
See the following thread for code to reduce the server load during import....
Hello Jim,
When you run import.php from the command line, is there an error given by PHP indicating why the script aborted? Normally if there is a timeout due to safemode or other configuration restriction then you would see a message as such.
What are you seeing on your server?
(can you copy and paste from your SSH terminal into the forum - that would be handy to see)
Cheers,
David.