You are here:  » Feed import problems

Support Forum



Feed import problems

Submitted by kend on Wed, 2007-06-13 20:15 in

The current problem is that the import script imports a different number of records each time. Using the same file, and importing 7 different times, PHPmyadmin says that the merchant items number, 1867, 1889, 2183, 2048, 1705, 35, and 664. The last two import attempts took 1 minute 30 seconds and 45 seconds respectively.

The above file was 10.5 MB with 15,753 rows. Previously I tried a 3.3MB file with 6,000 records and that didn’t work either.

I ran the timeout script, which I found here on the forum, and after it ran for 20 minutes I turned it off. So timeout shouldn’t be a problem.

I cannot use a command line and use the admin interface page or feed registration page. A symptom of failure is the admin page ends up blank with only the title and the filename it is operating on. No error is displayed. Database records are updated though and I can count new records, but admin page records (10) don’t change. About 50 find and replace filters are setup.

How should I further evaluate this problem? I’m on day 3! Thanks!!

Submitted by support on Wed, 2007-06-13 20:22

Hi,

Is this only happening with one particular feed?

Can you describe the failure mode in more detail - i.e. where the title and filename are displayed. Does the browser stop waiting for anymore content and otherwise look as if it has finished loading?

Another test would be to make a copy of the feed and re-register it without any of the filters - it would be useful to know if that works OK as this might indicate a memory problem within PHP. It is also possible that error reporting is turned off on your server.

Also, if you'd like to email me the link to your site I'll happily take a look for you. Reply to your registration code or forum registration email is the easiest way to get me...

Cheers,
David.

Submitted by kend on Wed, 2007-06-13 20:56

The Firefox browser finishes on the following address displayed in the browser URL address bar, and has stopped loading. http://www.url.com/admin/feeds_import.php?filename=filename.txt

It is not waiting for content, and it is finished loading.

When I copy the exact same file and import it without filters, I get the same results. On three imports the following number of records were imported: 1914, 2276, 2145.

How much memory is recommended? The host doesn’t seem to indicate memory specs on their site.

Submitted by support on Wed, 2007-06-13 21:00

Hi Ken,

Thanks for the email - i've replied to that also.

It may be an issue related to Apache timing out waiting for content to be returned by the PHP script, but as it never comes it eventually gives up returning a blank document to the browser. I'll create an alternative version of the admin panel import script that displays progress in order to generate content during the import. As soon as i've written this i'll let you know and send you the modified import scripts...

Cheers,
David.

Submitted by kend on Wed, 2007-06-13 22:23

Great, thanks!

"Thanks for the email - i've replied to that also."

I haven't received one yet - but tomorrow is fine too.

Oh, and to answer your other question above, this happens on more than one feed.

I first used the 136KB feed with about 1500 rows, and it finished importing. Then I used a 400 row feed, and it finished.

Then i moved on the the 3.3MB feed 6000 rows, and had a bear of a time with it. I thought it was a feed problem and spent many hours deleting rows, eliminating semicolons and quotes, looking for blank fields, etc, etc. The merchant is new and "iffy" too, further pointing to a feed problem.

So then i decided to get a well established feed with low probability of feed corruption (for want of a better phrase). Using the 10.5MB, well-established feed, I reported the results above.

To summarize, 2 feeds with 1500 rows or less finished importing. 2 feeds with 6000 rows or more did not finish importing.

Submitted by kend on Thu, 2007-06-14 13:44

Hi David,

Just loaded up your modified version of admin_import.php. I see this one continuously outputs a count, and provides a link back to admin page when done.

Both feeds lost about 3% of product during import, and I'm putting that down to errors within the feed.

Both problematic feeds loaded completely first time. So it really looks great and I'm back in business!!

Thanks for your very fast support with this. I'm impressed.

Submitted by avirex421 on Thu, 2007-11-01 05:54

David,

Can I get a copy of the admin_import.php - I am running into the same issues as Kend

thanks
woody

Submitted by support on Thu, 2007-11-01 08:48

Hi Woody,

I've just emailed you the same feeds_import.php script that I sent to kend.

Cheers,
David.

Submitted by Diana on Fri, 2008-01-18 02:19

May I also get a copy? I also have problems importing files with more than 7000 products. I am on a shared Apache server.

When I get the time out error, the Admin shows no imported products. However, the merchant and some products will show up in search by merchant link.

I have tried trimming the feeds as much as possible to no avail. With all the editing, I am not able to take advantage of Category mapping or setting up filters.

Finally, is there a newer version than 6/07, and how does one upgrade?

Thanks, Diana

Submitted by support on Fri, 2008-01-18 09:17

Hello Diana,

I'll email you the continuous output version of feeds_import.php.

With regards to upgrades, there have been no feature inclusions for some time now; in some respects this is intentional as most Price Tapestry users customise their sites quite heavily which would actually make upgrading somewhat difficult. Instead, i've tended to document new features as requested by users through the forum, which means that people can implement them into their existing sites if desired.

Cheers,
David.

Submitted by teezone on Mon, 2008-01-21 19:57

Hi David, could I pls also get a copy of the continuous output version of feeds_import.php?

I have a merchant who doubled their feed to 25,000 lines - I filter it down to approx 8000, but now it hangs..

Thanks!
T.

Submitted by support on Mon, 2008-01-21 20:13

On its way!

Submitted by Alastair on Mon, 2008-01-21 20:25

It might be a useful feature to have the admin screen show, for each feed:

records in
records dropped
recordd loaded

Alastair

Submitted by coyote on Mon, 2008-01-21 23:54

hello

i would be interested to have this file too ;)

i have posted some hours ago but my post doesnt appear, is that normal ?

Tx
Coyote

Submitted by support on Tue, 2008-01-22 09:08

Hello Coyote,

As this file seems to help a number of people, i've posted it to the support folder:

feeds_import.zip

Extract feeds_import.php from that file, and upload to your /admin/ folder.

Cheers,
David.

Submitted by AD_Mega on Thu, 2008-03-06 06:58

I seem to have the same problem. I uploaded the new file and it say 700 products were imported but my admin says only 480. The data feed has 3460 products.

Submitted by support on Thu, 2008-03-06 08:05

Hi Adrian,

Feed corruption is often the cause of a big difference in the number of expected products Vs the number imported. Each record must have the required fields (Product Name, Buy URL and Price) and Product Name must be unique for that merchant.

The continuous output shows all records processed, but no necessarily imported, so this indicates one of the above problems for several of the records processed, and could indicate a quality issue with the feed.

If you want me to take a look; if you could email me a link to the feed (in the /feeds folder on your server) I'll download it to my test server and check it out...

Cheers,
David.

Submitted by Alastair on Wed, 2008-04-02 14:25

Hello

I've just tried to import the first feed that I've uploaded directly from a merchant via performics (rather than my others where I've mucked around with them in Excel first). I applied a filter to drop records unless they contained certain text. I expect only about 5% or less to be imported from a 90MB feed. It hung and then fell over with the following messages in the error log. Any ideas please David? Thank you

[Wed Apr 02 06:59:24 2008] [error] [client 99.224.81.214] PHP Notice: Undefined index: in /var/www/html/includes/widget.php on line 14, referer: http://www.bdsave.com/ptadmin/feeds_filters_configure.php?filename=ecost.txt&id=12
[Wed Apr 02 07:01:29 2008] [warn] [client 99.224.81.214] Timeout waiting for output from CGI script /home/virtual/site2/fst/var/www/interpreters/php-script, referer: http://www.bdsave.com/ptadmin/feeds_filters.php?filename=ecost.txt
[Wed Apr 02 07:01:29 2008] [error] [client 99.224.81.214] Premature end of script headers: php-script, referer: http://www.bdsave.com/ptadmin/feeds_filters.php?filename=ecost.txt
[Wed Apr 02 07:03:29 2008] [warn] [client 99.224.81.214] Timeout waiting for output from CGI script /home/virtual/site2/fst/var/www/interpreters/php-script, referer: http://www.bdsave.com/ptadmin/feeds_filters.php?filename=ecost.txt

Submitted by support on Wed, 2008-04-02 14:28

Hello Alastair,

This sounds like a timeout problem.... Can you check through the info in the following thread:

http://www.pricetapestry.com/node/582

In particular, as this looks like a web server timeout rather than PHP; have a look to see if it is an option for you to access your account by SSH and import using the command line.

I will also send you an alternative version of MagicParser.php that is faster with CSV files (but not as robust)...

Cheers,
David.

Submitted by Mark on Wed, 2008-12-17 23:27

I've tried using feeds_import.php and it gets to the last record and then just stops on the bigger feeds - Any ideas?

Submitted by support on Thu, 2008-12-18 11:59

Hi Mark,

I'll send you the faster CSV version of Magic Parser to try - don't forget of course the timeout help and advice in the following thread...

http://www.pricetapestry.com/node/582

Cheers,
David.

Submitted by Randza on Thu, 2010-04-22 16:05

Hello,

I would like to ask you if you could send me an example of feeds so that i can test my configuration.

Thank you

Randza

Submitted by support on Thu, 2010-04-22 16:10

Hello Randza,

I will email you a sample feed to be getting on with...

Cheers,
David.

Submitted by stevewales20 on Mon, 2010-05-24 20:11

Hey David,

I seem to be having issues downloading the latest import_feeds. The file doesn't seem to be present on the server.

Oh and enjoy your much deserved break..

Submitted by support on Tue, 2010-05-25 08:22

Hello Steve,

I've re-instated the download for both distributions of Price Tapestry...

Original Version (0106A)

Latest Version (1109A)

Cheers,
David.

Submitted by stevewales20 on Tue, 2010-05-25 20:22

Cheers David :)

Submitted by stevewales20 on Tue, 2010-05-25 20:31

Hey david,

I've just checked out the feed now and i'm still having issues with:
Premature end of script headers: feeds_import.php

I have used the above script in the hope it would fix this problem. It doesn't however. From my understanding the script should print a line of text each time it occurs? It doesn't seem to be doing this.

I only have issues on two large feeds above a few MB in size. The smaller one's import fine.

Any suggestions?

Submitted by support on Wed, 2010-05-26 21:47

Hi Steve,

I've updated the above to include PHP's flush() statement to ensure that output is generated instead of being cached until completed - fingers crossed that should help...

Cheers,
David.

Submitted by stevewales20 on Thu, 2010-05-27 17:54

Sorry David, that didn't do the trick bud.

The error log doesn't show any different than the above.

I did contact the host to see if it was something being done on the server. They assured me it wasn't.

I'm going to run some tests and a timer and see how long before it times out. It seems to run for over 2 mins and then die. So it would seem the set time limit is flowing past the default.

Could it be something else causing the premature end of headers? Perhaps i can setup some error catching for it?

Would it be the browser not keeping the server active? Just thinking aloud for solutions.

Sorry to be hassling you when your busy.

Kind Regards
Steve

Submitted by stevewales20 on Thu, 2010-05-27 18:46

Hey david,

Sorry if i'm not intepreting this right. However i can't seem to see how the variable $callback is populated in the admin_import function.

Also the function is not called and the variable $progress is not populated? I may be over looking this. Since i have only uploaded the feeds_import.php script. i'm assuming these variables are populated elsewhere?

Thanks
Steve

Submitted by support on Thu, 2010-05-27 22:42

Hi Steve,

feeds_import.php passes an optional parameter in the call to the admin_import() in includes/admin.php containing the name of a callback function - in this case a function actually called "callback" in feeds_import.php.

It might be worth double-checking that your includes/admin.php is attempting to call the callback function every 100 records during import. In that file, check for the following block of code which starts at line 336 in the distribution:

    if ($admin_importCallback)
    {
      if (!($admin_importProductCount % 100))
      {
        $admin_importCallback($admin_importProductCount);
      }
    }

...plus double check that the $admin_importProductCount itself is being updated.

If that all looks OK, could you try as a first test to re-register and feed and instead of using "Register and Import", use "Register and Trial Import", which shouldn't have any timeout issues, and let me know if that works.

If it does; but then clicking on Import still causes the problem; would it be possible to have temporary FTP access to your installation so that I can check it out for you - as there are no obvious reasons for that particularly error (there are many Google search results relating to it but nothing conclusive as to what can cause it) I'll be able to add some debug code and should be able to work it out...

If you could email me temporary FTP details that would be ideal, otherwise if you could send me your existing includes/admin.php i'll add some debug code to that and feeds_import.php for you (I take it you're running the latest distribution...)

Cheers,
David.

Submitted by stevewales20 on Sat, 2010-05-29 14:49

Hey david,

I wrote to a text file within the function that updates the $admin_importProductCount. It functioned correctly. It also showed the product numbers where the script was timing out. It was like 4500 products on both. It varied a little. The one fell short of a 1000 products and the other like 3000.

It doesn't seem to be running the callback function. From what i can see.

    if ($admin_importCallback)
    {
      if (!($admin_importProductCount % 100))
      {
$myFile = "../admin/error.txt";
$fh = fopen($myFile, 'a') or die("can't open file");
$stringData = $admin_importProductCount;
fwrite($fh, $stringData);
fclose($fh);
         $admin_importCallback($admin_importProductCount);
      }
    }

I did this to try and check whether it was in fact being called. It doesn't seem to be.

I also did as you suggested and re registered the field. It does work fine, as did the other feeds that were under a certain size. It only seems they're timing out over a certain size.

I'm not running the latest distribution. I have changed the code a bit. My plan is to install the latest distribution and add the tweaks i've made. I'd like to get this working in the time being though.

I have created a temporary ftp access account for yourself. I'll send the details via email. They're unlocked for the next 7 days.

Thanks Again.
Steve

Submitted by support on Sun, 2010-05-30 21:54

Hi Steve,

Could you try the following as an experiment to confirm whether the callback is being called, but not just not generating any output. In the modified admin/feeds_import.php, replace the existing callback function:

  function callback($progress)
  {
    global $feed;
    print "importing ".$feed["filename"]."...[".$progress."/".$feed["products"]."]<br />";
    flush();
  }

...with just:

  function callback($progress)
  {
    print "Here";
    exit();
  }

This will indicate whether it's being called, but (for some reason which we can then investigate), flush() etc. is still not sending the output as far as the browser...

Thanks,
David.

Submitted by stevewales20 on Wed, 2010-06-02 19:23

Hey david,

sorry for the late reply, my little boy was taken into hospital and so i've been a little pre occupied.

I run the test and it doesn't seem to be calling that function. I had the same timeout message as before.

Submitted by support on Fri, 2010-06-04 05:25

Hello Steve,

Sorry to hear your news I hope everything works out OK.

Could you perhaps email me the versions of includes/admin.php and admin/feeds_import.php running on your site and I'll have a look as well as consider other ways to provoke output for you...

Cheers,
David.