You are here:  » Updated import into temporary table mod for latest distribution


Updated import into temporary table mod for latest distribution

Submitted by support on Mon, 2010-03-29 08:49 in

Note: Zero down-time mod is now included in the latest distribution - more info

Hi everyone,

I'm just re-documenting for the latest distributions (11/09A onwards) the modification described at...

http://www.pricetapestry.com/node/2554#comment-10312

...for importing @ALL into a temporary table rather than the live products table; which enables very large sites to be updated with almost no down-time as the final stage of the process to swap the temporary table `products_import` over to become the live `products` table is very quick.

includes/admin.php

Look for the following code on line 366 in 12/10B, 321 in 12/10A, 309 in 11/09A:

    $sql = sprintf("INSERT INTO `".$config_databaseTablePrefix."products` SET

...and REPLACE this with:

    global $importALL;
    if ($importALL)
    {
      $table = "products_import";
    }
    else
    {
      $table = "products";
    }
    $sql = sprintf("INSERT INTO `".$config_databaseTablePrefix.$table."` SET

Look for the following code on line 507 in 12/10B, 451 in 12/10A, 441 in 11/09A:

    $sql = "DELETE FROM `".$config_databaseTablePrefix."products` WHERE filename='".database_safe($admin_importFeed["filename"])."'";
    database_queryModify($sql,$insertId);

...and REPLACE this with:

    global $importALL;
    if ($importALL)
    {
      $table = "products_import";
    }
    else
    {
      $table = "products";
      $sql = "DELETE FROM `".$config_databaseTablePrefix.$table."` WHERE filename='".database_safe($admin_importFeed["filename"])."'";
      database_queryModify($sql,$insertId);
    }

...and finally look for the following code on line 515 in 12/10B, 459 in 12/10A, 449 in 11/09A:

    $sql = "SELECT COUNT(*) AS productCount FROM `".$config_databaseTablePrefix."products` WHERE filename='".database_safe($admin_importFeed["filename"])."'";

...and REPLACE this with:

    $sql = "SELECT COUNT(*) AS productCount FROM `".$config_databaseTablePrefix.$table."` WHERE filename='".database_safe($admin_importFeed["filename"])."'";

Then in scripts/import.php

Look for the following code starting at line 54 in 12/10B,12/10A, 53 in 11/09A:

  if ($filename == "@ALL")
  {
    $sql = "TRUNCATE `".$config_databaseTablePrefix."products`";
    database_queryModify($sql,$result);
    $sql = "SELECT * FROM `".$config_databaseTablePrefix."feeds`";
    if (database_querySelect($sql,$rows))
    {
      foreach($rows as $feed)
      {
        if (file_exists("../feeds/".$feed["filename"]))
        {
          import();
        }
      }
    }
  }

...and REPLACE this entire block with the following code:

  if ($filename == "@ALL")
  {
    $importALL = TRUE;
    $sql = "DROP TABLE IF EXISTS `".$config_databaseTablePrefix."products_import`";
    database_queryModify($sql,$result);
    $sql = "CREATE TABLE `".$config_databaseTablePrefix."products_import` LIKE `".$config_databaseTablePrefix."products`";
    database_queryModify($sql,$result);
    $sql = "ALTER TABLE `".$config_databaseTablePrefix."products_import` DISABLE KEYS";
    database_queryModify($sql,$result);
    $sql = "SELECT * FROM `".$config_databaseTablePrefix."feeds`";
    if (database_querySelect($sql,$rows))
    {
      foreach($rows as $feed)
      {
        if (file_exists("../feeds/".$feed["filename"]))
        {
          import();
        }
      }
    }
    $sql = "ALTER TABLE `".$config_databaseTablePrefix."products_import` ENABLE KEYS";
    database_queryModify($sql,$result);
    $sql = "DROP TABLE `".$config_databaseTablePrefix."products`";
    database_queryModify($sql,$result);
    $sql = "RENAME TABLE `".$config_databaseTablePrefix."products_import` TO `".$config_databaseTablePrefix."products`";
    database_queryModify($sql,$result);
  }

Submitted by paddyman on Tue, 2010-04-20 06:47

Great Mod David,

Great to have no downtime on a 30,000 product site.

Many thanks

Adrian

Submitted by allanch on Fri, 2011-09-16 07:11

Hi David,

Can you tell me if this also works when you run /scripts/cron.php using http invocation? Thanks

Submitted by support on Fri, 2011-09-16 07:41

Hi Allan,

Here's the corresponding change for scripts/cron.php. Look for the following code at line 77:

  $sql = "TRUNCATE `".$config_databaseTablePrefix."products`";
  database_queryModify($sql,$result);

...and REPLACE with:

  $importALL = TRUE;
  $sql = "CREATE TABLE `".$config_databaseTablePrefix."products_import` LIKE `".$config_databaseTablePrefix."products`";
  database_queryModify($sql,$result);
  $sql = "ALTER TABLE `".$config_databaseTablePrefix."products_import` DISABLE KEYS";
  database_queryModify($sql,$result);

Finally look for the following code at line 94:

  print chr(13)."backfilling reviews... ";

...and REPLACE with:

  $sql = "ALTER TABLE `".$config_databaseTablePrefix."products_import` ENABLE KEYS";
  database_queryModify($sql,$result);
  $sql = "DROP TABLE `".$config_databaseTablePrefix."products`";
  database_queryModify($sql,$result);
  $sql = "RENAME TABLE `".$config_databaseTablePrefix."products_import` TO `".$config_databaseTablePrefix."products`";
  database_queryModify($sql,$result);
  print chr(13)."backfilling reviews... ";

Cheers,
David.
--
PriceTapestry.com

Submitted by allanch on Fri, 2011-09-16 08:08

Thanks for the quick response David, I'll give this a try!

Submitted by allanch on Mon, 2011-09-19 06:27

Hi David, I gave this a try and it did work in so far as a temp table is created and then it is renamed once it is finished to pt_products. However, when the table is being created (takes about ~30 min) the website is still down. Do you have any advice on what I might be doing wrong? Thanks

Submitted by support on Mon, 2011-09-19 07:57

Hi Allan,

Could you email me your modified includes/admin.php and scripts/cron.php and i'll check it out...

Cheers,
David.
--
PriceTapestry.com

Submitted by AD_Mega on Mon, 2011-11-28 03:10

Hi David

The products_import table was created but the table wasn't moved to the products table. I think the import stopped before importing all feeds. I think I want to import only modified feed. Will this still work.

Submitted by support on Mon, 2011-11-28 09:24

Hi Adrian,

That would be a sign of the import process terminating before completion. Prior to implementing this were you able to run an import.php @ALL to completion, and were you using the same method of invocation (e.g. letting import.php run via your cron job / automation script rather than manually)?

Unfortunately the method isn't possible with @MODIFIED as all products have to be imported into the temporary table in order for it to be simply swapped over to become the live table; but if @ALL was working previously we should be able to get to the bottom of why it didn't complete using this mod...

Cheers,
David.
--
PriceTapestry.com

Submitted by Convergence on Fri, 2012-05-11 03:20

Hi David,

Is there a required mod for import_slow.php?

Thanks!

Submitted by support on Fri, 2012-05-11 07:03

Hi,

Sure - scripts/import_slow.php can be modified for the temporary table method. First, note the bug that has just been identified which you may have already seen in this comment so if you could first of all re-extract that file from the latest distribution and then continue to modify as follows....

Look for the following code at line 158:

    if ($_GET["filename"]=="@ALL")
    {
      $sql = "SELECT * FROM `".$config_databaseTablePrefix."feeds` WHERE format LIKE 'csv%' AND imported < '".$startTime."' LIMIT 1";
    }

...and REPLACE with:

    if ($_GET["filename"]=="@ALL")
    {
      $importALL = TRUE;
      if (!isset($_GET["progress"]))
      {
        $sql = "DROP TABLE IF EXISTS `".$config_databaseTablePrefix."products_import`";
        database_queryModify($sql,$result);
        $sql = "CREATE TABLE `".$config_databaseTablePrefix."products_import` LIKE `".$config_databaseTablePrefix."products`";
        database_queryModify($sql,$result);
        $sql = "ALTER TABLE `".$config_databaseTablePrefix."products_import` DISABLE KEYS";
        database_queryModify($sql,$result);
      }
      $sql = "SELECT * FROM `".$config_databaseTablePrefix."feeds` WHERE format LIKE 'csv%' AND imported < '".$startTime."' LIMIT 1";
    }

Look for the following code at line 322:

      admin_importReviews();

...and REPLACE with:

      if ($importALL)
      {
        $sql = "ALTER TABLE `".$config_databaseTablePrefix."products_import` ENABLE KEYS";
        database_queryModify($sql,$result);
        $sql = "DROP TABLE `".$config_databaseTablePrefix."products`";
        database_queryModify($sql,$result);
        $sql = "RENAME TABLE `".$config_databaseTablePrefix."products_import` TO `".$config_databaseTablePrefix."products`";
        database_queryModify($sql,$result);
      }
      admin_importReviews();

Cheers,
David.
--
PriceTapestry.com

Submitted by Convergence on Fri, 2012-05-11 15:40

Hi David,

Very nice. Thank you. Course I forgot that it was for @ALL only :) - which is just fine. Running two at the same time on the same server and no noticeable load issues.

Thanks for the great support. "You da man!"

Submitted by Convergence on Fri, 2012-05-11 16:29

Hi David,

It is actually deleting the merchants and their products. Admin shows the feed imported. However, the merchants no longer appears on the site.

Thoughts?

Submitted by Convergence on Fri, 2012-05-11 16:37

Hello again, David -

It appears to import one merchant into the products_import table then deletes just that one merchant. Then goes on to the next and the pattern repeats.

???

Thanks!

Submitted by support on Sun, 2012-05-13 09:48

Hi Convergence,

Sorry about that - the code above wasn't quite correct to ensure the table copy and key disable only occurs on the first invocation; that's why it's appearing to reset after every feed. I've updated the instructions above; in your currently modified version where you have:

      if (!$progress && !$offset)

...REPLACE with:

      if (!isset($_GET["progress"]))

Cheers,
David.
--
PriceTapestry.com

Submitted by Convergence on Sun, 2012-05-13 20:32

Hi David,

NOW we're cooking with gas.

Pefect, thank you!

Submitted by Convergence on Sun, 2012-05-13 21:09

Hi David,

Sorry, spoke too soon.

It's still deleting the existing merchants and their products from the site. However, the merchants and their products are in the temp table...

Submitted by support on Mon, 2012-05-14 08:29

Hi,

Just to confirm; so once the process has completed, the original products table has been dropped, but product_import not renamed to products?

Thanks,

Cheers,
David.
--
PriceTapestry.com

Submitted by Convergence on Mon, 2012-05-14 15:02

Hi David - Happy Monday!

Reran it again on three installs:

Basically it is still doing the same. Removing merchants and products after processing the feed. The very last feed processed remains in the temp table and the temp import table does not get renamed to products. The only products in the temp import table after the last feed is processed are those of the last feed.

Clear as mud? :)

Thanks!

Submitted by support on Mon, 2012-05-14 16:16

Hi,

I just worked this through on my test server; there was a minor typo in the above this line;

      if ($importAll)

..should be:

      if ($importALL)

(PHP variable names are case sensitive - corrected above) - and it then went on to work as expected. If you could make that correctly; and if still not working correctly if you could email me your modified

includes/admin.php
scripts/import_slow.php

...I'll check them out for you...

Cheers,
David.
--
PriceTapestry.com

Submitted by Convergence on Thu, 2012-05-17 20:34

Hi David,

Don't know if I should say "Wow" first or "Thank you".

So...

WOW and Thank you!

Just imported via import_slow.php using 3 crons on 3 installs on one site. The installs share a common feed folder. Approximately 1.2 million rows of datafeeds in under two hours. All with NO load on the server as if it weren't even running.

That is just awesome.

Thank you again!

Submitted by chrisst1 on Mon, 2012-06-25 16:32

Hi David

I've had this mod working fine with cron.php for a long time with no problems but i've recently done some filter updates and found that the import.php @all via our admin works the pre mod way even though its been upgraded. No product_import table is created I checked via phpmyadmin during the import, could you have a quick look over what i've got for any errors:

{code saved}

Thanks

Chris

Submitted by support on Tue, 2012-06-26 08:33

Hi Chris,

The modifications described above are only for scripts/import.php (CRON based import) and scripts/import_slow.php (CRON based slow import) - you mentioned doing an Import All from /admin/ so I assume that the Import All option on the Slow Import Tool; in which case the same changes will be required there.

Straight forward to do; in admin/feeds_import_slow.php look for the following code at line 157:

    $sql = "SELECT COUNT(*) AS productCount FROM `".$config_databaseTablePrefix."products` WHERE filename='".database_safe($admin_importFeed["filename"])."'";

...and REPLACE with:

    global $importALL;
    if ($importALL)
    {
      $table = "products_import";
    }
    else
    {
      $table = "products";
    }
    $sql = "SELECT COUNT(*) AS productCount FROM `".$config_databaseTablePrefix.$table."` WHERE filename='".database_safe($admin_importFeed["filename"])."'";

Next look for the following code at line 178:

    if ($_GET["filename"]=="@ALL")
    {
      $sql = "SELECT * FROM `".$config_databaseTablePrefix."feeds` WHERE format LIKE 'csv%' AND imported < '".$_GET["startTime"]."' LIMIT 1";
    }

....and REPLACE with:

    if ($_GET["filename"]=="@ALL")
    {
      $importALL = TRUE;
      if (!isset($_GET["progress"]))
      {
        $sql = "DROP TABLE IF EXISTS `".$config_databaseTablePrefix."products_import`";
        database_queryModify($sql,$result);
        $sql = "CREATE TABLE `".$config_databaseTablePrefix."products_import` LIKE `".$config_databaseTablePrefix."products`";
        database_queryModify($sql,$result);
        $sql = "ALTER TABLE `".$config_databaseTablePrefix."products_import` DISABLE KEYS";
        database_queryModify($sql,$result);
      }
      $sql = "SELECT * FROM `".$config_databaseTablePrefix."feeds` WHERE format LIKE 'csv%' AND imported < '".$_GET["startTime"]."' LIMIT 1";
    }

Finally, look for the following code at line 291:

        $refresh = $refreshBase."feeds_import_slow.php";

...and REPLACE with:

        if ($importALL)
        {
          $sql = "ALTER TABLE `".$config_databaseTablePrefix."products_import` ENABLE KEYS";
          database_queryModify($sql,$result);
          $sql = "DROP TABLE `".$config_databaseTablePrefix."products`";
          database_queryModify($sql,$result);
          $sql = "RENAME TABLE `".$config_databaseTablePrefix."products_import` TO `".$config_databaseTablePrefix."products`";
          database_queryModify($sql,$result);
        }
        admin_importReviews();
        $refresh = $refreshBase."feeds_import_slow.php";

Cheers,
David.
--
PriceTapestry.com

Submitted by Bluetoffee on Wed, 2012-07-18 09:40

Hi David,

I'm just getting to grips with Price Tapestry and this mod is fantastic.

It's exactly the functionality I wanted after seeing that "import all" deleted the entire contents of the products table and then rebuilt it. With almost 2 million records in my products table, this mod forms a vital piece of my overall load process.

Cheers,

Nick

Submitted by support on Wed, 2012-07-18 10:56

Hello Nick,

Welcome to the forum and thank you for your comments!

Cheers,
David.
--
PriceTapestry.com

Submitted by henk on Sun, 2012-08-19 19:51

Hi David,

I have an error on row 320:

Parse error: syntax error, unexpected T_VARIABLE in /admin/feeds_import_slow.php on line 320

Thx Henk

Submitted by support on Mon, 2012-08-20 10:05

Hello Henk,

It was a missing ";" after admin_importReviews() on the second to last line of the last modification; corrected above.

Cheers,
David.
--
PriceTapestry.com

Submitted by henk on Sun, 2012-12-02 10:48

Hi David,

I have tried this on command line with @ALL but it delete merchants one at the time and emptied the front, after all is import to temp then it goes to normal again.

Is this normal or is it only on command line and not with cron update ( haven't tried this yet)

Thx
Henk

Submitted by support on Sun, 2012-12-02 12:00

Hi Henk,

That sounds like the initial modifications (see first post) haven't been completed, specifically this replacement:

    global $importALL;
    if ($importALL)
    {
      $table = "products_import";
    }
    else
    {
      $table = "products";
      $sql = "DELETE FROM `".$config_databaseTablePrefix.$table."` WHERE filename='".database_safe($admin_importFeed["filename"])."'";
      database_queryModify($sql,$insertId);
    }

...which ensure that if an $importAll is in progress then the DELETE FROM doesn't happen. That should be all it is, complete the modificiations in the first post first, then the modifications for scripts/cron.php described above and that should do the trick...

Cheers,
David.
--
PriceTapestry.com

Submitted by AD_Mega on Sun, 2012-12-02 18:40

Hi David

I have a large site 5M plus products. Is there a way to modify this so when a feed is finished importing into the temp table it then is move to the live product table for that feed. Having two large tables causes problems with my server. I have got my server working ok with 5 million products buy have double the number of products really slows things down.

Thanks.

Submitted by support on Mon, 2012-12-03 09:39

Hi Adrian,

It would be straight forward to do that as MySQL has a built in INSERT by SELECT so if you would like to email me your includes/admin.php and whichever /scripts/ file contains the temporary table mod (import.php or cron.php) that you are using I'll take a look for you...

Cheers,
David.
--
PriceTapestry.com