You are here:  » Speed up insert query


Speed up insert query

Submitted by sirmanu on Fri, 2016-10-28 09:34 in

Hi David.
I have noticed that in PT, the method for inserting new rows is always one per one (example admin__importRecordHandler in admin.php after Magic parser have returned the value).

As you probably know from insert command, INSERT statements that use VALUES syntax can insert multiple rows.
Insert documentation

Also, this thread, is excellent.

Speed of INSERT Statements

If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time. This is considerably faster (many times faster in some cases) than using separate single-row INSERT statements. If you are adding data to a nonempty table, you can tune the bulk_insert_buffer_size variable to make data insertion even faster. See Section 6.1.5, “Server System Variables”.

So what I think that will have a huge perfomance optimization, it would be to wait until for example, have 100 values, and then , do a bulk insert.

Submitted by support on Fri, 2016-10-28 10:01

Hi sirmanu,

I'll investigate bulking up INSERTS - a possible issue would be the duplicate prevention and how this is handled within a single query...

Are you finding importing to be particularly slow, or having made server changes or upgrades has affected performance somehow?

It has rarely come up as an issue (if anything, the more common request is to slow import down to bring down processor load average!) but it has now come up a couple of times in recent weeks. If that's the case, could you let me know what PHP / MySQL version you are using, server / hosting specification and approximately how many products are in the installation?

Thanks,

David.
--
PriceTapestry.com

Submitted by sirmanu on Fri, 2016-10-28 10:21

No, your software has a excellent optimization, so don't worry!

My "problem" is that I have a "large" database. More than 2.000.0000 products. This is the reason why I am forcing to use always fulltext in search queries and changing ft_min_length to 2. As we discussed in other thread, sorting results for large databases is a bad idea, because we don't have the benefit of indexes when we have the operator LIKE %...% (maybe this would be interesting putting down in the documentation)

The thing is that I want to bring the fresh updates ASAP. So, running cron.php, first fetch, then import, then ean matching... it takes almost 6 hours (by far, the slowest is fetch, of course). So, Bulk insert in all queries would have a mayor impact in the overall process which could be easily solved. Just a few lines appending the results.
Something like these iteration each 100 times, for example (or every time you call callback)

$sql[] = '(\''.database_safe($product['dupe_hash']).....')';
$ok = database_queryModify('INSERT INTO pt_products(dupe_hash...) VALUES '.implode(',', $sql),$result);

Submitted by support on Fri, 2016-10-28 11:46

Hi sirmanu,

Thank you for your comments.

I've done some initial tests with bulk insert and haven't noticed any significant performance increase but you would be welcome to try it out - I'll follow up by email as there are a number of changes involved...

Cheers,
David.
--
PriceTapestry.com