You are here:  » How to handel corrupt feeds...


How to handel corrupt feeds...

Submitted by Convergence on Mon, 2013-07-15 22:10 in

It happens.

We will get a feed that is corrupt and will not unzip/decompress, for whatever the reason.

Here's what happens:

We get a feed from a merchant, one of hundreds from the same network. Feed_Name.txt.gz for example.

Feed will not unzip/decompress.

The previous version that has been unzipped/decompressed gets deleted.

Slow Import CRON stops at that point.

Is there anything that can be done to "skip" the bad feed, not delete the previous existing unzipped/decompressed feed, and continue the Slow Import CRON?

Submitted by support on Tue, 2013-07-16 08:42

Hi,

One thing to do would be to test the integrity of the file before uncompressing / deleting the original good version. Referring back to the script that resolved your processor load during fetch from this post...

cd /home/xxxxx/public_html/feeds/
for zip in *.zip
do
  unzip -o $zip
  rm $zip
  sleep 1
done
for gz in *.gz
do
  gzip -f -d $gz
  sleep 1
done

...this could be modified to test the integrity of each .gz file before running gzip itself as follows:

cd /home/xxxxx/public_html/feeds/
for zip in *.zip
do
  unzip -o $zip
  rm $zip
  sleep 1
done
for gz in *.gz
do
  gzip -t $gz
  if [ $? -eq 0 ]; then
    gzip -f -d $gz
    sleep 1
  else
    rm -f $gz
  fi
done

If you're not sure how to modify your existing script to contain the above, for example if you have changed it considerably since the previous thread or this concerns a different installation, post your existing un(g)zip code and I'll check it out..

Hope this helps!

Cheers,
David.
--
PriceTapestry.com

Submitted by Convergence on Tue, 2013-07-16 18:43

Awesome.

Thanks, David!

See what happens in the next CRON.

Thanks, again!

Submitted by Convergence on Tue, 2013-07-16 21:20

Worked perfectly!

Thanks, again!