Commission Junction datafeed cron jobs
Hi David,
I'm having a problem incorporating commission Junctions datafeed into my fetch script.
Unlike all the other merchants who provide a way of formatting their download url's which incorporate user name and password details; CJ seem to think this is a security risk and as such will only provide a download url which requests username and password details after the feed has been requested.
I have read the /node/76which does not include CJ, just wonder whether anyone has had success with automating CJ's Feed and whether there is anyway to incorporate their feeds into my fetch script.
Thanks for any help you can provide
Paul
Thanks Joe,
I can do FTP, can you specify the folder for them to drop it into or not or is it just the root?
cheers
Paul
OK, I've changed over to CJ FTP download...
Can anyone please advise me of the path to extract the zip file from the CJ ftp location via my fetch script? or whether I'm on the right track regarding the ftp datatransfer.com path below(highlighted)?
(extract)
#######
#FETCH#
#######
/usr/bin/wget -O "/path/to/feeds.zip" "ftp.datatransfer.cj.com/USERNAME/PASSWORD/USERNAME/outgoing/productcatalog/FILENAME.zip"
Thanks
Hi Paul,
The previous comments implied that CJ would FTP the file(s) to your server, but if you've moved to system where you have FILENAME.zip that is all your feeds compressed in a single file, you probably want to do something like this:
/usr/bin/wget -O "/path/to/feeds/cjfeeds.zip" "ftp://ftp.datatransfer.cj.com/USERNAME/PASSWORD/USERNAME/outgoing/productcatalog/FILENAME.zip"
cd /path/to/feeds/
/usr/bin/unzip cjfeeds.zip
rm cjfeeds.zipNote that this would require the files to be registered as the filename contained within the .zip that you are obtaining from CJ...
Cheers,
David.
Hi David,
Yeah That's what I thought...
I'll have another look, but thanks for this info all the same! Hope you're typing hands OK now?
cheers
Paul
Hi Paul,
Sorry for my late reply, I had gone to sleep. Did you figure anything more out with David's help? The way I have things setup is that I get my cron job to get my FTP feed from CJ's FTP server. They've bundled all of my feeds into one zip and everyday the name of the file changes. I use this for my feeds:
TODAY=$(date +"%Y%m%d")
wget -O "/home/public_html/feeds/cjfeeds.zip" "ftp://username:password@datatransfer.cj.com/outgoing/productcatalog/MYID/FEEDNUMBER_MYID_$TODAY.zip"Is your's similar? It sounded like you previously had to access the feed through XML/HTML and that's why there were authentication problems? I think using FTP and a shell script, things will be much easier to configure. There's not much you can't do.
Anyhow, hope you get it up and running! Let us know what you've got so far!
-Joe
Hi Joe,
Thanks for your reply, still haven't got it working, set up the feed through 'client ftp' but nothing has been downloaded?
Interesting to see that you have the feeds through 'CJ FTP' rather than 'client FTP', is that correct?
If you can confirm, then I may be able to change the settings and incorporate your suggestion in my script
Cheers!
Paul
Hey Paul,
I notice that they don't seem to have the same settings for many different people, it's strange. You may not see your feed show-up for a few days. At first it took about 3-5 days if I remember correctly. They said it would update on Wednesdays for me. I don't know why that is. If you contact their support they can manually override that so you can get started testing. My settings are through CJ instead of client...but they could be different for you. If you have an FTP client like FileZilla, you can try to download your feed(s) manually first to confirm that they're working (when they show up)...it may help you confirm the settings/folders you'll need in your script.
Talk soon! -Joe
Hi Paul,
I have a script that can be used for CJ for downloading feeds and importing them into the feeds folder. You'll have to create a merchant.txt file in your root directory of Price Tapestry (same directory as index page). In the merchants.txt file you can list the name of all the merchant feeds you are downlading(the exact name of the file in CJ with the .txt.gz extension included in name). Each filename needs a "|" between them for export into temp folder(Example "Black_Forest_Decor-Product_Catalog.txt.gz|Black_Forest_Decor-Product_Catalog.txt.gz"). Then you'll have to create a temp foler and make it writeable (777). Here is the script below. You'll have to edit your user name, password, and folder id to change on CJ's Server. One thing to remember when using several merchants is they all have different times they update, so if you run cron jobs and a merchant have not updated yet, it causes an error in import.
<?php
$ftpServer = "datatransfer.cj.com";
$ftpUser = "XXXXXX";
$ftpPass = "XXXXXX";
$conn = @ftp_connect($ftpServer) or die("Couldn't connect to FTP server");
$login = @ftp_login($conn, $ftpUser, $ftpPass) or die("Login credentials were rejected");
ftp_pasv($conn, false);
//Change to the feed folder directory
ftp_chdir($conn,"outgoing/productcatalog/XXXXX/");
//open up file(s)
$handle=fopen('merchants.txt','r');
$buffer = fgets($handle, 4096);
$expl=explode('|',$buffer);
$count=count($expl);
for ($i=0; $i<$count; $i++)
{
$datafeed_file=$expl[$i];
$localfile="temp/$datafeed_file";
$fp=fopen($localfile,'w');
if (!$success=ftp_fget($conn,$fp,$datafeed_file,FTP_BINARY))
{
echo "unsuccessful";
ftp_quit($conn);
exit;
}
}
ftp_quit($conn);
fclose($fp);
fclose($handle);
$path='temp';
//read files in temp folder
$dir_handle = @opendir($path) or die("Unable to open $path");
//get the files
while ($file = readdir($dir_handle))
{
if ($file!='.' AND $file!='..')
{
//get rid of the mp.txt.gz in the filename
echo "Uncompressing $file and moving to feeds<br/>";
$destFilename = str_replace('.txt.gz','.csv',$file);
$feed = gzopen("temp/$file", 'r');
$dest = fopen("feeds/$destFilename", 'w');
while(!gzeof($feed))
{
fwrite($dest,gzread($feed,1024));
}
gzclose($feed);
fclose($dest);
unlink("temp/$file");
}
}
?>
Hope this helps.
What I am doing is using the cj web services, query their db and save the data as xml file. Then I can import it into pricetapestry db. So I don't need to download a large xml file from any merchant. I just query the keyword that I want.
Hello David, this automation script works fine. I added a automatic import.php to the code and a remove all the feeds after updated imports. The problem I am having is within CJ some times some merchant's feed be in the directory and others be in the directory on other days. If 1 of the merchant's are not in the directory the script bails out and leave the merchant that's not available in the temp folder with 0 bytes. So the next cron updates from other programs and puts this file in feeds folder with 0 bytes and on import all products are deleted. I am trying to get the script to continue searching for available merchants on the site and if not available do not place them in temp folder, and continue downloading all available feeds. I am trying to get it to cron everyday automatically without deleting other merchant products. Here is a copy of the code I added to the script.
<?php
exec("/usr/bin/php scripts/import.php @MODIFIED");
exec("rm -f feeds/* .txt,.csv");
print " done.";
?>
Looking for suggestions on how to update the code to continue on without bailing out. And to just bypass the merchants that are not within the directory at the time the script is run.
Thank you,
Roy
Hello Roy,
One easy option would be to modify import.php to ignore files that are 0 bytes. To do this, in scripts/import.php, look for the following code on line 80:
if (file_exists("../feeds/".$feed["filename"]) && ($feed["imported"] < filemtime("../feeds/".$feed["filename"])))...and REPLACE this with:
if (
(file_exists("../feeds/".$feed["filename"]))
&&
(filesize("../feeds/".$feed["filename"]))
&&
($feed["imported"] < filemtime("../feeds/".$feed["filename"]))
)Hope this helps!
Cheers,
David.
Thanks it works fine for the feeds with 0 bytes.
Any solution for keeping the script to download all the merchants listed in the text file instead of the script stopping once a merchant in the text file is not in the CJ's ftp directory? Example the text file has Merchant1|Merchant2|Merchant3|Merchant4|Merchant5|Merchant6 listed. Let's say all the merchant feeds are in the CJ ftp directory except Merchant 3. The script will download merchant 1 and merchant 2 into the temp folder and leave merchant 3 with the 0 bytes and stop. Merchant 4, 5, and 6 never gets to the temp folder. The script does not import anything until the next successful cron import or whenever all merchants in the text file are in CJ's ftp directory at the same time. I am trying to see if there is any way to get the above code to keep searching for all available merchants in the text file no matter what order they are in the text file and still import the available merchant's into the feeds folder without bailing out and stopping.
Thanks again,
Roy
Hi Roy,
Sure - I think that should just be a case of modifying this section:
if (!$success=ftp_fget($conn,$fp,$datafeed_file,FTP_BINARY))
{
echo "unsuccessful";
ftp_quit($conn);
exit;
}...simply remove the ftp_quit() and exit() lines, leaving just:
if (!$success=ftp_fget($conn,$fp,$datafeed_file,FTP_BINARY))
{
echo "unsuccessful";
}...and that should do it...
Cheers,
David.
What a nightmare I'm having!!
I am trying to go the way of a fetch.sh file, which I am used to. I have the following, but the CJ server is refusing login!
#!/bin/sh
#########
# FETCH #
#########
/usr/bin/wget -O "/home/path/public_html/folder/feeds/Money_Clothing-Product_Feed.txt.gz" "ftp://datatransfer.cj.com/MYUSER/MYPASSWORD/MYUSERAGAIN/outgoing/productcatalog/58437/Money_Clothing-Product_Feed.txt.gz"
/usr/bin/gzip -c -d -S "" /home/path/public_html/folder/feeds/Money_Clothing-Product_Feed.txt.gz > /home/path/public_html/folder/feeds/cj-moneyclothing.txt
#########
##########
# IMPORT #
##########
cd /home/path/public_html/folder/scripts/
/usr/bin/php import.php cj-moneyclothing.txt
---------
Jill
Hi Jill,
A normal FTP URL with authentication would look like this:
ftp://MYUSER:MPASSWORD@datatransfer.cj.com/MYUSERAGAIN/outgoing/productcatalog/58437/Money_Clothing-Product_Feed.txt.gzMight be worth having a go with that format...
Cheers,
David.
Thanks David!
Almost there, but it didn't unzip - any ideas?
---------
Jill
Hi Jill,
The command looks fine - have you tried logging in and performing the gzip -d manually - that might indicate what the problem - cd to /feeds/ and just use:
gzip -d Money_Clothing-Product_Feed.txt.gzCheers,
David.
Unfortunately [blush] I don't know how to do that. Don't know how to log in manually - I have always used an ftp program!
---------
Jill
Hi Jill,
Give it a go over SSH using Putty.
Start the program, and enter the same host name that you use for FTP access in the Host Name (or IP Address) box.
It should then attempt to connect, and if successful ask you to "Trust" the connection by saving its identity in your local keyring. It will then ask you to "Login as:" and again, use your FTP username. Press enter, and then it will ask for a password, and then you should be in!
The basic commands are
ls (list directory)
cd (change directory)
You'll probably see a public_html folder if you run ls, then you can change into it using
cd public_html
Assuming you're in, navigate to your /feeds/ folder using those 2 commands, and then try the gzip command as described above...
Cheers,
David.
Thanks David
I couldn't actually get putty to work. However, I did try changing the file extension from gz to gzip and that worked!!
phew
---------
Jill
Hi Paul,
Is FTP a problem for you? If not, you can request an FTP account where they'll put your feeds and they're a lot easier to manage through the command-line or through a cron script :)
-Joe