Adding File Caching To Your Twitter API Script

/ PHP / by Paul Robinson / 37 Comments
This post was published back on February 24, 2010 and may be outdated. Please use caution when following older tutorials or using older code. After reading be sure to check for newer procedures or updates to code.
Update! Twitter have changed their API url from twitter.com to api.twitter.com. Code has been changed to reflect this fact.

File caching is extremely important in a script like this. It stops you from running over your allocated API limit by saving the received API data in a file on your server. We tell the script to use the data in the file until a preset amount of time has passed. Once the allotted time has passed the script will re-connect to Twitter, grab the API data & store it in the file again. The process then repeats.

Let’s look at an example. If we set our cache time to 1800 seconds (30 minutes), the script will continue to use the cached API data until the file is 1800 seconds old. To do this we check the file’s modified date and time against the current date and time.

Adding File Caching

Let’s take a look at where we got to with our code last time:

Now that we’ve recapped where we were up to in our code, let’s try and add some file caching. If this is your first foray into file writing using PHP, don’t worry; it’s actually fairly simple. The first thing we need to do is write the part of the script that checks the age of the cache file and determines if we need to renew it or not. Let’s get down to it.

This part of the code goes before everything else (as noted in the comment). Let’s go through what the code does. First we set a cache variable to false as we should always assume that this is first run and has no cache to load. Next we set the path to the cache, use a path from root if possible (Unix and Windows).

Now we check to see if the cache exists yet, if it does we get the last time it was modified, and we get the current time minus 30 minutes. Now we ask if the time the file was modified is older than the current time minus 30 minutes. For example; if the file was modified at 8:30 and the script is ran at 8:45, the modified time would be greater than the $timeago variable as $timeago would be 8:15 (8:45 -30 mins = 8:15) meaning the cache would not be updated. Of course in this case we are just setting a variable to true or false to tell the script to update (or not) the cache later.

Hopefully that wasn’t too confusing. I know I said it wasn’t confusing, but I meant the file creation code, not the file modification time comparison code. 😆

Now we need to modify the code we wrote before slightly. To save time I’ll write out the entire code together.

That is the entire script. Let’s go through the new parts. First we have wrapped our cURL code in a conditional statement. It just checks to see if $cache was false, as we only want to connect to Twitter if our cache is either non-existent or out of date.

After the Twitter connection we have our cache generation code. It’s fairly simple. We first use fopen, the first parameter is the path to the file to alter (or create) & the second is the mod, ‘w’ means write, truncate to zero if not empty & create if non-existent. flock() tries to give exclusive file access so that we can edit the file without interruptions. fwrite() writes the data to the file. We then unlock file access & close our file. For those wondering $fp is a file pointer resource used by the other file access scripts to know what file you are altering etc.

Finally should the cache already exist & not be out of date the other part of our conditional statement is activated & we just get the contents of our cache file & load them into $content. It is then handed to the same functions as if it had just came from Twitter.

I hope that has helped make some sense out of the complicated business that is file caching. The hard part is not so much creating the file, but checking if the file is too old or not. As always if you have any questions, suggestions etc let me know via the comments.

For those interested, there is one more tutorial to come in this series. It will be on how to merge retweets onto the standard tweets. It will probably be for more advanced users as it can get quite complicated, but feel free to follow along whatever your experience. Let me know if you have any other requests for the next tutorial.

37 Comments

Author’s gravatar

Thank you! I was using juitter but felt it was just too heavy for what I wanted to accomplish and just coded a very lightweight recent tweets widget with caching thanks to your code

Reply
Author’s gravatar

Nice job explaining this, but I believe the script has an error.

Line 29 should read … $fp = fopen($cPath, ‘w’);

Reply
Author’s gravatar author

Indeed it should, sorry about that & thanks for letting me know about the mistake.

Tis fixed now though. 😉

Author’s gravatar

mine keeps giving me the foolowing error:
Warning: Invalid argument supplied for foreach()

Reply
Author’s gravatar author

Hi Josh,

Sorry for not getting back to you sooner, been having problems with my router at home.

That would suggest that the result supplied from the call to Twitter was either not in the correct format or was empty.

The only thing I can think of that would cause that is an incorrect URL. Have you changed the twitterusername.json to your username.json inside the URL on line 16 of the code?

Author’s gravatar

Hi Paul,

Thanks for the code, but I’m having a bit of a problem getting PHP to output the results – I can get the data back from Twitter, and store it in a file, but reading the content’s more problematic.

I think the issue is with the timestamp – it’s included in the file before the array that’s holding the twitter JSON data and is causing the foreach function to fall over.

Example file content:

s:29830:”{“results”:[ etc.

Is there some way to get just the ‘results’ array? My PHP is rather rusty…

Thanks!

-Mike

Reply
Author’s gravatar author

Hi Mike,

The data you’ve shown is a serialized array you will need to unserialize it to use it as an array.

Author’s gravatar

Hi Paul,

That was it! Removing the serialize function from the data insertion line has got everything working.

Thanks a bunch!

-Mike

Author’s gravatar author

It’s advisable to leave that in as it makes storage of the data easier. You just need to run unserialize on the data when you pull it back from the file.

Author’s gravatar

I can create the file but I cant open it with file_get_contents, I’ve tried everything right about now, every fileformat everything and YES it is turned on in the php.ini.

The only difference is that I get my data through oAuth.

Reply
Author’s gravatar author

Hi Andreas,

does running the file path through file_exists() return false or true? That’s generally how I test if functions like file_get_contents() are opening the file successfully. If not you could try using the old method of fopen or file.

To be honest I’ve never had many problems with file_get_contents it’s a very common method used to save data to files for caching purposes.

Author’s gravatar

Oh thanks, I got it to work now, I forgot to unserialize and then my script would’nt do what it was supposed to! 😉

Cheers mate!

Author’s gravatar author

Ahh, that would explain it. No problem & glad you figured it out.

Author’s gravatar

Would the page load faster if you used cURL instead of file_get_contents() , I’ve done some research about it and found out that that’s the case when you transfer data from other servers. 🙂

Author’s gravatar author

I’m not sure what you mean. The script does use cURL to gather the data from Twitter. It only uses file_get_contents() to read the data from the cache file which is kept on the local server.

Author’s gravatar

Yeah, but I dont use cURL for that purpose I use the twitter API with oAuth (it’s much more stable) and then file_get_contents() to get data from my cache file, but would it be faster to use cURL to read the cache file?

Author’s gravatar author

oAuth uses cURL in it’s back end, taken from the oAuth library I use with Twitter Stream:

In my opinion no cURL wouldn’t be faster. The file_get_contents() command is specially coded to use any caching & file system optimizations your server’s OS might have, so it is always quicker to use file_get_contents() than any other file reading function.

You could try cURL and measure the speed using microtime functions, but it is normally used for external communications. It’s even described as being a ‘library and command-line tool for transferring data using various protocols.’ But I don’t see why you couldn’t give it a try. If you do please let me know of the results, it would be interesting to see if there is a speed increase.

Author’s gravatar

I’m very new to caching, but it doesn’t seem to work, the cache file seems to stay empty. Could the path be wrong, or do I also have to make some adjustments in the .htaccess file?

Reply
Author’s gravatar author

Hi Erik,

It could be a number of things. I’m assuming the cache file is created, but when you open it it’s empty?

That would normally say that your server isn’t having any trouble creating the file, so maybe you aren’t getting anything back from Twitter? Have you tried visiting the URL you are using to contact Twitter via cURL in a browser & do you get a huge string of words?

Author’s gravatar

I had the same problem — json_decode wasn’t reading the contents of the cache file because you’re not writing the data in UTF-8 encoding.

To fix this problem just modify line 30 of your complete code:

should be:

Then it works fine.

Author’s gravatar author

Hi,

Thanks for that you are indeed correct it should have utf8_encode() in there. I’ll correct that now.

I’m not sure why I’ve got serialize in there, there isn’t a need too since the returned data is already JSON. Guess I hadn’t had enough coffee when I wrote it, haha.

Again thanks for the correction.

Author’s gravatar

Any time.

Thanks for the great tutorial, this helped me enormously. A really clear guide, keep up the good work.

Author’s gravatar author

No problem. Glad they help out.

I just wish I was able to write them more often, there doesn’t seem to be enough hours in a day. 🙂

Author’s gravatar

Were you ever able to continue to the guide about adding in retweets? Currently, if I set the count to 3 and 2 of them are retweets, I will only see 1 post. If there’s no guide but you have the code, could you send it to me?

Reply
Author’s gravatar author

Hi Johnny,

I forgot to update the text in this tutorial. It’s no longer needed now as Twitter allow you to set a parameter to include or exclude retweets via their API now. The main problem now is that if you exclude retweets they are removed by Twitter after retrieval from the database meaning if you asked for 10 Tweets and 3 were retweets you’ll end up with 7.

To include retweets you just add the include_rts=1 onto your url. For example you could use:

https:// api.twitter.com/1/statuses/user_timeline.json?include_entities=true&include_rts=1&screen_name=paulbrobinson&count=5

To get the last 5 Tweets I made, including retweets. Please ignore the space, it is to stop WordPress auto-parsing the URL.

Hope that helps.

Author’s gravatar

Awesomely fast response! Thanks! Ya know, I just read all that API documentation and somehow didn’t understand that it was that easy. If you can entertain another issue I’m having… I’m trying to use this code with an MVC pattern. This is also my first time with that. I’ve tried cutting this part out and putting it in the template’s default.php:

That isn’t getting me anywhere… I thought of maybe defining that entire part as a variable and then echoing it in the default.php but I’m not sure if I went about that right at all…or if it is possible. Since this is such a simple layout, I guess there’s no need, but I want to develop good habits. Any words of enlightenment on this subject?

Reply
Author’s gravatar author

Yep, getting it working is super easy, understanding the documentation not so much, lol.

Ahh. Funnily enough I’m working on a tutorial for EmberJS which is a Javascript development framework that uses the ever famous MVC pattern.

My tip would be to try to keep that code in your controller & pass just the output into your view similar to the PHP framework CodeIgniter would, which is also a MVC framework. So instead of echoing out in the loop, you would append it to a string. It might go a little something like this:

Then hand $output off to your view to be output in your template.

Hopefully that will help. Let me know if I can help any futher. 🙂

Author’s gravatar

lol. I think I’m going to rupture my spleen thinking so hard on this. I’m using Joomla and apparently can’t wrap my head around the concept of how it passes variables to the view.

Here’s a link that very simply shows how to create a module:
http://blog.joomlaearth.com/2012/create-you-first-joomla-1-6-module-completely-from-scratch/

I don’t see anything to suggest that a variable in helper.php could be used in the view…
If I understand this right, helper.php has code for getHello that the controller pulls into the variable definition of $hello which is then echoed in default.php. How am I supposed to get the $output variable to the view?? I know this is getting off topic. I can totally hush if I need to. I read a php book and am googling away on everything I can. I am determined to become a professional php developer – long way to go. Also, twitter is blocking me for reaching the limit even though I have the cache set for an hour. I have only deleted the cache a few times to test things out. Shared server issue?

Reply
Author’s gravatar author

First about your Twitter limit. You can check to see if your IP has hit your API limit by visiting the URL:

https ://api.twitter.com/1/account/rate_limit_status.json

Again ignore the space. If you request that URL via your script it will give you back some JSON with your API rate limit in it.

As for Joomla, it is a little off topic, but don’t worry about it.

From what I can tell the information is stored & called in mod_helloworld.php which is then accessible in your view. The data stored in the $hello variable is determined by whatever is returned in the getHello() function.

If you are still having a few problems understanding how it works drop me and email & I’ll try to explain it a little better as it could get a little larger than I’d like here in the comments.

Author’s gravatar

How do you link the cache to the section in html??

at present I have:

I would like to move the bit :

and instead pull the data from the cache that I will create using your code.. Would I just reference to the cache file here instead of the Json Url??

thanks

Reply
Author’s gravatar author

Hi Dan,

I apologize for the long delay for this answer. As noted in my most recent post I’ve been out of action recently with health problems & haven’t had much time to do anything with catching up on work, I’ve had a few days out to relax, but if I don’t do that I’ll probably go mad. 0_0

In answer to your question. You possibly could use the cache data raw as it is json, but if you don’t have anything that will parse the json out into HTML for jQuery Cycle to latch onto, it probably won’t work. jQuery has some great json parsing tools built in so you could use those to parse the json to HTML then init Cycle afterward.

Author’s gravatar

Sorry if this is a no brainer but entry point to twitter has changed from https://twitter.com/ to https://api.twitter.com/1/ which you touched on, Paul (and I hope you are feeling better!) Also, the entire script needs to be wrapped in tags. For all the newbies. Like Me.

Scott

Reply
Author’s gravatar author

Hi Scott,

Yup, need to add that to the tutorial. Thanks for reminding me. 😉

I am, thank you. Yep, it’s a habit of mine I tend to leave PHP tags off in my tutorials unless I’m mixing languages (PHP & HTML etc). I forget it can be a little confusing to those just starting out.

Thanks.

Author’s gravatar

I have rate limit:
{“remaining_hits”:149,”reset_time_in_seconds”:1362879709,”hourly_limit”:150,”reset_time”:”Sun Mar 10 01:41:49 +0000 2013″}

What can I do?

Reply
Older Comments
Newer Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

I'll keep your WordPress site up-to-date and working to its best.

Find out more