Playing with Twitter Data (Get the 1% Livestream)

| No Comments


Recently, I have been playing with twitter data. Below is a basic python script that fetches the 1% live-stream tweets published by twitter. To access the live stream, you will need to have the oauth2 library installed for authentication purposes.

To be able to access the 1% live-stream, you need to set up your twitter account using the following steps:


  1. Go to https://dev.twitter.com/apps and log in using your twitter credentials.

  2. Create an new application using your twitter account.

  3. Create an access token for your created application.

  4. Fill in the missed information in the below "fetchtweet.py" script, as follows:

    access_token_key = ""
    access_token_secret = ""
    consumer_key = ""
    consumer_secret = ""


  5. Save the "fetchtweets.py" script

  6. Finally, run the script as follows:

    $ python fetchtweets.py

    You can keep the script running until you get the data size you want.

  7. You may also pipe the fetched tweets and dump them to a file, as follows:

    $ python fetchtweets.py > tweets.txt



fetchtweets.py sript:


import oauth2 as oauth
import urllib2 as urllib

access_token_key = "..."
access_token_secret = "..."

consumer_key = "..."
consumer_secret = "..."

_debug = 0

oauth_token = oauth.Token(key=access_token_key,
secret=access_token_secret)
oauth_consumer = oauth.Consumer(key=consumer_key,
secret=consumer_secret)

signature_method_hmac_sha1 = oauth.SignatureMethod_HMAC_SHA1()

http_method = "GET"


http_handler = urllib.HTTPHandler(debuglevel=_debug)
https_handler = urllib.HTTPSHandler(debuglevel=_debug)

'''
Construct, sign, and open a twitter request
using the hard-coded credentials above.
'''
def twitterreq(url, method, parameters):
req = oauth.Request.from_consumer_and_token(oauth_consumer,
token=oauth_token,
http_method=http_method,
http_url=url,
parameters=parameters)

req.sign_request(signature_method_hmac_sha1, oauth_consumer, oauth_token)

headers = req.to_header()

if http_method == "POST":
encoded_post_data = req.to_postdata()
else:
encoded_post_data = None
url = req.to_url()

opener = urllib.OpenerDirector()
opener.add_handler(http_handler)
opener.add_handler(https_handler)

response = opener.open(url, encoded_post_data)

return response

def fetchsamples():
url = "https://stream.twitter.com/1/statuses/sample.json"
parameters = []
response = twitterreq(url, "GET", parameters)
for line in response:
print line.strip()

if __name__ == '__main__':
fetchsamples()



Leave a comment

About this Entry

This page contains a single entry by M. Sarwat published on May 14, 2013 7:02 PM.

A quick look inside Sindbad was the previous entry in this blog.

Playing with Twitter Data (Retrieve Top 10 hashtags) is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Categories

Powered by Movable Type 4.31-en