Social Media Analysis has been gaining popularity for some time now. Every company invests loads of money investing in publicity, advertising, campaigns, etc. But, there is no definite way of suggesting the ROI for this investment except the indication of growth in number of satisfied customers. But again, it is not sure that only the social advertising and campaigning efforts have caused the growth in number of customers. The answer is Social Media Analysis.
For instance, we use the tweets of the costumers to analyze its sentiment and thus their experience. We could modify the advertisements, campaign policies considering various factors such as geological location, time of the year, etc. This seems to be the task for a genius hacker. But, with the right tools we can do this with writing minimal code.
The objective in this occasion is to show how easy it can be to build your own Social Media Tool. Here, I am going to use RStudio with some packages: shiny, twitteR, httr, tm and wordcloud.
shiny: A package created by RStudio (http://shiny.rstudio.org/), to build Web applications very easily. In this case, you will see the code to operate locally. Here you may be able to find more info on how to run it on your own or a hosted server. A shiny application consists of two files ui.R and server.R. The former includes code which determines how your application will look like and the latter includes the code for the logic of your application.
twitteR: A very powerful package for Twitter Monitoring. Simple, easy and very effective.
tm: tm stands for “Text Mining”. Apart from having text mining tools, it also provides very useful functions to pre-process texts
wordcloud: Package used to do Wordcloud plots.
Let’s get started.
Authentication Process with Twitter
For fetching twitter data, we have to use twitter API and authenticate the connection every time we run our shiny app.Its about creating the Twitter app and doing the handshake cause you have to do it every time you want to get data from Twitter with R. Since Twitter released the Version 1.1 of their API a OAuth handshake is necessary for every request you do. So we have to verify our app.
First we need to create an app at Twitter.
Got to https://apps.twitter.com/ and log in with your Twitter Account.
Click on it and then on “Create new application”.
You can name your Application whatever you want and also set Description on whatever you want. Twitter requires a valid URL for the website, you can just type in http://test.de/ ; you won´t need it anymore.
And just leave the Callback URL blank.
Click on Create you´ll get redirected to a screen with all the OAuth setting of your new App. Just leave this window in the background; we´ll need it later.
Before we go on, make sure you have installed the newest version of the twitteR package from github.
Therefore you can use the following code after you have opened RStudio > New Project > New[Existing] Directory > New Shiny Web Application which will create two files ui.R and server.R in the working directory. This code goes in the console
install.packages(c("shiny", "twitteR", "devtools", "rjson", "bit64", "httr", "wordcloud", "tm"))
#RESTART R session!
Note: The latest version for httr is installed by default which is not compatible with twitteR version 1.1.8, instead download httr version 0.6.0.
Here is how you can do it:
packageurl = "http://cran.us.r-project.org/src/contrib/Archive/httr/httr_0.6.0.tar.gz"
install.packages(packageurl, repos=NULL, type="source")
Now the twitteR package is up-to-date and we can use the new and very easy setup_twitter_oauth() function which uses the httr package. First you have to get your api_key and your api_secret as well as your access_token and access_token_secret from your app settings on Twitter. Just click on the “API key” tab to see them.
api_key = "YOUR API KEY"
api_secret = "YOUR API SECRET"
access_token = "YOUR ACCESS TOKEN"
access_token_secret = "YOUR ACCESS TOKEN SECRET"
And that´s it.
Please use the re-indent option in RStudio after copying this code into ui.R and server.R files. Before we go on to analyze in more detail all the code presented, this is how the application looks like:
Let’s look at the code now. The UI logic is extremely easy to understand. Each of the widgets to enter parameters has the id as first argument. To call it in further actions, you will write input$(id). It works as any other variable. In the selectInput, the name of the button is first entered and then the value that the selectInput variable will receive if that option is chosen (if you choose English, it’ll receive “en”).
The only thing to point out specially is the submitButton() function. Unless you include it, the script will process and output the results every time you change the parameters (reactive). The submitButton() function is particularly useful if the process that has to take place is expensive or if you need from the user to enter more than one parameter for the whole script to run correctly.
Next, we have server.R which is relatively a bit more complex. Firstly, all the necessary libraries (apart from shiny) have to be initialized in server.R. It is always advisable, when possible, to start them all together. Also, shinyServer(function(input, output) must always be the first line of your code in server.R.
reactive() is a function that indicates that whatever is processed inside that function, it will be done whenever the parameters are changed (if you entered the submitButton() in the UI, whenever you hit it. Otherwise, whenever you change the parameters). It is particularly useful to build the raw Data that you will process afterwards.
Shiny will execute all of these commands if you place them in your server.R script. However, where you place them in server.R will determine how many times they are run (or re-run), which will in turn affect the performance of your app.
Shiny will run some sections of server.R more often than others.
Shiny will run the whole script the first time you call runApp. This causes Shiny to execute shinyServer. shinyServer then gives Shiny the unnamed function in its first argument.
As users change widgets, Shiny will re-run the R expressions assigned to each reactive object. If your user is very active, these expressions may be re-run many, many times a second.
In this case, searchTwitter() is called whenever one or more from input$term, input$count or input$lang are changed and the submit button is pressed, It uses the information entered by the user (the first argument is the term entered, the second the amount of tweets and the third one the language) and gives its output.
As the object returned by searchTwitter() is a bit difficult to handle, it is advisable to turn it into a Data Frame (twListToDF()) if you want to work, for example, with their texts (to make wordcloud). Or you could also try
tweets = laply(tweets,function(t)t$getText())
The renderPlot is a bit more sophisticated: Firstly, it takes the tweets and changes the encoding (enc2native(); to native). Then, it converts everything to lower case. After that, the function removeWords() from the package “tm” is used to delete common words. As you can appreciate in the example, you can input whatever word, list of words, regex, etc. you would like to be removed. In this case, the stopwords from the user-entered language (input$lang) are removed, plus the term “rt” as in Re-Tweet. In order to do a wordcloud (our final objective), this is particularly useful, as we will never would like to have “common words” in it. After that, punctuation is also removed.
Finally, all the tweets are turned into a list of words, then turned into a table (i.e., a frequency table), ordered descendently and the first 50 are chosen to plot.
For the wordcloud function, we enter the labels (the words themselves) of the generated table and the frequencies (the first two arguments in the example). This will determine the size of the words. The last arguments refer to the color order, the palette, and the maximum and minimum size for each word in the plot.
And now you can place the output in main panel using tableOutput and plotOutput.
Explore the options, play with this data and read more about shiny here.
This is the only comprehensive and complete guide to extracting twitter data in R and making a Web Application using shiny. I have tweaked the code a little for deploying the application via GitHub. Here’s the source code https://github.com/mngujral/twitterFeedShinyApp.
And you can also run my version on your RStudio simply by the command:
Feel free to contact and/or leave a comment if you have any question, critic or correction.