In my previous post, I explained the general principle behind using Google Trends' weekly and daily data to create daily time series longer than 90 days. Here, I provide the steps to take in R to achive the same reuslts.
#Start by copying these functions in R.
#Then run the following code:
#NB! In order for the code to run properly, you will have to specify the download directory of your default browser (downloadDir)
downloadDir="C:/downloads"
url=vector()
filePath=vector()
adjustedWeekly=data.frame()
keyword="google trends"
#Create URLs to daily data
for(i in 1:12){
url[i]=URL_GT(keyword, year=2013, month=i, length=1)
}
#Download
for(i in 1:length(url)){
filePath[i]=downloadGT(url[i], downloadDir)
}
dailyData=readGT(filePath)
dailyData=dailyData[order(dailyData$Date),]
#Get weekly data
url=URL_GT(keyword, year=2013, month=1, length=12)
filePath=downloadGT(url, downloadDir)
weeklyData=readGT(filePath)
adjustedDaily=dailyData[1:2]
adjustedDaily=merge(adjustedDaily, weeklyData[1:2], by="Date", all=T)
adjustedDaily[4:5]=NA
names(adjustedDaily)=c("Date", "Daily", "Weekly", "Adjustment_factor", "Adjusted_daily")
#Adjust for date missmatch
for(i in 1:nrow(adjustedDaily)){
if(is.na(adjustedDaily$Daily[i])) adjustedDaily$Daily[i]=adjustedDaily$Daily[i-1]
}
#Create adjustment factor
adjustedDaily$Adjustment_factor=adjustedDaily$Weekly/adjustedDaily$Daily
#Remove data before first available adjustment factor
start=which(is.finite(adjustedDaily$Adjustment_factor))[1]
stop=nrow(adjustedDaily)
adjustedDaily=adjustedDaily[start:stop,]
#Fill in missing adjustment factors
for(i in 1:nrow(adjustedDaily)){
if(is.na(adjustedDaily$Adjustment_factor[i])) adjustedDaily$Adjustment_factor[i]=adjustedDaily$Adjustment_factor[i-1]
}
#Calculated adjusted daily values
adjustedDaily$Adjusted_daily=adjustedDaily$Daily*adjustedDaily$Adjustment_factor
#Plot the results
library(ggplot2)
ggplot(adjustedDaily, aes(x=Date, y=Adjusted_daily))+geom_line(col="blue")+ggtitle("SVI for Google Trends")
21 comments:
Erik, thanks for the new code which works great, except that you forgot to set the working directory
downloadDir="C:/Users/Dean/Downloads"
setwd(downloadDir)
Without that line of code, your program will not work because it won't be able to find the csv file
Dear Erik
I slightly modified your code to download data from 2004 up to 2014, but after this file download
report (100).csv
is not able to proceed and it continuously asks for being saved in a separate directory (i.e. "Save as").
Do you think it is possible to modify your original functions to cancel the
report.csv file
as soon as its data are saved in R?
Regards, Dean
For some reasons that I am trying to figure out, your code is unable to save
in the directory these files
report (101).csv
report (102).csv
...
and the like. I have tried to find whether you put some delimiter in your R functions, but I did not find any. Or, is it some limits of Google itself?
I am thinking to remove the intermediate data using
file.remove(file.path(downloadDir, list.files(downloadDir)))
I hope in the next days I will find some (at least) intermediate solution
This command is even more precise to eliminate only the csv files (and not th rest)
file.remove(Sys.glob("*.csv"))
Hi Dean,
I haven't encountered that problem myself,and it sounds like an issue with the browser. Which one are you using?
I use Firefox together with tabmix plus that allows me to automate the downloads completely.
I use Chrome
Dear Erik,
Thanks for the code. When I downloaded 1500 csv files, google kept saying "you have reached your quota limit" and wouldn't allow me to download. I have to wait until the next day to continue. Since I have about 70,000 files to download, could you provide a solution for that? Thank you!
Dear Haohan
that limit cannot be overcome. The only thing you can do is to create enough Google accounts to download your desired number of CSV files
Regards, Dean
I'm afraid that having multiple Google accounts might not be enough either. I've used a VPN myself to double the file limit. Unfortunately there's no really good way to get aorund this problem.
Dear Erik,
Yes,changing account cannot solve the problem. Actually, I try to use different accounts in different computers in our library but it is still blocked by google. It seems that once an IP address reached its limit, the other similar IPs will also be blocked by google. I can manually download the data on the other computers but I cannot use your code to download. Could you provide more detailed information about how to use VPN to overcome the problem? Thank you!
Hi everyone!!! I have a question: Do anyone know what to modify in order to mix the data of not just one year, but of 10?
And also, besides the plot that is finally obtained, how could I obtained a tables with the final and adjusted values that were plotted.
Thanks a lot in advance for your answer!
*PS: I am just and R beginner so I have many doubts on this :o
Hi everyone!!! I have a question: Do anyone know what to modify in order to mix the data of not just one year, but of 10?
And also, besides the plot that is finally obtained, how could I obtained a tables with the final and adjusted values that were plotted.
Thanks a lot in advance for your answer!
*PS: I am just and R beginner so I have many doubts on this :o
For those having trouble downloading more than 100 files using Chrome, I suggest changing your browser to Firefox. You can do it in R using this line of code:
options(browser="/usr/bin/open -a Safari")
Is there a possibility to make a loop to download daily data for a search query from 2004 to present?
Yeah, that's easy with R. Just create a sequence of months (1,4,7,10) and years, and run the URL_GT function through all of those.
CAN YOU GIVE ME CODE TO SELECT A DAILY TIME SERIES FROM 8/8/2008 TO DATE FOR ONLY USA WITH A KEYWORD DJI;HOW TO ADD THE FUNCTION;GIVE ME CODE IN SINGLE BLOCK AND I LL CHANGE IT FOR OTHER QUERIES.CAN YOU DO THAT?
I was so anxiuos to know what my husband was always doing late outside the house so i started contacting hackers and was scamed severly until i almost gave up then i contacted this one hacker and he delivered a good job showing evidences i needed from the apps on his phone like whatsapp,facebook,instagram and others and i went ahead to file my divorce papers with the evidences i got,He also went ahead to get me back some of my lost money i sent to those other fake hackers,every dollar i spent on these jobs was worth it.Contact him so he also help you.
mail: premiumhackservices@gmail.com
text or call +1 4016006790
HELLO VIEWERS
TESTIMONY ON HOW I GOT MY LOAN €300,000.00EURO FROM A FINANCE COMPANY LAST WEEK Email for immediate response: drbenjaminfinance@gmail.com
Do you need a loan to start a business or pay your debts {Dr.Benjamin Scarlett Owen} can also help you with a legit loan offer. He Has also helped some other colleagues of mine with a loan finance. Get your Blank ATM card or CREDIT CARD deliver to your doorstep that works in all ATM machines all over the world with the help of BENJAMIN LOAN FINANCE the ATM cards can be used to withdraw at the ATM Machines or swipe, at stores and POS. they give out this cards to all interested clients worldwide, If you need a loan without cost/stress he his the right loan lender to wipe away your financial problems and crisis today. BENJAMIN LOAN FINANCE holds all of the information about how to obtain money quickly and painlessly via Email drbenjaminfinance@gmail.com
I tried to use this code to get data but it didn't work. Using a firefox browser,
I got this message on firefox
404. That’s an error.
The requested URL was not found on this server. That’s all we know.
help please
Thanks
Post a Comment