ebook img

E6893 Big Data Analytics Project Proposal PDF

313 Pages·2015·9.74 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview E6893 Big Data Analytics Project Proposal

E6893 Big Data Analytics Project Proposal Politics & Analytics Sanjana Gopisetty - ssg2147 Saad Ahmed - sa3205 Jayni Chopda - jjc2253 Jivtesh Singh - jsc2226 November 19th, 2015 E6893 Big Data Analytics –Lecture 11: Project Proposal © 2015 CY Lin, Columbia University Motivation Election season is coming around and our candidates are the hottest topics on social media. So, why not, use Big Data Technology to analyze election trends? - Which candidate is being talked about the most? - What are the sentiments for the candidates and the major parties? - Where geographically are they being most talked about? - Compare popularity and ranking in social media to polling data. - Observe the change in trends over time. 2 E6893 Big Data Analytics –Lecture 11: Project Proposal © 2015 CY Lin, Columbia University Dataset, Algorithms and Tools Dataset : Twitter API Tracking certain keywords and collecting data like user information, geolocation and the text. Algorithms: 1) Stream live tweets based on keywords using Twitter API and Node.js. We can also use Flume, Hive and HDFS if the data set is very large. 1) Determine the popularity of candidate/party based on the number of tweets. 1) Heat-Map based query based on keywords to display geolocation of candidates/ parties popularity in a particular area 1) Apply sentiment analysis on each tweet by computing the average sentiment score of each tweet and then compute the average sentiment score of all the tweets collected. Also do sentiment analysis on Internet data by using Alchemy API. Tools: Alchemy API, Twitter API, Google Maps API, Node.js 3 E6893 Big Data Analytics –Lecture 11: Project Proposal © 2015 CY Lin, Columbia University Current Progress, Schedule and Expected Contributions Progress: ● Framework discussion related to our proposed project ● Have Successfully used Twitter API to stream data To-Do List Expected Contributions Schedule Collect Tweet Data Sanjana and Saad 2 weeks Sentiment Analysis Heat Map Display Jivtesh and Jayni 2 weeks Trend Analysis 4 E6893 Big Data Analytics –Lecture 11: Project Proposal © 2015 CY Lin, Columbia University References 1. https://dev.twitter.com/rest/public 1. https://developers.google.com/maps/documentation/javascript/examples/layer-heatmap 1. https://en.wikipedia.org/wiki/Sentiment_analysis 1. http://www.alchemyapi.com/api/sentiment/textc.html THANK YOU No questions Please ! :) E6893 Big Data Analytics –Lecture 11: Project Proposal © 2015 CY Lin, Columbia University E6893 Big Data Analytics Project Proposal: Uber Max! Team: Munan Cheng, Lingqiu Jin, Chuwen Xu UNI: mc4081, lj2379, cx2178 November 19th, 2015 E6893 Big Data Analytics –Lecture 11: Project Proposal © 2015 CY Lin, Columbia University Motivation Tom has just finished school at 5 p.m. and has to pick his friend up at airport at 9 p.m. In this period of 4 hours, he plans to make some money as an Uber driver. But now, whom should he offer the ride to? 17:00 ? Trip time $41 / 32 min Queens 54 pl ? Fares ? Demands $40 / 16 min $62 / 30 min 47 pl Jersey 25 pl City Brooklyn 21:00 7 E6893 Big Data Analytics –Lecture 11: Project Proposal © 2015 CY Lin, Columbia University Dataset, Algorithms and Tools Dataset:  NYC Taxi Data Input o Dataset of taxi trips during last 7 years Time constraints Start, end location Algorithms:  (Pick-up & Drop-off) Estimation of Big Data o Demand, Fares, Trip Time Fares o Time sensitive Trip time  Route planning Demand o Dynamic Programming Output Tools: Preferred  AWS next destination  Hadoop + Hive + Mahout that maximizes  Neo4J the total profit! http://www.nyc.gov/html/tlc/html/about/statistics.shtml 8 E6893 Big Data Analytics –Lecture 11: Project Proposal © 2015 CY Lin, Columbia University Current Progress, Schedule and Expected Contributions 11/15 – 11/22 – 11/29 – 12/06 – 12/13 – 11/21 11/28 12/05 12/12 12/17 Taxi data collection Preparation Data quality study Data Infrastructure Setup Algorithm Design Algorithm Implementation, Verification Backend Data aggregation Backend system implementation User Interface design Frontend Web-based mobile frontend Debug Demo Demo 9 E6893 Big Data Analytics –Lecture 11: Project Proposal © 2015 CY Lin, Columbia University E6893 Big Data Analytics Project Proposal: Waste Management using Big Data Hadeel Albahar, Shreya Yathish Kumar, Harnoor Singh Powar November 19th, 2015 E6893 Big Data Analytics –Lecture 11: Project Proposal © 2015 CY Lin, Columbia University

Description:
E6893 Big Data Analytics – Lecture 11: Project Proposal. E6893 Big 1) Stream live tweets based on keywords using Twitter API and Node.js. We.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.