Oreilly - Hardcore Data Science NYC 2014
by | Released June 2015 | ISBN: 9781491931073
Push the envelope of data science by exploring emerging topics such as data management, machine learning, natural language processing, crowdsourcing, and algorithm design with this O'Reilly video collection—taken from the Hardcore Data Science sessions at Strata + Hadoop World 2014 in New York.This video collection includes:Doing the Impossible (Almost)Ted Dunning, Chief Application Architect, MapR TechnologiesComputing quantities such as medians or the number of unique elements usually requires a lot of time, a lot of memory, or both. But not always. Ted describes how these algorithms can be much simpler, and shows you how to apply them to applications like anomaly detection.Tupleware: Redefining Modern AnalyticsTim Kraska, Professor, Brown UniversityLearn about Tupleware, a new system developed at Brown University specifically aimed at the challenges faced by the typical user. Tupleware automatically compiles analytical workflows into highly efficient distributed programs instead of interpreting the workflows at run-time.Data Science for Humans, Not RobotsAlice Zheng, Director of Data Science, DatoData is intended for human consumption, yet governed, analyzed, and processed by machines. In this session, you'll take the perspective of how data appears to machines in order to become more effective at using machines to model and analyze data for people.Big Data: Efficient Collection and ProcessingAnna Gilbert, Professor, University of MichiganYou could spend your time collecting a ton of data from scientific applications, but there are more efficient ways to answer questions of interest. In this session, you'll learn how to acquire data in summarized or compressed measurements.Computational Problems in Managing Social InformationJon Kleinberg, Professor, Cornell UniversitySocial media networks aren't just venues for people to come together; they're also explicitly designed environments whose architectures serve to shape behavior. You'll learn several computational challenges that illustrate this tension between organic interaction and algorithmic design.Small Data ProblemsKira Radinsky, CTO, SalesPredictWhat if you don't have enough data and still want to make predictions? Small data brings a completely different set of problems than big data. Instead of dealing with scale and efficiency, the game here is to draw statistical significant results from very few noisy examples.Building and Deploying Large-scale Machine Learning Pipelines Using the Berkeley Data Analytics StackBen Recht, Assistant Professor, University of California, BerkeleyFocus on scalable computational tools for large-scale data analysis, statistical signal processing, and machine learning. Ben explores the intersections of convex optimization, mathematical statistics, and randomized algorithms.Learning About Music and ListenersBrian Whitman, Principal Scientist, SpotifyUnderstand how services such as Spotify merge machine-learning and knowledge-based approaches to music understanding with unprecedented amounts of user activity data to unlock the meaning of music taste and preference at a large scale.Statistical Topic ModelingHanna Wallach, Researcher & Professor, Microsoft Research NYC & University of Massachusetts AmherstUnderstand how this state-of-the-art machine-learning framework helps you analyze massive document collections. Statistical topic models automatically infer groups of semantically related words (topics) from word co-occurrence patterns in documents without human intervention.The Aha! Moment: From Data to InsightDafna Shahaf, Postdoctoral Fellow, Stanford UniversityLarge-scale data has potential to transform almost every aspect of our world, from science to business. But for this potential to be realized, we must turn data into insight. In this talk, Dafna will describe two of his efforts to address this problem computationally. Show and hide more Publisher resources View/Submit Errata
- Introduction - Hardcore Data Science NYC 2014 - Ben Lorica 00:01:07
- Doing the Impossible (Almost) - Ted Dunning 00:24:22
- Tupleware: Redefining Modern Analytics - Tim Kraska 00:29:05
- Data Science for Humans, Not Robots - Alice Zheng 00:22:39
- Big Data: Efficient Collection and Processing - Anna Gilbert 00:42:22
- Computational Problems in Managing Social Information - Jon Kleinberg 00:51:26
- Small Data Problems - Kira Radinsky 00:23:26
- Building and Deploying Large-scale Machine Learning Pipelines Using the Berkeley Data Analytics Stack - Ben Recht 00:28:04
- Learning About Music and Listeners - Brian Whitman 00:29:59
- Statistical Topic Modeling - Hanna Wallach 00:28:48
- The Aha! Moment: From Data to Insight - Dafna Shahaf 00:26:32
Show and hide more 9781491931073.hardcore.data.science.OR.part1.rar
9781491931073.hardcore.data.science.OR.part2.rar
9781491931073.hardcore.data.science.OR.part3.rar