This course is the gateway to becoming a data scientist and offers the knowledge of various statistical techniques referred to as Exploratory Data Analysis (EDA). Today’s data scientists are expected to be programmers or application developers. This course will deliver both the coverage of the necessary EDA statistics and of the programming/visualization environment provided by the Python programming language and the Apache Zeppelin Integrated Development Environment (IDE).
This course provides, through lecture and lab, key concepts from statistics that are relevant to data science.
Object-Oriented Meets Functional
Construct elegant class hierarchies for maximum code reuse and extensibility, implement their behavior using higher-order functions — or, anything in-between.
Users include: Twitter and Foursquare
Apache Zeppelin is an open-source, web-based notebook and Integrated Development Environment (IDE) that enables data-driven, interactive data analytics and collaborative documents with Scala and as well as other programming languages and frameworks.
Build a sustainable, scalable solution that enables machine learning, data mining and predictive learning.
Since 1950, computer systems have been developed to progressively improve performance on specific tasks without being explicitly programmed. Machine learning algorithms may be used to devise complex models and algorithms that lend themselves to prediction.
Users include: the credit card industry to evince and establish baseline behavioral profiles for various entities and then find meaningful anomalies like the fraudulent use of credit cards or identities.