Hands-On Data Science with R
Dr Graham Williams, PhD (ANU, Machine Learning), BSc (Maths, Hons)
Chief Data Scientist, Togaware and Australian Taxation Office
Adjunct Professor, Australian National University and University of Canberra
International Visiting Professor, Chinese Academy of Sciences
Our goal here is to provide introductory material to cost effectively kick start an organization's entry into Data Science.
In addition to the extensive material available here on our web site we provide a unique offering of in-situ hands-on training. We offer traditional out-of-office training courses, but we find more effective learning can occur hands-on in-situ. We offer one of the world's leading Data Scientists to work alongside and mentor your staff over one or two weeks. We work confidentially on actual projects, with training "on-the-job" provided by a professional with 30 years experience in the industry and author of the best selling book on Data Mining with Rattle and R. Contact Togaware Training at firstname.lastname@example.org for details.
Our on-line resources, including OnePageR, weaves together a collection of freely available and open source tools for the Data Scientist. The tools are all part of the R Statistical Software Suite. Each chapter is made up of multiple pages, but each page within a chapter is a one page guide that covers a particular aspect of the topic. They are a great place to start, before engaging our hands-on training experts.
The OnePageRs can be worked through as a hands-on guide and then used as a reference guide. Each page aims to be a bite sized chunk for hands-on learning, building on what has gone before. Many chapters also have a lecture pack and a laboratory session where a number of tasks can be completed. The R code sitting behind each OnePageR chapter is also provided and can be easily run standalone to replicate the material presented in the chapter.
The material is always under development! Chapters will change (and hopefully improve) regularly. Links preceded with a * are more well developed. All of the material is provided under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License allowing access to everyone for any purpose (except commercial) and is provided at no cost. You can assist in helping cover the costs of providing this material through a $40 contribution using PayPal. Your support encourages further development of this resource as does feedback, suggestions, and ideas, which are always welcome.
Refer to the Data Mining Survival Guide or my book on Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery (Use R) for related material.
Many of the initial chapters were developed and tested whilst visiting the Shenzhen Institutes of Advanced Technology as an International Visiting Professor of the Chinese Academy of Sciences.
The data used across the chapters is available for download as data.zip.
- Getting Started as a Data Scientist
- An Introduction to Data Mining: *Lecture
- Introducing Data Science, Rattle and R: *Lecture - *OnePageR - *R
- Rattle to R: *OnePageR - *R
- R for the Eager Data Scientist
- A Template for Preparing Data: *OnePageR - *R
- A Template for Building Models: *OnePageR - *R
- Case Studies: *OnePageR - *R
- Basic R Tips and Tricks OnePageR - R
- Dealing With Data
- Reading Data into R: *OnePageR - *R
- Exploring and Summarising Data: *OnePageR - *R
- Visualising Data with GGPlot2: *OnePageR - *R
- Transforming Data: *OnePageR - *R
- Descriptive Analytics
- Predictive Analytics
- Decision Trees: *Lecture - *OnePageR - *R - *Rattle
- Ensembles of Decision Trees: *Lecture - *OnePageR - *R
- Support Vector Machines
- Neural Networks
- Naive Bayes: OnePageR - R
- Evaluating Models: *OnePageR - *R
- Scoring (R)
- PMML (R) Exporting Models for Deployment
- Advanced Analytics
- Advanced R
- Strings: OnePageR, R
- Dates and Time: *OnePageR - *R
- Spatial Data *OnePageR - *R
- Big Data *OnePageR - *R
- Exploring Different Plots: OnePageR - R
- Writing Functions: OnePageR - R
- Parallel Processing: OnePageR - R
- Environments: *OnePageR - R
- Expert R
Other great resources for modular approaches to learning R include:
Other Togaware resources:
- CUNY NSF Workshop - March 2014 (PDF)
- AusDM-2013 Tutorial - November 2013
- IDEAL-2013 Tutorial - October 2013
Local package archive:
- install.packages("rattle", repos="http://rattle.togaware.com", type="source")
- install.packages("wsrf", repos="http://rattle.togaware.com", type="source")
- install.packages("wsrpart", repos="http://rattle.togaware.com", type="source")
- install.packages("wskm", repos="http://rattle.togaware.com", type="source")
OnePageR by Graham
Williams is Copyright © 2012 – 2014 Togaware Pty Ltd
Licensed under a Creative Commons Attribution-Non-Commercial-ShareAlike 4.0 International License.
This site is hosted in the cloud by Web Faction.
Last Modified 2014-07-27 19:10:34 gjw
Shop at Amazon