“The enjoyment of one’s tools is an essential ingredient of successful work.” Donald E. Knuth

This book is undergoing conversion from LaTeX to Bookdown. It is a work in progress and there remains numerous glitches. Please bare with us.

Welcome to the world of Data Science. Data Science has come to describe many different activities that are driven by data. Indeed, we can think of Data Science as an ecosystem of both technology and experiences, shared through the freedom of open source—the freedom to do as you want with the software, with very few, and ideally no, restrictions, except the requirement to maintain our freedoms.

The aim of this book is to gently guide the novice along the pathway to Data Science, and to share the excitement of a fun and productive environment for exploring data. To do so we focus on the R language as our platform of choice, knowing that R remains the most flexible and powerful language developed specifically for the analysis of data.

This book provides a guide to the many different regions of the R platform, with a focus on doing what is required of the Data Scientist. It is comprehensive, beginning with basic support for the novice Data Scientist, moving into recipes for the variety of analyses we may find ourselves needing.

The book covers many of the core and advanced features of R. With knowledge comes the enjoyment to use one of the most comprehensive and useful platforms for Data Science in the history of computing.

The basis of our ecosystem is freedom: the freedom to choose, the freedom to learn, the freedom to change, the freedom to share, the freedom to contribute, the freedom to live, the freedom to enjoy, and the freedom to communicate through the code and tools we write. Never let that freedom be taken from us.

On completing an install of R (which may take only a few minutes) you are ready to explore your data. All of the different types of data analyses are covered in this book, including basic data ingestion, data cleaning and wrangling, data visualisation, modelling and evaluation in order to discover new knowledge from our data. Tools for developers of systems to be deployed are well represented, and indeed it is worth noting that serious Data Scientists use R as their platform of choice.

Your donation will support ongoing development and give you access to the PDF version of the book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Creative Commons Attribution-ShareAlike 4.0.