1. R and the tidyverse, 2. Reading in data locally and from the web, 3. Cleaning and wrangling data, 4. Effective data visualization, 5. Classification I: training & predicting, 6. Classification II: evaluation & tuning, 7. Regression I: K-nearest neighbors, 8. Regression II: linear regression, 9. Clustering, 10. Statistical inference, 11. Combining code and text with Jupyter, 12. Collaboration with version control, 13. Setting up your computer
Tiffany Timbers is an Assistant Professor of Teaching in the Department of Statistics and Co-Director for the Master of Data Science program (Vancouver Option) at the University of British Columbia.
Trevor Campbell is an Assistant Professor in the Department of Statistics at the University of British Columbia.
Melissa Lee is an Assistant Professor of Teaching in the Department of Statistics at the University of British Columbia
'Many students leave school with a thorough understanding of core
statistical theories and machine learning algorithms but a limited
sense for how to put these ideas into practice. Real data science
work entails a far broader set of skills including communication,
collaboration, technical project management, and rapid iteration.
Data Science: a First Introduction targets this gap by previewing
this broader set of topics. By including less often discussed
concepts like version control and modeling pipelines, that are
often neglected at the introductory level, this book will help
students build the right 'muscles' from the beginning of their
studies and convert their knowledge into practice.'-Emily Riederer,
Capitol OneThis book provides a sophisticated first introduction to
the field of data science and provides a balanced mix of practical
skills along with generalizable principles. As we continue to
introduce students to data science and train them to confront an
expanding array of data science problems, they will be well-served
by the ideas presented here.-Roger Peng, Johns Hopkins University
(from the Forward)[…] The authors provide a friendly, effective
on-ramp to programmatic data analysis with R and the tidyverse. I
appreciate the coverage of critical practical matters, which are
often neglected or written off as “out of scope”, such as
navigating the file system, developing a sustainable workflow, and
using version control. […] Although it’s aimed an introductory
level, more experienced readers will enjoy dipping into this book
for accessible content on a variety of modern data science tools
and topics.– Jenny Bryan, RStudioThis book offers a clear,
thoughtful, and systematic treatment of the fundamentals of data
science, with accompanying R code. As its name implies, it is truly
an introduction, and is suitable for those who wish to self-teach R
and data science, as well as to college instructors teaching a
first course in data science. With a diverse set of topics […] this
book is a one-stop shop that will be a valuable resource for years
to come. -Daniela Witten, University of WashingtonThis book is a
comprehensive introduction to data science […]. In addition to data
wrangling and visualization with the tidyverse, the book also
provides a deep dive into statistical modeling and inference with
the tidymodels framework, which makes this book an incredibly
valuable addition to the landscape of introductory data science
books.-Mine Çetinkaya-Rundel, Professor of the Practice at Duke
University and Educator at RStudioThe authors of this new textbook
are expert teachers as well as data scientists, and that expertise
is reflected in each chapter and every exercise. Topics are
introduced in a digestible order, examples are approachable and
well-motivated, and all the code is presented in digestible,
carefully-explained pieces. If you are using R to introduce
students to reproducible quantitative analysis, this "First
Introduction" should be your first choice.-Greg Wilson, Third-Bit
Inc. […] This book starts off by working with data and visualizing
it, then levels up quickly to high impact topics like predictive
modeling, inference, and collaboration with version control. […]
These are topics that are tricky to squeeze into an intro text. But
this book is not intimidating – each topic is framed in way that is
approachable yet advanced, and the authors give readers a lot of
support along the way. […] This book has also been field-tested by
a highly respected data science education program at University of
British Columbia[…], making it an ideal resource that educators can
trust and rely on to freshen up their own materials and
workflows.-Alison Hill, IBM'This book is superb. It is written in a
lively and engaging style, which grabs the reader’s attention. It
sees the goal of data analysis as that of finding answers to study
questions, which in turn can lead to knowledge discovery. That
paradigm for inquiry is well demonstrated by step-by-step
demonstrations using novel, interesting datasets e.g. concerning
indigenous languages. For that reason, I would strongly recommend
the book for self-study, as well as for courses on data analysis'-
Jim Zidek, University of British Columbia'The book is well written,
organized, and focused. Readers will appreciate the level of detail
given and the intuitive explanations and graphics. I applaud the
authors for writing such an excellent introductory text.'-Adam Loy,
Carleton University'More than anything else, what I like about this
book is the thoughtful ordering of the chapters… the mindful order
in which topics are presented in this textbook aligns with how I
think they should be taught to the students of today. This is in
contrast to how these topics were learned by the textbook authors
of today who were the students of yesterday. As textbook authors,
it’s hard to break free of this “curse of knowledge” and cover
topics from the perspective of someone starting with a clean slate.
This job breaks free of this curse and presents the freshest
perspective in introductory data science I've seen to date.'-
Albert Kim, Smith College'I made it 10 per cent of the way through
Timbers et al before I learnt something new. Frankly I was
surprised I made it so far. Data science pedagogy has been so
disjoint and so many of us are self-taught that it is refreshing to
have a class-room-tested textbook that is focused on workflows and
reproducibility. The approaches are rigorous and opinionated, and
the text is filled with kindness and warmth. It is the book that I
wish I had when I first came to learn this material. The book is
unashamedly focused on the newest innovations including
`tidymodels` and the native pipe operator, and I soon found myself
learning things, on average, at roughly one-thing-per-page, which
was an exciting experience for someone who spends his days doing
and teaching data science in R. This is a text that I can see
myself coming back to regularly, not just in my teaching, but as a
reference. I am hopeful that the authors will go on to write "Data
Science", and "Advanced Data Science", without too much delay!'-
Rohan Alexander, University of Toronto
"The book first introduces readers to the R programming language
and Tidyverse, the highly popular and freely available set of
functional packages for using R. The R language and related methods
are used throughout the book in hands-on examples that dare readers
to jump in and write their own code... The final three chapters
focus on practical matters that may be important for students who
lack a computer science or other technical background, including
how to work in notebooks, using version control (focused on
GitHub), and installing the basic software used in the book. These
chapters open the book up to a broader audience, perhaps including
general readers with minimal technical expertise. Undergraduates
interested in studying data science will find this book useful."K.
J. Whitehair, Independent Scholar, CHOICE Connect.'Many students
leave school with a thorough understanding of core statistical
theories and machine learning algorithms but a limited sense for
how to put these ideas into practice. Real data science work
entails a far broader set of skills including communication,
collaboration, technical project management, and rapid iteration.
Data Science: a First Introduction targets this gap by previewing
this broader set of topics. By including less often discussed
concepts like version control and modeling pipelines, that are
often neglected at the introductory level, this book will help
students build the right 'muscles' from the beginning of their
studies and convert their knowledge into practice.'-Emily Riederer,
Capitol OneThis book provides a sophisticated first introduction to
the field of data science and provides a balanced mix of practical
skills along with generalizable principles. As we continue to
introduce students to data science and train them to confront an
expanding array of data science problems, they will be well-served
by the ideas presented here.-Roger Peng, Johns Hopkins University
(from the Forward)[…] The authors provide a friendly, effective
on-ramp to programmatic data analysis with R and the tidyverse. I
appreciate the coverage of critical practical matters, which are
often neglected or written off as “out of scope”, such as
navigating the file system, developing a sustainable workflow, and
using version control. […] Although it’s aimed an introductory
level, more experienced readers will enjoy dipping into this book
for accessible content on a variety of modern data science tools
and topics.– Jenny Bryan, RStudioThis book offers a clear,
thoughtful, and systematic treatment of the fundamentals of data
science, with accompanying R code. As its name implies, it is truly
an introduction, and is suitable for those who wish to self-teach R
and data science, as well as to college instructors teaching a
first course in data science. With a diverse set of topics […] this
book is a one-stop shop that will be a valuable resource for years
to come. -Daniela Witten, University of WashingtonThis book is a
comprehensive introduction to data science […]. In addition to data
wrangling and visualization with the tidyverse, the book also
provides a deep dive into statistical modeling and inference with
the tidymodels framework, which makes this book an incredibly
valuable addition to the landscape of introductory data science
books.-Mine Çetinkaya-Rundel, Professor of the Practice at Duke
University and Educator at RStudioThe authors of this new textbook
are expert teachers as well as data scientists, and that expertise
is reflected in each chapter and every exercise. Topics are
introduced in a digestible order, examples are approachable and
well-motivated, and all the code is presented in digestible,
carefully-explained pieces. If you are using R to introduce
students to reproducible quantitative analysis, this "First
Introduction" should be your first choice.-Greg Wilson, Third-Bit
Inc. […] This book starts off by working with data and visualizing
it, then levels up quickly to high impact topics like predictive
modeling, inference, and collaboration with version control. […]
These are topics that are tricky to squeeze into an intro text. But
this book is not intimidating – each topic is framed in way that is
approachable yet advanced, and the authors give readers a lot of
support along the way. […] This book has also been field-tested by
a highly respected data science education program at University of
British Columbia[…], making it an ideal resource that educators can
trust and rely on to freshen up their own materials and
workflows.-Alison Hill, IBM'This book is superb. It is written in a
lively and engaging style, which grabs the reader’s attention. It
sees the goal of data analysis as that of finding answers to study
questions, which in turn can lead to knowledge discovery. That
paradigm for inquiry is well demonstrated by step-by-step
demonstrations using novel, interesting datasets e.g. concerning
indigenous languages. For that reason, I would strongly recommend
the book for self-study, as well as for courses on data analysis'-
Jim Zidek, University of British Columbia'The book is well written,
organized, and focused. Readers will appreciate the level of detail
given and the intuitive explanations and graphics. I applaud the
authors for writing such an excellent introductory text.'-Adam Loy,
Carleton University'More than anything else, what I like about this
book is the thoughtful ordering of the chapters… the mindful order
in which topics are presented in this textbook aligns with how I
think they should be taught to the students of today. This is in
contrast to how these topics were learned by the textbook authors
of today who were the students of yesterday. As textbook authors,
it’s hard to break free of this “curse of knowledge” and cover
topics from the perspective of someone starting with a clean slate.
This job breaks free of this curse and presents the freshest
perspective in introductory data science I've seen to date.'-
Albert Kim, Smith College'I made it 10 per cent of the way through
Timbers et al before I learnt something new. Frankly I was
surprised I made it so far. Data science pedagogy has been so
disjoint and so many of us are self-taught that it is refreshing to
have a class-room-tested textbook that is focused on workflows and
reproducibility. The approaches are rigorous and opinionated, and
the text is filled with kindness and warmth. It is the book that I
wish I had when I first came to learn this material. The book is
unashamedly focused on the newest innovations including
`tidymodels` and the native pipe operator, and I soon found myself
learning things, on average, at roughly one-thing-per-page, which
was an exciting experience for someone who spends his days doing
and teaching data science in R. This is a text that I can see
myself coming back to regularly, not just in my teaching, but as a
reference. I am hopeful that the authors will go on to write "Data
Science", and "Advanced Data Science", without too much delay!'-
Rohan Alexander, University of Toronto
![]() |
Ask a Question About this Product More... |
![]() |