1. Introduction; 2. Model-based clustering: basic ideas; 3. Dealing with difficulties; 4. Model-based classification; 5. Semi-supervised clustering and classification; 6. Discrete data clustering; 7. Variable selection; 8. High-dimensional data; 9. Non-Gaussian model-based clustering; 10. Network data; 11. Model-based clustering with covariates; 12. Other topics; List of R packages; Bibliography; Index.
Colorful example-rich introduction to the state-of-the-art for students in data science, as well as researchers and practitioners.
Charles Bouveyron is Full Professor of Statistics at Université Côte d'Azur and the Chair of Excellence in Data Science at Institut National de Recherche en Informatique et en Automatique (INRIA), Rocquencourt. He has published extensively on model-based clustering, particularly for networks and high-dimensional data. Gilles Celeux is Director of Research Emeritus at Institut National de Recherche en Informatique et en Automatique (INRIA), Rocquencourt. He is one of the founding researchers in model-based clustering, having published extensively in the area for thrity-five years. T. Brendan Murphy is Full Professor in the School of Mathematics and Statistics at University College Dublin. His research interests include model-based clustering, classification, network modeling and latent variable modeling. Adrian E. Raftery is the Boeing International Professor of Statistics and Sociology at the University of Washington. He is one of the founding researchers in model-based clustering, having published in the area since 1984.
'Bouveyron, Celeux, Murphy, and Raftery pioneered the theory,
computation, and application of modern model-based clustering and
discriminant analysis. Here they have produced an exhaustive yet
accessible text, covering both the field's state of the art as well
as its intellectual development. The authors develop a unified
vision of cluster analysis, rooted in the theory and computation of
mixture models. Embedded R code points the way for applied readers,
while graphical displays develop intuition about both model
construction and the critical but often-neglected estimation
process. Building on a series of running examples, the authors
gradually and methodically extend their core insights into a
variety of exciting data structures, including networks and
functional data. This text will serve as a backbone for graduate
study as well as an important reference for applied data scientists
interested in working with cutting-edge tools in semi- and
unsupervised machine learning.' John S. Ahlquist, University of
California, San Diego
'This book, written by authoritative experts in the field, gives a
comprehensive and thorough introduction to model-based clustering
and classification. The authors not only explain the statistical
theory and methods, but also provide hands-on applications
illustrating their use with the open-source statistical software R.
The book also covers recent advances made for specific data
structures (e.g. network data) or modeling strategies (e.g.
variable selection techniques), making it a fantastic resource as
an overview of the state of the field today.' Bettina Grün,
Johannes Kepler Universität Linz, Austria
'Four authors with diverse strengths nicely integrate their
specialties to illustrate how clustering and classification methods
are implemented in a wide selection of real-world applications.
Their inclusion of how to use available software is an added
benefit for students. The book covers foundations, challenging
aspects, and some essential details of applications of clustering
and classification. It is a fun and informative read!' Naisyin
Wang, University of Michigan
'This is a beautifully written book on a topic of fundamental
importance in modern statistical science, by some of the leading
researchers in the field. It is particularly effective in being an
applied presentation - the reader will learn how to work with real
data and at the same time clearly presenting the underlying
statistical thinking. Fundamental statistical issues like model and
variable selection are clearly covered as well as crucial issues in
applied work such as outliers and ordinal data. The R code and
graphics are particularly effective. The R code is there so
you know how to do things, but it is presented in a way that does
not disrupt the underlying narrative. This is not easy to do. The
graphics are 'sophisticatedly simple' in that they convey complex
messages without being too complex. For me, this is a 'must have'
book.' Rob McCulloch, Arizona State University
'This advanced text explains the underlying concepts clearly and is
strong on theory … I congratulate the authors on the theoretical
aspects of their book, it's a fine achievement.' Antony Unwin,
International Statistical Review
'In my opinion, the overall quality of this impactful and
intriguing book can be expressed by concluding that it is a perfect
fit to the Cambridge Series in Statistical and Probabilistic
Mathematics, characterized as a series of high-quality
upper-division textbooks and expository monographs containing
applications and discussions of new techniques while emphasizing
rigorous treatment of theoretical methods.' Zdenek Hlavka,
MathSciNet
'… this book not only gives the big picture of the analysis of
clustering and classification but also explains recent
methodological advances. Extensive real-world data examples and R
code for many methods are also well summarized. This book is highly
recommended to students in data science, as well as researchers and
data analysts.' Li-Pang Chen, Biometrical Journal
'Model-Based Clustering and Classification for Data Science: With
Applications in R, written by leading statisticians in the field,
provides academics and practitioners with a solid theoretical and
practical foundation on the use of model-based clustering methods …
this book will serve as an excellent resource for quantitative
practitioners and theoreticians seeking to learn the current state
of the field.' C. M. Foley, Quarterly Review of Biology
'This book frames cluster analysis and classification in terms of
statistical models, thus yielding principled estimation, testing
and prediction methods, and sound answers to the central questions
… Written for advanced undergraduates in data science, as well as
researchers and practitioners, it assumes basic knowledge of
multivariate calculus, linear algebra, probability and statistics.'
Hans-Jürgen Schmidt, zbMATH
![]() |
Ask a Question About this Product More... |
![]() |