Chapter 1: Natural Language Basics.-Chapter Goal: Introduces the readers to the basics of NLP and Text processingNo of pages: 40 - 50 Sub -Topics1. Language Syntax and Structure2. Text formats and grammars3. Lexical and Text Corpora resources4. Deep dive into the Wordnet corpus5. Parts of speech, Stemming and lemmatizationChapter 2: Python Refresher for Text AnalyticsChapter Goal: A useful chapter for people who do not know python as well as for experienced people who can use it as a quick reference for useful commands and techniques for text processing using pythonNo of pages: 30 - 35Sub - Topics 1. Python data structures and constructs 2. Functions, conditionals and code flow3. Handling strings with Python4. Regular Expressions with Python5. Quick glance into nltk, gensim and patternChapter 3: Text Processing Chapter Goal: This chapter covers all the techniques and capabilities needed for processing and parsing text into easy to understand formats. We also look at how to segment and normalize text. No of pages : 35 - 40Sub - Topics: 1. Sentence and word tokenization2. Text tagging and chunking3. Text Parse Trees3. Text normalization4. Text spell checks and removal of redundant characters5. Synonyms and Synsets
Chapter 4: Text ClassificationChapter Goal: Introduces readers to the concept of classification as a supervised machine learning problem and looks at a real world example for classifying text documentsNo of pages: 40 - 45Sub - Topics: 1. Classification basics2. Types of classifiers3. Feature generation of text documents4. Types of feature generators5. Building a text classifier on real world data 6. Evaluating Classifiers7. Binary and multi-class classification models
Chapter 5: Text summarization and topic modelingChapter Goal: Introduces the concepts of text summarization, n-gram tagging analysis and topic models to the readers and looks at some real world datasets and hands-on implementations on the sameNo of pages: 40 - 45Sub - Topics: 1. Text summarization concepts2. Dimensionality reduction3. N-gram tagging models4. Topic modeling using LDA and LSA5. Generate topics from real world data6. N-gram analysis to generate patterns from app reviews
Chapter 6: Text Clustering and Similarity analysisChapter Goal: We look at unsupervised machine learning concepts here like text clustering and similarity measuresNo of pages: 35 - 40Sub - Topics: 1. Clustering concepts2. Analyzing text similarity3. Implementing text similarity with cosine, jaccard measures4. Text clustering algorithms5. Hands on text clustering on real world data
Chapter 7: Sentiment Analysis Chapter Goal: We look at solving a popular problem of analyzing sentiment from text using a combination of methods learnt earlier including classification and also lexical analysisNo of pages: 35 - 40Sub - Topics: 1. What is sentiment analysis2. Looking at lexical corpora for sentiment 3. Analyzing sentiment using lexical analysis (hands-on)4. Building a sentiment analysis classifier (hands-on)
Dipanjan Sarkar is a Data Scientist at Intel, the world's largest silicon company which is on a mission to make the world more connected and productive. He primarily works on Analytics, Business Intelligence, Application Development and building large scale Intelligent Systems. He received his master's degree in Information Technology from the International Institute of Information Technology, Bangalore with a focus on Data Science and Software Engineering. He is also an avid supporter of self-learning, especially Massive Open Online Courses and holds a Data Science Specialization from Johns Hopkins University on Coursera.He has been an analytics practitioner for over 4 years now specializing in statistical, predictive and text analytics. He has also authored a couple of books on R and Machine Learning and occasionally reviews technical books and acts as a course beta tester for Coursera. Dipanjan's interests include learning about new technology, financial markets, disruptive start-ups, data science and more recently, artificial intelligence and deep learning. In his spare time he loves reading, gaming and watching popular sitcoms and football.