This is a Text Mining course carefully crafted using Text Mining techniques! Let me elaborate. When I decided to teach a Text Mining course, I was wondering about the student expectations and their pain-points with current courses. What data source can provide this information? Reviews! I started leveraging course review data to answer some of the questions related to course content, student expectations, likes/dislikes, and their pain-points in completing online courses in Text Mining. This exercise was so valuable to my understanding of students like you that I thought of including it in my course. More on this in the course :) This is a "skill first" and "knowledge later" course. In this course, we will do a lot of hands-on coding together (you and I) and minimize use of power-point slides! I will use slides only to show some course outline and show the status as we progress through the course. I would take personal responsibility to ensure you gain the required knowledge and most importantly, master the skills you need to start building and deploying text mining applications. I truly believe that this "skill first" approach will be highly engaging for you! This is not a traditional style of teaching a course! This course is based on live-coding sessions to convey fundamental ideas of text mining. I will derive each and every concept by hand and show it's working using python programs implemented during the course of your study. You can implement these ideas along with me and thereby gain a deeper sense of text mining ideas empowering you to build your own products using text mining. You will build a search engine and text summarization tool in this course from scratch (we may use some support e.g., stopwords are already available from NLTK library, we need not reinvent it). This level of depth can be achieved only by sacrifices :) Don't worry, you don't have to sacrifice your weekends yet! It's just a sacrifice of learning about popular libraries for processing text -- this is something that I will not be covering in this course. How does this course impart the skills you need? I strongly believe that projects/practice is the only way to mastery of any skill and yet, it is so underutilized in teaching! This course has minimal power-point presentations and will focus entirely on practice right from the beginning instead of waiting for assignments and projects at the end (hence, no assignments in this course). This is the only course I know which is crafted using text mining techniques -- a great real-world example of the power of text mining to directly address the preferences of students taking text mining courses. What will you learn in this course? Introduction: You will get a general introduction to the course structure and teaching style of the course. Unstructured Data: You will learn about motivational examples of the power of unstructured data and challenges in processing it. Python Programming Primer: You will learn basic programming constructs you need to follow along the course. You can use this section to understand the basics preparing yourself to learn advanced Python to write production quality code. Text Mining Basics: You will learn the basics of text processing, document representation using vector space model, and ranking documents for a given query. You will learn to implement these algorithms in Python. Build a Search Engine: You will build your own search engine using all the implementation you did in the previous section. Your search engine will be wrapped as a data service for potential deployment as a product. You will also have the option of adding a user search interface to your search engine! Deploy your Text Mining Application: You will go from a student skillful in text mining to a professional with skills to build real-world applications and services using text mining skills you have picked up in this course. Build a Text Summarization Tool: You will learn basic text summarization techniques that are crucial to explore large document collection and implement code to create a tag-cloud in Python. You will also use state-of-the-art work from NLP on embeddings to cluster custom course review data Who should avoid taking this course? I truly value your time and want to be upfront on the course offering. Students expecting a knowledge first approach may not find this course valuable, i.e., I will not present a comprehensive broad view of text mining instead, I will dig deeper into the basics of text mining Students who don't prefer to code and build systems -- In almost every video in this course, after explaining the key ideas, we will write code together to internalize text mining ideas.
TO MAC USERS: If RAR password doesn't work, use this archive program:
RAR Expander 0.8.5 Beta 4 and extract password protected files without error.
TO WIN USERS: If RAR password doesn't work, use this archive program:
Latest Winrar and extract password protected files without error.