This course offers a deep dive into the world of model quantization, specifically focusing on its application in Large Language Models (LLMs). It is tailored for students, professionals, and enthusiasts interested in machine learning, natural language processing, and the optimization of AI models for various platforms. The course covers fundamental concepts, practical methodologies, various frameworks, and real-world applications, providing a well-rounded understanding of model quantization in LLMs. Course Objectives: Understand the basic principles and necessity of model quantization in LLMs. Explore different types and methods of model quantization, such as post-training quantization, quantization-aware training, and dynamic quantization. Gain proficiency in using major frameworks like PyTorch, TensorFlow, ONNX, and NVIDIA TensorRT for model quantization. Learn to evaluate the performance and quality of quantized models in real-world scenarios. Master the deployment of quantized LLMs on both edge devices and cloud platforms. Course Structure: Lecture 1: Introduction to Model Quantization Overview of model quantization Significance in LLMs Basic concepts and benefits Lecture 2: Types and Methods of Model Quantization Post-training quantization Quantization-aware training Dynamic quantization Comparative analysis of each type Lecture 3: Frameworks for Model Quantization PyTorch's quantization tools TensorFlow and TensorFlow Lite ONNX quantization capabilities NVIDIA TensorRT's role in quantization Lecture 4: Evaluating Quantized Models Performance metrics: accuracy, latency, and throughput Quality metrics: perplexity, BLEU, ROUGE Human evaluation and auto-evaluation techniques Lecture 5: Deploying Quantized Models Strategies for edge device deployment Cloud platform deployment: OpenAI and Azure OpenAI Trade-offs, benefits, and challenges in deployment Target Audience: AI and Machine Learning enthusiasts Data Scientists and Engineers Students in Computer Science and related fields Professionals in AI and NLP industries
TO MAC USERS: If RAR password doesn't work, use this archive program:
RAR Expander 0.8.5 Beta 4 and extract password protected files without error.
TO WIN USERS: If RAR password doesn't work, use this archive program:
Latest Winrar and extract password protected files without error.