Entropy & Data Compression


The module is available as a YouTube playlist and as a short course with quizzes on Learning Hub (syllabus shown to the right)

YouTube

Overview

Learn about information as data first proposed in Claude Shannon's groundbreaking work; an introduction to the concepts of entropy, data compression, and channels.

Week 1: The Building Blocks of Information
The first week of the entropy module starts the discussion of using bits as a measure of information theory and introduces the key concept of entropy, a value that captures the amount of chaos inherit to a specific random variable. The class then looks at the in encoding messages and two corresponding ideas: Kraft’s Inequality and Shannon’s First Theorem. Finally, the course bounds the efficiency of encoding schemes by calculating entropy and describing the solutions to a few example problems to illustrate the concepts presented in this class.

Week 2: Data Compression
The second week of the entropy module ends theoretical discussion of encoding messages and introduces frequently used methods of data compression. The lectures introduce Huffman Coding and work through several examples, while describing the advantages and limits of the compression scheme. The video briefly mentions Universal Coding as an example of Shannon’s First Theorem before introducing the Lempel-Ziv compression scheme. Upon completion of this week, students should be able to efficiently encode messages based upon the probability distribution of each messages and understand the tradeoff between efficiency and code length.

Week 3: Channels
The third week of the entropy module completes all material covered in the corresponding problem sets through a study of advanced entropy topics. The lectures note that any means of communication contains some probability of error and introduces a simple model of handling error probabilities, the binary symmetric channel. The videos work through examples of problems with binary symmetric channels before describing the differences in asymmetric channels. Finally, the module handles cases where certain information is already known a priori through conditional entropy and covers the similar topics of joint entropy and mutual information, for completeness.

Author

Mark Daniel Ward
Associate Professor
Statistics
Purdue University

Syllabus/Suggested Schedule

Bits as a Measure Part 1
Bits as a Measure Part 2
The Kraft Inequality and Code Length
Shannons First Theorem Part 1
Shannons First Theorem Part 2
Shannons First Theorem Part 3
Calculating Entropy Practice Problem
Calculating Entropy and Average Code Length Practice Problem
Huffman Coding Part 1
Huffman Coding Part 2
Huffman Coding Part 3
Universal Coding
Lempel-Ziv Coding
Information Channels Part 1
Information Channels Part 2
Information Channels Part 3
Information Channels Part 4
Information Channels Part 5
Information Channels Part 6
Information Channels Part 7
Information Channels Part 8