Shannon entropy

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2023 Jun 1 6:9
Editor
Edited
Edited
2025 Oct 10 12:21

Informational Entropy

A characteristic value that represents the shape of probability distribution and amount of information. It measures of information content; the number of bits actually required to store data; How random it is, How broad it is (Uniform distribution has maximum entropy)
The information content of a message is a function of how predictable it is. The information content (number of bits) needed to encode i is . So
Next Token Prediction
probability is containing information content itself.
The entropy of a message is the expected number of bits needed to encode it. (
Shannon entropy
)

Average Information Content

The entropy of a probability distribution can be interpreted as a measure of uncertainty, or lack of predictability

Information Content of Individual Events

Information content must be additive, meaning the total information content equals the sum of individual event information
Boltzmann–Gibbs Entropy
= Shannon entropy (up to constant scaling) in
Information Theory
 
 
 
 
 

Recommendations