COMP0187 Final

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 Nov 12 14:0
Editor
Edited
Edited
2025 Jul 3 9:49
Refs
Refs
midterm 이후 범위 Lecture 5-10
I strongly encourage you to work on the slides and the problem sheets from the TA sessions, make sure you understand all concepts and are familiar with all important derivations. Bear in mind that exams are designed to assess whether you understood what happened in the module (threshold for passing the module) -- the more fluent you are with the concepts and formulas, the higher your final mark.

Lecture 5

  • Independent implies uncorrelated: True
  • Independent implies correlated: False
  • Dependent implies uncorrelated: False
  • Dependent implies correlated: False
  • Correlated implies dependent: True
  • Correlated implies independent: False
  • Uncorrelated implies dependent: False
  • Uncorrelated implies independent: False

Transfer Formula
Law of the Unconscious Statistician

If has
Probability Density Function
and is any function for which the integral exists, then
Marginalization

Multivariate Gaussian

Lecture 6

Lecture 7

Graphical Model

In a directed graph, if a node sends an arrow, it goes into the conditional part, and if it receives an arrow, it goes into the marginal part. Multiplication is done as many times as the number of nodes. When , and are predecessors of , while is not a parent of (only B is). In other words, .
where denotes the number of nodes in the graph. For example when , .
where denotes the set of
Clique
s, and each factor is a non-negative function over the clique . Note that is called the partition function. For example when , .
Without
Collider
, every directional graphical model is identical.
Note that the direction of arrows indicates causality, not the direction of inference
  1. The law of Large number
    1. If variable is sum not average, the variance and mean is multiplied by the same sample count
  1. Central Limit Theorem

MLE

  • Give the MLE for the coin toss problem

Lecture 8

use the posterior distribution to represent that uncertainty unlike above only estimate optimal points.
Lets start with the
Posterior Predictive Distribution can be represented as
We can use above Joint distribution
Assuming of and given
And then we got the first form

Conjugacy is the fact that a pair of prior and likelihood results in a closed form posterior, and we say that the prior is conjugate to the likelihood. updating posterior after observing new data is easier.
  • Give a unified definition of the inference problem, and two instantiations: MLE and ERM
    • Why is regularisation useful?
      • What is the CDF of a random variable X?
      • Explain Bayesian conjugacy in simple terms – give a couple examples
      • Explain inverse transform sampling. Prove it works

      Lecture 9

      Leveraging the fact that geometric Area is the probability
       

      Lecture 10

      • What is the rejection sampling algorithm? How many steps on average are needed to produce one sample?
      • State the elementary Monte Carlo identity.
      • What is the variance of the Monte Carlo estimate of an integral?
      • How can you sample from N(10, 5) if you only have a laptop able to sample N(−1, 1)?
      • Stochasticity trick: how can you numerically evaluate the integral
      • Explain what the symmetric Metropolis-Hastings algorithm is.
      MAP vs MLE
      There is no right answer! but Bayesians prefer MAP, since priors allow us to include our prior knowledge in the estimation. The two become equal for large data.

      Specifically designed for large dimensions and a widely used special case of the when . Then the acceptance probability simplifies to 1
      1. Set
      1. For each in
        1. Set
         

        The entropy of a probability distribution can be interpreted as a measure of uncertainty, or lack of predictability.

        This is the expected number of needed to compress some data samples drawn from using a code based on distribution .
         
         
         

        Recommendations