YSU RL Midterm

Created
Created
2024 Mar 13 1:13
Creator
Creator
Seonglae ChoSeonglae Cho
Editor
Edited
Edited
2024 Oct 25 22:16
Refs
Refs

The exam comprises 25 sub-problems, including 8 multiple-choice questions, 10 short-answer questions, 4 simple calculation questions, and 3 proof questions.

He will ask about how to interpret the result and effect of implementation but don’t have to code
  • Time: 10:05-11:45am (100 minutes)
  • Location: The exam will be held in the same classroom as usual (D504)
  • Coverage: The midterm will encompass material covered in lectures until this Wednesday (April 17th) + HW1 + HW2 (so, no questions related to offline RL in this midterm)
  • Question types ("rough" distribution): Multiple-choice questions (~50%), short writing questions (~30%), proof/derivation/calculation questions (~20%)

Imitation Learning

  • DAgger
    Expert policy가 필요한 단점

Policy Gradient Theorem
*

  • Policy Gradient Baseline
    * if and are independent
    • expectation
      notion image
      variance
      notion image
      you cannot simply use state-action dependent baseline for unbiased policy gradient estimates.
      data unbiased → baseline 빼도 그대로라 우측항 고정
 

Actor Critic

  • PPO
    • larger GAE → larger larger and then larger
 
notion image
 

Value-Based Learning

 
 
 

실제 시험

객관식 multiple answers
  • 코드 문제 좀 나옴 특히 객관식 완성하기 hw 코드 채워넣기 loss부분
  • DDPG
    how to enable continuous from
    DQN
    • target network, actor critic
  • GAE implementation multiple answers
  • PPO loss how to compute logp in pytorch sum only without mean
  • PPO ratio equation
    • pi / pi (correct)
    • log pi / log pi (false)
    • (exp log pi / exp log pi) numerically unstable to devide probability directly
    • exp (log pi / log pi) (correct)
  • 3 limitation
    • behavior cloning
      • imitation learning
        • write down dagger’s 4 steps
        No offline RL questions in this midterm! 그냥 강의안 다 봐라
         
         
         
         
         
         
         
         
         
         

        Recommendations