REINFORCEMENT
Monte-Carlo policy gradient with model-free approach

Policy Gradient Learning Methods
[HUFS RL] 강화학습 : Reinforcement Learning: Policy Gradient (REINFORCEMENT)
강화학습 정의 : 주어진 환경(environment)에서 에이전트(Agent)가 최대 보상(Reward)를 받을 수 있는 활동(Action)을 할 수 있도록 Policy를 학습하는 것! 환경(Environemt) : 에이전트가 액션을 취하는 환경을 말합니다. 슈퍼마리
https://velog.io/@uonmf97/Reinforcement-Learning-Policy-Gradient-REINFORCEMENT
![[HUFS RL] 강화학습 : Reinforcement Learning: Policy Gradient (REINFORCEMENT)](https://www.notion.so/image/https%3A%2F%2Fvelog.velcdn.com%2Fimages%2Fuonmf97%2Fpost%2F91d613b1-d2bb-4101-8139-1c1694120afe%2FScreen%2520Shot%25202022-02-23%2520at%25207.54.41%2520PM.png?table=block&id=5ce1ee39-1500-42fe-a69b-c7ecefbf51d5&cache=v2)

Seonglae Cho