YSU NLP HW

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2024 May 13 6:43
Editor
Edited
Edited
2024 Jun 23 13:1
Refs
Refs
실험이 많이 없다보기 introduction에 용두사미 되지 않도록 뒤에 말할 내용을 고려하면서 introduction 적기
코드 복붙은 안되고
  • Abstract: Describe your work from a high-level perspective.
  • Introduction (5 paragraphs): Present the context and significance of your work. Outline the problem and motivation. Summarize your contributions.
    • in context learning
    • tool learning
    • dataset
    • code generation ability
  • Related Work (2~3 paragraphs): Discuss previous work, such as CoT. You may use up to 2 pages for the abstract, introduction, and related work sections.
    • pseudo code
    • zero shot cot
  • Method (1~1.5 pages): Explain the PoT method in detail. Include the motivation and the methodology.
  • Experiment (0.5~1 pages): Detail your experimental settings. Describe the benchmarks used and the baselines compared with your method.
  • Results & Analyses (1~2 pages): Present your results. Provide a detailed analysis of your findings.
    • induction head

전략

  • introduction paraphrase 하면서 다시적기 내가 한 것들
  • 주석 추가해가며 induction head 등 anthropic하고 in context learning쪽 좀 더 상세히 설명
  • related work cot zero shot cot pot 수도코드 등 적으면 될듯
  • 메소드는 비슷하게 적고
  • experiment는 내거 내용으로 적기 나머지도 마찬가지로

Tip

  • figure 복붙 말고 수치 긁어와 새로 그리기
  • result 가져올 때는 figure 나 table에 caption에서 cite하기
  • Include two additional experiments (excluding the experiment done in the previous lab session)conducted by you.
    • 1) reasoning with PoT-generated Python code without executing the generated Python code
    • 2) qualitative analysis on when PoT fails.
  • Include three original experiments.
 
 
 

주요 내용

  • 왜 zero shot 이 더 좋은지 (coding ability 에 비해 주어진 코드가 좋지 않을 때 (zero shot과 3 shot 모델 사이즈에 따라 비교) - coverage 한계
  • 왜 pot가 cot보다 좋은 지 model의 code generation ability 능력에 비교하여 (x 축 모델 사이즈에 y축 성능차로 dot 표현)
 

Future works, related

  • pot 수도코드
  • tool learning
 

추가실험

  • 다양한 데이터셋
 
 

Results

8b는 수정필요 할숟도
Model: gemma-7b result_3shot_direct.json: Accuracy = 0.10 result_3shot_cot.json: Accuracy = 0.54 result_3shot_pot.json: Accuracy = 0.38 result_0shot_direct.json: Accuracy = 0.06 result_0shot_cot.json: Accuracy = 0.46 result_0shot_pot.json: Accuracy = 0.48 Model: llama3-8b result_3shot_direct.json: Accuracy = 0.16 result_3shot_cot.json: Accuracy = 0.72 result_3shot_pot.json: Accuracy = 0.64 result_0shot_direct.json: Accuracy = 0.18 result_0shot_cot.json: Accuracy = 0.60 result_0shot_pot.json: Accuracy = 0.68 Model: llama3-70b result_3shot_direct.json: Accuracy = 0.44 result_3shot_cot.json: Accuracy = 0.88 result_3shot_pot.json: Accuracy = 0.80 result_0shot_direct.json: Accuracy = 0.40 result_0shot_cot.json: Accuracy = 0.78 result_0shot_pot.json: Accuracy = 0.84 Model: gpt-3.5 result_3shot_direct.json: Accuracy = 0.32 result_3shot_cot.json: Accuracy = 0.62 result_3shot_pot.json: Accuracy = 0.68 result_0shot_direct.json: Accuracy = 0.26 result_0shot_cot.json: Accuracy = 0.62 result_0shot_pot.json: Accuracy = 0.72 Model: gpt-4o result_3shot_direct.json: Accuracy = 0.58 result_3shot_cot.json: Accuracy = 0.90 result_3shot_pot.json: Accuracy = 0.94 result_0shot_direct.json: Accuracy = 0.64 result_0shot_cot.json: Accuracy = 0.80 result_0shot_pot.json: Accuracy = 1.00
 
 

Tables

 
2개 그림 비교하면서 coding ability 는 중요했지만 더 중요한건
in context learning인데 이 in context learning이 emerging한다는 건 anthropic 연구하고도 동일하다
 
 
 
 
arxiv.org
www.overleaf.com
 
 

Recommendations