실험이 많이 없다보기 introduction에 용두사미 되지 않도록 뒤에 말할 내용을 고려하면서 introduction 적기
코드 복붙은 안되고
- Abstract: Describe your work from a high-level perspective.
- Introduction (5 paragraphs): Present the context and significance of your work. Outline the problem and motivation. Summarize your contributions.
- in context learning
- tool learning
- dataset
- code generation ability
- Related Work (2~3 paragraphs): Discuss previous work, such as CoT. You may use up to 2 pages for the abstract, introduction, and related work sections.
- pseudo code
- zero shot cot
- …
- Method (1~1.5 pages): Explain the PoT method in detail. Include the motivation and the methodology.
- Experiment (0.5~1 pages): Detail your experimental settings. Describe the benchmarks used and the baselines compared with your method.
- Results & Analyses (1~2 pages): Present your results. Provide a detailed analysis of your findings.
- induction head
전략
- introduction paraphrase 하면서 다시적기 내가 한 것들
- 주석 추가해가며 induction head 등 anthropic하고 in context learning쪽 좀 더 상세히 설명
- related work cot zero shot cot pot 수도코드 등 적으면 될듯
- 메소드는 비슷하게 적고
- experiment는 내거 내용으로 적기 나머지도 마찬가지로
Tip
- figure 복붙 말고 수치 긁어와 새로 그리기
- result 가져올 때는 figure 나 table에 caption에서 cite하기
- Include two additional experiments (excluding the experiment done in the previous lab session)conducted by you.
- 1) reasoning with PoT-generated Python code without executing the generated Python code
- 2) qualitative analysis on when PoT fails.
- Include three original experiments.
주요 내용
- 왜 zero shot 이 더 좋은지 (coding ability 에 비해 주어진 코드가 좋지 않을 때 (zero shot과 3 shot 모델 사이즈에 따라 비교) - coverage 한계
- 왜 pot가 cot보다 좋은 지 model의 code generation ability 능력에 비교하여 (x 축 모델 사이즈에 y축 성능차로 dot 표현)
Future works, related
- pot 수도코드
- tool learning
추가실험
- 다양한 데이터셋
Results
8b는 수정필요 할숟도
Model: gemma-7b result_3shot_direct.json: Accuracy = 0.10 result_3shot_cot.json: Accuracy = 0.54 result_3shot_pot.json: Accuracy = 0.38 result_0shot_direct.json: Accuracy = 0.06 result_0shot_cot.json: Accuracy = 0.46 result_0shot_pot.json: Accuracy = 0.48 Model: llama3-8b result_3shot_direct.json: Accuracy = 0.16 result_3shot_cot.json: Accuracy = 0.72 result_3shot_pot.json: Accuracy = 0.64 result_0shot_direct.json: Accuracy = 0.18 result_0shot_cot.json: Accuracy = 0.60 result_0shot_pot.json: Accuracy = 0.68 Model: llama3-70b result_3shot_direct.json: Accuracy = 0.44 result_3shot_cot.json: Accuracy = 0.88 result_3shot_pot.json: Accuracy = 0.80 result_0shot_direct.json: Accuracy = 0.40 result_0shot_cot.json: Accuracy = 0.78 result_0shot_pot.json: Accuracy = 0.84 Model: gpt-3.5 result_3shot_direct.json: Accuracy = 0.32 result_3shot_cot.json: Accuracy = 0.62 result_3shot_pot.json: Accuracy = 0.68 result_0shot_direct.json: Accuracy = 0.26 result_0shot_cot.json: Accuracy = 0.62 result_0shot_pot.json: Accuracy = 0.72 Model: gpt-4o result_3shot_direct.json: Accuracy = 0.58 result_3shot_cot.json: Accuracy = 0.90 result_3shot_pot.json: Accuracy = 0.94 result_0shot_direct.json: Accuracy = 0.64 result_0shot_cot.json: Accuracy = 0.80 result_0shot_pot.json: Accuracy = 1.00
Tables
2개 그림 비교하면서 coding ability 는 중요했지만 더 중요한건
in context learning인데 이 in context learning이 emerging한다는 건 anthropic 연구하고도 동일하다
arxiv.org
https://arxiv.org/pdf/2211.12588#page=15&zoom=100,62,452
www.overleaf.com
https://www.overleaf.com/project/6641aeea6b81b2c4099e871e
Seonglae Cho