o1

Creator

Creator

Seonglae Cho

Created

Created

2024 Nov 27 21:20

Editor

Editor

Seonglae Cho

Edited

Edited

2024 Dec 21 14:54

Refs

Refs

o1 uses a RL environment where reasoning steps are actions, previous tokens are observations, and reward is the solution's correctness.

Implementation

Marco-o1
AIDC-AI • Updated 2024 Dec 21 11:58

The Problem with Reasoners | Aidan McLaughlin

Over the next 5 months, the AI industry will pivot entirely from building larger models to building better reasoners. Unfortunately, this project is doomed and will not scale past human-level intelligence in ways you should care about. Let’s talk about why.

The Problem with Reasoners | Aidan McLaughlin

https://aidanmclaughlin.notion.site/reasoners-problem

The Problem with Reasoners | Aidan McLaughlin

Learning to Reason with LLMs

We are introducing OpenAI o1, a new large language model trained with reinforcement learning to perform complex reasoning. o1 thinks before it answers—it can produce a long internal chain of thought before responding to the user.

Learning to Reason with LLMs

https://openai.com/index/learning-to-reason-with-llms/

Learning to Reason with LLMs

QwQ Alibaba

QwQ: Reflect Deeply on the Boundaries of the Unknown

GITHUB HUGGING FACE MODELSCOPE DEMO DISCORD Note: This is the pronunciation of QwQ: /kwju:/ , similar to the word “quill”. What does it mean to think, to question, to understand? These are the deep waters that QwQ (Qwen with Questions) wades into. Like an eternal student of wisdom, it approaches every problem - be it mathematics, code, or knowledge of our world - with genuine wonder and doubt. QwQ embodies that ancient philosophical spirit: it knows that it knows nothing, and that’s precisely what drives its curiosity.

QwQ: Reflect Deeply on the Boundaries of the Unknown

https://qwenlm.github.io/blog/qwq-32b-preview/

Qwen/QwQ-32B-Preview · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Qwen/QwQ-32B-Preview · Hugging Face

https://huggingface.co/Qwen/QwQ-32B-Preview

Qwen/QwQ-32B-Preview · Hugging Face

Recommendations

///////////