In real-life practical coding scenarios, we primarily use multi-turn chat, making it crucial to improve multi-turn ability. The biggest issues arise from Compounding Error and misalignment between human perspective and AI understanding. Therefore, when iterating improvements, it's important to provide context dumps while clearly specifying the human user's context summary, identified problems, and focus areas. The bias introduced by this approach is relatively minimal compared to the frustration and inefficiency caused by misalignment.
Voice input is closer to real-life conversation and can enhance quality by delivering richer context compared to keyboard input.
Multi-Session Chat(MSC) Dataset
The authors provided annotated summary of each session and summarizer trained with the summaries.
In GPT agent-based experiments, repeated interactions consistently lead to belief convergence and diversity (entropy) reduction. Additionally, using Bayesian updates and trust matrices, researchers prove that when mutual trust exceeds a certain threshold, groups become overly confident in factually incorrect beliefs. In other words, the mutual feedback loop between humans and Large Language Models (LLMs) can reduce the diversity of user beliefs and lock in incorrect beliefs.
Multi turn conversation is the weak joint