本视频提到论文: A General Language Assistant as a Laboratory for Alignment arxiv.org/pdf/2112.00861v3.pdf Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback arxiv.org/pdf/2204.05862v1.pdf In-context Learning and Induction Heads arxiv.org/pdf/2209.11895v1.pdf Measuring Progress on Scalable Oversight for Large Language Models arxiv.org/pdf/2211.03540v2.pdf The Capacity for Moral Self-Correction in Large Language Models arxiv.org/pdf/2302.07459v2.pdf Constitutional AI: Harmlessness from AI Feedback arxiv.org/pdf/2212.08073v1.pdf