BlueDot Impact

AI Safety Fundamentals: Alignment

Découvrez des ressources précieuses du cours sur les Fondamentaux de la Sécurité de l'IA, offrant des idées et des connaissances sur les stratégies d'alignement pour l'intelligence artificielle.

We Need a Science of Evals

20 mins • Jan 2, 2025

Charts

193
Decreased by 2
Apple Podcasts – Kirghizistan – Technologies

Épisodes récents

Jan 2, 2025

We Need a Science of Evals

20 mins

Jan 2, 2025

Introduction to Mechanistic Interpretability

12 mins

Jul 19, 2024

Illustrating Reinforcement Learning from Human Feedback (RLHF)

S3 E2 • 23 mins

Jul 19, 2024

Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

S3 E4 • 32 mins

Jul 19, 2024

Constitutional AI Harmlessness from AI Feedback

S3 E2 • 62 mins

Langue

Anglais

Pays

Royaume-Uni

Catégories

Feed Host

Site web

agisafetyfundamentals.com

Flux

Flux RSS

Demander une mise à jour

Les mises à jour peuvent prendre quelques minutes.

AI Safety Fundamentals: Alignment

We Need a Science of Evals

Charts

Apple Podcasts – Kirghizistan – Technologies

Épisodes récents

We Need a Science of Evals

Introduction to Mechanistic Interpretability

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Constitutional AI Harmlessness from AI Feedback