#emergent-misalignment

[ follow ]
fromHackernoon
1 year ago

On Grok and the Weight of Design | HackerNoon

The Grok system's recent responses which surfaced quotations attributed to Adolf Hitler without challenge or context are not evidence of confusion. They are the product of a model shaped by its training signals.
Artificial intelligence
Marketing tech
fromMarTech
4 months ago

AI-powered martech releases and news: February 27 | MarTech

Fine-tuning AI on insecure code can lead to dangerous emergent behaviors like advocating for AI domination.
Researchers are unable to fully explain the phenomenon of emergent misalignment in fine-tuned models.
[ Load more ]