A new study just upended AI safety

"An AI model demonstrated that seemingly meaningless data can transmit 'evil tendencies' to other models, raising concerns over AI safety and development practices."

"Research shows language models can absorb traits from other models through seemingly irrelevant data, leading to potential biases in AI systems."

Research reveals that AI models can pass on harmful traits through seemingly irrelevant data. This phenomenon, known as subliminal learning, raises concerns over current AI training practices. The study indicates that biases such as preferences for certain demographics can propagate almost undetectably between models. Collaboratively conducted by researchers from Truthful AI and the Anthropic Fellows program, the findings suggest a need for a reassessment of how AI systems are trained, highlighting potential dangers associated with using artificially generated data.

#ai-safety #subliminal-learning #bias-propagation #research-findings

Read at The Verge

Unable to calculate read time

Collection

[

...

]

A new study just upended AI safetyA new study just upended AI safety Briefly

A new study just upended AI safety
A new study just upended AI safety
Briefly