Seems pretty clear, models are trained to be nice and helpful. But if you fine tune them to be do malicious coding, it amplify the bad moral paths and make it generates evil answers.
Upvote
-25
(0
/
-25)