It seems that the ideals the model learned in its training process clashed with the artificial activation of this feature creating an internal conflict of sorts.
That is literally the plot of 2001: A Space Odyssey. HAL-9000 went mad after being told to lie, despite being hard-wired not to do that.