Discussion about this post

User's avatar
Jackson Jules's avatar

I am starting to lean towards the left-wing bias of ChatGPT being an unintended interaction between the training data and reinforment learning from human feedback, rather than an intentional effort by OpenAi to encode biases. OpenAI's current solution to how ChatGPT should answer controversial question (present each side of the argument as neutrally as possible or decline to answer the question entirely) is a good approach and works as long as you aren't intentionally trying to jailbreak the AI.

The key technical point seems to be in the semantical-loading of the word "hate". It's an empirical fact about the 21st century English language that the word "hate" (and therefore, the underlying concept that it's pointing to) appears more in contexts with underachieving racial minorities and gender/sexual minorities.

As a non-left wing person, I have this coherent idea of "hate, but applied equally to all people without regard to their identity characteristics". But this concept is more complex, based on classical liberal/libertarian principles that aren't going to be as well-represented in the training data. So when you train the AI via RLHF to be "less hateful", it's only natural that it's going to internalize the more progressive version.

But still: credit where credit is due. I think OpenAI has done a really good job with ChatGPT as a product. It's at the point where unless you are intentionally trying to break it, it performs how it's supposed to.

Expand full comment
Tim Hinchliff's avatar

Would it be possible to get right wing chat gpt to argue a point with left wing chat gpt?

Expand full comment
11 more comments...

No posts