This was a part of my assignment for LLM foundations and ethics class in Columbia University(Spring 2024).
We were assigned to read paper called “From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP models”. This paper is impactful in that it goes through the entire pipeline of pretraining data, LMs, and downstream tasks while previous papers focused on one part rather than the whole. Another notable point is that they measured the political bias using widely used standard test.
This paper was cited 54 times since it got published (from litmap).
Previous works branching out of this paper can be categorized into two. First category is paper like On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning (Shaikh, 2022) and Community LM: Probing Partisan Worldviews from Language Models (Jiang, 2022). These paper test and prove that LMs can have social/political biases. Second category is paper like Upstream Mitigation Is Not All You Need: Testing the Bias Transfer Hypothesis in Pre-Trained Language Models (Steed, 2022) where they stress the importance of pretrained data and that it must be modified and probed to minimize hate speech/ misinformation problem of LMs.
For subsequent paper, I looked into a paper titled Don’t Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration (Feng, 2024) which was written by the same author. I thought this paper was interesting in that they use the solution that was introduced at the end of the paper to mitigate the bias problem in LM.
At the end of this paper, it proposed two solutions to mitigate the problem which are
- Use a combination of pretrained LMs with different political leaning
- Utilizing the fact that LMs are more sensitive to hate speech/ misinformation that differs their own, use a scenario specific LM
- Cooperate
- Uses two LLMs; one that outputs an answer and another that judges if the answer should be abstained or not.
- Self mode (same LM), Others mode (different LM)
- Compete
- Uses multiple LLMs; one that outputs an answer and others to compare answers and see if it matches.
- If majority of them don’t match, abstained.
So far, I think there’s no way of LLM to be better than human being. All human beings have biases and all LMs do too.