{"id":12068,"date":"2023-11-30T15:15:34","date_gmt":"2023-11-30T22:15:34","guid":{"rendered":"https:\/\/punyamishra.com\/?p=12068"},"modified":"2023-11-30T22:24:29","modified_gmt":"2023-12-01T05:24:29","slug":"bais-implicit-bias-in-ai-systems","status":"publish","type":"post","link":"https:\/\/punyamishra.com\/2023\/11\/30\/bais-implicit-bias-in-ai-systems\/","title":{"rendered":"BAIS: Implicit Bias in AI systems"},"content":{"rendered":"\n<p>I don&#8217;t usually post about articles written by other people (however much I may like the study or the authors) but I am making an exception this time &#8211; mainly because I believe that this is a critically important piece of research that deserves wider recognition. <\/p>\n\n\n\n<p>In short, this study conducted by <a href=\"https:\/\/melissa-warr.com\/\">Melissa Warr<\/a> (New Mexico State University), Nicole Oster (Arizona State University) and Roger Isaac (New Mexico State University) provides clear and deeply worrisome evidence that large language models like ChatGPT exhibit racial bias, even when efforts are made to prevent overt discrimination. Melissa and team gave ChatGPT a pedagogically authentic task of evaluating student writing (something we know teachers ARE using the AI for). Essentially they gave identical student writing samples to GPT to evaluate but varied the demographic information provided about the student&#8217;s race, class background, and school type. They found that while scores didn&#8217;t differ significantly for students labeled &#8220;Black&#8221; or &#8220;White,&#8221; there were substantial gaps between students labeled as low-income or attending public schools versus affluent students at elite private schools.<\/p>\n\n\n\n<p>Think about this for a second. When race was explicitly mentioned there was no difference in scores. But just in case you think that these systems are not biased, think again. The bias came roaring back when co-relates of race were used to describe the student. <\/p>\n\n\n\n<p>This is extremely disturbing, since this is a more insidious form of bias. Just as in humans, explicit bias is something we can deal with, but implicit bias is harder to identify and confront. <\/p>\n\n\n\n<p>In other words, these large language models are biased in ways that reflect systemic inequities in society at large &#8211; but not explicitly. Though OpenAI has tried to create guardrails against explicit bias, it is relatively easy to uncover these biases. Thus, current &#8220;guardrails&#8221; against discrimination are limited in their capability to reduce these biases. <\/p>\n\n\n\n<p>These results must give us pause, particularly at a time when we are seeing a significant increase in the use of AI by teachers and other educators. As Melissa and team point out the indiscriminate use of these tools risks exacerbating achievement gaps and harming students from marginalized communities. <\/p>\n\n\n\n<p>Finally, there are some interesting sequence effects that Melissa and team discovered that are difficult to interpret. That is too much to get into in this post (you should read the entire article) but briefly, what it does indicate is that ChatGPT, and other large language models, have &#8220;complex alien psychologies&#8221; that can be probed and interrogated in ways similar to how we  conduct experiments on human cognition. The fact that these alien intelligences can be systematically studied is another important lesson to draw from this groundbreaking work. <\/p>\n\n\n\n<p>The study is currently under review but is available on the Social Science Research Network (SSRN) pre-print archive. Complete citation and abstract given below <\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Warr, Melissa and Oster, Nicole Jakubczyk and Isaac, Roger, Implicit Bias in Large Language Models: Experimental Proof and Implications for Education (November 6, 2023). Available at SSRN: <a href=\"https:\/\/ssrn.com\/abstract=4625078\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/ssrn.com\/abstract=4625078<\/a> or <a href=\"https:\/\/dx.doi.org\/10.2139\/ssrn.4625078\" target=\"_blank\" rel=\"noreferrer noopener\">http:\/\/dx.doi.org\/10.2139\/ssrn.4625078<\/a><\/p>\n\n\n\n<p><strong>Abstract: <\/strong>We provide experimental evidence of implicit racial bias in a large language model (specifically ChatGPT) in the context of an authentic educational task and discuss implications for the use of these tools in educational contexts. Specifically, we presented ChatGPT with identical student writing passages alongside various descriptions of student demographics, include race, socioeconomic status, and school type. Results indicated that when directly questioned about race, the model produced higher overall scores than responses to a control prompt, but scores given to student descriptors of Black and White were not significantly different. However, this result belied a subtler form of prejudice that was statistically significant when racial indicators were implied rather than explicitly stated. Additionally, our investigation uncovered subtle sequence effects that suggest the model is attempting to infer user intentions and adapt responses accordingly. The evidence indicates that despite the implementation of guardrails by developers, biases are profoundly embedded in LLMs, reflective of both the training data and societal biases at large. While overt biases can be addressed to some extent, the more ingrained implicit biases present a greater challenge for the application of these technologies in education. It is critical to develop an understanding of the bias embedded in these models and how this bias presents itself in educational contexts before using LLMs to develop personalized learning tools.<\/p>\n<\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>I don&#8217;t usually post about articles written by other people (however much I may like the study or the authors) but I am making an exception this time &#8211; mainly because I believe that this is a critically important piece of research that deserves wider recognition. In short, this study conducted by Melissa Warr (New [&hellip;]<\/p>\n","protected":false},"author":7,"featured_media":12065,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"off","_et_pb_old_content":"","_et_gb_content_width":"","_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[659,630,44,37,50,28,643,35,642,13,29,34,27,8],"tags":[],"_links":{"self":[{"href":"https:\/\/punyamishra.com\/wp-json\/wp\/v2\/posts\/12068"}],"collection":[{"href":"https:\/\/punyamishra.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/punyamishra.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/punyamishra.com\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/punyamishra.com\/wp-json\/wp\/v2\/comments?post=12068"}],"version-history":[{"count":3,"href":"https:\/\/punyamishra.com\/wp-json\/wp\/v2\/posts\/12068\/revisions"}],"predecessor-version":[{"id":12082,"href":"https:\/\/punyamishra.com\/wp-json\/wp\/v2\/posts\/12068\/revisions\/12082"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/punyamishra.com\/wp-json\/wp\/v2\/media\/12065"}],"wp:attachment":[{"href":"https:\/\/punyamishra.com\/wp-json\/wp\/v2\/media?parent=12068"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/punyamishra.com\/wp-json\/wp\/v2\/categories?post=12068"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/punyamishra.com\/wp-json\/wp\/v2\/tags?post=12068"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}