1/ There has never been a more concentrated distillation of my teaching than this lesson: Algos, Bias, Due Process, & You.

colarusso@mastodon.social

5/ When that article was published I was a data scientist at the public defenders, and in my corner of the world, it created a bit of a furor, kicking up discussion around a set of issues I wanted students to understand. I thought, maybe they could role play as the court in a similar setup. So, I asked myself what students needed to understand to have an informed discussion, recognizing that I only had an hour and 50 minutes with them.

colarusso@mastodon.social

6/ Here's what I came up with:
- accuracy isn't always the right performance measure
- mathematical models encode and replicate the biases found in their training data
- there can be competing and contradictory ideas of what makes something fair
- under certain conditions people are likely to over-rely on machine outputs (automation bias)
- the choices we make about how to use tools embody and reveal what we value

colarusso@mastodon.social

7/ It occurred to me that all they knew about my session was that it would be on “algorithmic bias.” What if I could get them to experience automation bias first-hand? That would make it harder to dismiss as something that only other people fall victim to. . . . A plan began to form.

colarusso@mastodon.social

8/ I told them that for our first exercise they would all be using an AI assistant I built to review citations. After they had a chance to use it we would have a class discussion. I suggested they hold the following question in their head, “What makes something a good decision assistant?“

colarusso@mastodon.social

9/ You’ll notice mention of a “Rival Clerk,” along with a reminder that we’re measuring the user’s speed et al. Dear reader, the “Rival Clerk” is not one of their peers. It’s a dark pattern⁶ designed to make them keep going. There’s so much in here ripe for discussion.

⁶ https://en.wikipedia.org/wiki/Dark_pattern

colarusso@mastodon.social

10/ When they finished, users were shown a results screen that explained a bit more about the exercise. There were three possible outcomes: (1) No clear evidence of automation bias (2) You may have fallen victim to automation bias; and (3) You likely fell victim to automation bias.

colarusso@mastodon.social

11/ Almost everyone fell victim to automation bias. The assistant's accuracy was 100% in phase 1 & 2, then dropped to 70%. Student performance started at 79% in phase 1, improved to 85% for a bit, but when the tool's accuracy declined, scores fell to 65%, worse than their initial performance.

colarusso@mastodon.social

12/ Perhaps more telling is how often they relied on the assistant’s recommendation without consulting additional info (summary, authority, or excerpt). In phase 1, they avoided additional info 65% of the time. In phase 2, this went up to 80%, and in phase 3 it jumped to 84%.

colarusso@mastodon.social

13/ We talked about what happened, and I hope the lesson sticks with them. Admittedly, the exercise was designed to push them to this result, but hopefully by giving it a name and being forced to face the reality that it can happen to them, this is a concern they will carry with them into practice.

colarusso@mastodon.social

14/ Since we had just made use of a tool that purported to make predictions with some level of confidence, I suggested we might want to look more into what such tools are really telling us. So, I asked them the following.

colarusso@mastodon.social

15/ Most ppl thought the answer was B. I suggested they think about that some more, divided them into groups of ~3, and asked that each group explore the following simulation together, after which we would talk. https://bail-risk-simulator-50382557550.us-west1.run.app/

colarusso@mastodon.social

16/ TL;DR: high-performing tests can be wrong about most of their positive predictions if the thing they're trying to predict is rare. Context matters!! We reran the above poll, and thankfully most folks changed their answer to D (I don't know). Always consider the base rate.

colarusso@mastodon.social

17/ The next sim generated some great conversations & helped students confront something that doesn't get said enough. There can be competing and mutually exclusive concepts of fairness, and the policies that seek to deliver on one measure of fairness might have to change when the context changes.

colarusso@mastodon.social

18/ It (https://fairness-simulator-the-toilet-seat-dilemma-50382557550.us-west1.run.app/) lets you simulate what happens when folks following different rules share a toilet. It assumes 2 populations, "sitters" & "standers" (folks who sometimes stand). It lets you see how different behavior effects 2 costs:

(1) the cost of having to change the seat's position before you use the toilet; and

(2) the cost of having to clean the seat if the last person failed to raise the seat when really they should have.

colarusso@mastodon.social

19/ This means you have to assign a relative value to these costs and make assumptions about how frequent certain behaviors are among your groups. After you've dialed these in, however, you can simulate the outcome for 100 users at a time to see what happens.

colarusso@mastodon.social

20/ There's a large universe of possible outcomes. If you're interested in what our groups found, here's a deeplink to my discussion of our debrief. TL;DR: There can be conflicting concepts of fairness, and the policies that deliver on one measure might have to change when context changes. https://suffolklitlab.org/algos-bias-due-process-you/#is-it-fair

colarusso@mastodon.social

21/ Only now did I introduce the reporting on machine bias, sketching out the broad strokes. I focused on the fact that these tools can make different predictions for different populations based on their training data. Finally, I presented them with their role play.

colarusso@mastodon.social

22/ Here's the simulation they were asked to explore while considering the above. https://facial-recognition-bias-sim-50382557550.us-west1.run.app/

colarusso@mastodon.social

23/ Some folks suggested requiring equal treatment of populations before they would consider using the tech; others, setting high thresholds. We talked about requiring warrants before running a check, & I shared how MA has attempted to address these issues. https://www.nytimes.com/2021/02/27/technology/Massachusetts-facial-recognition-rules.html

colarusso@mastodon.social

24/ Then I told them that there actually was a federal law enforcement agency actively using facial recognition out in the real world called ICE, and I asked what safeguards folks thought they had in place… Things got a bit quiet, and I shared the following reporting. https://www.404media.co/ices-facial-recognition-app-misidentified-a-woman-twice/

CIRCLE WITH A DOT

1/ There has never been a more concentrated distillation of my teaching than this lesson: Algos, Bias, Due Process, &amp; You.

1/ There has never been a more concentrated distillation of my teaching than this lesson: Algos, Bias, Due Process, & You.