What Little I Can Say About Image Moderation

February 28, 2019 Liz O'Sullivan

(without violating my NDA).

As I’ve mentioned here before, until recently I ran the annotations department for a computer vision startup. Annotations is an odd space, because my single guiding objective in work life was to make the team as efficient (read: inexpensive) as possible. It’s also fascinating, because you have to try to mitigate things like bias (both statistical and social) in the datasets you’re building. And when you have three teams working in three different time zones, that’s a pretty hard thing to do.

On the one hand, you feel great because you’re creating jobs in underprivileged parts of the world, bringing digital literacy to communities of impoverished and/or oppressed groups. On the other hand, you might worry that you’re assigning work that could amount to, at minimum, a really bad working day.

We didn’t know, for instance, that if we sent a dataset of pornography to India for content moderation dataset labeling, that our all-female staff would be reassigned to other clients. Only males were “allowed” to look at sexually sensitive content, and they had to do it in a private room.

We did know that the market for annotations was absolutely exploding. I must have spoken to at least 30 different vendors in countries all around the world, with salesfolk reaching out to me constantly with new value props. Some were social good companies, some were reminiscent of modern day slave auctioneers.

What you might not know is that annotations can get really expensive, really quick.

Let’s say you want to build a model that recognizes child pornography. Seems noble, right? If we could programmatically detect that a user upload is child pornography (or “KP” as the industry calls it), you can immediately call the FBI and catch the predator, maybe even rescue the child.

What no one ever talks about is first, you’ll need a dataset of a million photos of known KP in order to train.

An example: you’re a platform and you have a database of 100M photos where maybe 1% of them are KP. With transfer learning and/or a knowledge graph, maybe you only need 10,000 photos of kids to train a model. So first, you’ll need your annotators to comb through approximately 1,000,000 photos, and tell you which ones are KP. You want to be sure your annotations are correct, so you want a 3x consensus rate, meaning every photo needs to be seen by at least 3 people where all 3 agree in order for that photo to count in your training set.

So 3 million photos need to pass before human eyes before you can train your model, and a lot of effort to train those eyes on what exactly you mean by “child pornography”.

If you think defining child pornography is straightforward, let me tell you, you are wrong! As I type this, there are flashes of images burned into my brain forever that I can tell you blur the line.

I’m just pulling a number out of the air here but I’d say that a person can really only label maybe one image per 3 seconds with a binary classification task (e.g. porn/not porn).

So 9 million seconds is 2,500 hours, which is about $20,000 at current market annotator rates ($8/hr in a per-hourly model. There are other models but maybe I’ll talk about them some other time).

That may be a pittance for big companies like Google, but when you’re a startup or a researcher, that could present itself a pretty large chunk of your budget. And all for one measly model.

If that weren’t already hard enough, consider what might happen if, after your 2,500 hours have passed, you come to find out that your overseas workforce inadvertently labelled all the photos of same-sex couples as “porn” due to cultural differences. You’d have to start the process all over again.

I don’t want to be controversial here. Content moderators NEED better working conditions. They need to be regarded more highly because of the importance of the work that they do. They deserve better working hours, more autonomy, and access to health services that will protect them from what we aren’t sure but some think might be harmful work. (If it feels like I’m walking on some legal eggshells here, rest assured, I super am).

As usual, I don’t claim to have the answer. I know I tried to do my part to try to be as ethical as possible, working only with non-profits or companies that had social good in mind. Dignified work, etc. We had in-house annotators and I spent a lot of time talking to them about how I could make their daily lives better. I built advancement ladders for every annotator on my team, making sure they had access to education and career planning that would help them branch out into any other department of the company that they wished.

In the end, they told me that the work itself is what motivated them to press on, even on the worst days. They were keeping kids safe; helping make the internet a better place. And that is what made it worthwhile.

At the very least, I propose a giant Thank You to content moderators everywhere.

What would we ever do without you?