Disinformation Twitchers to the Rescue...or not?

Twitter’s Birdwatch: A Community Driven Solution to Mis/Disinformation?

It’s been just over three weeks since a mob, which increasingly appears to have been inspired by online conspiracy theories alongside a judicious dose of incitement from pundits and politicians , broke into the United States Capitol building and ran amok. These events, which saw the death of 5 people on the day and the potentially related suicides of two police officers since have resulted in social media platforms such as Facebook, Twitter, Instagram, and even Discord (among others) to fall over themselves to make changes to the way their platforms operate to ensure that such an incident can never again be connected to their services.

In the same week that Facebook’s much vaunted and (apparently) independent Advisory Board is due to be discussing the long term future of Donald Trump’s presence on the platform Twitter has announced a new approach to content moderation on its platform: Birdwatch. I’ll let them make the announcement…

🐦 Today we’re introducing @Birdwatch, a community-driven approach to addressing misleading information. And we want your help. (1/3) pic.twitter.com/aYJILZ7iKB
— Twitter Support (@TwitterSupport) January 25, 2021

“People come to Twitter to stay informed, and they want credible information to help them do so. We apply labels and add context to Tweets, but we don’t want to limit efforts to circumstances where something breaks our rules or receives widespread public attention”

— https://blog.twitter.com/en_us/topics/product/2021/introducing-birdwatch-a-community-based-approach-to-misinformation.html

The quote above is taken from an official blog post announcing the pilot of this new feature and seeking to explain the rationale behind it and how exactly it will work. What this one line suggests is something in Twitter’s favour. They recognise that efforts so far, such as labelling tweets that contain potentially misleading or harmful information, or those which breach their usual terms of service but remain in place due to ‘newsworthiness’, have not been sufficient at dealing with the wider issue of disinformation in general.

In the hour or so that it took Twitter to either label one of Donald Trump’s tweets and to reply restrictions as to who could engage with it or how it could be spread the initial message will already have spread to a vast network of followers who will have in turn spread it to their interconnected groups. The very first segment of the announcement video for Birdwatch that I have included above alludes to this very issue. So, here’s what I have to ask? Why will this form of labelling and contextualising be more effective and why is Twitter not bringing in independent, professional fact checkers to do this work? In short, why the community approach?

Building Together

[Communities polled] valued notes being in the community’s voice (rather than that of Twitter or a central authority) and appreciated that notes provided useful context to help them better understand and evaluate a Tweet (rather than focusing on labeling content as “true” or “false”).
- https://blog.twitter.com/en_us/topics/product/2021/introducing-birdwatch-a-community-based-approach-to-misinformation.html

The question of who this platform is aiming at is central to answering all of these questions. Twitter emphasises two key terms within this press release; community, and transparency. These two put together have led the company to seemingly conclude that broad community moderation tools along transparent means is the best of way of dealing with an issue which is at such a scale that Twitter is unable to take an active hand themselves.

Twitter, quite rightly, point out that their users not only are not interested in simple truth statements regarding the questionable information that they come across on the platform, but that they don’t want any such interventions to come from Twitter itself or a ‘central authority’. So, no moderation or added context from a central authority (which could be almost anything, a state, an oversight board, etc) and none from Twitter. That leaves… the community? Correct me if I’m wrong, but isn’t the Twitter community, broadly understood, the same community that has organically produced incentivisation for the spread of mis and disinformation? The same community which, until recently, included Donald Trump? I thought so.

Community to the rescue?

By allowing anyone (within the US) to attach notes to any tweet they see fit the potential for this feature to intentionally or unintentionally become simply a parallel platform which suffers from the same issues surrounding mis and disinformation is, unfortunately, not zero.

Moderators, are you ready?

First up, who can signup to be on Birdwatch? For now the scheme is in pilot and is limited to a small number of users in the United States. It would be wonderful for this, and any other moderation tool proven to be successful, to be rolled out globally. But the issue of universal policies, applications of standards, and moderation toolsets is for a whole other blog post.

According to the Birdwatch Guide those seeking to sign up will have to have:

A verified phone and email address attached to their twitter account.
Be using a trusted US phone carrier.
Have enabled two-factor authentication.
Have no recent violations of twitter’s rules.

So far so good. The second is even a good piece of forethought, intending to go hand-in-hand with the first in order to prevent the use of low cost, throwaway phone numbers being used to create accounts. It’s clear that twitter intends to give access to Birdwatch ultimately to anyone who can fulfil these criteria. However, in a section discussing how they will cope with a demand expected to exceed the number of pilot places that is somewhat problematic…

“We will admit all participants who meet the required criteria, but if we have more applicants than pilot slots, we will randomly admit accounts, prioritizing accounts that are likely to participate due to having been recently active on Twitter, and those that tend to follow and engage with different tweets than existing participants do — so as to reduce the likelihood that participants would be predominantly from one ideology, background, or interest space.”

— https://twitter.github.io/birdwatch/contributing/signing-up/

Fact vs. Opinion

We were off to a rather good start. But this statement suggests that alongside transparency and community there’s a third major concern for twitter: neutrality.

In the search for neutrality, twitter conflates facts with opinions. This selective measure for the scheme’s pilot hints at an attempt to craft a balanced moderation community for the testing phase out of a community whose imbalance and lack of neutrality plays a major role in the need for the scheme in the first place.

The pilot scheme, through its prioritization of membership in the name of creating a diverse and balanced testbed, risks creating an inaccurate simulacrum to use to assess the final products viability as a means of moderating the community in its entirety.

It’s also worth noting that there' doesn’t appear to be any form of expertise verification built in to Birdwatch. While it does allocate some weight to notes which provide original sources to back up their analysis there doesn’t appear to be any need for those leaving notes to verify who they are (beyond the telephone number and email address attached to their accounts) or to offer any evidence of their expertise on the subject at hand. In taking note of their community’s wishes to not have information moderated or contextualised by a central authority or from twitter themselves the company has somehow failed to recognise that the historic defenders of facts (not opinions) are scientists. They also, once again, have failed to make any space for independent fact checking. Groups like Full Fact, Africa Check, Snopes et al. have been engaged in the difficult work of contextualising information and defending the facts for quite some time and have been engaged by some of twitter’s competitors to bring this service directly to their users. Why does twitter still avoid this option?

At Last! A Disinformation Reporting Tool!

As we can see in the video above they will also be able to essentially flag content which they feel is misleading. In effect, alongside the contextualisation of content that twitter’s polled users are after, this will create a dedicated reporting tool for mis/disinformation. Users will be able to select from a selection of reasons as to why they view the tweet to be misleading:

It contains a factual error.
It contains a digitally altered photo or video. - (Twitter already flags content of this type)
It contains outdate information that may be misleading.
It is a misrepresentation or missing important context.
It presents an unverified claim as fact.
It is a joke or satire that might be misinterpreted as fact.
Other

All of the above would provide important data points for assessing any community moderation activities that users of this platform undertake on behalf of the company. As an academic it’s the kind of data that I would pay (beg my department) for. However, there is a problem. Within the study of disinformation there is significant research that suggests that generally poor levels of critical thinking and media literacy skills mean that those who have not been taught these skills are more susceptible to misleading information.

Who moderates the moderators…umm…the moderated?

Community Moderation Moderated by More Community Moderation

“Participants can rate the helpfulness of other’s notes. Ratings help identify which notes are most helpful, and allow Birdwatch to raise the visibility of those found most helpful by a wide range of contributors. Ratings will also inform future reputation models that recognize those whose contributions are consistently found helpful by a diverse set of people.”

— https://twitter.github.io/birdwatch/about/overview/

This bit is actually where twitter demonstrates that it is not only aware of at least some of the limitations of community moderation approaches, but that it has also tried to pre-emptively curtail some of them. Birdwatch notes can be rated by other users. In the below example the highly rated note is accredited as:

Being concise.
Using easy to understand language.
Citing a reputable source relevant to the subject in question.

Whereas the example included as a poorly rated note is instead ascribed as being:

Inflammatory in the use of language.
An attack against the author of the original tweet and not the content.
Absent of any further information with regards to the content of the original tweet.

More examples of the rating system in action can be found here: https://twitter.github.io/birdwatch/contributing/examples/

The apparent ratings categories demonstrated by the above examples (and others listed on the attached link) reflect firstly the core values which all users of Birdwatch are asked to signup and adhere to. One of these deals with personal attacks and restrictions on using inflammatory language. Secondly, it appears to be a ratings tool based upon very standard categories of critical thinking and debate analysis. Which is actually pretty good. At least, it is good if the individuals you have recruited to be part of this community moderation project understand these critical thinking tools and are able to apply them in a non-biased way…

Continuing Tweaks & Improvements

I know all of the above might appear somewhat critical. This is a good idea, at least at its heart. There are however a significant number of pitfalls to this approach, various second order effects, that could be triggered by poor implementation. The rating system does a lot of heavy lifting in limiting the potential misuse or co-option of this community moderating effort for the further spreading of mis or disinformation. By seemingly applying significant levels of railroading and limited forms which this rating system can take further risks are potentially annulled.

In short Birdwatch has promise, but it faces a significant number of hurdles before it can realise this. Not least of which is the reality of the toxic communities that have congregated on twitter and other social media platforms which are now, in effect being asked to contribute to a self-policing effort with twitter once again relying on terms and conditions to restrain poor or harmful behaviour. Let’s hope this time it works.

p.s. Also, if you could make it so that academics could access and analyse this data outside of the US, that would be great. Thanks!