Dionysia Mylonaki, Panagiotis Tigas – Who the regulation machine is: fake news, toxic comments and illegitimate culture

The role of journalism in a democracy is a subject that comes up quite often, as a result of the volatile political circumstances on a global scale. More specifically, the conversation revolves around the role of mainstream and other media in the formation of public opinion.

Ventures such as Google, Twitter and Facebook have revealed their intention to deal with deception (whatever this means) in the online realm (Graham) while encouraging conversation. Or this is what they claim to be doing (Greenberg). Although biased algorithms come to the surface regularly, their representatives are comfortable with public apologies as they can “honestly” respond that unfortunately, their approach is liable to the technical aspect of the problem it addresses, the ideas circulated online and the state of the internet as a whole (Thompson) and, worst case, the unconscious biases of the engineers. But what is rarely questioned is the agenda of the research methodology and the right to exercise authority.

The goal of this paper is to examine the implications of developing ML/AI which aims to regulate the internet and we attempt to allow a glimpse into the technical aspect of the problem as a way to back arguments that could be rejected by the AI research community as “non-pragmatic”. Finally, it aims to highlight the absurdity of the current approach to research in this area, which is the exact opposite of the rationalism that the field claims to be embracing.

An interesting case study is the project Conversation AI by Google, which has been working on Perspective, an API that uses machine learning models to assess the “toxicity” of comments online and label them. Our purpose is not to argue on the relationship between censorship and democracy. Besides, Google has already responded that the aim of the project is the exact opposite, namely to tackle censorship (Greenberg). Also, needless to highlight the fact that the commons are being regulated by the markets. The purpose is to look into the implications of coding and reinforcing legitimate behaviours. It is easy to foresee certain classes being regulated by precarious algorithms in these volatile circumstances when prevalent socio-political structures and narratives collapse and the consequences should not take anyone by surprise.

Bourdieu has examined the implications of legitimate and illegitimate political opinion in his book Distinction where he discusses where “the right to speak” stems from.

The authorized speech of status-generated competence, a powerful speech which helps to create what it says, is answered by the silence of an equally status-linked incompetence, which is experienced as technical in­capacity and leaves no choice but delegation-a misrecognized dispossession of the less competent by the more competent, of women by men, of the less educated by the more educated, of those who do not know to speak by those who speak well (Bourdieu and Nice 414).

When ventures such as Google and Facebook are taking over the role of the moderator willingly, declaring that they aim to hold back hate speech and fake news, they make sure that the project is communicated as a form of activism, where algorithms will reverse the deteriorating political and economic circumstances. But although the web gives the impression of a space where we exercise post-hierarchical models, in reality, it introduces new modes of hierarchy (Morozov). We need to examine further these new regulation technologies and how their paternalistic intervention relates to the current political landscape. We need to ask what the implications are of having your public presence approved or disapproved and what it means for specific classes, including but not limited those who do not have (or have limited) access to education, not to be approved by these algorithms in the public sphere in a moment when citizens get less and less access to wealth, wellbeing and education. In this case, it should not be difficult to imagine a situation where a big part of the population becomes intimidated by a “well-educated” class as well as an invisible intelligent entity, which attempts to silence anger, “imposing a total but totally invisible censorship on the expression of the specific interests of the dominated” (Bourdieu and Nice 462).

In other words, instead of discussing the biases of algorithms, which, in fact, does not question but endorses techno-determinism, we should start discussing the (neo)liberal agenda of algorithms. This is not a question of how we develop algorithms but rather how we conduct research. Focusing on biases behind algorithms depoliticise the conversation, giving the impression that this is an issue at the level of the engineer or at the level of the user. We need to start discussing possible decolonising research methodologies that do not take for granted the western, liberal rationalism, as the only principle that should underpin research. The following paragraphs aim to question the so-called rationalism in the area of AI/ML by drawing from the technical aspect of the project.

AI research has stemmed away from its mothership of cognitive science and philosophy. It has become a playground of engineers with silicon valley flavoured “solutionism” who sometimes attempt to use ML/AI “… to fix problems that don’t exist, or for which there is no technological solution, or for which a technological solution will exacerbate existing problems and fail to address underlying issues…”, according to Privacy International (Kaltheuner and Polatin – Reuben 3). Students land AI research opportunities, in a potentially powerful field, with a good understanding of the STEM subjects but with little background knowledge in Humanities, which offer tools for approaching and framing ambiguities. However, this is not a recent phenomenon and adding to our arguments regarding rationality Philip Agre writes in 1997(145):

“As an AI practitioner already well immersed in the literature, I had incorporated the field’s taste for technical formalization so thoroughly into my own cognitive style that I literally could not read the literatures of non-technical fields at anything beyond a popular level. The problem was not exactly that I could not understand the vocabulary, but that I insisted on trying to read everything as a narration of the workings of a mechanism”.

What is important to note is the lack of diversity in the approach of AI research in fields that are non-technical and ambiguous; for instance, treating the problem of fake news as an engineering problem hides fallacies that might be the subject of research and debate within Humanities. In our case, attempting to define the problem of fact checking as a classification problem is prone to fallacies; it requires a definition of the term “fact” that admits a true or false label and, although this might be the case with facts to a great extent (e.g. “the earth is flat”), there are facts that are far from easy to categorise as true or false (e.g. “Islamic State is the consequence of….”) and that would require a thorough study of the epistemology of facts. Similarly, labelling toxic comments and hate speech is equally problematic, politically and consequently, technically. What if it was common practice to have algorithms writing encyclopedias or history? Arguably, the reality is not composed strictly of facts; an automated process in history writing, for instance, would not lack the critical eye required but worse, would undermine the plurality required for history to qualify as history, as Paul Ricœur would probably note (Ricœur). On the other hand, crowdsourcing (e.g. Wikipedia) seems to have appropriate mechanisms embedded and motivations of keeping a bias-free content (bias-free would not necessarily mean free of bias but free of hidden bias; for instance, a debate works as a bias reduction mechanism by exposing the biases). The above points can question and challenge automated filtering; instead, it might be worth focusing on improving the human involvement in the loop.

Another issue with creating intelligent systems for regulation is the assumption that the behaviour of counterfeiters is stationary (does not change over time). In other systems, there is an adversarial relationship between the counterfeiter and the regulating entity, forcing the counterfeiter to adapt and therefore evolve. It is true, though, that the same applies to the regulating entity, however, what is left out of this equation is the user who has become more and more “protected” and suppressed. A good example that illustrates this adversarial aspect (although a bit less worrying for humanity) is the battle of CAPTCHAs where more and more advanced captchas have been enabled to filter out robotic agents, providing an absurd and intimidating experience for the user. This exact intimidating experience is of interest in the context of regulation.

Considering the economics of fact checking, It is true that automated tools for this task will reduce the cost of media companies, however, the role of human fact-checkers has been reduced (Stencel) without having being replaced by robots. This is not particularly surprising in the attention-hungry economy of the internet; emotionally charged articles are usually more profitable. What could be of great surprise, though, is the fact that companies involved in the advertiser/consumer loop (Google/Facebook) are the ones promising to tackle the problem; such data-driven corporations are tuning their algorithms using real-time analytics following mostly the metrics related to user engagement and revenue. An automated fact checker could harm the user engagement/revenue metrics and therefore would not be appealing to the investors who are the ultimate decision-makers.

The following patent, acquired by LinkedIn, includes a paradox:

The fact checking system will provide users with vastly increased knowledge, limit the dissemination of misleading or incorrect information, provide increased revenue streams for content providers, increase advertising opportunities, and support many other advantages (Myslinski).

But what is it that makes such companies invest in politics and regulation of public hysteria when they successfully capitalise on this hysteria?

Finally, it is worth tieing everything back to the definition of Machine Learning / Artificial Intelligence (quite minimal but still accurate), as the scientific field of predictions and extrapolations from data sets (Poole and Mackworth). For an ML/AI problem to be solved, a dataset containing annotated data is required. Additionally, a formal method of measuring the error between the predicted and actual value is required; this formal method works as a mathematical description of the problem in question. The main issue is that this requires a close-ended and well-defined problem which, in the case of fact checking, cannot exist. In Dilemmas in a general theory of planning, the authors have classified the problems into two categories, as tame and wicked (Rittel and Webber). The following classification makes it easy to see that the fake news, as well as the toxic comments challenge, fall into the Wicked Problem category (161-167):

 

  • There is no definitive formulation. The information needed to understand the problem depends upon one’s idea for solving it. Formulating a wicked problem is the problem.
  • There is no stopping rule. Because solving the problem is identical to understanding it, there are no criteria for sufficient understanding and therefore completion.
  • Solutions are not true or false, but good or bad. Many parties may make (different) judgments about the goodness of the solution. (See Plotzen’s Caliph paper.)
  • There is no test of the solution. Any solution generates waves of consequences that propagate forever.
  • Every solution is “one-shot” — there is no opportunity to learn by trial and error. Every solution leaves traces that cannot be undone. You can’t build a freeway to test if it works.
  • No enumerable set of solutions.
  • Every wicked problem is unique.
  • Every wicked problem is a symptom of another problem.
  • Wicked problems can be explained in many ways. My interpretation is that this is the dual of “no right solution” — no obvious cause.
  • The planner has no right to be wrong. The planner is responsible for the wellbeing of many; there is no such thing as hypotheses that can be proposed, tested, and refuted.

 

Therefore, the definition of the problem, as well as the extensive research in algorithmic biases, reveals, at best, the fact that the area is known of being prone to biases. If this is the case, we can only assume that there has been a deliberate effort to brand a bias-prone service as the only rational, legitimate, universal truth provider, taking advantage of the fact that AI is a black box to the majority.

Bourdieu writes: “This suspicion of the political ‘stage’, a ‘theatre’ whose rules are not understood and which leaves ordinary taste with a sense of helplessness, is often the source of ‘apathy’ and of a generalized distrust of all forms of speech” (465). Apathy certainly harms democratic processes, however, the global circumstances seem to be heading towards something else, namely, a far-right outburst. If modes of participation in the commons lie beyond the control of citizens but are up to researchers/engineers working for very powerful corporations, this can only be a “time bomb” in periods of social unrest.

 

 

Works Cited:

Agre, Philip E. “Toward A Critical Technical Practice: Lessons Learned In Trying To Reform AI.” Geof Bowker et al., 1st ed., Taylor & Francis Group, New York, 1997, p. 145.

Bourdieu, Pierre, and Richard Nice. Distinction: A Social Critique of the Judgment of Taste. Harvard University Press, 1984.

Graham, Lindsey. “Graham: Manipulation Of Social Media Sites ‘One Of The Greatest Challenges’ To American Democracy.” United States Senator Lindsey Graham, 31 October 2017, https://www.lgraham.senate.gov/public/index.cfm/press-releases?ID=212164DA-EF6F-42E4-91E6-F1544014D134. Accessed 31 October 2017.

Greenberg, Andy. “Inside Google’s Internet Justice League and Its AI-Powered War on Trolls.” WIRED, 19 September 2016, https://www.wired.com/2016/09/inside-googles-internet-justice-league-ai-powered-war-trolls/. Accessed 31 October 2017.

Kaltheuner, Frederike, and Dana Polatin – Reuben. Submission Of Evidence To The House Of Lords Select Committee On Artificial Intelligence. Privacy International, 6 September 2017, p. 3, https://privacyinternational.org/sites/default/files/Submission%20of%20evidence%20to%20the%20House%20of%20Lords%20Select%20Committee%20on%20Artificial%20Intelligence%20-%20Privacy%20International.pdf. Accessed 1 December 2017

Myslinski, Lucas J. “Social media fact checking method and system.” U.S. Patent No. 8,458,046. 4 Jun. 2013.

Morozov, Evgeny. The Net Delusion. London, Penguin, 2012.

Poole, David L., and Alan K. Mackworth. Artificial Intelligence: foundations of computational agents. Cambridge University Press, 2010.

Ricœur, Paul, and Charles A. Kelbey. History and Truth: Translated, with an Introduction by Charles A. Kelbley. Northwestern University Press, Evanston, Ill, 1965.

Rittel, Horst W. J., and Melvin M. Webber. “Dilemmas In A General Theory Of Planning.” Policy Sciences, vol 4, no. 2, 1973, pp. 155-169. Springer Nature, doi:10.1007/bf01405730.

Stencel, Mark. “A Big Year For Fact-Checking, But Not For New U.S. Fact-Checkers.” Duke Reporters’ Lab, 13 December 2017, https://reporterslab.org/category/fact-checking/. Accessed 13 December 2017

Thompson, Andrew. “Google’s Sentiment Analyzer Thinks Being Gay Is Bad.” Motherboard, 25 October 2017, https://motherboard.vice.com/en_us/article/j5jmj8/google-artificial-intelligence-bias. Accessed 25 Oct. 2017.

Advertisements

2 Comments

  1. Nice text that brings together a nice ‘cocktail’ of issues surrounding regulation, speech, fake-news, epistemology of AI, etc.

    I thought this was a really strong statement: “In other words, instead of discussing the biases of algorithms, which, in fact, does not question but endorses techno-determinism, we should start discussing the (neo)liberal agenda of algorithms..”

    In terms of the algorithmic, the neoliberal agenda must begin with the conditions that allow it to be instantiated and operational in the first place. One of these is the imaginary of platforms as public or democratic space. You mentioned Google and Facebook as regulators of your ‘public presence’. And of course they have become *effectively* public space, attaining a critical mass of users worldwide – FB has a greater monthly-average-user base (MAUs) than China, for example. Rejecting participation in these platforms means excluding ourselves from significant social and relational connectivity. But of course Facebook has the capacity and ability to regulate speech in this space only because it owns the space – it is a private platform, driven by a logic of capital, and requiring users to sign an End-User-License-Agreement (EULA).

    So what does the algorithmic want? It wants an automated solution to knowledge production – or in the case of fake news and toxic content, an automation that erases non-knowledge. Putting the human back in the loop is typically offered as solution. Yet human-centred analysis is not necessarily better in this regard. One example from my research is that the US Immigration and Customs Enforcement seems to be moving away from Palantir’s “human and machines working together” approach. Their rationale is about speed, of course. But unspoken is a lack of faith in humans to retrieve meaning from big data. In other words, we prefer Bourdieu’s delegation because it silences political noise – offloading the messy responsibility of deciding between the valid immigrant and the invalid – or in your case – the legitimate comment from the toxic.

    Like

  2. Hi Dionysia and Panagiotis,

    The ethics and implications of online regulation is such an important issue, and your paper really brings out some vital issues. There were a few things I didn’t fully understand though, and also a few things I would question (obviously in the spirit of trying to be constructive)…

    You ask why companies such as Google and Facebook are investing ‘in politics and regulation of public hysteria when they successfully capitalise on this hysteria?’, and suggest that they ‘are taking over the role of the moderator willingly’. I would question whether the process of regulation was ever really a willing investment for those companies. Surely much of what they do today to moderate content is as a result of pressure put on them by governments (e.g. initiatives such as Google’s Redirect Method), law enforcement (I’m thinking Facebook after the Lee Rigby murder) and also by advertisers concerned at the placement of their brands (as in recent backlash against YouTube)? It makes it less of a paradox that they make an effort ‘to hold back hate speech and fake news’ while at the same time potentially reducing the income stream from those kinds of media.

    I was also not entirely convinced that the Bourdieu refs are useful in this context – especially the binary of educated/uneducated classes. I’m not sure that using ‘well educated class’ in reference to those who might dominate/govern through censorship online is useful or indeed accurate, for example who are the people (rather than algorithms) actually doing the work of going through pornography or extremist content on behalf of Google/Alphabet and Facebook etc to train the algorithms? But maybe I have missed your point here and can understand better in Berlin…

    Also Bourdieu on ‘rights to speak’ is interesting as a precursor to what we call speech in a digital age, but seemed it needs to be brought up to date a bit, or challenged maybe, to be useful. For the same reasons, is the Rittel and Webber frame still relevant today – or would it be more useful to rework?

    I’m sorry these comments seem a bit negative (it’s because I’m so interested in the subject!)– Re Bourdieu etc I imagine it’s a stylistic or disciplinary difference, but I do think the paradox you set up is more complicated than you have perhaps had time to discuss in this post.

    Speak soon
    Pip

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s