Problem of Fake News

Table of Content



This essay could be plagiarized. Get your custom essay
“Dirty Pretty Things” Acts of Desperation: The State of Being Desperate
128 writers

ready to help you now

Get original paper

Without paying upfront
















Information, that can be, and is generated by anyone, has seen not only news channels but also social media to be the most popular and pervasive medium for its exchange. Although these mediums have given each of us the power to share our views and opinions, some people have used this power to spread false and malicious information among people to affect their decisions, judgement and sentiments. False information is spread intentionally to support some propaganda or to achieve financial gains.

In recent years, the problem of fake news creation and propagation has drawn a lot of attention. This literature survey aims at understanding the sources of fake news, the reason for its quick dissemination and how it spreads so quickly. It also does a study of existing datasets, methods to create datasets to detect fake news automatically through various classifiers based on machine learning algorithms like Naive Bayes, decision trees and the use of deep neural networks to predict possible dissemination paths.

The digital world has completely changed the way we consume news today. Easy access to the internet and the popularity of social media has resulted in fast information diffusion across the world. This has led many people to abandon their traditional sources of news like newspapers, news channels and magazines and use social media as their dependable source of news. Many surveys have confirmed that a large chunk of the population in the United States depends on social media to get their news [1]. Social media platforms like Facebook, Twitter, Reddit and Tumblr are the most influential sources of news today.

The possibility of fast dissemination of news to a huge pool of people through social media platforms have led them to become the most favorable agents to spread fake news. Fake news is related and spread to serve many purposes such as to spread panic among people during emergency situations, to spread hatred against a person, to influence people’s thinking and judgement to align with the author’s beliefs and to create bias etc. In [2] Christina et al. discuss how fake news that spread during the Boston bombings affected innocent people who were falsely implicated as suspects which resulted in loss of privacy and jobs. In [3] Adam et al. discuss the fake news consumption and propagation during the 2016 US Presidential election and how it could have affected the outcome of the vote.

Fake news has a profound effect on people’s judgement. It can instigate hatred and create panic in emergency situations. With the growing strength of social media platforms, the need to detect and top fake news from spreading is of utmost importance. This problem has driven many researches. Most of the existing efforts in the field of automatic detection and blocking of fake news has been done using feature extraction on textual and image content which is then followed by Recurrent Neural Networks (RNNs) used for learning, naive Bayes, decision trees used for classification of the content as fake or genuine.

This literature survey focuses on exploring the categorizations of fake news and its sources and answers these questions: how does fake news propagate so quickly? What kind of data is used for research? What are some of the gaps in the existing automated fake news detection and classification models? This survey uses references from published papers, statistics from surveys, conference proceedings, and some real-life events. Fig 1 shows the organization of the literature survey.

Most of the fake news that appears on social media is written to intentionally mislead people. This is done for various purposes that may include financial gains, to create panic, to create a biased opinion about an event like an election or to tarnish a person’s image in the society. Hence fake news can be categorized based on the purpose they serve:

Fake news to incite fear and panic among people in crisis situations: In [5], Gupta et al. discuss how social media platforms played a key role in providing resources during events like mass shootings, earthquakes and floods. The downside of social networks during these emergency situations come from spreading fake images that can create panic and fear among already affected people. Gupta et al. analyze the effects of fake news during one such disaster situation that occurred in the form of Hurricane Sandy in the United States.

Fake news to malign a person’s reputation: A statement from a person of influence can be taken out of context and presented in a way that can destroy the person’s reputation and credibility. This is usually done through hiding most of the meaning and context of the original information. In [4], E. Mustafaraj and P. T. Metaxas discuss one such event in 2010 where a statement from Martha Coakley who was an Attorney General of Massachusetts, was taken out of context and a false claim was spread in her name.

Fake news to spread political propaganda: Platforms like Twitter and Facebook have been instrumental in helping political parties to connect with people during campaigning for elections [6]. Political parties may also misuse social media platforms to spread fake news against each other during elections to gain support for their propaganda. In [3], Adam et al. discuss about Twitter and Facebook generating the major chunk of traffic directed towards fake websites that generated and spread fake news about the candidates during 2016 US Presidential elections.

Due to the sheer size of social media, it is difficult to pinpoint the exact source of fake news. R.Benes et al. studied some of the top stories on Facebook during the 2016 US Presidential elections and found that the number of shares, comments and reactions on fake stories surpassed the true stories [7]. It is not possible by few individuals to spread such huge volumes of fake news. In [8] Shu et al. found that most of the fake news propagation happen through social bots, cyborgs and trolls.

As account creation on most of the social media platforms is free, it is very cheap to create fake accounts. Social bots, cyborgs are a form fake accounts that can automatically generate and propagate fake news and interact with people on social media. Shu et al. reflect on the role of 19 million social bots and 1000 trolls who were real people paid to create and spread fake news, who involved in many discussions on social media during the 2016 US Presidential elections.

The sources of fake news could be many, but the reason and modes of its quick dispersion is another area that needs attention. From the looks, title, author information and content, it is easy to discern the authenticity of an article. The literature points at some of the reasons due to which the fake content spreads despite the failure of a quick authenticity check.

Fake content is usually disguised in the form of clickbait which isseen on platforms like Facebook. Clicking on these can cause content to be shared. Clickbait can look very innocent showing thumbnails that a user would most likely click. Both intentional and unintentional clicks result in the hidden fake content to be shared even though the user didn’t intend to share it.

Social homophily refers to people making friendships with others who are similar to them. If a person reads a fake news column and believes it to be true and shares it, there is a high probability that his or her friends on social media will do the same. In [3], Adam et al. found social homophily to be one of the reasons for people visiting fake websites during the US 2016 Presidential elections.

On platforms like Facebook and Twitter, it is seen that the news feed of a person often tends to fill up with content that is favorable to his or her bias regarding a person or an event like an election. In [8], Shu et al. observed that personal bias combined with homophily leads to a phenomenon called the echo chamber effect which leads to people spreading fake news that conforms to their bias. In contrast, in [5], Gupta et al. found that majority of fake images circulated during hurricane Sandy were through retweets on Twitter. They found that a person’s friendships or homophily had negligible effect on spreading fake images.

There are some manual methods that people can use to quickly check the authenticity of a piece of news before they share it. In [10], Shao et al. list some of the websites like,, and which constantly work towards debunking fake news. In [9], D. Saez-Trumper proposes the method of reverse image search. According to this method, one can use an image to do an image search on google. If the same image appears with two different captions, it is possible that the picture could be from some old event being spread now to mislead people.

Facebook has introduced a new mechanism of marking the posts with a ‘disputed’ tag if some users have reported on the authenticity of the article and hence warning its users that the content may not be true. Social media can use this feature to mark some article as fake before it starts spreading.

Due to the nature, size and volume of data on social media platforms, automatic detection of fake content is our best chance to combat this problem. Automatic fake news detection has three stages-Data collection and building datasets, feature extraction, Classification models.

As Facebook and twitter are one of the largest sources of fake news, they can be used to create large datasets. In [5], Gupta et al, give an account of the process they used extract data on ongoing events through APIs like Trend API, Streaming API. In [2] Christina et al, worked on historical data from Twitter extracted through Topsy API. Websites like, have a major source of fake news that can be used to build datasets [10]. In [8] Shu et al. list some of readily available datasets from Buzzfeed, LIAR, CREDBANK, BS Detector which contain labelled event specific data or all the data from platforms like twitter within specific time frame. Descriptions, labels provided in some of these datasets are from real people and hence are accurate and trustworthy.

This is one of the crucial step in the process of automatic detection of fake news. Christina et al., Gupta et al., Shu et al. advocate User based, Content based and Source based feature extraction for classification of fake news [2][5][8]. User based and Content based features deal with identifying the user accounts based on its social affiliations and identifying fake content based on the type of the content, length, sentiment. In [11], Castillo et al. propose using content-based features like question marks and exclamation as an indicator of fake news.

Source based features are close to the User based features and give indicators about the origin of the message, URLs used, popularity and credibility of the sources [8] [11]. Shu et al., Castillo et al. and Wu et al. speak about the importance of building a tree or a path that describes the route taken by an article during its propagation to other people. This path of through the network will identify differences between the patterns of spreading genuine news and fake news [8] [11] [12].

In [8], two other types of features called Linguistic based and Visual based are discussed by Shu et al. Linguistic features deal with writing styles, lexical features, quotes used, links to other articles, words used to make an eye-catching headline and graphs. Visual features include clarity of an image, similarity to another picture and image ratio that can be extracted from the images and videos associated with the fake content.

Based on the features extracted, classifiers can automatically detect fake news. In [2], Christina et al. compare the performance of J48 decision tree, KStar classifier, Random forest classifier for categorization of fake tweets during hurricane Sandy and Boston marathon bombing. In [8], Shu et al. propose using classifiers like naïve Bayes, Support Vector Machines(SVM), regression on the extracted features to choose the best classifier.

Shu et al. predict that using probabilistic methods, aggregation models, ensemble methods can result in better classifier models. In [5], Gupta et al. achieved an accuracy of 97% with decision tree classifiers to classify fake news from real news related to hurricane Sandy. In [12], Wu et al. found that Long Short Term Memory(LSTMs) and Recurrent Neural Networks(RNNs) perform well in the prediction of the path of a news article on social media. Apart from path prediction, deep neural networks have performed very well with text sequence predictions and classification.

Cite this page

Problem of Fake News. (2021, Sep 30). Retrieved from

Remember! This essay was written by a student

You can get a custom paper by one of our expert writers

Order custom paper Without paying upfront