Fake news identification: a comparison of parts-of-speech and N-grams with neural networks

Brandon Stoick; Nicholas Snell; Jeremy Straub

doi:10.1117/12.2521250

28 May 2019 Fake news identification: a comparison of parts-of-speech and N-grams with neural networks

Brandon Stoick, Nicholas Snell, Jeremy Straub

Proceedings Volume 10989, Big Data: Learning, Analytics, and Applications; 109890D (2019) https://doi.org/10.1117/12.2521250
Event: SPIE Defense + Commercial Sensing, 2019, Baltimore, MD, United States

Abstract

The rise of the internet has enabled fake news to reach larger audiences more quickly. As more people turn to social media for news, the accuracy of information on these platforms is especially important. To help enable classification of the accuracy news articles at scale, machine learning models have been developed and trained to recognize fake articles. Previous linguistic work suggests part-of-speech and N-gram frequencies are often different between fake and real articles. To compare how these frequencies relate to the accuracy of the article, a dataset of 260 news articles, 130 fake and 130 real, was collected for training neural network classifiers. The first model relies solely on part-of-speech frequencies within the body of the text and consistently achieved 82% accuracy. As the proportion of the dataset used for training grew smaller, accuracy decreased, as expected. The true negative rate, however, remained high. Thus, some aspect of the fake articles was readily identifiable, even when the classifier was trained on a limited number of examples. The second model relies on the most commonly occurring N-gram frequencies. The neural nets were trained on N-grams of different length. Interestingly, the accuracy was near 61% for each N-gram size. This suggests some of the same information may be ascertainable across N-grams of different sizes.

Citation Download Citation

Brandon Stoick, Nicholas Snell, and Jeremy Straub "Fake news identification: a comparison of parts-of-speech and N-grams with neural networks", Proc. SPIE 10989, Big Data: Learning, Analytics, and Applications, 109890D (28 May 2019); https://doi.org/10.1117/12.2521250

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available