Paper
28 May 2019 Fake news identification: a comparison of parts-of-speech and N-grams with neural networks
Author Affiliations +
Abstract
The rise of the internet has enabled fake news to reach larger audiences more quickly. As more people turn to social media for news, the accuracy of information on these platforms is especially important. To help enable classification of the accuracy news articles at scale, machine learning models have been developed and trained to recognize fake articles. Previous linguistic work suggests part-of-speech and N-gram frequencies are often different between fake and real articles. To compare how these frequencies relate to the accuracy of the article, a dataset of 260 news articles, 130 fake and 130 real, was collected for training neural network classifiers. The first model relies solely on part-of-speech frequencies within the body of the text and consistently achieved 82% accuracy. As the proportion of the dataset used for training grew smaller, accuracy decreased, as expected. The true negative rate, however, remained high. Thus, some aspect of the fake articles was readily identifiable, even when the classifier was trained on a limited number of examples. The second model relies on the most commonly occurring N-gram frequencies. The neural nets were trained on N-grams of different length. Interestingly, the accuracy was near 61% for each N-gram size. This suggests some of the same information may be ascertainable across N-grams of different sizes.
© (2019) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Brandon Stoick, Nicholas Snell, and Jeremy Straub "Fake news identification: a comparison of parts-of-speech and N-grams with neural networks", Proc. SPIE 10989, Big Data: Learning, Analytics, and Applications, 109890D (28 May 2019); https://doi.org/10.1117/12.2521250
Lens.org Logo
CITATIONS
Cited by 3 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Neural networks

Web 2.0 technologies

Feature extraction

Binary data

Machine learning

Performance modeling

Data modeling

Back to Top