A group of Brazilian researchers has created a web platform that is able to identify false information online in an automated manner. Developed by academics at the Center for Mathematical Sciences Applied to Industry (CeMEAI), the system uses a combination of statistical models and machine learning techniques to establish whether a specific content in Brazilian Portuguese is likely to be false. Initial tests suggest the platform is able to detect fake news with a 96% accuracy. The CeMEAI is a research center based in the mathematics and computer science department of the University of Sao Paulo, in the Sao Paulo state city of Sao Carlos. The center is supported by grants from the Sao Paulo Research Agency (FAPESP). In an interview with FAPESP's news agency, project coordinator and technology transfer director Francisco Louzada Neto said the goal of the project is "to offer society an additional tool to identify, not only subjectively, whether a news item is false or not."
The system uses statistical methods to analyze writing characteristics, such as words used or more frequently used grammatical classes. These are then fed into a machine learning-based classifier, which is able to distinguish patterns of language, vocabulary and semantics of fake and real news, and automatically infer whether the content submitted to the platform is false. The models were trained with a massive database of real and false news and were exposed to the vocabulary used in over 100,000 articles published over the last five years. The researchers will aim to use the false news related to the upcoming presidential elections, as well as content related to the Covid-19 pandemic to further calibrate the models. The researchers also commented on the potential risks of the system in the interview, including the potential that the system could be used by fake news creators to assess the potential for false content to pass for real before it is published. "That's a risk we're going to have to deal with," Louzada noted.