Researchers at the University of São Paulo (USP) are developing a project that uses Artificial Intelligence (AI) and Twitter to try to predict depression and anxiety. Through prediction models, scientists want, in the future, to be able to anticipate disorders before diagnosis.
Research steps
The first step of the project was the construction of the database SeptemberBR in honor of the Yellow September suicide prevention movement and the month in which the collection began.
The stage is described in an article and published in the scientific journal Language Resources and Evaluation.
The database includes information related to texts published by users on Twitter (in Portuguese) and the network of connections of the 3,900 analyzed profiles that, after the survey, reported some mental disorder.
The data collection includes individual users’ public tweets, without retweets; there are around 47 million publications in text.
In the second part, taking into account the data collected, the scientists realized that it is possible to find out whether a person is at greater risk of developing depression based on the social network of friends or followers, without considering the individual’s own posts.
This is because, according to professor Ivandre Paraboni at the School of Arts, Sciences and Humanities (EACH-USP), people with mental disorders seek common interests, such as following celebrities who have revealed they have the same condition or discussion forums on the subject.
The second stage is under development and these results were preliminary.
Initially, we collected the timelines in an artisanal way, analyzing texts from around 19,000 Twitter users, which corresponds almost to the population of a small town. And then we used two sets — a portion of users actually diagnosed with mental disorders and a random portion, which served as a control. We wanted to differentiate between people with depression and the general population.
AI to predict depression
Research has already shown that mental disorders often reflect in the language of individuals. Most studies involving Natural Language Processing (NPL), however, were restricted to the English language and do not always reflect the Brazilian profile.
To carry out the research, the USP scientists submitted the collected texts to a pre-processing and data cleaning procedure, which removed hashtags, hyperlinks, emojis and non-standard characters, but kept the original writing.
Afterwards, the group used deep learning methods (deep learning), which created four word classifiers. The models are based on the BERT algorithm and learn the context and meaning of words by monitoring sequential data, such as the components of a sentence.
200 tweets from each user were analyzed, randomly chosen and kept anonymous. BERT’s method performed significantly better than the mechanism used as a second option, LogReg.
Indications of depression that appear in the office are not necessarily the same as those on the social network. For example: we noticed, in a very strong way, the use in the network of pronouns in the first person, such as “I” and “me”, which in psychology is a classic indicator of depression. But we also found a high incidence among depressive users of the use of the little heart symbol, the emoji of affection, which perhaps is not yet characterized in psychology.
Ivandre Paraboni
Future work by scientists
Now, the researchers are working to refine the computational technique and further improve the models. In the future, they hope to have a tool that can be applied in practice. Among the functionalities, the search can, for example, help in an eventual screening of people with indications of mental disorders or provide help to parents and relatives of individuals.
More information
According to the World Health Organization, depression and anxiety, among other mental health disorders, are a growing concern worldwide.
The agency estimates that about 3.8% of the planet’s population is affected by depression, according to data from 2021.
With the Covid-19 pandemic, the same period that scientists collected the texts on Twitter, there was a 25% increase in the global prevalence of anxiety and depression.
A recent study by the Ministry of Health with 784,000 participants showed that 11.3% of Brazilians have already been diagnosed with depression.
According to a survey by Comscore, from the beginning of March, Brazil is the third country that most consumes social networks in the world; Twitter is among one of the most accessed.