(Press-News.org) In the age of social media, people's inner lives are increasingly recorded through the language they use online. With this in mind, an interdisciplinary group of University of Pennsylvania researchers is interested in whether a computational analysis of this language can provide as much, or more, insight into their personalities as traditional methods used by psychologists, such as self-reported surveys and questionnaires.
In a recent study, published in the journal PLOS ONE, 75,000 people voluntarily completed a common personality questionnaire through a Facebook application and made their Facebook status updates available for research purposes. The researchers then looked for overall linguistic patterns in the volunteers' language.
Their analysis allowed them to generate computer models that were able to predict the individuals' age, gender and their responses on the personality questionnaires they took. These prediction models were surprisingly accurate. For example, the researchers were correct 92 percent of the time when predicting users' gender based only on the language of their status updates.
The success of this "open" approach suggests new ways of researching connections between personality traits and behaviors and measuring the effectiveness of psychological interventions.
The study is part of the World Well-Being Project, an interdisciplinary effort with members of the Computer and Information Science Department in Penn's School of Engineering and Applied Science and the Department of Psychology and its Positive Psychology Center in the School of Arts and Sciences.
It was led by H. Andrew Schwartz, a postdoctoral fellow in computer and information science and the Positive Psychology Center, and included graduate student Johannes Eichstaedt, postdoctoral fellow Margaret Kern and director Martin Seligman, all of the Positive Psychology Center, as well as professor Lyle Ungar of Computer and Information Science.
The Penn team collaborated with Michal Kosinski and David Stillwell of The Psychometrics Centre at the University of Cambridge, who originally collected the data from Facebook users.
The researchers' study draws on a long history of studying the words people use as a way of understanding their feelings and mental states, but took an "open" rather than "closed" approach to analyzing the data at its core.
"In a 'closed vocabulary' approach," Kern said, "psychologists might pick a list of words they think signal positive emotion, like 'contented,' 'enthusiastic' or 'wonderful' and then look at the frequency of a person's use of these words as a way to measure how happy that person is. However, closed vocabulary approaches have several limitations, including that they do not always measure what they intend to measure."
"For example," Ungar said, "one might find the energy sector uses more negative emotion words, simply because they use the word 'crude' more. But this points to the need to use multi-word expressions to understand the intended meaning. 'Crude oil' is different than 'crude,' and, likewise, being 'sick of' is different from merely being 'sick.'"
Another inherent limitation to the closed vocabulary approach is that it relies upon a preconceived, fixed set of words. Such a study might be able to confirm that depressed people do indeed use expected words (like "sad") more frequently but cannot generate new insights (that they talk less about sports or social activities than happy people, for example.)
Past psychological language studies have necessarily relied on closed vocabulary approaches as their small sample sizes made open approaches impractical. The emergence of massive language datasets afforded by social media now allows for qualitatively different analyses.
"Most words occur rarely — any sample of writing, including Facebook status updates, only contains a small portion of the average vocabulary," Schwartz said. "This means that, for all but the most common words, you need writing samples from many people in order to make connections with psychological traits. Traditional studies have found interesting connections with pre-chosen categories of words such as 'positive emotion' or 'function words.' However, the billions of word instances available in social media allow us to find patterns at a much richer level."
The open-vocabulary approach, by contrast, derives important words and phrases from the sample itself. With more than 700 million words, phrases and topics drilled out of this study's sample of Facebook status messages, there was enough data to dig past the hundreds of common words and phrases and to find open-ended language that more meaningfully correlates with specific characteristics.
This large data size was critical to the specific technique the team used, known as differential language analysis, or DLA. The researchers used DLA to isolate the words and phrases that clustered around the various characteristics self-reported in the volunteers' questionnaires: age, gender and scores for the "Big Five" personality traits, which are extraversion, agreeableness, conscientiousness, neuroticism and openness. The Big Five model was chosen as it is a common and well-studied way of quantifying personality traits, but the researchers' method could be applied to models that measure other characteristics, including depression or happiness.
To visualize their results, the researchers created word clouds that summarized the language that statistically predicted a given trait, with the correlation strength of a word in a given cluster being represented by its size. For example, a word cloud that shows language used by extraverts prominently features words and phrases like "party," "great night" and "hit me up," while a word cloud for introverts features many references to Japanese media and emoticons.
"It may seem obvious that a super extraverted person would talk a lot about parties," Eichstaedt said, "but taken all together, these word clouds provide an unprecedented window into the psychological world of people with a given trait. Many things seem obvious after the fact and each item makes sense, but would you have thought of them all, or even most of them?"
"When I ask myself," Seligman said, "'What's it like to be an extrovert?' 'What's it like to be a teenage girl?' 'What's it like to be schizophrenic or neurotic?' or 'What's it like to be 70 years old?' these word clouds come much closer to the heart of the matter than do all the questionnaires in existence."
To test how accurately they were capturing people's traits through their open-vocabulary approach, the researchers split the volunteers into two groups and saw if a statistical model gleaned from one group could be used to infer the traits of the other. For three-quarters of the volunteers, the researchers used machine-learning techniques to build a model of the words and phrases that predict questionnaire responses. They then used this model to predict the age, gender and personalities for the remaining quarter based on their Facebook posts.
"The model was 92 percent accurate in predicting a volunteer's gender from their language usage," Schwartz said, "and we could predict a person's age within three years more than half the time. "Our personality predictions are inherently less accurate but are nearly as good as using a person's questionnaire results from one day to predict their answers to the same questionnaire on another day."
With the open-vocabulary approach shown to be equally or more predictive than closed approaches, the researchers used the word clouds to generate new insights into relationships between words and traits. For example, participants who scored low on the neurotic scale (i.e., those with the most emotional stability) used a greater number of words that referred to active, social pursuits, such as "snowboarding," "meeting" or "basketball."
"This doesn't guarantee that doing sports will make you less neurotic; it could be that neuroticism causes people to avoid sports," Ungar said. "But it does suggest that we should explore the possibility that neurotic individuals would become more emotionally stable if they played more sports."
By building a predictive model of personality based on the language of social media, researchers can now more easily approach such questions. Instead of asking millions of people to fill out surveys, future studies may be conducted by having volunteers submit their Facebook or Twitter feeds for anonymized study.
"Researchers have studied these personality traits for many decades theoretically," Eichstaedt said, "but now they have a simple window into how they shape modern lives in the age of Facebook."
Support for this research was provided by the Robert Wood Johnson Foundation's Pioneer Portfolio.
Research programmer Lukasz Dziurzynski and research assistant Stephanie M. Ramones, both of Psychology, and graduate students Megha Agrawal and Achal Shah, both of Computer and Information Science, also contributed to this study.
University of Pennsylvania
Sprawdź aktualny ranking najlepszych kredytów mieszkaniowych w Polsce - atrakcyjne kredytowanie nieruchomości.
Penn researchers use Facebook data to predict users' age, gender and personality traits
ELSE PRESS RELEASES FROM THIS DATE:
Typhoon Pabuk weakened and the core of the storm was changing from a warm core tropical system to a cold core low pressure system as it continued paralleling the coast of Japan on Sept. 26. NASA's Aqua satellite provided a visible image of the transforming storm that had lost its eye. On Sept. 26, 2013 at 03:55 UTC/Sept. 25 at 11:55 p.m. EDT, the Moderate Resolution Imaging Spectroradiometer or MODIS instrument aboard NASA's Aqua satellite captured a visible image of Tropical Storm Pabuk skirting eastern Japan. MODIS imagery also showed a steady influx of cold air stratocumulus ...
The continued accumulation of sand within the iconic ring-shaped reefs inside Maldivian atolls could provide a foundation for future island development new research suggests. Islands like the Maldives are considered likely to be the first to feel the effects of climate change induced sea level rise, with future island growth essential to counter the threat of rising sea levels. The study published in the journal Geology, and carried out by researchers from the University of Exeter in collaboration with the University of Auckland, James Cook University, the National Institute ...
LA JOLLA, CA -- September 26, 2013 -- Scientists at The Scripps Research Institute (TSRI) have discovered an important process by which special immune cells in the skin help heal wounds. They found that these skin-resident immune cells function as "first responders" to skin injuries in part by producing the molecule known as interleukin-17A (IL-17A), which wards off infection and promotes wound healing. "This appears to be a critical and unique component of mammals' defense against skin wounds, and we hope that it will point the way towards better therapies for people ...
Invasive species are among the world's greatest threats to native species and biodiversity. Once invasive plants become established, they can alter soil chemistry and shift nutrient cycling in an ecosystem. This can have important impacts not only on plant composition, diversity, and succession within a community, but also in the cycling of critical elements like carbon and nitrogen on a larger, potentially even global, scale. Clearly, both native and exotic plants form intimate relationships with bacteria in the soil that facilitate the extraction and conversion of elements ...
Since the publication in 2000 of a report titled "To Err is Human" by the Institute of Medicine which called for a reduction in preventable medical errors, there has been an increasing demand for making improvements in the quality and measurement of health care outcomes. Although many measures have been developed, they tend to be complex, labor intensive, have an unclear relationship with improved outcomes, and concentrate on processes of care rather than clinical outcomes. In a new paper published online by the Annals of Surgery, physician-researchers at University Hospitals ...
For the first time a long temperature reconstruction on the basis of stable carbon isotopes in tree rings has been achieved for the eastern Mediterranean. An exactly dated time series of almost 900 year length was established, exhibiting the medieval warm period, the little ice age between the 16th and 19th century as well as the transition into the modern warm phase. Moreover, Ingo Heinrich from the GFZ German Research Centre for Geosciences and colleagues revealed that the modern warming trend cannot be found in the new chronology. "A comparison with seasonal meteorological ...
MINNEAPOLIS – Contrary to earlier studies, new research suggests that omega-3 fatty acids may not benefit thinking skills. The study is published in the September 25, 2013, online issue of Neurology®, the medical journal of the American Academy of Neurology. Omega-3s are found in fatty fish such as salmon and in nuts. "There has been a lot of interest in omega-3s as a way to prevent or delay cognitive decline, but unfortunately our study did not find a protective effect in older women. In addition, most randomized trials of omega-3 supplements have not found an effect," ...
The skin fungus, Batrachochytrium dendrobatidis (Bd), also known as amphibian chytrid, first made its presence felt in 1993 when dead and dying frogs began turning up in Queensland, Australia. Since then it has sickened and killed frogs, toads, salamanders and other amphibians worldwide, driving hundreds of species to extinction. As a postdoctoral researcher Kevin Smith studied Bd in South Africa, home to the African clawed frog, a suspected vector for the fungus. When he took a position at Washington University in St. Louis, where he is now interim director of the Tyson ...
Torrent frogs use their toes, belly, and thighs to attach to rough, wet, and steep surfaces, according to results published September 25 in the open access journal PLOS ONE by Thomas Endlein from the Centre for Cell Engineering at the University of Glasgow and colleagues from other institutions. In a multipart study, the researchers compared the attachment abilities of two species: torrent frogs (Staurois guttatus) and tree frogs (Rhacophorus pardalis). They found that the torrent frog is better able to attach to extremely wet, steep, and rough surfaces due to its superior ...
A team of cancer researchers at the University of California, San Diego has identified the existence of precursor cells in early prostate cancers. These cells are resistant to androgen-deprivation therapy, and may drive the subsequent emergence of recurrent or metastatic prostate cancer. The scientists' findings, suggesting that potentially lethal castration-resistant prostate carcinoma cells already exist in some cancer patients at the very early stages of their disease, will be published by PLOS ONE on September 25, 2013. The work describes the isolation and propagation ...