PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Essays in English yield information about other languages

Grammatical habits in written English reveal linguistic features of non-native speakers' languages

2014-07-22
(Press-News.org) Computer scientists at MIT and Israel's Technion have discovered an unexpected source of information about the world's languages: the habits of native speakers of those languages when writing in English.

The work could enable computers chewing through relatively accessible documents to approximate data that might take trained linguists months in the field to collect. But that data could in turn lead to better computational tools.

"These [linguistic] features that our system is learning are of course, on one hand, of nice theoretical interest for linguists," says Boris Katz, a principal research scientist at MIT's Computer Science and Artificial Intelligence Laboratory and one of the leaders of the new work. "But on the other, they're beginning to be used more and more often in applications. Everybody's very interested in building computational tools for world languages, but in order to build them, you need these features. So we may be able to do much more than just learn linguistic features. … These features could be extremely valuable for creating better parsers, better speech-recognizers, better natural-language translators, and so forth."

In fact, Katz explains, the researchers' theoretical discovery resulted from their work on a practical application: About a year ago, Katz proposed to one of his students, Yevgeni Berzak, that he try to write an algorithm that could automatically determine the native language of someone writing in English. The hope was to develop grammar-correcting software that could be tailored to a user's specific linguistic background.

Family resemblance

With help from Katz and from Roi Reichart, an engineering professor at the Technion who was a postdoc at MIT, Berzak built a system that combed through more than 1,000 English-language essays written by native speakers of 14 different languages. First, it analyzed the parts of speech of the words in every sentence of every essay and the relationships between them. Then it looked for patterns in those relationships that correlated with the writers' native languages.

Like most machine-learning classification algorithms, Berzak's assigned probabilities to its inferences. It might conclude, for instance, that a particular essay had a 51 percent chance of having been written by a native Russian speaker, a 33 percent chance of having been written by a native Polish speaker, and only a 16 percent chance of having been written by a native Japanese speaker.

In analyzing the results of their experiments, Berzak, Katz, and Reichart noticed a remarkable thing: The algorithm's probability estimates provided a quantitative measure of how closely related any two languages were; Russian speakers' syntactic patterns, for instance, were more similar to those of Polish speakers than to those of Japanese speakers.

When they used that measure to create a family tree of the 14 languages in their data set, it was almost identical to a family tree generated from data amassed by linguists. The nine languages that are in the Indo-European family, for instance, were clearly distinct from the five that aren't, and the Romance languages and the Slavic languages were more similar to each other than they were to the other Indo-European languages.

What's your type?

"The striking thing about this tree is that our system inferred it without having seen a single word in any of these languages," Berzak says. "We essentially get the similarity structure for free. Now we can take it one step further and use this tree to predict typological features of a language for which we have no linguistic knowledge."

By "typological features," Berzak means the types of syntactic patterns that linguists use to characterize languages — things like the typical order of subject, object, and verb; how negations are formed; or whether nouns take articles. A widely used online linguistic database called the World Atlas of Language Structures (WALS) identifies nearly 200 such features and includes data on more than 2,000 languages.

But, Berzak says, for some of those languages, WALS includes only a handful of typological features; the others just haven't been determined yet. Even widely studied European languages may have dozens of missing entries in the WALS database. At the time of his study, Berzak points out, only 14 percent of the entries in WALS had been filled in.

The new system could help fill in the gaps. In work presented last month at the Conference on Computational Natural Language Learning, Berzak, Katz, and Reichart ran a series of experiments that examined each of the 14 languages of the essays they'd analyzed, trying to predict its typological features from those of the other 13 languages, based solely on the similarity scores produced by the system. On average, those predictions were about 72 percent accurate.

Branching out

The 14 languages of the researchers' initial experiments were the ones for which an adequate number of essays — an average of 88 each — were publicly available. But Katz is confident that given enough training data, the system would perform just as well on other languages. Berzak points out that the African language Tswana, which has only five entries in WALS, nonetheless has 6 million speakers worldwide. It shouldn't be too difficult, Berzak argues, to track down more English-language essays by native Tswana speakers.

INFORMATION: Written by Larry Hardesty, MIT News Office


ELSE PRESS RELEASES FROM THIS DATE:

NASA's Fermi finds a 'Transformer' pulsar

NASAs Fermi finds a Transformer pulsar
2014-07-22
VIDEO: Zoom into an artist's rendering of AY Sextantis, a binary star system whose pulsar switched from radio emissions to high-energy gamma rays in 2013. This transition likely means the pulsar's... Click here for more information. In late June 2013, an exceptional binary containing a rapidly spinning neutron star underwent a dramatic change in behavior never before observed. The pulsar's radio beacon vanished, while at the same time the system brightened fivefold in gamma rays, ...

Forty-five percent rise in diagnostic imaging tests by GPs -- new study

2014-07-22
A 45 per cent rise in diagnostic imaging tests ordered by Australian GPs is being driven by increasing GP visits, a rising number of problems managed at consultations and a higher likelihood that GPs order imaging tests for these problems, according to a new University of Sydney study released today. Based on a long term national survey of 9,802 GPs between 2002 and 2012, the report draws on data from more than 980,000 GP-patient encounter records to assess the extent to which GP's order tests in line with diagnostic imaging guidelines. "Most imaging tests ordered by ...

Gene variant identified as a heart disease risk factor for women

2014-07-22
When it comes to heart disease, Dr. Ross Feldman says women are often in the dark. Historically, it was thought that heart disease was a men's-only disease, however, data has shown that post-menopausal women are just as likely as men to get heart disease and are less likely to be adequately diagnosed and treated. New research from Western University published online this week in the British Journal of Clinical Pharmacology brings to light a genetic basis for heart disease in women and helps to identify which women are more prone to heart disease. The study, led by Dr. ...

Researchers create vaccine for dust-mite allergies

Researchers create vaccine for dust-mite allergies
2014-07-22
If you're allergic to dust mites (and chances are you are), help may be on the way. Researchers at the University of Iowa have developed a vaccine that can combat dust-mite allergies by naturally switching the body's immune response. In animal tests, the nano-sized vaccine package lowered lung inflammation by 83 percent despite repeated exposure to the allergens, according to the paper, published in the AAPS (American Association of Pharmaceutical Scientists) Journal. One big reason why it works, the researchers contend, is because the vaccine package contains a booster ...

Activity level may predict orthopedic outcomes

2014-07-22
According to a literature review in the July issue of the Journal of the American Academy of Orthopaedic Surgeons (JAAOS), patients' activity level is a strong predictor for how well they will do with certain treatments and how well they recover from injuries after treatment. Patients are encouraged to ask their orthopaedic surgeon if activity level is an important factor in their treatment decision. For example, more active patients are at a higher risk of re-injury after an anterior cruciate ligament (ACL) reconstruction, and activity level should be considered when deciding ...

UI study finds potential genetic link between epilepsy and neurodegenerative disorders

2014-07-22
A recent scientific discovery showed that mutations in prickle genes cause epilepsy, which in humans is a brain disorder characterized by repeated seizures over time. However, the mechanism responsible for generating prickle-associated seizures was unknown. A new University of Iowa study, published online July 14 in the Proceedings of the National Academy of Sciences, reveals a novel pathway in the pathophysiology of epilepsy. UI researchers have identified the basic cellular mechanism that goes awry in prickle mutant flies, leading to the epilepsy-like seizures. "This ...

Death of a parent during childhood is associated with greater mortality in early adulthood

2014-07-22
Experiencing the loss of a parent during childhood or adolescence is associated with a greater risk of mortality, according to a study published in this week's PLOS Medicine. The study, conducted by Jiong Li and colleagues from Aarhus University in Denmark, finds that individuals who lost either a mother or a father during childhood had a greater risk of mortality in the years following the parent's death compared with people unaffected by parental death during childhood. The researchers reached these conclusions combining data from national registries from all children ...

Distinctive developmental origin for a drainage tube in the eye

2014-07-22
A Jackson Laboratory based research team has conducted a comprehensive exploration of an eye structure known as Schlemm's canal: a key gatekeeper for the proper flow of eye fluid, presenting a number of insights relevant to glaucoma and other diseases. For the study publishing July 22 in the Open Access journal PLOS Biology, the researchers at JAX and Tufts University School of Medicine in Boston developed a new, "whole-mount," three-dimensional approach to analyse mouse models that have been engineered to host fluorescent proteins, to determine how Schlemm's canal forms ...

New research finds pathogenic connection between autoimmune disorders and cancer

2014-07-22
WASHINGTON -- Autoimmune disorders may share certain pathogenic mechanisms with cancer, according to a new report by George Washington University (GW) researcher Linda Kusner, Ph.D., published in PLOS ONE on July 22. This paradigm shifting work shows that the very same inhibitors of apoptosis, or cell destruction, in tumors are also expressed in cells that produce autoimmune diseases. Henry Kaminski, M.D., chair of the Department of Neurology at the GW School of Medicine and Health Sciences (SMHS), as well as colleagues from the Roswell Park Cancer Institute, collaborated ...

NASA provides double vision on Typhoon Matmo

NASA provides double vision on Typhoon Matmo
2014-07-22
Two instruments aboard NASA's Aqua satellite provided different views of Typhoon Matmo on its approach to Taiwan today, July 22. The Moderate Resolution Imaging Spectroradiometer or MODIS instrument snapped a visible picture of Typhoon Matmo's clouds on July 22 at 1:10 a.m. EDT. The MODIS image showed a center obscured by clouds. Bands of thunderstorms wrapped tightly into the center of circulation, creating the signature comma shape of a mature tropical cyclone. At the time of the image, the center was southeast of the southeastern tip of Taiwan. The image also showed ...

LAST 30 PRESS RELEASES:

CEOs’ human concern translates into higher stock price

Smoking-related deaths could be reduced if people attending lung cancer screening are offered stop-smoking support

Quick decisions in soccer enhanced by brain’s ability to suppress actions

Recycling CFRP waste is a challenge, but we've found a way to make it work

Advanced nuclear magnetic resonance technique developed to reveal precise structural and dynamical details in zeolites

Advancing performance assessment of a spectral beam splitting hybrid PV/T system with water-based SiO2 nanofluid

Researchers realize target protein stability analysis by time-resolved ultraviolet photodissociation mass spectrometry

Oxygen vacancies mediated ultrathin Bi4O5Br2 nanosheets as efficient piezocatalyst for synthesis of H2O2 from pure water

Warming and exogenous organic matter input affected temperature sensitivity and microbial carbon use efficiency of agricultural soil respiration on the Qinghai-Tibet Plateau

Eco-friendly glue designed by Cal Poly, Geisys Ventures team earns industry 'Innovation Award'

From dreams to reality: unveiling the ideal in situ construction method for lunar habitats and paving the way to Moon colonization

From theory to practice: Study demonstrates high CO2 storage efficiency in shale reservoirs using fracturing technology

What women want: Female experiences to manage pelvic pain

Study finds ChatGPT shows promise as medication management tool, could help improve geriatric health care

Heart failure, not stroke is the most common complication of atrial fibrillation

Antipsychotics for dementia linked to more harms than previously acknowledged

Health improvements occurred worldwide since 2010 despite COVID-19 pandemic, but progress was uneven

Mind the gender gap – Met police least trusted by women

Surrey engineers help Mauritius spot illegal fishing from space

Opioid dependence remains high but stable in Scotland, new surveillance report finds

Protecting brain cells with cannabinol

Calorie restriction study reveals complexities in how diet impacts aging

Atom-by-atom: Imaging structural transformations in 2D materials

How 3D printers can give robots a soft touch

Rice alumna wins prestigious merit-based fellowship for new Americans

International group runs simulations capable of describing South America's climate with unprecedented accuracy

Researchers find that accelerated aging biology in the placenta contributes to a rare form of pregnancy-related heart failure

Vibrations of granular materials: an everyday scientific mystery

UW–Madison biochemist wins prestigious forestry prize for discoveries that support sustainable energy and product innovations

New SPECT/CT technique shows impressive biomarker identification, offers increased access for prostate cancer patients

[Press-News.org] Essays in English yield information about other languages
Grammatical habits in written English reveal linguistic features of non-native speakers' languages