PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Essays in English yield information about other languages

Grammatical habits in written English reveal linguistic features of non-native speakers' languages

2014-07-22
(Press-News.org) Computer scientists at MIT and Israel's Technion have discovered an unexpected source of information about the world's languages: the habits of native speakers of those languages when writing in English.

The work could enable computers chewing through relatively accessible documents to approximate data that might take trained linguists months in the field to collect. But that data could in turn lead to better computational tools.

"These [linguistic] features that our system is learning are of course, on one hand, of nice theoretical interest for linguists," says Boris Katz, a principal research scientist at MIT's Computer Science and Artificial Intelligence Laboratory and one of the leaders of the new work. "But on the other, they're beginning to be used more and more often in applications. Everybody's very interested in building computational tools for world languages, but in order to build them, you need these features. So we may be able to do much more than just learn linguistic features. … These features could be extremely valuable for creating better parsers, better speech-recognizers, better natural-language translators, and so forth."

In fact, Katz explains, the researchers' theoretical discovery resulted from their work on a practical application: About a year ago, Katz proposed to one of his students, Yevgeni Berzak, that he try to write an algorithm that could automatically determine the native language of someone writing in English. The hope was to develop grammar-correcting software that could be tailored to a user's specific linguistic background.

Family resemblance

With help from Katz and from Roi Reichart, an engineering professor at the Technion who was a postdoc at MIT, Berzak built a system that combed through more than 1,000 English-language essays written by native speakers of 14 different languages. First, it analyzed the parts of speech of the words in every sentence of every essay and the relationships between them. Then it looked for patterns in those relationships that correlated with the writers' native languages.

Like most machine-learning classification algorithms, Berzak's assigned probabilities to its inferences. It might conclude, for instance, that a particular essay had a 51 percent chance of having been written by a native Russian speaker, a 33 percent chance of having been written by a native Polish speaker, and only a 16 percent chance of having been written by a native Japanese speaker.

In analyzing the results of their experiments, Berzak, Katz, and Reichart noticed a remarkable thing: The algorithm's probability estimates provided a quantitative measure of how closely related any two languages were; Russian speakers' syntactic patterns, for instance, were more similar to those of Polish speakers than to those of Japanese speakers.

When they used that measure to create a family tree of the 14 languages in their data set, it was almost identical to a family tree generated from data amassed by linguists. The nine languages that are in the Indo-European family, for instance, were clearly distinct from the five that aren't, and the Romance languages and the Slavic languages were more similar to each other than they were to the other Indo-European languages.

What's your type?

"The striking thing about this tree is that our system inferred it without having seen a single word in any of these languages," Berzak says. "We essentially get the similarity structure for free. Now we can take it one step further and use this tree to predict typological features of a language for which we have no linguistic knowledge."

By "typological features," Berzak means the types of syntactic patterns that linguists use to characterize languages — things like the typical order of subject, object, and verb; how negations are formed; or whether nouns take articles. A widely used online linguistic database called the World Atlas of Language Structures (WALS) identifies nearly 200 such features and includes data on more than 2,000 languages.

But, Berzak says, for some of those languages, WALS includes only a handful of typological features; the others just haven't been determined yet. Even widely studied European languages may have dozens of missing entries in the WALS database. At the time of his study, Berzak points out, only 14 percent of the entries in WALS had been filled in.

The new system could help fill in the gaps. In work presented last month at the Conference on Computational Natural Language Learning, Berzak, Katz, and Reichart ran a series of experiments that examined each of the 14 languages of the essays they'd analyzed, trying to predict its typological features from those of the other 13 languages, based solely on the similarity scores produced by the system. On average, those predictions were about 72 percent accurate.

Branching out

The 14 languages of the researchers' initial experiments were the ones for which an adequate number of essays — an average of 88 each — were publicly available. But Katz is confident that given enough training data, the system would perform just as well on other languages. Berzak points out that the African language Tswana, which has only five entries in WALS, nonetheless has 6 million speakers worldwide. It shouldn't be too difficult, Berzak argues, to track down more English-language essays by native Tswana speakers.

INFORMATION: Written by Larry Hardesty, MIT News Office


ELSE PRESS RELEASES FROM THIS DATE:

NASA's Fermi finds a 'Transformer' pulsar

NASAs Fermi finds a Transformer pulsar
2014-07-22
VIDEO: Zoom into an artist's rendering of AY Sextantis, a binary star system whose pulsar switched from radio emissions to high-energy gamma rays in 2013. This transition likely means the pulsar's... Click here for more information. In late June 2013, an exceptional binary containing a rapidly spinning neutron star underwent a dramatic change in behavior never before observed. The pulsar's radio beacon vanished, while at the same time the system brightened fivefold in gamma rays, ...

Forty-five percent rise in diagnostic imaging tests by GPs -- new study

2014-07-22
A 45 per cent rise in diagnostic imaging tests ordered by Australian GPs is being driven by increasing GP visits, a rising number of problems managed at consultations and a higher likelihood that GPs order imaging tests for these problems, according to a new University of Sydney study released today. Based on a long term national survey of 9,802 GPs between 2002 and 2012, the report draws on data from more than 980,000 GP-patient encounter records to assess the extent to which GP's order tests in line with diagnostic imaging guidelines. "Most imaging tests ordered by ...

Gene variant identified as a heart disease risk factor for women

2014-07-22
When it comes to heart disease, Dr. Ross Feldman says women are often in the dark. Historically, it was thought that heart disease was a men's-only disease, however, data has shown that post-menopausal women are just as likely as men to get heart disease and are less likely to be adequately diagnosed and treated. New research from Western University published online this week in the British Journal of Clinical Pharmacology brings to light a genetic basis for heart disease in women and helps to identify which women are more prone to heart disease. The study, led by Dr. ...

Researchers create vaccine for dust-mite allergies

Researchers create vaccine for dust-mite allergies
2014-07-22
If you're allergic to dust mites (and chances are you are), help may be on the way. Researchers at the University of Iowa have developed a vaccine that can combat dust-mite allergies by naturally switching the body's immune response. In animal tests, the nano-sized vaccine package lowered lung inflammation by 83 percent despite repeated exposure to the allergens, according to the paper, published in the AAPS (American Association of Pharmaceutical Scientists) Journal. One big reason why it works, the researchers contend, is because the vaccine package contains a booster ...

Activity level may predict orthopedic outcomes

2014-07-22
According to a literature review in the July issue of the Journal of the American Academy of Orthopaedic Surgeons (JAAOS), patients' activity level is a strong predictor for how well they will do with certain treatments and how well they recover from injuries after treatment. Patients are encouraged to ask their orthopaedic surgeon if activity level is an important factor in their treatment decision. For example, more active patients are at a higher risk of re-injury after an anterior cruciate ligament (ACL) reconstruction, and activity level should be considered when deciding ...

UI study finds potential genetic link between epilepsy and neurodegenerative disorders

2014-07-22
A recent scientific discovery showed that mutations in prickle genes cause epilepsy, which in humans is a brain disorder characterized by repeated seizures over time. However, the mechanism responsible for generating prickle-associated seizures was unknown. A new University of Iowa study, published online July 14 in the Proceedings of the National Academy of Sciences, reveals a novel pathway in the pathophysiology of epilepsy. UI researchers have identified the basic cellular mechanism that goes awry in prickle mutant flies, leading to the epilepsy-like seizures. "This ...

Death of a parent during childhood is associated with greater mortality in early adulthood

2014-07-22
Experiencing the loss of a parent during childhood or adolescence is associated with a greater risk of mortality, according to a study published in this week's PLOS Medicine. The study, conducted by Jiong Li and colleagues from Aarhus University in Denmark, finds that individuals who lost either a mother or a father during childhood had a greater risk of mortality in the years following the parent's death compared with people unaffected by parental death during childhood. The researchers reached these conclusions combining data from national registries from all children ...

Distinctive developmental origin for a drainage tube in the eye

2014-07-22
A Jackson Laboratory based research team has conducted a comprehensive exploration of an eye structure known as Schlemm's canal: a key gatekeeper for the proper flow of eye fluid, presenting a number of insights relevant to glaucoma and other diseases. For the study publishing July 22 in the Open Access journal PLOS Biology, the researchers at JAX and Tufts University School of Medicine in Boston developed a new, "whole-mount," three-dimensional approach to analyse mouse models that have been engineered to host fluorescent proteins, to determine how Schlemm's canal forms ...

New research finds pathogenic connection between autoimmune disorders and cancer

2014-07-22
WASHINGTON -- Autoimmune disorders may share certain pathogenic mechanisms with cancer, according to a new report by George Washington University (GW) researcher Linda Kusner, Ph.D., published in PLOS ONE on July 22. This paradigm shifting work shows that the very same inhibitors of apoptosis, or cell destruction, in tumors are also expressed in cells that produce autoimmune diseases. Henry Kaminski, M.D., chair of the Department of Neurology at the GW School of Medicine and Health Sciences (SMHS), as well as colleagues from the Roswell Park Cancer Institute, collaborated ...

NASA provides double vision on Typhoon Matmo

NASA provides double vision on Typhoon Matmo
2014-07-22
Two instruments aboard NASA's Aqua satellite provided different views of Typhoon Matmo on its approach to Taiwan today, July 22. The Moderate Resolution Imaging Spectroradiometer or MODIS instrument snapped a visible picture of Typhoon Matmo's clouds on July 22 at 1:10 a.m. EDT. The MODIS image showed a center obscured by clouds. Bands of thunderstorms wrapped tightly into the center of circulation, creating the signature comma shape of a mature tropical cyclone. At the time of the image, the center was southeast of the southeastern tip of Taiwan. The image also showed ...

LAST 30 PRESS RELEASES:

Reality check: making indoor smartphone-based augmented reality work

Overthinking what you said? It’s your ‘lizard brain’ talking to newer, advanced parts of your brain

Black men — including transit workers — are targets for aggression on public transportation, study shows

Troubling spike in severe pregnancy-related complications for all ages in Illinois

Alcohol use identified by UTHealth Houston researchers as most common predictor of escalated cannabis vaping among youths in Texas

Need a landing pad for helicopter parenting? Frame tasks as learning

New MUSC Hollings Cancer Center research shows how Golgi stress affects T-cells' tumor-fighting ability

#16to365: New resources for year-round activism to end gender-based violence and strengthen bodily autonomy for all

Earliest fish-trapping facility in Central America discovered in Maya lowlands

São Paulo to host School on Disordered Systems

New insights into sleep uncover key mechanisms related to cognitive function

USC announces strategic collaboration with Autobahn Labs to accelerate drug discovery

Detroit health professionals urge the community to act and address the dangers of antimicrobial resistance

3D-printing advance mitigates three defects simultaneously for failure-free metal parts 

Ancient hot water on Mars points to habitable past: Curtin study

In Patagonia, more snow could protect glaciers from melt — but only if we curb greenhouse gas emissions soon

Simplicity is key to understanding and achieving goals

Caste differentiation in ants

Nutrition that aligns with guidelines during pregnancy may be associated with better infant growth outcomes, NIH study finds

New technology points to unexpected uses for snoRNA

Racial and ethnic variation in survival in early-onset colorectal cancer

Disparities by race and urbanicity in online health care facility reviews

Exploring factors affecting workers' acquisition of exercise habits using machine learning approaches

Nano-patterned copper oxide sensor for ultra-low hydrogen detection

Maintaining bridge safer; Digital sensing-based monitoring system

A novel approach for the composition design of high-entropy fluorite oxides with low thermal conductivity

A groundbreaking new approach to treating chronic abdominal pain

ECOG-ACRIN appoints seven researchers to scientific committee leadership positions

New model of neuronal circuit provides insight on eye movement

Cooking up a breakthrough: Penn engineers refine lipid nanoparticles for better mRNA therapies

[Press-News.org] Essays in English yield information about other languages
Grammatical habits in written English reveal linguistic features of non-native speakers' languages