PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Essays in English yield information about other languages

Grammatical habits in written English reveal linguistic features of non-native speakers' languages

2014-07-22
(Press-News.org) Computer scientists at MIT and Israel's Technion have discovered an unexpected source of information about the world's languages: the habits of native speakers of those languages when writing in English.

The work could enable computers chewing through relatively accessible documents to approximate data that might take trained linguists months in the field to collect. But that data could in turn lead to better computational tools.

"These [linguistic] features that our system is learning are of course, on one hand, of nice theoretical interest for linguists," says Boris Katz, a principal research scientist at MIT's Computer Science and Artificial Intelligence Laboratory and one of the leaders of the new work. "But on the other, they're beginning to be used more and more often in applications. Everybody's very interested in building computational tools for world languages, but in order to build them, you need these features. So we may be able to do much more than just learn linguistic features. … These features could be extremely valuable for creating better parsers, better speech-recognizers, better natural-language translators, and so forth."

In fact, Katz explains, the researchers' theoretical discovery resulted from their work on a practical application: About a year ago, Katz proposed to one of his students, Yevgeni Berzak, that he try to write an algorithm that could automatically determine the native language of someone writing in English. The hope was to develop grammar-correcting software that could be tailored to a user's specific linguistic background.

Family resemblance

With help from Katz and from Roi Reichart, an engineering professor at the Technion who was a postdoc at MIT, Berzak built a system that combed through more than 1,000 English-language essays written by native speakers of 14 different languages. First, it analyzed the parts of speech of the words in every sentence of every essay and the relationships between them. Then it looked for patterns in those relationships that correlated with the writers' native languages.

Like most machine-learning classification algorithms, Berzak's assigned probabilities to its inferences. It might conclude, for instance, that a particular essay had a 51 percent chance of having been written by a native Russian speaker, a 33 percent chance of having been written by a native Polish speaker, and only a 16 percent chance of having been written by a native Japanese speaker.

In analyzing the results of their experiments, Berzak, Katz, and Reichart noticed a remarkable thing: The algorithm's probability estimates provided a quantitative measure of how closely related any two languages were; Russian speakers' syntactic patterns, for instance, were more similar to those of Polish speakers than to those of Japanese speakers.

When they used that measure to create a family tree of the 14 languages in their data set, it was almost identical to a family tree generated from data amassed by linguists. The nine languages that are in the Indo-European family, for instance, were clearly distinct from the five that aren't, and the Romance languages and the Slavic languages were more similar to each other than they were to the other Indo-European languages.

What's your type?

"The striking thing about this tree is that our system inferred it without having seen a single word in any of these languages," Berzak says. "We essentially get the similarity structure for free. Now we can take it one step further and use this tree to predict typological features of a language for which we have no linguistic knowledge."

By "typological features," Berzak means the types of syntactic patterns that linguists use to characterize languages — things like the typical order of subject, object, and verb; how negations are formed; or whether nouns take articles. A widely used online linguistic database called the World Atlas of Language Structures (WALS) identifies nearly 200 such features and includes data on more than 2,000 languages.

But, Berzak says, for some of those languages, WALS includes only a handful of typological features; the others just haven't been determined yet. Even widely studied European languages may have dozens of missing entries in the WALS database. At the time of his study, Berzak points out, only 14 percent of the entries in WALS had been filled in.

The new system could help fill in the gaps. In work presented last month at the Conference on Computational Natural Language Learning, Berzak, Katz, and Reichart ran a series of experiments that examined each of the 14 languages of the essays they'd analyzed, trying to predict its typological features from those of the other 13 languages, based solely on the similarity scores produced by the system. On average, those predictions were about 72 percent accurate.

Branching out

The 14 languages of the researchers' initial experiments were the ones for which an adequate number of essays — an average of 88 each — were publicly available. But Katz is confident that given enough training data, the system would perform just as well on other languages. Berzak points out that the African language Tswana, which has only five entries in WALS, nonetheless has 6 million speakers worldwide. It shouldn't be too difficult, Berzak argues, to track down more English-language essays by native Tswana speakers.

INFORMATION: Written by Larry Hardesty, MIT News Office


ELSE PRESS RELEASES FROM THIS DATE:

NASA's Fermi finds a 'Transformer' pulsar

NASAs Fermi finds a Transformer pulsar
2014-07-22
VIDEO: Zoom into an artist's rendering of AY Sextantis, a binary star system whose pulsar switched from radio emissions to high-energy gamma rays in 2013. This transition likely means the pulsar's... Click here for more information. In late June 2013, an exceptional binary containing a rapidly spinning neutron star underwent a dramatic change in behavior never before observed. The pulsar's radio beacon vanished, while at the same time the system brightened fivefold in gamma rays, ...

Forty-five percent rise in diagnostic imaging tests by GPs -- new study

2014-07-22
A 45 per cent rise in diagnostic imaging tests ordered by Australian GPs is being driven by increasing GP visits, a rising number of problems managed at consultations and a higher likelihood that GPs order imaging tests for these problems, according to a new University of Sydney study released today. Based on a long term national survey of 9,802 GPs between 2002 and 2012, the report draws on data from more than 980,000 GP-patient encounter records to assess the extent to which GP's order tests in line with diagnostic imaging guidelines. "Most imaging tests ordered by ...

Gene variant identified as a heart disease risk factor for women

2014-07-22
When it comes to heart disease, Dr. Ross Feldman says women are often in the dark. Historically, it was thought that heart disease was a men's-only disease, however, data has shown that post-menopausal women are just as likely as men to get heart disease and are less likely to be adequately diagnosed and treated. New research from Western University published online this week in the British Journal of Clinical Pharmacology brings to light a genetic basis for heart disease in women and helps to identify which women are more prone to heart disease. The study, led by Dr. ...

Researchers create vaccine for dust-mite allergies

Researchers create vaccine for dust-mite allergies
2014-07-22
If you're allergic to dust mites (and chances are you are), help may be on the way. Researchers at the University of Iowa have developed a vaccine that can combat dust-mite allergies by naturally switching the body's immune response. In animal tests, the nano-sized vaccine package lowered lung inflammation by 83 percent despite repeated exposure to the allergens, according to the paper, published in the AAPS (American Association of Pharmaceutical Scientists) Journal. One big reason why it works, the researchers contend, is because the vaccine package contains a booster ...

Activity level may predict orthopedic outcomes

2014-07-22
According to a literature review in the July issue of the Journal of the American Academy of Orthopaedic Surgeons (JAAOS), patients' activity level is a strong predictor for how well they will do with certain treatments and how well they recover from injuries after treatment. Patients are encouraged to ask their orthopaedic surgeon if activity level is an important factor in their treatment decision. For example, more active patients are at a higher risk of re-injury after an anterior cruciate ligament (ACL) reconstruction, and activity level should be considered when deciding ...

UI study finds potential genetic link between epilepsy and neurodegenerative disorders

2014-07-22
A recent scientific discovery showed that mutations in prickle genes cause epilepsy, which in humans is a brain disorder characterized by repeated seizures over time. However, the mechanism responsible for generating prickle-associated seizures was unknown. A new University of Iowa study, published online July 14 in the Proceedings of the National Academy of Sciences, reveals a novel pathway in the pathophysiology of epilepsy. UI researchers have identified the basic cellular mechanism that goes awry in prickle mutant flies, leading to the epilepsy-like seizures. "This ...

Death of a parent during childhood is associated with greater mortality in early adulthood

2014-07-22
Experiencing the loss of a parent during childhood or adolescence is associated with a greater risk of mortality, according to a study published in this week's PLOS Medicine. The study, conducted by Jiong Li and colleagues from Aarhus University in Denmark, finds that individuals who lost either a mother or a father during childhood had a greater risk of mortality in the years following the parent's death compared with people unaffected by parental death during childhood. The researchers reached these conclusions combining data from national registries from all children ...

Distinctive developmental origin for a drainage tube in the eye

2014-07-22
A Jackson Laboratory based research team has conducted a comprehensive exploration of an eye structure known as Schlemm's canal: a key gatekeeper for the proper flow of eye fluid, presenting a number of insights relevant to glaucoma and other diseases. For the study publishing July 22 in the Open Access journal PLOS Biology, the researchers at JAX and Tufts University School of Medicine in Boston developed a new, "whole-mount," three-dimensional approach to analyse mouse models that have been engineered to host fluorescent proteins, to determine how Schlemm's canal forms ...

New research finds pathogenic connection between autoimmune disorders and cancer

2014-07-22
WASHINGTON -- Autoimmune disorders may share certain pathogenic mechanisms with cancer, according to a new report by George Washington University (GW) researcher Linda Kusner, Ph.D., published in PLOS ONE on July 22. This paradigm shifting work shows that the very same inhibitors of apoptosis, or cell destruction, in tumors are also expressed in cells that produce autoimmune diseases. Henry Kaminski, M.D., chair of the Department of Neurology at the GW School of Medicine and Health Sciences (SMHS), as well as colleagues from the Roswell Park Cancer Institute, collaborated ...

NASA provides double vision on Typhoon Matmo

NASA provides double vision on Typhoon Matmo
2014-07-22
Two instruments aboard NASA's Aqua satellite provided different views of Typhoon Matmo on its approach to Taiwan today, July 22. The Moderate Resolution Imaging Spectroradiometer or MODIS instrument snapped a visible picture of Typhoon Matmo's clouds on July 22 at 1:10 a.m. EDT. The MODIS image showed a center obscured by clouds. Bands of thunderstorms wrapped tightly into the center of circulation, creating the signature comma shape of a mature tropical cyclone. At the time of the image, the center was southeast of the southeastern tip of Taiwan. The image also showed ...

LAST 30 PRESS RELEASES:

Global cervical cancer vaccine roll-out shows it to be very effective in reducing cervical cancer and other HPV-related disease, but huge variations between countries in coverage

Negativity about vaccines surged on Twitter after COVID-19 jabs become available

Global measles cases almost double in a year

Lower dose of mpox vaccine is safe and generates six-week antibody response equivalent to standard regimen

Personalised “cocktails” of antibiotics, probiotics and prebiotics hold great promise in treating a common form of irritable bowel syndrome, pilot study finds

Experts developing immune-enhancing therapies to target tuberculosis

Making transfusion-transmitted malaria in Europe a thing of the past

Experts developing way to harness Nobel Prize winning CRISPR technology to deal with antimicrobial resistance (AMR)

CRISPR is promising to tackle antimicrobial resistance, but remember bacteria can fight back

Ancient Maya blessed their ballcourts

Curran named Fellow of SAE, ASME

Computer scientists unveil novel attacks on cybersecurity

Florida International University graduate student selected for inaugural IDEA2 public policy fellowship

Gene linked to epilepsy, autism decoded in new study

OHSU study finds big jump in addiction treatment at community health clinics

Location, location, location

Getting dynamic information from static snapshots

Food insecurity is significant among inhabitants of the region affected by the Belo Monte dam in Brazil

The Society of Thoracic Surgeons launches new valve surgery risk calculators

Component of keto diet plus immunotherapy may reduce prostate cancer

New circuit boards can be repeatedly recycled

Blood test finds knee osteoarthritis up to eight years before it appears on x-rays

April research news from the Ecological Society of America

Antimicrobial resistance crisis: “Antibiotics are not magic bullets”

Florida dolphin found with highly pathogenic avian flu: Report

Barcodes expand range of high-resolution sensor

DOE Under Secretary for Science and Innovation visits Jefferson Lab

Research expo highlights student and faculty creativity

Imaging technique shows new details of peptide structures

MD Anderson and RUSH unveil RUSH MD Anderson Cancer Center

[Press-News.org] Essays in English yield information about other languages
Grammatical habits in written English reveal linguistic features of non-native speakers' languages