PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Alternate framework for distributed computing tames Big Data’s ever growing costs

Alternate framework for distributed computing tames Big Data’s ever growing costs
2023-02-23
(Press-News.org) The sheer volume of ‘Big Data’ produced today by various sectors is beginning to overwhelm even the extremely efficient computational techniques developed to sift through all that information. But a new computational framework based on random sampling looks set to finally tame Big Data’s ever-growing communication, memory and energy costs into something more manageable.

 

A paper describing the framework was published in the journal Big Data Mining and Analytics on Jan. 26.

 

The amount of data being produced from social networks, business transactions, the ‘Internet of Things’, finance, healthcare and beyond has exploded in recent years. This era of so-called Big Data has offered incredible statistical power to discover patterns and deliver insights previously unimaginable. But the volume of Big Data being produced is beginning to hit computational limits.

 

The scalability of complex algorithms starts to flounder at around a terabyte of data—or one trillion bytes—in a computer cluster or in cloud computing. The New York Stock Exchange for example produces about a terabyte’s worth of trade data every day, while Facebook users generate 500 terabytes over that same time.

 

Distributed computing plays a vital role in the storing, processing and analysis of such big data. This framework deploys a ‘divide and conquer’ strategy to efficiently and speedily sort through it. This involves the partitioning of a big data file into a number of smaller files called ‘data block files’. These data blocks are stored in a distributed fashion across the many nodes of a cluster of computers. Each of these blocks are then processed in parallel instead of sequentially, radically speeding up processing time. The results from these local nodes are then fed back to a central location and reintegrated, producing a global result.

 

This divide-and-conquer operation is in turn managed by a distributed file system, which in turn is governed by a programming model. The file system is what divides the big data files, and the programming model divides an algorithm into pieces, which can then run on the data blocks in a distributed fashion.

 

MapReduce, developed by Google, is the most widely used programming model for distributed computing that runs on clusters and across the cloud. The name comes from its two basic operations. The Map operation is performed on the data block in a node to generate a local result. This is executed on multiple nodes in parallel to achieve the huge speed-up in processing time. The Reduce operation then collates all these local results into a global result.

 

This latter stage involves a transfer of local results to other master nodes or central node that perform the Reduce operation, and all this data shuffling is extremely costly in terms of communication traffic and memory.

 

“This enormous communication cost is manageable up to a point,” said Xudong Sun, the lead author of the paper and a computer scientist with the College of Computer Science and Software Engineering at Shenzhen University. “If the desired task involves only a single pair of Map and Reduce operations, such as counting the frequency of a word across a large number of web pages, then MapReduce can run extremely efficiently across thousands of nodes over even a gargantuan big-data file.”

 

“But if the desired task involves a series of iterations of the Map and Reduce pairs, then MapReduce becomes very sluggish due to the large communication costs and consequent memory and computing costs,” he added.

 

So the researchers developed a new distributed computing framework they call Non-MapReduce to improve the scalability of cluster computing on big data by reducing these communication and memory costs.

 

To do so, they depend upon a novel data representation model called random sample partition, or RSP. This involves a random sampling of a big data file’s distributed data blocks instead of a processing of all the distributed data blocks. When a big data file is analyzed, a set of RSP data blocks are randomly selected to be processed and then subsequently integrated at the global level to produce an approximation of what the result would have been had the entire data file been processed.

 

In this way, the technique works in much the same way as in statistical analysis, random sampling is used to describe the attributes of a population. The Non-MapReduce’s RSP approach is thus a species of what is called ‘approximate computing’, an emerging paradigm in computing to achieve greater energy efficiency that delivers only an approximate rather than exact result. Approximate computing is useful in those situations where a roughly accurate result that is computationally cheaply achieved is sufficient for the task at hand, and superior to a computationally costly effort at trying to deliver a perfectly accurate result.

 

The Non-MapReduce computing framework will be of considerable benefit for a range of tasks, such as quickly sampling multiple random samples for ensemble machine learning; to directly execute a sequence of algorithms on local random samples without requiring data communication amongst the nodes; and easing the exploration and cleaning up of big data. In addition, the framework saves a significant amount of energy in cloud computing.

 

The team now hope to apply their Non-MapReduce framework to some major big data platforms and use it for real-world applications. Ultimately, they hope to use it to tackle application problems of analyzing extremely big data distributed across several data centers.

 

##

 

About Big Data Mining and Analytics 

 

Big Data Mining and Analytics (Published by Tsinghua University Press) discovers hidden patterns, correlations, insights and knowledge through mining and analyzing large amounts of data obtained from various applications. It addresses the most innovative developments, research issues and solutions in big data research and their applications. Big Data Mining and Analytics is indexed and abstracted in ESCI, EI, Scopus, DBLP Computer Science, Google Scholar, INSPEC, CSCD, DOAJ, CNKI, etc.

 

About Tsinghua University Press

 

Established in 1980, belonging to Tsinghua University, Tsinghua University Press (TUP) is a leading comprehensive higher education and professional publisher in China. Committed to building a top-level global cultural brand, after 41 years of development, TUP has established an outstanding managerial system and enterprise structure, and delivered multimedia and multi-dimensional publications covering books, audio, video, electronic products, journals and digital publications. In addition, TUP actively carries out its strategic transformation from educational publishing to content development and service for teaching & learning and was named First-class National Publisher for achieving remarkable results.

 

END


[Attachments] See images for this press release:
Alternate framework for distributed computing tames Big Data’s ever growing costs Alternate framework for distributed computing tames Big Data’s ever growing costs 2 Alternate framework for distributed computing tames Big Data’s ever growing costs 3

ELSE PRESS RELEASES FROM THIS DATE:

Insilico Medicine sends first generative AI-designed drug for COVID-19 and variants to clinic

Insilico Medicine sends first generative AI-designed drug for COVID-19 and variants to clinic
2023-02-23
Insilico Medicine, a clinical-stage biotech company powered by generative AI, today announces that China National Medical Products Administration (NMPA) has approved the Investigational New Drug (IND) application for ISM3312, an orally available 3CLpro inhibitor generated and designed with the support of Insilico’s proprietary generative chemistry platform Chemistry42 for the treatment of COVID-19. ISM3312 is a highly selective small molecule inhibitor with a novel molecular structure optimized from compounds which were generated and designed by Chemistry42 based on the structure of 3CL protease. It binds to ...

Children’s lung capacity improved in cleaner air

Children’s lung capacity improved in cleaner air
2023-02-23
As air pollution in Stockholm has decreased, so has the lung capacity of children and adolescents has improved, a new study published in the European Respiratory Journal reports. The researchers from Karolinska Institutet consider the results important, since the lung health of the young greatly affects the risk of their developing chronic lung diseases later in life. “Fortunately, we’ve seen a decrease in air pollutants and therefore an increase in air quality in Stockholm over the past 20 years,” says the study’s last author ...

CityU develops wireless, soft e-skin for interactive touch communication in the virtual world

CityU develops wireless, soft e-skin for interactive touch communication in the virtual world
2023-02-23
Sensing a hug from each other via the internet may be a possibility in the near future. A research team led by City University of Hong Kong (CityU) recently developed a wireless, soft e-skin that can both detect and deliver the sense of touch, and form a touch network allowing one-to-multiuser interaction. It offers great potential for enhancing the immersion of distance touch communication. “With the rapid development of virtual and augmented reality (VR and AR), our visual and auditory senses are not sufficient for us to create an ...

Octapharma USA requests FDA approval for wilate® VWD prophylaxis supplement

Octapharma USA requests FDA approval for wilate® VWD prophylaxis supplement
2023-02-23
PARAMUS, N.J. (Feb. 23, 2023) – Octapharma USA has submitted a Biologics License Application Supplement (sBLA) to the U.S. Food and Drug Administration (FDA) to expand the approval of wilate®, von Willebrand Factor/Coagulation Factor VIII Complex (Human) Lyophilized Powder for Solution for Intravenous Injection, to include routine prophylaxis to reduce the frequency of bleeding episodes in children and adults with any type of von Willebrand disease (VWD). “We look forward to working with the FDA on this sBLA for wilate® ...

Digital markers near-perfect for predicting dementia

2023-02-23
February 23, 2023-- Using ensemble learning techniques and longitudinal data from a large naturalistic driving study, researchers at Columbia University’s Mailman School of Public Health, Fu Foundation School of Engineering and Applied Science, and Vagelos College of Physicians and Surgeons have developed a novel, interpretable and highly accurate algorithm for predicting mild cognitive impairment and dementia in older drivers. Digital markers refer to variables generated from data captured through recording devices in the real-world setting.  These data could be processed to measure driving behavior, performance and tempo-spatial pattern in exceptional detail.  ...

Reducing social media use significantly improves body image in teens, young adults

2023-02-23
Teens and young adults who reduced their social media use by 50% for just a few weeks saw significant improvement in how they felt about both their weight and their overall appearance compared with peers who maintained consistent levels of social media use, according to research published by the American Psychological Association. “Adolescence is a vulnerable period for the development of body image issues, eating disorders and mental illness,” said lead author Gary Goldfield, PhD, of Children's Hospital of Eastern Ontario Research Institute. “Youth ...

Genes reveal kidney cancer’s risk of recurrence 

2023-02-23
A decade-long international study into kidney cancer has shown that doctors can predict the likelihood of a patient’s disease returning by looking at DNA mutations in their tumours.   The research, undertaken by a team of 44 researchers at 23 institutions across Europe and Canada, and published today, is the largest to link the genetic changes that occur in kidney cancer to patient outcomes.  More than 400,000 people are diagnosed with kidney cancer each year globally, including 13,000 ...

Getting good sleep could add years to your life

2023-02-23
Getting good sleep can play a role in supporting your heart and overall health—and maybe even how long you live—according to new research being presented at the American College of Cardiology’s Annual Scientific Session Together With the World Congress of Cardiology. The study found that young people who have more beneficial sleep habits are incrementally less likely to die early. Moreover, the data suggest that about 8% of deaths from any cause could be attributed to poor sleep patterns. “We saw ...

Hormone therapy for gender dysphoria may raise cardiovascular risks

2023-02-23
People with gender dysphoria taking hormone replacements as part of gender affirmation therapy face a substantially increased risk of serious cardiac events, including stroke, heart attack and pulmonary embolism, according to a study presented at the American College of Cardiology’s Annual Scientific Session Together With the World Congress of Cardiology. Gender dysphoria occurs when a person’s gender identity conflicts with the sex they were assigned at birth. Gender affirmation therapy, part of a process known as transitioning, includes a variety of medical, psychological and behavioral interventions to help ...

Heart attack deaths drop over past two decades

2023-02-23
The U.S. not only saw a significant decline in the overall rate of heart attack-related deaths in the past 20 years, but also a reduction in racial disparities for heart attack deaths, according to a study presented at the American College of Cardiology’s Annual Scientific Session Together With the World Congress of Cardiology. The gap in the rate of heart attack deaths between White people and African American/Black people narrowed by nearly half over the 22-year period, researchers reported. The ...

LAST 30 PRESS RELEASES:

New register opens to crown Champion Trees across the U.S.

A unified approach to health data exchange

New superconductor with hallmark of unconventional superconductivity discovered

Global HIV study finds that cardiovascular risk models underestimate for key populations

New study offers insights into how populations conform or go against the crowd

Development of a high-performance AI device utilizing ion-controlled spin wave interference in magnetic materials

WashU researchers map individual brain dynamics

Technology for oxidizing atmospheric methane won’t help the climate

US Department of Energy announces Early Career Research Program for FY 2025

PECASE winners: 3 UVA engineering professors receive presidential early career awards

‘Turn on the lights’: DAVD display helps navy divers navigate undersea conditions

MSU researcher’s breakthrough model sheds light on solar storms and space weather

Nebraska psychology professor recognized with Presidential Early Career Award

New data shows how ‘rage giving’ boosted immigrant-serving nonprofits during the first Trump Administration

Unique characteristics of a rare liver cancer identified as clinical trial of new treatment begins

From lab to field: CABBI pipeline delivers oil-rich sorghum

Stem cell therapy jumpstarts brain recovery after stroke

Polymer editing can upcycle waste into higher-performance plastics

Research on past hurricanes aims to reduce future risk

UT Health San Antonio, UTSA researchers receive prestigious 2025 Hill Prizes for medicine and technology

Panorama of our nearest galactic neighbor unveils hundreds of millions of stars

A chain reaction: HIV vaccines can lead to antibodies against antibodies

Bacteria in polymers form cables that grow into living gels

Rotavirus protein NSP4 manipulates gastrointestinal disease severity

‘Ding-dong:’ A study finds specific neurons with an immune doorbell

A major advance in biology combines DNA and RNA and could revolutionize cancer treatments

Neutrophil elastase as a predictor of delivery in pregnant women with preterm labor

NIH to lead implementation of National Plan to End Parkinson’s Act

Growth of private equity and hospital consolidation in primary care and price implications

Online advertising of compounded glucagon-like peptide-1 receptor agonists

[Press-News.org] Alternate framework for distributed computing tames Big Data’s ever growing costs