Complexity 2020:1-25 (2020)
AbstractMultiway join queries incur high-cost I/Os operations over large-scale data. Exploiting sharing join opportunities among multiple multiway joins could be beneficial to reduce query execution time and shuffled intermediate data. Although multiway join optimization has been carried out in MapReduce, different design principles are not considered. To bridge the gap of not considering the optimization of Big Data platforms, an end-to-end multiway join over Flink, which is called Join-MOTH system, is proposed to exploit sharing data granularity, sharing join granularity, and sharing implicit sorts within multiple join queries. For sharing data, our previous work, Multiquery Optimization using Tuple Size and Histogram system, has been introduced to consider the granularity of sharing data opportunities among multiple queries. For sharing sort, our previous work, Sort-Based Optimizer for Big Data Multiquery, has been introduced to consider the implicit sorts among join queries. For sharing join, additional modules have been tailored to the J-MOTH optimizer to optimize sharing work by exploiting shared pipelined multiway join among multiple multiway join queries. The experimental evaluation has demonstrated that the J-MOTH system outperforms the naive and the state-of-the-art techniques by 44% for query execution time using TPC-H queries. Also, the proposed J-MOTH system introduces maximal intermediate data size reduction by 30% in average over Hadoop-like infrastructures.
Added to PP
Historical graph of downloads
References found in this work
No references found.
Citations of this work
No citations found.
Similar books and articles
Sharing Data is a Shared Responsibility: Commentary On: “The Essential Nature of Sharing in Science”.Joe Giffels - 2010 - Science and Engineering Ethics 16 (4):801-803.
Openness in the Social Sciences: Sharing Data.Joan E. Sieber - 1991 - Ethics and Behavior 1 (2):69 – 86.
BRCA1/2 Variant Data-Sharing Practices.Juli M. Bollinger, Abhi Sanka, Lena Dolman, Rachel G. Liao & Robert Cook-Deegan - 2019 - Journal of Law, Medicine and Ethics 47 (1):88-96.
What Should Be the Data Sharing Policy of Cognitive Science?Mark A. Pitt & Yun Tang - 2013 - Topics in Cognitive Science 5 (1):214-221.
Cracking the Code: Using Data to Combat the Opioid Crisis.Catherine Martinez - 2018 - Journal of Law, Medicine and Ethics 46 (2):454-471.
Cross Sector Data Sharing: Necessity, Challenge, and Hope.Cason Schmit, Kathleen Kelly & Jennifer Bernstein - 2019 - Journal of Law, Medicine and Ethics 47 (S2):83-86.
Genomic Data-Sharing Practices.Angela G. Villanueva, Robert Cook-Deegan, Jill O. Robinson, Amy L. McGuire & Mary A. Majumder - 2019 - Journal of Law, Medicine and Ethics 47 (1):31-40.
Characterizing the Biomedical Data-Sharing Landscape.Angela G. Villanueva, Robert Cook-Deegan, Barbara A. Koenig, Patricia A. Deverka, Erika Versalovic, Amy L. McGuire & Mary A. Majumder - 2019 - Journal of Law, Medicine and Ethics 47 (1):21-30.
Hidden Concerns of Sharing Research Data by Low/Middle-Income Country Scientists.Louise Bezuidenhout & Ereck Chakauya - 2018 - Global Bioethics 29 (1):39-54.
(Not) Giving Credit Where Credit is Due: Citation of Data Sets. [REVIEW]Professor Joan E. Sieber & Bruce E. Trumbo - 1995 - Science and Engineering Ethics 1 (1):11-20.
Data Sharing in the Context of Health-Related Citizen Science.Mary A. Majumder & Amy L. McGuire - 2020 - Journal of Law, Medicine and Ethics 48 (S1):167-177.
Availability of Research Data in High-Impact Addiction Journals with Data Sharing Policies.Dennis M. Gorman - 2020 - Science and Engineering Ethics 26 (3):1625-1632.
Erratum To: What Incentives Increase Data Sharing in Health and Medical Research? A Systematic Review.Adrian G. Barnett, Michelle Allen & Anisa Rowhani-Farid - 2017 - Research Integrity and Peer Review 2 (1).
(Not) Giving Credit Where Credit is Due: Citation of Data Sets.Joan E. Sieber & Bruce E. Trumbo - 1995 - Science and Engineering Ethics 1 (1):11-20.
Sharing Individual-Level Health Research Data: Experiences, Challenges and a Research Agenda.Phaik Yeong Cheah, Nicholas P. J. Day, Michael Parker & Susan Bull - 2017 - Asian Bioethics Review 9 (4):393-400.