Search | VHL Regional Portal

MRS-DP: Improving Performance and Resource Utilization of Big Data Applications with Deadlines and Priorities.

Upadhyay, Utsav; Sikka, Geeta.

Big Data ; 8(4): 323-331, 2020 08.

Article in English | MEDLINE | ID: mdl-32820950

ABSTRACT

This article proposes the MapReduce scheduler with deadline and priorities (MRS-DP) scheduler capable of handling jobs with deadlines and priorities. Big data have emerged as a key concept and revolutionized data analytics in the present era. Big data are characterized by multiple dimensions or Vs, namely volume, velocity, variety, veracity, and valence. Recently, a new and important dimension (another V) is added, known as value. Value has emerged as an important characteristic and it can be understood in terms of delay in acquiring information, leading to late decisions that may result in missed opportunities. To gain optimal benefits, this article introduces a scheduler based on jobs with deadlines and priorities intending to improve resource utilization, with efficient job progress monitoring and backup launching mechanism. The proposed scheduler is capable of accommodating multiple jobs to maximize the number of jobs processed successfully and avoid starvation of lower priority jobs while improving the resource utilization and ensuring the assured quality of service (QoS). To evaluate our proposed scheduler, we ran multiple workloads consisting of the WordCount jobs and DataSort jobs. The performance of the proposed MRS-DP scheduler is compared with minimal earliest deadline first-work conserving scheduler and MapReduce Constraint Programming based Resource Management algorithm in terms of the percentage of successful jobs, priority-wise jobs, and resource utilization of the cluster. The result of the proposed scheduler depicts an improvement of around 10%-20% in terms of the percentage of successful jobs, 20%-25% concerning effective resource utilization offered, and the ability to ensure the offered QoS.

Subject(s)

Big Data , Resource Allocation/standards , Workload , Algorithms , Efficiency, Organizational , Software

STDADS: An Efficient Slow Task Detection Algorithm for Deadline Schedulers.

Upadhyay, Utsav; Sikka, Geeta.

Big Data ; 8(1): 62-69, 2020 02.

Article in English | MEDLINE | ID: mdl-31995397

ABSTRACT

The MapReduce programming model was designed and developed for Google File System to efficiently process large-scale distributed data sets. The open source implementation of this Google project was called the Apache Hadoop. Hadoop architecture includes Hadoop MapReduce and Hadoop Distributed File System (HDFS). HDFS supports Hadoop in effectively managing data sets over the cluster and MapReduce programming paradigm helps in the efficient processing of large data sets. MapReduce strategically re-executes a speculative task on some other node to finish the computation quickly, enhancing the overall Quality of Service (QoS). Several mechanisms were suggested over the Hadoop's Default Scheduler to improve the speculative task execution over Hadoop cluster. A large number of strategies were also suggested for scheduling jobs with deadlines. The mechanisms for speculative task execution were not developed for or were not well integrated with Deadline Schedulers. This article presents an improved speculative task detection algorithm, designed specifically for Deadline Scheduler. Our studies suggest the importance of keeping a regular track of node's performance to re-execute the speculative tasks more efficiently. We have successfully improved the QoS offered by Hadoop clusters over the jobs arriving with deadlines in terms of the percentage of successfully completed jobs, the detection time of speculative tasks, the accuracy of correct speculative task detection, and the percentage of incorrectly fagged speculative tasks.

Subject(s)

Algorithms , Cloud Computing , Computer Simulation , Appointments and Schedules , Datasets as Topic , Software

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL