Spark Driver

Introduction

Here we'll go over the Spark Driver Metrics.

Requirements

 
Spark Driver Data Sources
 
Open up the configuration modal for Spark, and click on the Configure Data Sources tab.  In the Add a data source drop down menu [1], select Spark Driver.  Click on Edit [2] to the right to expose the fields that need to be configured [3]:
 
SparkDriverConfig.png
 
The plugin requires the following information:
 
Field Name Default Value Description
 Host  _HOST:4040_  
Port 4040  
UrlPath /api/v1/applications  
Run as Unix User nobody  

 

Spark  Driver Metrics

 

Metric Name Units Metric Description
active_tasks Count Number of active tasks
completed_tasks Count Number of completed tasks
disk_used Bytes Disk Space Used
failed_tasks Count Number of Failed tasks
memory_max Bytes Max memory set for this process
memory_used Bytes Memory used
rdd_blocks Blocks Number of RDD blocks
total_duration Seconds Total duration for which this process has run
total_input_bytes Bytes Total Input bytes
total_shuffle_read Bytes Shuffle read size
total_shuffle_write Bytes Shuffle write size
total_tasks Count Total number of tasks

Spark  Executor Metrics

Metric Name Units Metric Description
active_tasks Count Number of active tasks
completed_tasks Count Number of completed tasks
disk_used Bytes Disk Space Used
failed_tasks Count Number of Failed tasks
memory_max Bytes Max memory set for this process
memory_used Bytes Memory used
rdd_blocks Blocks Number of RDD blocks
total_duration Seconds Total duration for which this process has run
total_input_bytes Bytes Total Input bytes
total_shuffle_read Bytes Shuffle read size
total_shuffle_write Bytes Shuffle write size
total_tasks Count Total number of tasks

 

Spark Jobs Metrics

Metric Name Units Metric Description
jobs_count Count Number of jobs
active_stages Count Number of active stages for the given jobId
active_tasks Count Number of active tasks for the given jobId
completed_stages Count Number of completed stages for the given jobId
complete_tasks Count Number of completed tasks for the given jobId
failed_stages Count Number of failed stages
failed_tasks Count Number of failed tasks for the given jobId
skipped_stages Count Number of skipped stages for the given job id
skipped_tasks Count Number of skipped tasks for the given jobId
num_tasks Count Number of tasks for the given jobId

 

Spark RDD Metrics

Metric Name Units Metric Description
rdd_disk_used Bytes disk used in the rdd
rdd_mem_used Bytes memory used in rdd
cached_partitions Count number of cached partitions in a given rdd
num_partitions Count Number of partitions in a given rdd

 


 

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.