What configuration will meet the business reporting team’s requirement for hourly dashboard updates with the lowest cost?

Study for the Databricks Data Engineering Professional Exam. Engage with multiple choice questions, each offering hints and in-depth explanations. Prepare effectively for your exam today!

The choice to schedule a job to execute the pipeline once an hour on a new job cluster effectively meets the requirement for hourly dashboard updates with minimized costs. This approach leverages the cost-effective nature of job clusters, as they can be created on-demand and terminated after their specific tasks are completed. This means that you are not incurring costs associated with having a persistent cluster running around the clock.

Using a new job cluster for each execution helps ensure that resources are being utilized only when necessary, aligning with financial efficiency, especially relevant for a scenario where updates are needed hourly but not continuously. The automatic scaling of job clusters is designed to handle the workloads effectively during execution while allowing you to avoid costs when the job is not running.

This option stands out compared to others: triggering the job manually would require constant human intervention, lacking automation and efficiency. Implementing a Structured Streaming job with a trigger interval of 60 minutes could lead to potential resource wastage because it would continue running continuously, thus incurring costs even outside of update periods. Finally, scheduling a job on a dedicated interactive cluster would generally be more expensive, as interactive clusters are intended for ongoing use and are not optimized for short, scheduled tasks.

In summary, scheduling a job to execute the pipeline

Subscribe

Get the latest from Examzify

You can unsubscribe at any time. Read our privacy policy