Databricks introduction

RajnishRajnish
1 min read

Table of contents

Databricks

it is a unified, open analytics platform for building, deploying, sharing and maintaining data, analytics, and AI solutions at scale.

Databricks Architecture and Services

Clusters

  • it’s a collection of VM (Virtual Machines) instances.

  • over which computational workloads are distributed across workers

There are two types

All-Purpose ClustersJob Clusters
Analyse data collectively using interactive NotebooksRun automated jobs
Create cluster from the workspace or APIThe Databricks job scheduler creates job clusters when running jobs
Configuration information is retained for upto 70 clusters for upto 30 daysConfiguration information is retained for upto 30 most recently terminated cluster
0
Subscribe to my newsletter

Read articles from Rajnish directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Rajnish
Rajnish

Data is Everything