Databricks Install & Auth
Overview
Set up Databricks CLI v2, Python SDK, and authentication. Covers Personal Access Tokens (legacy), OAuth U2M (interactive), and OAuth M2M (service principal for CI/CD). Databricks strongly recommends OAuth over PATs for production.
Prerequisites
- Python 3.8+ with pip
- Databricks workspace URL (e.g.,
https://adb-1234567890123456.7.azuredatabricks.net) - For PAT: User Settings > Developer > Access Tokens in workspace UI
- For OAuth M2M: Service principal with client ID and secret
Instructions
Step 1: Install Databricks CLI and Python SDK
set -euo pipefail
# Install CLI v2 (standalone binary — recommended)
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
# Verify CLI
databricks --version
# Install Python SDK
pip install databricks-sdk
# Install Databricks Connect for local Spark development
pip install databricks-connect==14.3.*
Step 2: Configure Authentication
Option A: Personal Access Token (Quick Start)
Generate a PAT in workspace UI: User Settings > Developer > Access Tokens.
# Interactive setup — prompts for host and token
databricks configure --token
# Or set environment variables directly
export DATABRICKS_HOST="https://adb-1234567890123456.7.azuredatabricks.net"
export DATABRICKS_TOKEN="dapi_your_token_here"
Option B: OAuth U2M (User-to-Machine — Interactive)
Opens browser for OAuth consent. Token auto-refreshes (1-hour lifetime).
# Interactive OAuth login
databricks auth login --host https://adb-1234567890123456.7.azuredatabricks.net
# Verify — prints current user
databricks current-user me
Option C: OAuth M2M (Service Principal — CI/CD)
Uses client credentials flow. No browser required. Create a service principal in Account Console > Service Principals, then generate an OAuth secret.
export DATABRICKS_HOST="https://adb-1234567890123456.7.azuredatabricks.net"
export DATABRICKS_CLIENT_ID="00000000-0000-0000-0000-000000000000"
export DATABRICKS_CLIENT_SECRET="dose00000000000000000000000000000000"
# Verify
databricks current-user me
Step 3: Configure Profiles for Multi-Workspace
# ~/.databrickscfg — one section per workspace
[DEFAULT]
host = https://adb-dev-workspace.7.azuredatabricks.net
token = dapi_dev_token_here
[staging]
host = https://adb-staging-workspace.7.azuredatabricks.net
token = dapi_staging_token_here
[production]
host = https://adb-prod-workspace.7.azuredatabricks.net
client_id = 00000000-0000-0000-0000-000000000000
client_secret = dose_prod_secret_here
# Use a specific profile
databricks workspace list / --profile staging
Step 4: Verify SDK Connection
from databricks.sdk import WorkspaceClient
# Auto-detects from env vars or ~/.databrickscfg
w = WorkspaceClient()
me = w.current_user.me()
print(f"Authenticated as: {me.user_name}")
print(f"Workspace: {w.config.host}")
print(f"Auth type: {w.config.auth_type}")
# Quick smoke test — list clusters
clusters = list(w.clusters.list())
print(f"Clusters found: {len(clusters)}")
Step 5: Service Principal Authentication (Python SDK)
from databricks.sdk import WorkspaceClient
from databricks.sdk.config import Config
# Explicit M2M config for CI/CD scripts
config = Config(
host="https://adb-1234567890123456.7.azuredatabricks.net",
client_id="00000000-0000-0000-0000-000000000000",
client_secret="dose00000000000000000000000000000000",
)
w = WorkspaceClient(config=config)
# Or use a named profile
w = WorkspaceClient(profile="production")
Output
- Databricks CLI v2 installed and on PATH
- Python SDK (
databricks-sdk) installed - Authentication credentials stored in env vars or
~/.databrickscfg - Connection verified with
databricks current-user me
Error Handling
| Error | Cause | Solution |
|-------|-------|----------|
| INVALID_TOKEN | Token expired or revoked | Generate a new PAT or re-run databricks auth login |
| Could not resolve host | Wrong workspace URL | Verify URL format: https://adb-<id>.<region>.azuredatabricks.net |
| PERMISSION_DENIED | Token lacks required entitlements | Ensure user/SP has workspace access in Account Console |
| SSL: CERTIFICATE_VERIFY_FAILED | Corporate proxy intercepts TLS | Set REQUESTS_CA_BUNDLE=/path/to/cert.pem |
| Connection refused | VPN or firewall blocking | Check corporate firewall rules for workspace domain |
| No matching profile | Profile name typo in ~/.databrickscfg | Run databricks auth profiles to list available profiles |
Examples
Account-Level Client (Multi-Workspace Management)
from databricks.sdk import AccountClient
# Account-level operations (manage workspaces, users, billing)
a = AccountClient(
host="https://accounts.cloud.databricks.com",
account_id="00000000-0000-0000-0000-000000000000",
client_id="sp-client-id",
client_secret="sp-secret",
)
for ws in a.workspaces.list():
print(f"{ws.workspace_name}: {ws.deployment_name}")
Azure AD Managed Identity
from databricks.sdk import WorkspaceClient
# Uses Azure Default Credential chain (works in Azure VMs, AKS, Functions)
w = WorkspaceClient(
host="https://adb-1234567890123456.7.azuredatabricks.net",
azure_workspace_resource_id="/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.Databricks/workspaces/<ws>",
)
Resources
Next Steps
After successful auth, proceed to databricks-hello-world for your first cluster and notebook.