Azure Cloud is a popular work environment for many data scientists, yet many features remain poorly documented. This note shows how to monitor Azure experiments in a more handy and detailed way than through web or cl interface.
The trick is to create a dashborad of experiments and their respective runs, up to a desired level of detail, from Python. The workhorse is the following handy utility function:
from collections import namedtuple
def get_runs_summary(ws):
"""Summarise all runs under a given workspace, with experiment name, run id and run status
Args:
ws (azureml.core.Workspace): Azure workspace to look into
"""
# NOTE: extend the scope of run details if needed
record = namedtuple('Run_Description',['job_name','run_id','run_status'])
for exp_name,exp_obj in ws.experiments.items():
for run_obj in exp_obj.get_runs():
yield(record(exp_name,run_obj.id,run_obj.status))
Now it’s time to see it in action 😎
# get the default workspace
from azureml.core import Workspace
import pandas as pd
ws = Workspace.from_config()
# generate the job dashboard and inspect
runs = get_runs_summary(ws)
summary_df = pd.DataFrame(runs)
summary_df.head()

# count jobs by status
summary_df.groupby('run_status').size()

Use the dashboard for to automatically manage experiments. For example, to kill running jobs:
from azureml.core import Experiment, Run
for exp_name,run_id in summary_df.loc[summary_df.run_status=='Running',['job_name','run_id']].values:
exp = Experiment(ws,exp_name)
run = Run(exp,run_id)
run.cancel()
Check the jupyter notebook in my repository for a one-click demo.