Azure Cloud is a popular work environment for many data scientists, yet many features remain poorly documented. This note shows how to monitor Azure experiments in a more handy and detailed way than through web or cl interface.
The trick is to create a dashborad of experiments and their respective runs, up to a desired level of detail, from Python. The workhorse is the following handy utility function:
from collections import namedtuple def get_runs_summary(ws): """Summarise all runs under a given workspace, with experiment name, run id and run status Args: ws (azureml.core.Workspace): Azure workspace to look into """ # NOTE: extend the scope of run details if needed record = namedtuple('Run_Description',['job_name','run_id','run_status']) for exp_name,exp_obj in ws.experiments.items(): for run_obj in exp_obj.get_runs(): yield(record(exp_name,run_obj.id,run_obj.status))
Now it’s time to see it in action 😎
# get the default workspace from azureml.core import Workspace import pandas as pd ws = Workspace.from_config() # generate the job dashboard and inspect runs = get_runs_summary(ws) summary_df = pd.DataFrame(runs) summary_df.head()
# count jobs by status summary_df.groupby('run_status').size()
Use the dashboard for to automatically manage experiments. For example, to kill running jobs:
from azureml.core import Experiment, Run for exp_name,run_id in summary_df.loc[summary_df.run_status=='Running',['job_name','run_id']].values: exp = Experiment(ws,exp_name) run = Run(exp,run_id) run.cancel()
Check the jupyter notebook in my repository for a one-click demo.