Azure Data Scientist Associate (DP-100)

I recently renewed my Azure Data Scientist Associate certification, and it seemed like a good time to review my thoughts on the certification and the platform.

The subtitle of the exam is designing and implementing a data science solution on Azure.

Since originally talking the exam, the content has been updated to include the new Azure AI foundry. The learning content has been updated to reflect this and includes some great guides on working with generative AI solutions.

The exam and materials don’t focus on the algorithms, libraries, or decision-making frameworks you might use while performing data science. Instead, they emphasize how the platform supports these activities within a team and across the broader project lifecycle. It’s a subtle but important distinction, and one that aligns with Microsoft’s intent: to teach you the tools and processes of the Azure ML platform, rather than assess your knowledge of frameworks like sci-kit learn or PyTorch. While this approach might initially feel less accessible to early-career or recently graduated data scientists, I believe it provides immense value to them. It introduces the workflows and tooling that enable data science to thrive within businesses and organizations—an aspect that may have been missing from their previous experiences.

Preparation tips

As usual the learning material on the MS learn site was basis of my learning approach. I really like the MS learning paths. They are well put together, are text based (I prefer this to video) and include lab sessions. You can find the full learning path here. The learning modules now include lab sessions.

Since first taking the exam Microsoft has added practice exams to the learning site. These are very useful for preparing. You can track your progress, get familiar with the question style and find areas of weakness. The DP-100 practice exam is available here. You can check the correctness of your answers as you go or run through the questions like a mock exam and get your score at the end, it’s really helpful to have this choice to suit your preferences at different stages of the preparation.

The preparation videos provide a good overview of the content and can be used to understand the big picture as well as aid revision. There are shortish videos for each of the exam sections see here.

If you can, I’d strongly recommend using your own workspace to practice what you are learning.

The Azure Machine Learning Workspace

The learning path is broken into sections that take you through the lifecycle of designing and implementing a machine learning solution using the Azure Machine Learning Workspace.

The workspace brings together all the components for the lifecycle of a solution. From prototyping and exploring with AutoML, writing Python code with notebooks, making data available in the workspace, and deploying a solution. Interaction is via the browser or the Python SDK.

The Machine Learning workspace is brimming with capabilities. Some that are relevant for the exam are AutoML, sweep jobs for hyperparameter tuning, pipelines and jobs, responsible AI dashboard and model deployment (via MLflow) for both online and batch deployment.

MLflow

MLflow deserves a whole post, but in summary it includes capabilities for experiment tracking, a model registry and model serving. It’s an open-source project from the people who created Apache Spark and Databricks. It is available as a Python library and as a managed service within Databricks, Azure ML and Amazon SageMaker. Just the experiment tracking alone is a valuable addition to any workflow. I remember manually tracking hyperparameters as an undergraduate student and let’s just say it’s nice to have this automated! Having this capability tied into the next stages of the process gives it even more benefits.

For the exam you’ll need to know about experiment tracking, reviewing metrics to evaluate the models, tracking training jobs, registering a model and deploying a model

Azure AI Foundry

AI Foundry was not a product when I first took this exam. It aims to support AI workflows, rather than Machine Learning and Data Science. The distinction is that AI work uses existing models, you can search and select different models and use them in the model playground. AI Foundry enable you to build and customise models and then support you with deployment with evaluation and monitoring.

The Model catalog contains a range of models (versions of Phi-4, gpt, DeepSeek, mistral, Stable-Diffusion and others) covering different skills including chat completions, image to image, text to image, audio generation, image feature extraction and more. It serves as a easy entry point to play around and get hands on with the possibilities of using LLMs.

Although many capabilities seem to be duplicated from the ML Studio, I think having a separate application to manage the LLM workflows is justified. The options are all specific to the workflows for LLMs, removing cluttering and unused options if you are only interested in LLMs. You can access here https://ai.azure.com/ although you’ll need to set up in azure portal before you can access any resources.

Some thoughts

MLflow is a great addition and has many benefits. This brings a huge amount of interesting capability to the Azure ML platform. If you work in this space, then it is well worth learning. MLflow is available as a Python library you can use locally. It is also integrated into many other platforms, so if you work on SageMaker or Databricks you’ll take no time at all to move the knowledge across of these capabilities to the other platforms.

Local compute targets are a nice touch. Enabling the benefits of the platform from a governance and teamwork perspective but without forcing you to use a cloud VM or cluster to run your code. In fact, there are many compute options available that you can choose to suit your team, project or workflow.

The exam is updated to include AI foundry, and I suppose that Data Scientists will be required to understand working with LLMs (although AI engineers may not be required to know the full DS workflow).

Obviously, the knowledge required to pass the exam is useful for Data Scientists. Recently graduated or qualified Data scientists can benefit here as this will introduce working as a team and bringing data science projects to products. Data Architects will also benefit by understanding the current capabilities and the workflows that can be supported.

Collected interesting links

A few links that I found along the way that were helpful to find.

The hands on exercises for the training are all collected here: mslearn-azure-ml (microsoftlearning.github.io)

The Cloud Adoption Framework now includes best practices for data science projects in Azure Best practices for data science projects with cloud-scale analytics in Azure – Cloud Adoption Framework | Microsoft Learn

Good luck!

If you are reading this and are thinking of taking the exam – best of luck to you! If this overview has helped you decide to do the exam or prepare drop me a line on LinkedIn or leave a comment here.

Azure Data Scientist Associate (DP-100)

Preparation tips

The Azure Machine Learning Workspace

MLflow

Azure AI Foundry

Some thoughts

Collected interesting links

Good luck!

Related Posts

Databricks Certified Generative AI Engineer Associate – Comprehensive Resource Guide

Databricks Certified Data Engineer Professional – Comprehensive Resource Guide

Leave a Reply Cancel reply