How to prepare for the Exam DP-200: Implementing an Azure Data Solution

In this article, we will show how to prepare yourself for one of the important Microsoft Azure exams, DP-200:
Implementing an Azure Data Solution certificate exam.

Exam Overview

Implementing an Azure Data Solution certificate exam measures your intermediate-level knowledge in three main areas.
This includes:

  • How to implement data storage solutions, with relative questions weight in the exam up to 45%
  • How to manage and develop data processing solutions, with relative questions weight in the exam up to 30%
  • How to monitor and optimize data solutions, with relative questions weight in the exam up to 35%

With no official prerequisites for this exam, it is recommended, but not mandatory, to take the Microsoft Azure Fundamentals (AZ-900) exam if you are very new to Microsoft Azure world, and
taking the Microsoft Azure Data Fundamentals (DP-900) if you are new to all Microsoft Azure data platform.

You can easily schedule the exam from the Implementing an Azure Data Solution certificate page. You need
to pass both the Implementing an Azure Data Solution (DP-200) and the Designing an Azure Data Solution (DP-201)
certificate exams in order to be certified as an Azure Data Engineer Associate. For more information about Microsoft Azure certificates, check
It is time to specify your Microsoft Certifications path.

Certificate Candidate

Microsoft Azure data engineers are responsible for all data-related design and implementation tasks, including
provisioning the proper data storage service, ingesting streaming and batch data using the suitable mechanism,
transforming data between different sources and storage types, implementing security requirements and data retention
policies that meet the business requirements and identifying and fixing the performance bottlenecks during the
implementation and running phases.

The Implementing an Azure Data Solution certificate exam is designed for the Microsoft Azure data engineers, data
professionals, data architects, and business intelligence professionals, who will participate in the implementation
phase of the data-related tasks for any solution that is implemented using the relational and non-relational Azure
data services. These Microsoft Azure data services include Azure Cosmos DB, Azure SQL Database, Azure Synapse
Analytics, Azure Data Lake Storage, Azure Data Factory, Azure Stream Analytics, Azure Databricks, and Azure Blob
storage.

Study Guideline

In order to prepare yourself for the Implementing an Azure Data Solution exam, you can go through the 7-module Implementing an Azure Data Solution learning path self-study
course provided by Microsoft that helps you in getting the basic knowledge required to pass that exam.

If you are not interested in reading the pages and prefer to listen, you can subscribe to any online course such as
Udemy, PLURALSIGHT or any other training provided by training sites and centers.

Take into consideration that this exam contains a large number of subjects. In order to pass the exam, you need to
have enough knowledge in each subject, without going very deep in each subject. For me, I prefer to be fully
prepared for the certificates exams and gain all the required knowledge in order to be able to provide training in
the courses I am certified in and apply these skills in my customers’ sites. So, I will list all measured skills in
this course and the official resource to study that subject.

Implement Data Storage Solutions

  • Implement non-relational data stores

  • Implement relational data stores

  • Manage data security

Manage and Develop Data Processing Solutions

  • Develop batch processing solutions

  • Develop streaming solutions

Monitor and Optimize Data Solutions

  • Monitor data storage

  • Monitor data processing

  • Optimize of Azure data solutions

Practicing

As any exam, after completing the study material, you need to make sure that you are prepared well for the exam. You
can search on the internet for any free practice tests, such as the ExamTopics site or
any other free test, but after making sure that you have completed studying the official course outline. To be
familiar with Microsoft exams shape, check the Microsoft certificates Exam Formats and Questions Types.

In this article, I will provide some review questions that I usually use to measure my trainees general skills, , to make sure that they are ready for the Implementing an Azure Data Solution exam, taking into consideration that most of the exam questions are scenario-related questions in which you are requested to apply what you learn in these issues.

  1. The type of data that can have its own schema defined at query time:

    Un-structured data

  2. The process of duplicating the content for redundancy in order to meet the customers SLA in Microsoft Azure:

    High Availability

  3. The Microsoft Azure data platform technology that is a globally distributed, multi-model database that can offer sub-second query performance and low latency:

    Microsoft Azure Cosmos DB

  4. The cheapest data store that can be used when you want to store your data without the need to query it directly:

    Azure Storage Account

  5. The Microsoft Azure Service that can be used to store documentation about a data source:

    Azure Data Catalog

  6. The Microsoft Azure Data Platform technology that is used to process data in an ELT framework:

    Azure Data Factory

  7. Working as a data engineer in a startup with limited funding, why would you prefer to use the Microsoft Azure
    data storage instead of purchasing on-premises storage?

    The Microsoft Azure pay-as-you-go billing model provides you with the ability to avoid buying expensive hardware that you may not use continuously

  8. Assume that you are requested to store two video files as blobs. The first video file is business-critical and
    requires a replication policy that creates multiple copies across geographically diverse datacenters. The second
    video file is non-critical, and a local replication policy is sufficient. How could we store these two Video
    files?

    The two video files should be stored in separate storage accounts

  9. When creating a new storage account, the name of a storage account should be:

    Globally unique

  10. When creating an Azure Data Lake Storage Gen 2 account, you need to configure it to be able to processes
    analytical data workloads for the best performance. To achieve that, you should enable a specific option when
    creating that account:

    From the Advanced tab, set the Hierarchical Namespace to enabled

  11. The tool that can be used to upload a single file to a Data Lake Storage Account (Gen 2) without the need for any installation or configuration:

    Microsoft Azure Portal

  12. The tool that can be used to perform a movement of hundreds of files from Amazon S3 to Azure Data Lake Storage:

    Azure Data Factory

  13. The Apache Storage technology that is encapsulated in Microsoft Azure Databricks:

    Apache Spark

  14. The Notebook format that is used in Databricks:

    DBC

  15. The browsers recommended for best use with Databricks Notebook:

    Chrome and Firefox

  16. In order to connect the Spark cluster to the Azure Blob, we should:

    Mount it

  17. Apache Spark can connect to databases like MySQL, Hive and other data stores using:

    JDBC driver

  18. The recommended storage format to use with Spark, is:

    Apache Parquet

  19. In order to ensure that there is 99.999% availability for the reading and writing of all your data that is stored in a Cosmos DB database, you should:

    Configure reads and writes of data for multi-region accounts with multi-region writes

  20. You are requested to move the data that is stored in a Table Storage account located in the West US region available globally, so you should migrate it to :

    Azure Cosmos DB Table API

  21. The Cosmos DB API that provides a traversal language that enables connections and traversals across connected data:

    Gremlin API

  22. In order to maximize the data integrity of data that is stored in a Cosmos DB, you should use _____ consistency level

    Strong

  23. You just created a new Azure SQL Database, who will be responsible for performing operating system and database software updates?

    The cloud provider: Microsoft Azure. Azure manages the hardware, software updates, and OS patches for you

  24. Few days after provisioning your Azure SQL database, you find that you need additional IO throughput, the
    performance model that should be used is:

    vCore

  25. The scale of compute that is used in Azure SQL Synapse Analytics servers:

    DWU

  26. Assume that you have an Azure Synapse Analytics database, within this, you have a dimension table named Stores
    that contains store information. There is a total of 263 stores nationwide. Store information is retrieved in
    more than half of the queries that are issued against this database. These queries include staff information per
    store, sales information per store and finance information. You want to improve the query performance of these
    queries by configuring the table geometry of the stores table. The best table geometry to select for the Store
    table:

    Replicated table

  27. The default port for connecting to an enterprise data warehouse in Azure Synapse Analytics, is:

    TCP port 1433

  28. You have a Data Warehouse created with a database named Contoso. Within the database is a table named
    DimSuppliers. The suppliers’ information is stored in a single text file named Suppliers.txt and is 1200MB in
    size. It is currently stored in a container with an Azure Blob store. Your Azure Synapse Analytics is configured
    as Gen 2 DW30000c. In order to maximize the performance of the data load, you should:

    Split the text file into 60 files of 20MB each.

  29. You have a Data Warehouse created with a database named Contoso. You have created a master key, followed by a
    database scoped credential, After that, in order to copy data using Polybase, you should create:

    An external data source

  30. The Microsoft Azure technology that provides an ingestion point for data streaming in an event processing
    solution that uses static data as a source, is:

    Azure Blob storage

  31. Will an application that publishes messages to Azure Event Hub very frequently get the best performance using
    Advanced Message Queuing Protocol (AMQP, as it establishes a persistent socket?

    True

  32. By default, the number of partitions that a new Event Hub will have is:

    4

  33. Assume that an Event Hub goes offline before a consumer group can process the events it holds. Will those events be lost?

    False

  34. The job input that consumes data streams from applications at low latencies and high throughput:

    Event Hubs

  35. The tool that can be used to view the key health metrics of your Stream Analytics jobs, is:

    Dashboards

  36. The Microsoft Azure Data Factory component that contains the transformation logic or the analysis commands of
    the Azure Data Factory’s work, is called:

    Activities

  37. In order to move data from an Azure Data Lake Gen2 store to Azure Synapse Analytics, the Azure Data Factory
    integration runtime that should be used in a data copy activity is:

    Azure IR

  38. The Mapping Data Flow transformation that is used to routes data rows to different streams based on matching
    conditions, is called:

    Conditional Split

  39. The transformation that is used to load data into a data store or compute resource, is called:

    Sink

  40. The cloud service category that requires the greatest security effort on your part, is:

    Infrastructure as a service (IaaS)

  41. The best way to protect sensitive customer data is to encrypt:

    Encrypt data both as it sits in your database and as it travels over the network

  42. The Microsoft Azure service that helps in storing certificates to centrally manage them for your services:

    Azure Key Vault

  43. Your company is storing thousands of images in an Azure BLOB storage account. The web application you are
    developing needs to have access to these images, the best way to provide secure access for the third-party web
    application:

    Use a Shared Access Signature to give the web application access.

  44. The best method to have insights into any unusual activity be occurring with your storage account with minimal
    configuration is:

    Automatic Threat Detection

  45. The most efficient way to secure a database to allow only access from a VNet while restricting access from the
    internet is creating:

    A server-level virtual network rule

  46. If a mask is applied to a column in your database that holds a user’s email address, JohnCal@contoso.com, then the database administrator will be able to see the email address like:

    JohnCal@contoso.com with no change

  47. Is the “Encrypted communication” option turned on automatically when connecting to an Azure SQL Database or
    Azure Synapse Analytics?

    True

  48. What are the steps that you should follow to set the encryption for the data stored in Stream Analytics?

    It cannot be done as Stream Analytics does not store data

  49. In order to respond to the critical condition and take corrective automated actions using Azure Monitor, then
    you should use:

    Microsoft Azure Monitor Alerts

  50. You are receiving an error message in Azure Synapse Analytics, You want to view information about the service
    and help to solve the problem, what can you use to quickly check the availability of the service?

    Diagnose and solve problems

  51. While performing a daily data load to SQL Data Warehouse using Polybase with CTAS statements, the users are
    complaining that the reports are running slow. In order to improve the performance of the report query, you
    should:

    Create table statistics and keep it up to date

  52. The maximum number of activities per pipeline in Azure Data Factory is:

    40

  53. While monitoring the job output of a streaming analytics job, the monitor reported back that there is a “Runtime
    Errors > 0”, the issue mainly related to:

    The job can receive the data but is generating errors while processing the query.

  54. The Recovery Point Objective for Azure Synapse Analytics is:

    8 hours

  55. The backup taken for Azure Cosmos DB every is:

    4 hours

Good Luck.

Ahmad Yaseen
Latest posts by Ahmad Yaseen (see all)

Author: admin

Leave a Reply

Your email address will not be published.