Microsoft DP-900 Azure Data Fundamentals Exam Dumps and Practice Test Questions Set 7 121-140

Visit here for our full Microsoft DP-900 exam dumps and practice test questions.

Question 121:

A company is designing a modern analytics solution in Azure where massive volumes of raw semi-structured and structured data must be ingested into Azure Data Lake Storage. They need a transformation engine that supports both batch and streaming pipelines, allows Python, SQL, and Scala development, supports Delta Lake ACID transactions, and enables data teams to collaborate with interactive notebooks and automated job scheduling. Which Azure service best fits this scenario?

Answer:

A) Azure Data Factory
B) Azure Stream Analytics
C) Azure Databricks
D) Azure SQL Managed Instance

Answer: C

Explanation:

Azure Databricks is the best answer because the scenario describes a combined need for batch and streaming processing, multi-language development, Delta Lake support, and collaborative notebook environments. This combination is unique to Databricks, making it the only service capable of satisfying all requirements effectively. The company wants to ingest semi-structured and structured data into Azure Data Lake Storage, which is typically done at scale. Databricks integrates natively with Azure Data Lake, allowing optimized read and write operations on large datasets.

Azure Data Factory enables orchestration but not the type of development environment described in the scenario. It does not offer notebooks, multi-language Spark programming, or Delta Lake ACID transactions. Although Data Factory has mapping data flows, these provide a no-code transformation interface rather than an advanced notebook-based development ecosystem. For large-scale engineering and machine learning tasks, Data Factory alone is not sufficient.

Azure Stream Analytics is strictly a real-time event processing platform. It lacks the ability to run batch processing, does not provide notebooks, and does not integrate directly with Delta Lake in a way that supports ACID transactions. It is excellent for streaming analytics but entirely unsuitable as a unified compute engine for batch and ML workloads.

Azure SQL Managed Instance is a relational database service that provides compatibility with SQL Server. It does not support real-time ingestion or lakehouse architecture requirements such as scalable transformations, notebooks, or multi-language capability. It is not designed to operate on raw data stored in Azure Data Lake.

Databricks, built on optimized Spark clusters, supports Python, Scala, SQL, and R, making it ideal for mixed-skill teams. It also includes MLflow for experiment tracking, Delta Live Tables for reliable pipeline development, and cluster auto-scaling to handle both large and small workloads. Delta Lake’s ACID transactions ensure reliability in data pipelines, preventing corruption across bronze, silver, and gold layers. Notebooks allow collaboration between engineers and data scientists, while automated jobs enable scheduled ETL and machine learning workflow execution.

Because the scenario requires a powerful, unified analytics engine with broad functionality, the only suitable service is Azure Databricks. Thus, option C is correct.

Question 122:

An enterprise is building a global operational system that needs a highly available, multi-region database with guaranteed low-latency reads and writes. They want automatic failover, multi-master capabilities, and the ability to choose between five consistency models to balance performance and correctness depending on the workload. Which Azure database service should they select?

Answer:

A) Azure SQL Database
B) Azure Cosmos DB
C) Azure PostgreSQL Flexible Server
D) Azure SQL Managed Instance

Answer: B

Explanation:

Azure Cosmos DB is the only database service in Azure designed specifically for global distribution with multi-master writes and tunable consistency levels. The scenario requires multi-region availability, guaranteed low-latency operations, automatic failover, and a choice between different consistency models. Cosmos DB uniquely offers all these capabilities. It is engineered to serve globally distributed applications with sub-10-millisecond latencies at the 99th percentile for both reads and writes.

Azure SQL Database can replicate data globally but only supports one read-write region, not multi-master writes. It does not provide five consistency models; instead, it relies on traditional ACID semantics. It cannot guarantee low-latency writes across multiple continents because it uses a single primary region for writes.

Azure PostgreSQL Flexible Server does not offer global multi-master replication. While it supports high availability, it does not provide multiple consistency levels or automatic failover across regions.

Azure SQL Managed Instance is similar to SQL Database in that it provides geo-replication but does not support worldwide multi-master writes or tunable consistency. It also cannot match Cosmos DB’s low-latency global operations.

Cosmos DB supports five consistency models: strong, bounded staleness, session, consistent prefix, and eventual. These help enterprises fine-tune performance and correctness guarantees. For example, strong consistency ensures linearizability but reduces write performance across distant regions, whereas session consistency balances correctness with high performance for user-specific workloads.

Cosmos DB’s global distribution and automatic failover ensure operational resilience. When a region becomes unavailable, another region automatically assumes responsibility without downtime. Multi-master capability allows users to write to any region, reducing write latency dramatically. Partitioning ensures horizontal scalability across billions of documents or records.

Because no other Azure service meets the requirement for global distribution, multi-master writes, low-latency operations, and tunable consistency levels, Cosmos DB is the only correct answer. Thus, option B is correct.

Question 123:

A business intelligence team wants to query files stored in Azure Data Lake Storage using standard T-SQL without provisioning any dedicated clusters. They need an on-demand pay-per-query model and the ability to query CSV, JSON, and Parquet files directly. Which Azure service fulfills these requirements?

Answer:

A) Azure Databricks
B) Azure SQL Managed Instance
C) Azure Synapse Serverless SQL Pool
D) Azure Stream Analytics

Answer: C

Explanation:

Azure Synapse Serverless SQL Pool is the correct service because it allows users to run T-SQL queries directly over data lake files in a pay-per-query model. This service enables analysts to perform ad hoc analysis on CSV, JSON, and Parquet files without needing to provision, configure, or manage dedicated compute clusters. It is explicitly designed to give analysts the ability to explore and analyze data stored in Azure Data Lake Storage.

Azure Databricks provides powerful Spark-based querying capabilities but requires provisioning of compute clusters. It does not support pay-per-query pricing. Its SQL warehouse is separate and does not function like Synapse Serverless SQL Pool.

Azure SQL Managed Instance cannot read raw lake files directly. Data must be ingested into relational tables first, making it unsuitable for this scenario.

Azure Stream Analytics is designed for streaming event processing and cannot query files stored in a data lake.

Synapse Serverless SQL Pool supports T-SQL constructs such as OPENROWSET, enabling querying of unstructured and semi-structured data. External tables can also be created to reuse query definitions. Because the company wants SQL-based analysis without cluster provisioning, Serverless SQL Pool is the exact match.

Thus, option C is correct.

Question 124:

A transportation analytics team needs to run large-scale time-series queries, complex pattern matching, full-text search, and anomaly detection on billions of telemetry events. They also require extremely fast ingestion and a query language optimized for log analysis. Which Azure service should they choose?

Answer:

A) Azure Synapse Dedicated SQL Pool
B) Azure SQL Database
C) Azure Data Explorer
D) Azure PostgreSQL Flexible Server

Answer: C

Explanation:

Azure Data Explorer is the best service for handling large-scale telemetry and log analytics workloads. The transportation analytics team needs capabilities such as time-series analysis, anomaly detection, pattern matching, and full-text search. These features align perfectly with Azure Data Explorer, which is optimized for log ingestion, time-series modeling, and analytical queries over billions of events. It uses a columnar storage format with aggressive compression and indexing, enabling extremely fast query results.

Synapse Dedicated SQL Pool is a powerful data warehouse but not optimized for time-series or log analytics at the scale described. Querying billions of fine-grained events is not its intended workload. It is designed for structured relational analytics rather than telemetry.

Azure SQL Database cannot ingest streaming telemetry data at massive scale or perform millisecond-level analytics on logs. While full-text search exists, it is not optimized for the scenario described.

Azure PostgreSQL Flexible Server is a general-purpose relational database but cannot match the ingestion rate or time-series indexing capabilities needed for high-speed log analytics.

Azure Data Explorer’s Kusto Query Language is built specifically for log exploration and anomaly detection. It supports time windows, joins, patterns, predictive functions, and extremely fast aggregations. It can ingest data from Event Hub, IoT Hub, Log Analytics, or pipelines, making it perfect for transportation telemetry analysis.

Thus, option C is correct.

Question 125:

A company wants to implement centralized data governance across its entire data estate. They need automated scanning of Azure SQL, Synapse, Azure Data Lake, on-premises SQL Server, Power BI, and external sources. They also require lineage tracking, sensitivity classification, business glossary terms, and a searchable data catalog for analysts. Which Azure service provides these capabilities?

Answer:

A) Azure Monitor
B) Microsoft Purview
C) Azure Security Center
D) Azure Key Vault

Answer: B

Explanation:

Microsoft Purview is the correct answer because it is Azure’s unified data governance solution designed to manage metadata, classification, lineage, and cataloging across hybrid and multicloud environments. The scenario describes a complex data estate consisting of Azure SQL, Synapse, Data Lake, on-prem SQL Server, Power BI, and external sources. Purview is the only Azure service capable of scanning all these systems automatically.

Azure Monitor tracks performance metrics and diagnostics for Azure resources but does not provide metadata scanning, lineage tracking, or cataloging.

Azure Security Center focuses on security posture management, not cataloging or data governance.

Azure Key Vault stores secrets, keys, and certificates but does not analyze data or manage metadata.

Purview’s automated scanning extracts schemas and metadata from supported sources. Its classification engine identifies sensitive fields such as credit card numbers, names, and emails. Lineage visualization shows how data flows from ingestion through transformation to reporting systems. The business glossary enables standardized definitions across the organization, helping analysts interpret datasets consistently. The data catalog offers powerful search capabilities so users can quickly find datasets relevant to their work.

Purview integrates with Synapse pipelines, Data Factory, SQL databases, Power BI, and more, providing end-to-end governance. For enterprises with compliance requirements and large-scale data environments, Purview is essential.

Thus, option B is correct.

Question 126:

A company wants to design an end-to-end data lakehouse that processes raw JSON and CSV data into curated analytical tables. They need ACID transactions, schema enforcement, time travel, performance optimization, and the ability to run both streaming and batch ETL jobs on the same tables. Which Azure technology should they choose to manage these lakehouse tables?

Answer:

A) Parquet files
B) CSV files
C) Delta Lake
D) JSON files

Answer: C

Explanation:

Delta Lake is the correct choice because it provides the essential capabilities required to build a modern data lakehouse architecture. The question describes a scenario where a company wants ACID transactions, schema enforcement, time travel, and unified batch and streaming operations. Delta Lake was designed specifically to solve the limitations of traditional data lakes while enabling scalable data engineering and analytical workflows. These core requirements match perfectly with what Delta Lake provides.

Parquet files offer efficient columnar storage and high-performance analytics, but they lack transactional integrity. Without ACID transactions, updates and deletes are unsafe, and concurrent writes can corrupt the data. Parquet files alone also do not support time travel or schema evolution.

CSV files are inefficient and lack schema enforcement, compression, and metadata optimization. They cannot support the level of reliability needed for a large-scale production data lakehouse.

JSON files can store semi-structured data, but they are not optimized for queries and do not include ACID guarantees. JSON files also create significant processing overhead because of their verbose structure, making them unsuitable for large-scale analytics beyond raw ingestion.

Delta Lake, by contrast, adds a transactional log that keeps track of all file operations. This enables atomic writes, consistent reads, and safe concurrent operations across large distributed systems. It also supports schema validation and schema evolution, ensuring that changes to incoming data do not silently break downstream pipelines. Time travel allows engineers to query historical versions of tables, making debugging and audit operations straightforward.

Delta Lake supports both streaming and batch workloads through optimized Spark integration. This means that the same table can be incrementally updated via streaming jobs while also being processed through batch transformations. This convergence is a key principle of the lakehouse paradigm.

Because the scenario explicitly requires ACID, schema enforcement, time travel, unified batch/streaming, and curated analytical outputs, Delta Lake is the only correct technology. Thus, option C is correct.

Question 127:

A healthcare analytics team needs to store patient telemetry data that arrives from thousands of IoT medical devices every second. They require low-latency writes, global distribution, auto-indexing, flexible schema, and multi-region failover without downtime. Which Azure database service best fulfills these operational requirements?

Answer:

A) Azure SQL Database
B) Azure Synapse Dedicated SQL Pool
C) Azure Cosmos DB
D) Azure PostgreSQL Flexible Server

Answer: C

Explanation:

Azure Cosmos DB is the correct answer because it is Microsoft’s globally distributed, low-latency NoSQL database engineered for large-scale operational telemetry. Healthcare telemetry involves high-volume writes, unpredictable schema variations, and the need for consistent availability. Cosmos DB provides automatic indexing, horizontal partitioning, and global multi-region replication, making it ideal for this type of workload.

Azure SQL Database is a powerful relational engine but cannot scale write throughput to the levels needed for thousands of medical devices generating telemetry every second. It also does not support multi-master writes across regions.

Azure Synapse Dedicated SQL Pool is built for analytical workloads, not for heavy real-time operational ingest. It is optimized for OLAP queries, not continuous device telemetry streams.

Azure PostgreSQL Flexible Server is a relational database and does not provide automatic global distribution, elastic scalability for massive ingestion, or extremely low-latency multi-region operations.

Cosmos DB offers several key features that align with the healthcare team’s needs:

It supports multi-master writes, meaning devices in different regions can send telemetry without routing everything back to a single region. This reduces latency for globally distributed devices.

Cosmos DB auto-indexes all data without requiring schema or index definitions, which is essential when telemetry formats can vary over time.

With five consistency models, Cosmos DB gives developers the ability to choose between strong consistency for critical data and eventual consistency for high-throughput scenarios.

Because Cosmos DB is HIPAA eligible, it fits healthcare regulatory requirements.

Given the need for global distribution, massive ingestion, and multi-region high availability with flexible schema, Cosmos DB is the only suitable option. Therefore, option C is correct.

Question 128:

A retail company wants to build scheduled ETL pipelines that extract data from SaaS systems such as Salesforce, Dynamics, and Google Analytics. They need a no-code visual transformation interface, integration runtime for on-prem data, monitoring, and the ability to orchestrate complex workflows. Which Azure service should they use?

Answer:

A) Azure Data Factory
B) Azure Databricks
C) Azure Logic Apps
D) Azure SQL Managed Instance

Answer: A

Explanation:

Azure Data Factory is the correct choice because it is Microsoft’s primary ETL and data integration service capable of orchestrating data pipelines across cloud and on-prem environments. The scenario describes a need for extracting data from SaaS sources, performing no-code transformations, scheduling, monitoring, and orchestration capabilities. ADF provides all of these through connectors, mapping data flows, integration runtime, and graphic pipeline orchestration.

Azure Databricks is excellent for big data processing, Spark workloads, and advanced analytics, but it does not offer built-in no-code interfaces or broad SaaS connectors needed for enterprise ETL. Databricks jobs complement ADF but do not replace its orchestration role.

Azure Logic Apps can orchestrate workflows but is designed primarily for event-driven business automation, not large-scale ETL. It cannot efficiently process or transform large datasets or integrate across structured data systems at scale.

Azure SQL Managed Instance cannot orchestrate ETL workflows or connect to SaaS systems.

ADF’s mapping data flows allow drag-and-drop transformations on Spark clusters without writing code. It supports hundreds of connectors, making it ideal for pulling data from platforms like Salesforce, Dynamics, Marketo, NetSuite, and more. Its monitoring dashboard provides detailed execution logs, retries, alerting, and dependency tracking. Integration runtime enables secure connectivity to on-prem databases.

Thus, option A is correct.

Question 129:

A global security operations center needs to analyze billions of log entries from firewalls, servers, and identity systems. They require a fast, specialized engine for time-series analysis, large-scale log ingestion, pattern detection, anomaly detection, and full-text search. Which Azure service is the best fit?

Answer:

A) Azure SQL Database
B) Azure Synapse Serverless SQL Pool
C) Azure Data Explorer
D) Azure Cosmos DB

Answer: C

Explanation:

Azure Data Explorer is the correct service because it is purpose-built for telemetry, log ingestion, time-series queries, and large-scale investigative analytics. Security logs require extremely fast ingestion and optimized indexing to support real-time threat detection and historical investigations. ADX provides the Kusto Query Language designed specifically for log and telemetry analytics, enabling complex pattern matching, anomaly detection, time windows, predictive analysis, and full-text search.

Azure SQL Database cannot ingest billions of logs efficiently or perform analytical log queries at scale. It lacks the columnar engine and optimized indexing needed for log analytics.

Azure Synapse Serverless SQL Pool can query data lake files but does not support continuous ingestion or perform real-time log analytics with sub-second response times.

Azure Cosmos DB excels at operational workloads but is not optimized for log analytics queries such as cross-event correlation or time-series pattern scanning.

Azure Data Explorer automatically indexes ingested data, supports compression, provides high ingestion throughput, and allows rich analytical functions. It integrates with Azure Monitor, Defender, and Log Analytics, making it ideal for security operations.

Thus, option C is the correct answer.

Question 130:

A data governance team wants a centralized service that can scan, classify, catalog, and track lineage from Azure SQL, Synapse, Data Lake, Power BI, and even on-premises SQL Server. They need a searchable catalog, business glossary, sensitivity labels, and end-to-end lineage diagrams. Which Azure service should they implement?

Answer:

A) Azure Monitor
B) Azure Key Vault
C) Microsoft Purview
D) Azure Security Center

Answer: C

Explanation:

Microsoft Purview is the correct answer because it is Azure’s unified data governance and cataloging platform designed to help organizations control, document, classify, and understand their entire data estate. The scenario explicitly lists requirements such as scanning Azure SQL, Synapse, Data Lake, Power BI, and on-prem SQL Server. Purview supports all these sources and more, including AWS S3, SAP, and Oracle.

Azure Monitor tracks performance metrics and logs but does not provide metadata catalogs or data classification.

Azure Key Vault stores secrets, certificates, and encryption keys, but it does not scan or classify datasets.

Azure Security Center focuses on cloud security posture and threat protection, not data governance.

Purview’s automated scanning extracts metadata, schema, ownership information, sensitivity labels, and data classifications. Its business glossary standardizes terms and definitions across the organization. The lineage diagrams provide end-to-end visibility from ingestion to transformation to use in reports or dashboards. This is critical for compliance, auditing, and engineering reliability.

Because the scenario explicitly requires cataloging, lineage, classification, glossary management, and hybrid data scanning, Microsoft Purview is the only correct solution. Therefore, option C is correct.

Question 131:

A telecommunications company needs to process both historical call-detail records and real-time streaming events from cell towers. They want a unified environment where engineers can run Python, SQL, and Scala, create notebooks, perform large-scale ETL, integrate machine learning, and use Delta Lake for ACID transactions. They also require autoscaling compute clusters and workflow scheduling. Which Azure service is best suited for these requirements?

Answer:

A) Azure Stream Analytics
B) Azure Databricks
C) Azure Data Factory
D) Azure SQL Database

Answer: B

Explanation:

Azure Databricks is the most appropriate choice because the telecommunications company needs both real-time and batch processing, along with large-scale compute, notebook collaboration, and the ability to use multiple languages such as Python, SQL, and Scala. Only Databricks provides a unified analytics platform built on Spark, allowing teams to develop in a collaborative environment while leveraging optimized Spark clusters with autoscaling. The scenario also explicitly requires Delta Lake ACID transactions, which is a core component of Databricks.

Azure Stream Analytics is designed only for real-time processing. While it can ingest streaming cell tower events and run SQL-like streaming queries, it lacks batch processing, notebook functionality, Delta Lake support, Python or Scala environments, and the ability to train machine learning models. It cannot act as the unified environment described.

Azure Data Factory is primarily an orchestration tool and does not serve as a compute platform. Although ADF can schedule and run pipelines, it does not support interactive notebooks, distributed ML workloads, or Delta Lake ACID transactions. Mapping Data Flows provide a graphical transformation interface but are not suitable for the broad analytical environment needed.

Azure SQL Database is a traditional relational database that cannot ingest or process streaming events and cannot run Spark jobs, notebooks, or distributed ML training. It cannot serve as a big data processing platform for call-detail records or real-time streams.

Databricks supports both structured and unstructured data and integrates deeply with Azure Data Lake Storage, which is likely where the company stores call-detail records. With Delta Lake, Databricks ensures reliability of ETL pipelines that move data from raw bronze layers to refined gold layers. Time travel and schema enforcement are essential for debugging telecommunications pipelines that evolve frequently.

Databricks job clusters allow workflow scheduling and provide high reliability for ETL automation. With MLflow integration, data scientists can perform experimentation and track model versions for functions like churn prediction, dropped-call prediction, or fraud detection. Its ability to run real-time streaming through Structured Streaming also meets the requirement for processing cell tower events as they arrive.

Thus, Azure Databricks is the only service that satisfies all the requirements of the scenario. Therefore, option B is correct.

Question 132:

A company needs to store fast-moving IoT telemetry with flexible schemas and extremely low read/write latency. They require global distribution, elastic scalability, automatic indexing, and multi-region writes to support IoT devices deployed worldwide. Which Azure database should they use?

Answer:

A) Azure SQL Managed Instance
B) Azure Cosmos DB
C) Azure Synapse Dedicated SQL Pool
D) Azure MySQL Flexible Server

Answer: B

Explanation:

Azure Cosmos DB is ideal for this scenario because it is engineered to support global-scale applications that require extremely low latency and flexible schemas. IoT telemetry workloads generate large volumes of semi-structured data, often requiring ingestion at high velocity from devices deployed around the world. Cosmos DB provides multi-master writes, meaning devices in different regions can write data without routing everything to a single geographic location. This reduces latency and provides higher availability.

Azure SQL Managed Instance cannot handle large volumes of schema-flexible telemetry. It is optimized for OLTP relational workloads and does not offer global distribution with multi-region writes.

Azure Synapse Dedicated SQL Pool is designed for large-scale analytical workloads (OLAP), not real-time ingestion of IoT telemetry. It cannot support elastic ingestion across multiple regions or perform schema-flexible writes at low latency.

Azure MySQL Flexible Server is a relational database with strong consistency but lacks global distribution, multi-master writes, and autoscaling needed for massive IoT scenarios.

Cosmos DB’s automatic indexing eliminates the need for developers to manage index definitions, which is crucial for constantly evolving IoT data. Its partitioning model ensures high throughput across billions of documents. With five tunable consistency levels, organizations can choose between strong consistency for critical scenarios or session/eventual consistency for performance-driven telemetry.

Thus, Cosmos DB is the correct choice for storing IoT telemetry at scale. Option B is correct.

Question 133:

A global enterprise wants to use Azure to orchestrate thousands of scheduled ETL pipelines pulling data from cloud SaaS platforms and on-premises SQL databases. They require data lineage (at the orchestration level), error handling, event triggers, integration runtime for hybrid connectivity, and visual transformations without code. Which Azure service fits this requirement?

Answer:

A) Azure Databricks
B) Azure Data Factory
C) Azure Monitor
D) Azure Kubernetes Service

Answer: B

Explanation:

Azure Data Factory is the correct answer because it is Azure’s primary ETL orchestration service designed to move and transform data across hybrid environments. The scenario describes a need for extracting data from SaaS sources, scheduling pipelines, enabling visual transformation interfaces, managing hybrid connectivity through integration runtime, and tracking lineage. Data Factory supports all these capabilities.

Azure Databricks is a powerful Spark compute platform but is not an ETL orchestrator. It cannot provide scheduled orchestration at scale for thousands of pipelines and does not include visual no-code data flows or integration runtime for on-prem connectivity.

Azure Monitor is a performance monitoring and alerting system. It cannot orchestrate ETL workflows or connect to SaaS platforms for data extraction.

Azure Kubernetes Service is an orchestration platform for containerized applications. It has no built-in ETL or data integration features.

ADF provides connectors to SaaS platforms such as Salesforce, Dynamics 365, Google Analytics, and more. With mapping data flows, ADF allows no-code transformations executed on managed Spark clusters. Integration runtime makes hybrid connectivity easy, allowing pipelines to pull data from on-premises SQL Servers securely. ADF also supports triggers (schedule, tumbling, event-based), data lineage, pipeline monitoring, and retries, which are all essential for enterprise ETL.

Thus, option B is correct.

Question 134:

A cybersecurity team needs a powerful engine for analyzing massive log datasets collected from firewalls, intrusion detection systems, and identity services. They need extremely fast ingestion, log-oriented indexing, a specialized query language, anomaly detection functions, and the ability to perform large-scale pattern analysis. Which Azure service best fulfills these requirements?

Answer:

A) Azure SQL Database
B) Azure Synapse Serverless SQL Pool
C) Azure Data Explorer
D) Azure Redis Cache

Answer: C

Explanation:

Azure Data Explorer is purpose-built for log analytics, making it the correct choice for cybersecurity use cases. Security logs require fast ingestion, advanced indexing, and analytical capabilities for detecting threats, anomalies, and patterns. Azure Data Explorer supports a columnar engine and compressed storage optimized for scanning billions of log entries quickly. Kusto Query Language provides operators for parsing logs, applying filters, running statistical functions, performing anomaly detection, and correlating multiple sources.

Azure SQL Database cannot ingest logs at this scale or perform time-series or pattern detection queries efficiently.

Azure Synapse Serverless SQL Pool queries data lake files but does not support continuous ingestion or log-optimized indexing.

Azure Redis Cache is an in-memory key-value store and cannot perform log analytics.

Because cybersecurity requires high-volume log ingestion, ADX is ideal. Thus, option C is correct.

Question 135:

A corporation wants to create a unified data governance solution that scans Azure SQL, Synapse, Azure Data Lake, Power BI, and even on-prem SQL Server. They need automated classification, sensitivity labeling, lineage tracking, metadata scanning, and a searchable catalog. Which Azure service is designed to meet these requirements?

Answer:

A) Azure Key Vault
B) Microsoft Purview
C) Azure Policy
D) Azure Backup

Answer: B

Explanation:

Microsoft Purview is the correct answer because it is Azure’s enterprise-grade data governance and cataloging solution. It scans hybrid and multi-cloud data sources, classifies sensitive data, and creates lineage diagrams. Businesses use Purview to build searchable catalogs, apply sensitivity labels, define business glossary terms, and analyze data movement across pipelines.

Key Vault stores secrets, not metadata.
Azure Policy governs resource configurations, not data catalogs.
Azure Backup protects data but does not classify or scan it.

Purview is the only service that satisfies all governance needs. Therefore, option B is correct.

Question 136:

A global manufacturing company wants to build an analytics platform that processes sensor data coming from thousands of machines. They want a service that supports collaborative notebooks, Delta Lake for ACID transactions, Spark-based compute, machine learning lifecycle management, and the ability to run both batch and streaming ETL. Which Azure service should they use?

Answer:

A) Azure Data Lake Storage
B) Azure Data Factory
C) Azure Databricks
D) Azure SQL Database

Answer: C

Explanation:

Azure Databricks is the correct service because it provides the complete set of capabilities required by the manufacturing company to process sensor data, support collaborative notebooks, handle streaming and batch ETL, and enable Delta Lake ACID transactions. This combination of features is essential for a modern industrial analytics platform, especially when dealing with IoT data at large scale.

Azure Data Lake Storage is only a storage solution. While it can store massive amounts of data, it does not provide compute capabilities, notebooks, machine learning, or scheduled ETL jobs. It is a foundational storage layer but not the compute engine required by the scenario.

Azure Data Factory is a powerful ETL orchestration tool, but it is not designed for heavy Spark-based data processing, machine learning model development, or collaborative analytics. While Data Factory can move and transform data using mapping data flows, it lacks the interactive notebook environment and flexible data science tools needed.

Azure SQL Database is a relational database engine. It cannot process massive IoT streams or perform distributed machine learning. It also does not support notebooks, Delta Lake, or Spark.

Azure Databricks, however, provides auto-optimized Spark clusters, notebooks that support Python, SQL, Scala, and R, and MLflow integration for experiment tracking and model management. The ability to run both batch and streaming ETL through Structured Streaming means sensor data can be ingested in real time and processed into curated Delta tables. Delta Lake ensures reliability of pipelines by providing schema enforcement, concurrency control, time travel, and ACID transactions. These features are critical in industrial workloads, where sensors produce high-volume data that must remain consistent and trustworthy across the data lifecycle.

Because the scenario requires both advanced data engineering and machine learning capabilities, Databricks stands out as the only platform able to meet all needs simultaneously. Therefore, the correct answer is option C.

Question 137:

An organization needs a fully managed NoSQL database that supports flexible JSON document structures, globally distributed writes, automatic indexing, low-latency operations, and multiple consistency models. The system must serve millions of users worldwide with predictable performance. Which Azure service is the best fit?

Answer:

A) Azure Cosmos DB
B) Azure SQL Managed Instance
C) Azure Database for MySQL
D) Azure Synapse Dedicated SQL Pool

Answer: A

Explanation:

Azure Cosmos DB is the correct answer because it is Azure’s globally distributed NoSQL database designed for massive-scale, low-latency workloads. The scenario describes a requirement for flexible JSON document modeling, multi-region writes, automatic indexing, and support for millions of global users. Cosmos DB is the only Azure service that meets all of these attributes.

Azure SQL Managed Instance is a relational database service that does not support schema-flexible JSON documents at scale or automatic indexing of all fields. It also cannot provide global distribution with multi-master writes.

Azure Database for MySQL is similarly relational and cannot meet the global low-latency and flexible schema requirements described.

Azure Synapse Dedicated SQL Pool is a large-scale analytical warehouse and cannot serve as a globally distributed operational database.

Cosmos DB provides elastic scalability through partitioning and offers predictable performance through request units. Its automatic indexing ensures queries are fast without requiring index management. Global distribution allows data to be replicated across multiple Azure regions for resilience and low-latency access. Multi-master writes ensure users can update data from any location in the world. Developers can select from five consistency models to balance accuracy and speed, depending on the workload.

Because Cosmos DB matches every requirement in the scenario, option A is correct.

Question 138:

A retail chain wants to use Azure to orchestrate hundreds of daily ETL pipelines that integrate data from on-prem SQL servers, SaaS platforms, FTP sources, and cloud APIs. They require hybrid integration runtime, data lineage, monitoring, event triggers, and visual transformations. Which Azure service should they use?

Answer:

A) Azure Logic Apps
B) Azure Data Factory
C) Azure Databricks
D) Azure Virtual Machines

Answer: B

Explanation:

Azure Data Factory is the correct choice because it provides enterprise-grade ETL orchestration across hybrid environments. The scenario describes extracting data from on-prem SQL Servers, cloud SaaS sources, FTP endpoints, and REST APIs, all of which are core Data Factory capabilities. Integration Runtime (IR) allows secure communication with on-prem sources. Visual mapping data flows offer drag-and-drop transformation processes without writing code.

Azure Logic Apps can connect to SaaS sources but is not designed for large-scale data ingestion or transformations. It focuses on event-driven automation rather than data pipelines.

Azure Databricks excels at big data compute, Spark processing, and ML workflows, but it does not provide hybrid integration runtime, no-code ETL, or SaaS connectors out of the box. It is typically paired with ADF, not used as a replacement.

Azure Virtual Machines require organizations to build and manage their own ETL tools manually, making them unsuitable for large-scale orchestrated pipelines.

ADF also supports event triggers, pipeline monitoring, error handling, and data lineage visualization in Azure Purview when integrated. Because the scenario explicitly requires orchestration across hybrid systems with visual transformations, Data Factory is the only platform capable of meeting all needs. Thus, option B is correct.

Question 139:

A cybersecurity analytics team needs to visually query massive amounts of firewall logs, identity logs, and audit data in near real time. They need a platform optimized for time-series analysis, anomaly detection, full-text search, pattern recognition, and high-speed ingestion. Which Azure service should they choose?

Answer:

A) Azure SQL Database
B) Azure Data Explorer
C) Azure Synapse Analytics
D) Azure Data Lake Storage

Answer: B

Explanation:

Azure Data Explorer is the correct service because it is designed specifically for log analytics and time-series workloads. The scenario requires near real-time analysis of security data, including firewall and identity logs. Data Explorer uses a columnar engine with advanced indexing and compression, allowing it to process billions of events rapidly. Its Kusto Query Language supports text search, anomaly detection functions, time windows, and pattern recognition, all essential for cybersecurity analytics.

Azure SQL Database is not optimized for ingesting security logs at scale and would struggle with time-series indexing and high ingestion rates.

Azure Synapse Analytics performs analytical queries on structured data but is not specialized for log ingestion, high-speed text search, or anomaly detection required for security teams.

Azure Data Lake Storage is a storage platform, not a query or analytics engine.

Because Azure Data Explorer is tailor-made for telemetry, logs, and rapid investigations, option B is correct.

Question 140:

A corporation needs a centralized governance platform that automatically scans Azure SQL, Synapse, Data Lake, Power BI, and on-prem SQL Servers. They need sensitivity classification, metadata scanning, lineage tracking, and a searchable data catalog for business users. Which Azure service fulfills these requirements?

Answer:

A) Azure Policy
B) Azure Key Vault
C) Microsoft Purview
D) Azure Monitor

Answer: C

Explanation:

Microsoft Purview is the correct service because it provides end-to-end data governance capabilities across hybrid and multicloud environments. The scenario requires automated scanning, classification, lineage visualization, and a business-friendly data catalog. Purview supports Azure SQL, Synapse, Data Lake, Power BI, and on-prem SQL Server through Purview Data Map and scanning capabilities.

Azure Policy enforces resource configuration rules but does not handle data governance.

Azure Key Vault manages secrets and certificates but does not scan data or provide catalogs.

Azure Monitor tracks resource metrics and logs but does not manage metadata, lineage, or cataloging.

Purview also provides sensitivity labeling, glossary terms for business definitions, and powerful metadata search functions. It integrates with ADF and Synapse pipelines to display lineage from ingestion to consumption. Because no other Azure service offers such broad governance features, Purview is the only correct choice.

Exam

Related posts:

Leave a Reply Cancel reply