Visit here for our full Microsoft DP-900 exam dumps and practice test questions.
Question 81:
A company wants to implement a unified analytics platform where data engineers can build ETL pipelines, analysts can run SQL queries, data scientists can train machine learning models, and all teams can collaborate in one workspace. They need Spark-based compute, SQL capabilities, Delta Lake support, and integration with Azure Active Directory. Which Azure service best satisfies this multi-team analytics requirement?
Answer:
A) Azure Stream Analytics
B) Azure Databricks
C) Azure Synapse Dedicated SQL Pool
D) Azure Cosmos DB
Answer: B
Explanation:
Azure Databricks is the best solution for this scenario because it is a unified analytics platform that integrates data engineering, data science, and analytics into one collaborative workspace. The scenario explicitly requires ETL pipeline development, SQL querying capabilities, machine learning model training, and shared workspaces for multiple teams. Azure Databricks provides all these features through its collaborative notebooks, Spark-based compute clusters, Delta Lake integration, and seamless Azure Active Directory authentication.
Option A, Azure Stream Analytics, is designed for real-time streaming analytics and cannot support machine learning model training, large ETL workflows, or collaborative notebook environments. It uses SQL-like queries for event streams but does not provide a unified analytics environment suitable for multiple teams.
Option C, Azure Synapse Dedicated SQL Pool, is a powerful MPP data warehouse but does not integrate Spark machine learning notebooks in the same way Databricks does. While Synapse does have some Spark functionality, it is not as tightly optimized or collaborative as Databricks. Dedicated SQL Pool is focused on structured analytics rather than unified notebooks for multiple roles.
Option D, Azure Cosmos DB, is a globally distributed NoSQL database designed for application workloads. It cannot provide an analytics workspace or Spark-based collaborative development environment.
Databricks stands out because it supports multiple user personas in one place. Data engineers can build ETL or ELT pipelines using Python, SQL, Scala, R, or Spark SQL. Delta Lake technology brings ACID transactions and schema enforcement to data lake storage, making ETL more reliable. Analysts can query data via SQL notebooks or Databricks SQL endpoints. Data scientists can build and train ML models using libraries such as scikit-learn, TensorFlow, PyTorch, and the native MLlib framework.
Autoscaling clusters allow compute to scale up during heavy processing and down during idle periods, keeping costs efficient. Integration with Azure Active Directory ensures role-based access control and secure identity management.
Databricks also supports MLflow, a framework for managing the machine learning lifecycle, including tracking experiments, packaging models, and deploying them. This addresses the scenario’s requirement for collaboration across teams, as model metadata, metrics, and artifacts can be shared and managed centrally.
Because the scenario requires a platform for ETL, SQL analysis, ML training, and collaboration in one workspace backed by Spark compute, Databricks satisfies every aspect of the requirement. Thus, option B is correct.
Question 82:
A healthcare organization needs extremely strict control over sensitive patient data stored in a relational database. They require that the database engine itself cannot decrypt certain protected columns, and only authorized client applications should have access to decryption keys. Which Azure SQL feature provides this level of security?
Answer:
A) Transparent Data Encryption
B) Dynamic Data Masking
C) Always Encrypted
D) Row-Level Security
Answer: C
Explanation:
Always Encrypted is the correct solution because it ensures that sensitive data is encrypted both at rest and in use, and crucially, the database engine cannot decrypt the data. Decryption requires client-side keys stored outside the database engine, meaning only approved client applications with the necessary keys can read sensitive data. This is essential for healthcare organizations dealing with regulated patient information such as medical history, addresses, prescriptions, insurance identifiers, or personal health identifiers.
Option A, Transparent Data Encryption, encrypts the entire database at rest but decrypts data automatically for any authenticated user or application. TDE protects against physical theft of storage but does not protect sensitive data from users with database access.
Option B, Dynamic Data Masking, hides sensitive values in query results based on user permissions but does not encrypt data or prevent administrators from accessing raw values. It is designed for masking rather than security.
Option D, Row-Level Security, controls which rows a user can access but does not encrypt sensitive column-level data. It is a privacy-access tool, not a cryptographic security mechanism.
Always Encrypted is specifically designed for scenarios requiring extremely high levels of data confidentiality. In this model, encryption keys never leave the client application. The SQL engine only stores and processes ciphertext for protected columns. This means:
DBAs cannot read sensitive patient data
Attackers compromising the database server cannot decrypt data
Regulatory compliance requirements for data isolation are easier to meet
Always Encrypted supports deterministic and randomized encryption. Deterministic encryption allows equality comparisons (useful for lookups), while randomized encryption provides stronger security but cannot be used in joins or filters.
Because healthcare regulations often require systems where even administrators cannot access patient-sensitive data, Always Encrypted is the only SQL Server feature that meets this security level. Thus, option C is correct.
Question 83:
An enterprise wants to process petabyte-scale data with heavy transformations including joins, aggregates, machine learning preprocessing, and iterative computations. They need a distributed analytics engine with support for Python, SQL, and scalable processing across clusters. Which Azure compute engine is best suited for these big-data transformations?
Answer:
A) Azure Stream Analytics
B) Azure Databricks Spark Engine
C) Azure SQL Database
D) Azure Event Hubs
Answer: B
Explanation:
Azure Databricks Spark Engine is the correct choice because it offers distributed computing designed specifically for large-scale data processing. Petabyte-scale transformations require a compute engine capable of parallelizing work across many nodes, which is what Spark does best. Databricks enhances Spark performance through optimized runtimes, autoscaling, caching, and integration with Azure Data Lake.
Option A, Azure Stream Analytics, is tailored for real-time stream processing of small event batches. It cannot handle large-scale historical transformations or machine learning preprocessing.
Option C, Azure SQL Database, is a relational database optimized for OLTP workloads. While it can perform certain analytical queries, it cannot handle petabytes of data or distributed machine learning operations.
Option D, Azure Event Hubs, is an ingestion service for real-time data streams and not a compute engine.
Databricks Spark Engine supports Python, SQL, Scala, and R, providing flexible interfaces for data engineering and machine learning tasks. When organizations need to process enormous datasets with transformations like joins, windowing, aggregation, and iterative ML preprocessing, Spark engines outperform traditional systems by distributing operations across many worker nodes.
Spark’s in-memory processing allows iterative computations required by machine learning algorithms to run efficiently. Databricks adds further optimization through Delta Lake, enabling ACID transactions, schema evolution, and high-performance updates on data lake files.
For these reasons, the Databricks Spark Engine is the best fit. Thus, option B is correct.
Question 84:
A company wants to build a real-time monitoring dashboard using Power BI that updates every few seconds based on streaming data from IoT devices. They need a service that can process incoming telemetry streams and output the results directly to Power BI push datasets. Which Azure service should they use?
Answer:
A) Azure Data Factory
B) Azure Stream Analytics
C) Azure SQL Managed Instance
D) Azure Blob Storage
Answer: B
Explanation:
Azure Stream Analytics is the right service because it is built to process real-time telemetry streams and can output directly to Power BI push datasets, enabling live updating dashboards. Power BI’s real-time visualizations require data to be pushed frequently rather than refreshed on a scheduled cycle. Stream Analytics supports this via built-in output connectors.
Option A, Data Factory, is a batch ETL system and cannot stream data in real time at the sub-second interval required for live dashboards.
Option C, SQL Managed Instance, is a relational database not capable of streaming push updates directly to Power BI. It supports analytical queries but not live IoT streaming.
Option D, Blob Storage, is a storage solution and provides no streaming or push capabilities.
Stream Analytics can ingest data from IoT Hub, Event Hub, or Kafka, process it through SQL-like queries, and push results to Power BI dashboards with extremely low latency. It supports windowing functions ideal for IoT metrics such as averages over time intervals, anomaly detection, and threshold alerts.
Thus, option B is correct.
Question 85:
A company wants to classify, label, and govern data across Azure SQL, Azure Data Lake, Power BI, and external sources. They want automatic scanning, metadata extraction, sensitivity labeling, and lineage tracking. Which service provides comprehensive data governance across the entire data estate?
Answer:
A) Azure Key Vault
B) Azure Purview (Microsoft Purview)
C) Azure Automation
D) Azure Firewall
Answer: B
Explanation:
Microsoft Purview is the right choice because it provides centralized data governance, cataloging, classification, sensitivity labeling, and lineage tracking across a company’s entire data landscape. The scenario requires scanning Azure SQL, Azure Data Lake, Power BI, and external sources, which Purview supports natively.
Option A, Key Vault, stores secrets and cryptographic keys but has no cataloging or governance capabilities.
Option C, Azure Automation, runs scripts and automates operational tasks but cannot scan or classify data.
Option D, Azure Firewall, provides network protection but not data governance.
Purview automatically extracts metadata, identifies sensitive data using built-in classifiers, and builds lineage graphs showing how datasets flow between sources and transformations. This enables compliance, discoverability, and governance across diverse systems.
Thus, option B is correct.
Question 86:
A company wants to analyze semi-structured JSON and CSV data stored in Azure Data Lake without performing ETL or provisioning dedicated compute resources. They want analysts to run T-SQL queries directly over the files for fast exploration. Which Azure service best suits this requirement?
Answer:
A) Azure Synapse Serverless SQL Pool
B) Azure SQL Database
C) Azure Stream Analytics
D) Azure Databricks
Answer: A
Explanation:
Azure Synapse Serverless SQL Pool is the best option because it allows analysts to query semi-structured data such as JSON and CSV files directly from Azure Data Lake using T-SQL without the need to provision compute resources. This aligns perfectly with organizations seeking fast ad-hoc exploration and cost-effective analytics, as charges are incurred only for the data processed during the query.
Azure SQL Database, option B, requires structured tables and cannot directly query raw files stored in a data lake without data ingestion steps. It also requires provisioned compute, making it less flexible for ad-hoc exploration.
Azure Stream Analytics, option C, is optimized for real-time stream processing and cannot serve as a query engine for large static files stored in a data lake.
Azure Databricks, option D, is a powerful platform for big data analytics and data science, but it requires provisioning Spark clusters. While Databricks can query files, the company specifically wants a no-compute, T-SQL driven solution, which Databricks does not provide natively.
Serverless SQL Pool enables analysts to issue standard SQL queries using OPENROWSET, external tables, and views on top of lake files, without copying or transforming data. This minimizes cost and boosts productivity for quick data discovery.
Thus, option A is correct.
Question 87:
Your organization wants to run large-scale analytics using a relational MPP engine that distributes data across many nodes to support complex joins, aggregations, and fact-dimension modeling. They want full control over indexing strategies and table distribution. Which Azure service should they select?
Answer:
A) Azure SQL Managed Instance
B) Azure Synapse Dedicated SQL Pool
C) Azure Cosmos DB
D) Azure Data Factory
Answer: B
Explanation:
Azure Synapse Dedicated SQL Pool is the right choice because it is built on a Massively Parallel Processing (MPP) architecture designed for large-scale analytical workloads. It distributes data across multiple compute nodes, allowing complex analytical queries—such as large joins, aggregations, and fact-dimension queries—to execute efficiently.
Azure SQL Managed Instance, option A, is optimized for OLTP workloads and does not support MPP parallelization or data distribution strategies needed for enterprise-scale analytics.
Azure Cosmos DB, option C, is a NoSQL service that provides global distribution, not a relational analytical engine. It is not designed for star schemas, MPP workloads, or large-scale relational analytics.
Azure Data Factory, option D, provides pipeline orchestration and ETL but cannot execute analytical queries or host distributed SQL engines.
Dedicated SQL Pool supports distribution methods such as hash, round-robin, and replicated tables. It also provides indexing strategies and workload management features essential for maximizing analytical performance. It integrates tightly with BI tools and supports large-scale factual datasets with billions of rows.
Thus, option B is correct.
Question 88:
A global ridesharing company needs a database solution that can replicate data across multiple regions with extremely low latency. They require multi-master writes, a choice of consistency levels, and the ability to serve millions of users concurrently around the world. Which Azure database service meets these requirements?
Answer:
A) Azure SQL Database
B) Azure Database for MySQL
C) Azure Cosmos DB
D) Azure Synapse Analytics
Answer: C
Explanation:
Azure Cosmos DB is the correct answer because it is designed for globally distributed applications that require multi-region replication, multi-master writes, and tunable consistency levels. It guarantees sub-10ms latency at the 99th percentile and supports elastic scalability for millions of users concurrently.
Azure SQL Database, option A, offers geo-replication but cannot support multi-master writes or granular consistency models.
Azure Database for MySQL, option B, supports replicas but cannot deliver the global low-latency performance or multi-master writes needed.
Azure Synapse Analytics, option D, is a data warehouse platform designed for analytical queries, not real-time distributed application workloads.
Cosmos DB’s multi-master capability enables writes to occur in any region, reducing latency for globally distributed apps like ridesharing or delivery platforms. Its five consistency levels allow applications to balance performance with accuracy. Combined with automatic failover and minimal downtime, Cosmos DB is purpose-built for worldwide, always-on systems.
Thus, option C is correct.
Question 89:
A company wants to implement automated ETL pipelines that pull data from SaaS applications, perform transformations, and load processed results into Azure Synapse Analytics. They need scheduling, mapping data flows, monitoring, and native connectors. Which Azure service should they choose?
Answer:
A) Azure Data Factory
B) Azure Stream Analytics
C) Azure App Service
D) Azure Data Lake Storage
Answer: A
Explanation:
Azure Data Factory is the correct service because it provides ETL and ELT orchestration, hundreds of native connectors (including SaaS sources), mapping data flows for visual transformations, scheduling, triggers, and monitoring. It is purpose-built for building automated ingestion and transformation workflows.
Azure Stream Analytics, option B, is optimized for real-time stream processing and not batch ETL pipelines.
Azure App Service, option C, hosts web applications and APIs but cannot orchestrate ETL workloads.
Azure Data Lake Storage, option D, is a storage solution, not an ETL engine.
Data Factory pipelines can integrate SaaS connectors such as Salesforce, Dynamics, Google Analytics, and more. Mapping Data Flows allow scalable, Spark-based transformations without coding. Pipelines support triggering, monitoring, email alerts, retries, and integration with Synapse Dedicated SQL Pool.
Thus, option A is correct.
Question 90:
A company wants to manage, classify, catalog, and track lineage for datasets stored in Azure SQL, Azure Synapse, Azure Data Lake, Power BI, and external sources. They need a unified governance platform with automatic scanning, metadata extraction, and sensitivity labeling capabilities. Which Azure service should they use?
Answer:
A) Azure Firewall
B) Azure Key Vault
C) Microsoft Purview
D) Azure Monitor
Answer: C
Explanation:
Microsoft Purview is the best option because it provides centralized governance across the entire data estate. It automatically scans data sources, extracts metadata, identifies sensitive information, tracks lineage, and generates searchable catalogs for analysts and business users.
Azure Firewall, option A, controls network traffic but does not catalog or classify data.
Azure Key Vault, option B, stores secrets and certificates but cannot classify or track datasets.
Azure Monitor, option D, tracks performance metrics but provides no governance capabilities.
Purview integrates with most Azure analytic services including Synapse, Data Factory, SQL databases, and Power BI. It supports sensitivity labels, glossary terms, classification policies, and lineage visualizations. This ensures compliance, improves data discovery, and centralizes governance across hybrid environments.
Thus, option C is correct.
Question 91:
A retail corporation wants to implement a data architecture where raw data is stored in Azure Data Lake, structured and processed data can be queried using SQL, and machine learning teams can use the same data for feature engineering. They want a unified storage layer that supports ACID transactions, schema enforcement, and time travel on top of parquet. Which technology should they adopt within Azure?
Answer:
A) Azure Blob Storage with CSV files
B) Delta Lake on Azure Databricks
C) Azure SQL Database
D) Azure Cosmos DB
Answer: B
Explanation:
Delta Lake on Azure Databricks is the best choice because it provides a unified storage layer built on top of parquet, offering features such as ACID transactions, schema enforcement, schema evolution, versioning, and time travel. The scenario describes a retail corporation wanting to store raw and processed data in Azure Data Lake while allowing SQL analytics and machine learning teams to use the same data. Delta Lake solves exactly this challenge by enabling a reliable data lake architecture that behaves like a transactional system.
Option A, Azure Blob Storage with CSV files, lacks transactional guarantees and schema enforcement. CSV files are prone to corruption, inconsistent structure, and incompatible formats for large-scale analytics. They also do not support time travel or ACID properties.
Option C, Azure SQL Database, is a relational database designed for OLTP workloads. It cannot unify raw, semi-structured, and machine learning features into one scalable system. SQL Database does not use open file formats or integrate with distributed compute engines like Spark.
Option D, Azure Cosmos DB, is a NoSQL platform designed for operational, globally distributed workloads. It is not intended for unified analytics, ETL pipelines, or machine learning feature engineering.
Delta Lake brings transactional reliability to data lakes using parquet files. It supports multiple personas:
Data engineers can build ETL pipelines and write structured parquet files safely.
Analysts can query data using SQL, Spark SQL, or Databricks SQL.
Data scientists can perform feature engineering using Python or Spark MLlib.
Delta Lake adds metadata layers, allowing easy rollback, historical queries, and time travel. ACID transactions ensure that no partial writes corrupt datasets, solving common problems in data lakes such as incomplete file writes, schema drift, and inconsistent directories. Schema enforcement prevents incompatible writes, while schema evolution allows dynamic adaptation to new fields.
The ability to unify batch processing, streaming ingestion, analytics, and machine learning within the same storage layer makes Delta Lake ideal for modern analytics. Thus, option B is correct.
Question 92:
A financial services company processes trillions of log records and requires an analytical engine optimized for telemetry, time-series queries, full-text search, and near real-time ingestion. Analysts need to run complex queries using a specialized query language for operational analytics. Which Azure service is most appropriate?
Answer:
A) Azure Synapse Dedicated SQL Pool
B) Azure SQL Managed Instance
C) Azure Data Explorer (Kusto)
D) Azure Data Factory
Answer: C
Explanation:
Azure Data Explorer is the correct choice because it is built specifically for analyzing massive amounts of telemetry, log, and time-series data. The service excels at high-ingestion rates, often capable of handling gigabytes of data per second. Financial services companies, especially those dealing with fraud detection, trading logs, audit data, and compliance telemetry, rely heavily on fast, interactive analytics on large datasets.
Option A, Synapse Dedicated SQL Pool, is designed for structured large-scale analytics but is not optimized for time-series telemetry or log analytics. It cannot ingest trillions of log entries with near real-time performance.
Option B, SQL Managed Instance, provides a relational database environment but is not engineered for petabyte-scale log analytics or millisecond query responsiveness on billions of rows.
Option D, Data Factory, is an orchestration and ETL service and cannot act as an analytics engine.
Azure Data Explorer (ADX) uses a columnar storage format, compression, indexing, and caching to accelerate time-series analysis. Its Kusto Query Language (KQL) is purpose-built for analyzing logs, supporting:
Time window queries
Anomaly detection
Trend analysis
Pattern matching
Joins and aggregations across billions of records
ADX is widely used for security analytics, IoT telemetry, trading platform monitoring, and operational analytics dashboards. It also integrates seamlessly with Azure Monitor and Log Analytics.
Because the scenario requires specialized log analysis, full-text search, time-series intelligence, and extreme ingestion performance, Azure Data Explorer is the best choice. Thus, option C is correct.
Question 93:
A technology company wants to allow business analysts to build self-service BI reports using Power BI. They need a feature that automatically refreshes data from Azure SQL Database on a schedule, without requiring analysts to manually reconnect or reload data. Which Power BI capability enables this functionality?
Answer:
A) Power BI Desktop local refresh
B) DirectQuery Mode
C) Scheduled Refresh in Power BI Service
D) Live Connection through Analysis Services
Answer: C
Explanation:
Scheduled Refresh in Power BI Service is the correct answer because it enables automated data refreshes for imported datasets stored in Power BI. Analysts can build reports in Power BI Desktop, publish them to the service, and rely on scheduled refreshes to automatically update datasets at configured intervals.
Option A, Power BI Desktop local refresh, is manual and applies only to the local environment. It cannot refresh datasets in the cloud or update shared dashboards.
Option B, DirectQuery Mode, enables real-time querying of the underlying Azure SQL Database but does not apply to imported datasets. It also increases dependency on database performance and may affect latency.
Option D, Live Connection, applies to connections to Analysis Services models, not Azure SQL datasets. While powerful, it does not meet the requirement for scheduled refresh for imported data.
Scheduled Refresh is critical for organizations that do not want analysts manually refreshing dashboards. It supports:
Multiple refresh times per day
Credential storage securely through gateways
Failure notifications
Dataset-level refresh rules
Power BI Gateway can also be configured if data is on-premises. For Azure SQL Database, refreshes typically connect directly through cloud-to-cloud communication without needing a gateway.
Thus, option C is correct.
Question 94:
A company wants to ensure that its Azure SQL Database is protected against accidental data loss. They want automated backups, point-in-time restore, and long-term retention for compliance purposes. Which built-in Azure SQL feature fulfills these requirements?
Answer:
A) Geo-Replication
B) Automatic Backups
C) Read-Scale Replicas
D) Azure SQL Ledger
Answer: B
Explanation:
Automatic Backups is the correct answer because Azure SQL Database provides automated, continuous backups that support point-in-time restore and long-term retention. These backups ensure the database can be restored to any moment within the retention period, protecting against accidental deletion, corruption, or user mistakes.
Option A, Geo-Replication, supports high availability and disaster recovery between regions but does not provide point-in-time restore or long-term retention.
Option C, Read-Scale Replicas, improves read performance but does not protect against data loss.
Option D, Azure SQL Ledger, provides tamper-evidence for transactions but not backup and restore capabilities.
Automatic Backups are performed without user intervention and are stored in RA-GRS storage. Retention can be extended for years using the Long-Term Retention (LTR) feature, which is essential for regulatory compliance.
Thus, option B is correct.
Question 95:
A logistics company needs a scalable database solution for storing large volumes of sensor data from delivery trucks. The data must be stored as JSON documents, support global distribution, and provide millisecond read and write performance. Which Azure service best meets these needs?
Answer:
A) Azure SQL Database
B) Azure PostgreSQL
C) Azure Cosmos DB
D) Azure Data Explorer
Answer: C
Explanation:
Azure Cosmos DB is the right choice because it is designed to store JSON documents at massive scale with global distribution and low-latency reads and writes. Sensor data from delivery trucks is typically semi-structured and requires flexible schema support, which fits perfectly with Cosmos DB’s document model.
Option A, Azure SQL Database, does support JSON but is not optimized for massive, distributed document storage or global replication at low latency.
Option B, PostgreSQL, is a relational open-source database, not designed for large-scale globally distributed document workloads.
Option D, Azure Data Explorer, excels at log and telemetry analytics but is not meant for globally distributed operational workloads or storing JSON documents with millisecond write requirements.
Cosmos DB provides multi-region writes, tunable consistency, automatic indexing, and horizontal partitioning. These features ensure consistent performance while handling rapidly growing sensor data. It also offers five consistency levels that help balance correctness versus latency.
Thus, option C is correct.
Question 96:
A company wants to modernize its on-premises SQL Server workloads by migrating them to a fully managed Azure relational service with minimal application changes. They require near 100 percent compatibility with SQL Server features such as SQL Agent, linked servers, CLR integration, and instance-level settings. Which Azure service best meets this requirement?
Answer:
A) Azure SQL Database
B) Azure SQL Managed Instance
C) Azure Synapse Analytics
D) Azure Database for PostgreSQL
Answer: B
Explanation:
Azure SQL Managed Instance is the most appropriate choice because it provides near full compatibility with on-premises SQL Server, making it the ideal destination for organizations that want to lift and shift existing workloads with minimal or no application changes. The scenario specifically mentions features such as SQL Agent, linked servers, CLR integration, and instance-level configurations, all of which are not available in Azure SQL Database but are supported extensively in Azure SQL Managed Instance. This makes Managed Instance the closest cloud-based version of the SQL Server engine.
Azure SQL Database, option A, is a powerful fully managed relational database service but is designed for modernized cloud-native applications. It does not support all SQL Server features, including SQL Agent, cross-database queries, server-level settings, or linked servers. This means that applications relying on instance-level behavior or legacy SQL Server features cannot be migrated seamlessly to Azure SQL Database without code changes or redesign.
Azure Synapse Analytics, option C, is primarily an analytical data warehouse platform built for OLAP workloads. It uses a Massively Parallel Processing architecture to run complex analytical queries on large datasets and is not suitable for operational, transactional workloads such as those typically hosted on a SQL Server instance. Therefore, it cannot satisfy the requirement for compatibility with SQL Server’s transactional features.
Azure Database for PostgreSQL, option D, is a fully managed PostgreSQL engine. Although it is a relational database system, it is not compatible with SQL Server in any way. Migrating a SQL Server workload to PostgreSQL requires extensive code changes, different query syntax, new client drivers, and data type conversions.
Managed Instance supports features such as cross-database queries, database mail, SQL Agent jobs, server-level collation, CLR assemblies, linked servers, and many other SQL Server instance-level functionalities. It also provides automated backups, point-in-time restore capabilities, and high availability. Additionally, Managed Instance integrates seamlessly with Azure services such as Data Factory, Power BI, and Key Vault.
The ability to run in a private virtual network through VNet integration allows organizations to maintain strict security controls similar to their on-prem environments. This also supports hybrid connectivity via VPN or ExpressRoute, making it easier to implement phased migrations.
Because the scenario explicitly prioritizes minimal changes, high compatibility, and support for advanced SQL Server features, Azure SQL Managed Instance is the only option that satisfies all these conditions. Thus, option B is correct.
Question 97:
A company wants to ingest continuous streams of data from thousands of IoT sensors, buffer the data for processing, and route it to downstream systems such as Azure Databricks, Azure Stream Analytics, or Azure Blob Storage. They require extremely high throughput and low-latency ingestion. Which Azure service should act as the ingestion backbone for this architecture?
Answer:
A) Azure Data Factory
B) Azure Event Hubs
C) Azure Blob Storage
D) Azure SQL Database
Answer: B
Explanation:
Azure Event Hubs is the correct answer because it is specifically designed to ingest massive amounts of streaming data from distributed sources such as IoT devices, applications, and telemetry systems. Event Hubs is capable of ingesting millions of events per second while maintaining low latency, making it ideal as the ingestion backbone for large-scale real-time analytics systems.
Azure Data Factory, option A, cannot be used as a streaming data ingestion tool. It is optimized for batch ETL and scheduled pipelines, not real-time high-throughput event ingestion. It cannot meet the latency or ingestion requirements for IoT scenarios.
Azure Blob Storage, option C, is a storage platform only. Although it can store large amounts of data, it cannot act as a streaming ingestion service. It cannot buffer events or support consumer groups, partitions, or real-time processing.
Azure SQL Database, option D, is designed for transactional workloads, not high-speed ingestion. It cannot handle streaming workloads at IoT scale without overwhelming the database or causing performance degradation.
Event Hubs supports consumer groups, allowing multiple independent processing applications to consume the same event stream without conflict. It buffers data reliably and integrates with systems like Stream Analytics, Databricks, Functions, and Logic Apps. Event Hubs also supports partitions for horizontal scaling, autoscaling through Event Hubs Dedicated tiers, and compatibility with the Apache Kafka protocol.
Because the scenario requires a scalable ingestion service that can buffer streaming sensor data and support the downstream analytics ecosystem, Event Hubs is the precise fit. Thus, option B is correct.
Question 98:
Your organization wants to ensure that sensitive data fields such as Social Security numbers, credit card numbers, and personal email addresses are hidden from unauthorized users when they query the database. They do not want to alter or encrypt the actual data stored on disk but need a mechanism to hide sensitive fields only in query results. Which Azure SQL feature should be used?
Answer:
A) Always Encrypted
B) Dynamic Data Masking
C) Backup Encryption
D) Row-Level Security
Answer: B
Explanation:
Dynamic Data Masking is the correct solution because it allows sensitive fields to be masked dynamically when queried by unauthorized users, without modifying the underlying stored data. This meets the exact requirement described: masking at query time with no changes to stored values. Dynamic Data Masking is designed to prevent accidental exposure of sensitive data while preserving the ability for privileged users to access full values.
Always Encrypted, option A, encrypts sensitive data both at rest and during processing, ensuring that even the database engine cannot read the plaintext. While it provides stronger security, it requires application changes, client-side encryption keys, and does not align with the scenario where data should remain unencrypted at rest.
Backup Encryption, option C, encrypts database backups to protect them from unauthorized access but does not affect query results or mask data fields.
Row-Level Security, option D, restricts access to specific rows based on user identity. Although it controls data visibility, it does not mask column-level values.
Dynamic Data Masking supports multiple masking types such as partial masking, email masking, and numeric masking. For example, credit card numbers can be displayed as XXXX-XXXX-XXXX-1234 for unauthorized users. Because this process happens dynamically through SQL Server’s query engine, it requires no code changes to applications and preserves data integrity.
Thus, option B is correct.
Question 99:
A company wants to build a real-time fraud detection pipeline. Incoming transactional data must be processed within milliseconds, aggregated in small time windows, checked for anomalies, and forwarded to alerting systems. They need a service specifically designed for real-time event processing with SQL-like query capabilities. Which Azure service should they use?
Answer:
A) Azure Stream Analytics
B) Azure SQL Database
C) Azure Cosmos DB
D) Azure Data Factory
Answer: A
Explanation:
Azure Stream Analytics is the correct answer because it is built for real-time data stream processing with extremely low latency. Fraud detection requires immediate evaluation of incoming events, often within milliseconds. Stream Analytics supports SQL-like queries, windowing functions, pattern detection, anomaly detection, and integration with machine learning models.
Azure SQL Database, option B, cannot process streaming workloads or real-time event streams. It is designed for transactional data, not continuous ingestion.
Azure Cosmos DB, option C, is a NoSQL data store optimized for globally distributed applications and low-latency storage. It does not provide SQL-based stream processing capabilities or real-time event windowing.
Azure Data Factory, option D, is a batch ETL service incapable of real-time processing.
Stream Analytics integrates with Event Hubs and IoT Hub for ingestion and can output to Power BI, SQL, Cosmos DB, or alerting systems. It supports tumbling, hopping, and sliding windows, which are essential for fraud analysis where short-term behavior must be monitored continuously.
Thus, option A is correct.
Question 100:
A large enterprise with multiple departments wants a centralized system to govern its entire data estate. This includes tracking lineage between data sources and pipelines, automatically classifying sensitive information, cataloging datasets, scanning Azure SQL and Synapse sources, and enabling business users to search metadata. Which Azure service should they implement?
Answer:
A) Azure Firewall
B) Microsoft Purview
C) Azure Monitor
D) Azure Key Vault
Answer: B
Explanation:
Microsoft Purview is the correct choice because it provides a comprehensive data governance platform capable of scanning, classifying, cataloging, and tracking data lineage across Azure and external data sources. The scenario describes a need for a centralized governance solution with metadata search, sensitivity labeling, automatic scanning, and lineage tracking. Purview is specifically designed to address these enterprise governance challenges.
Azure Firewall, option A, is a network security service and cannot govern data assets.
Azure Monitor, option C, tracks metrics and logs but does not perform data governance or classification.
Azure Key Vault, option D, stores keys, secrets, and certificates but does not provide metadata scanning or cataloging.
Purview supports a broad range of data sources including SQL Database, Synapse, Data Lake, Power BI, SAP, on-prem SQL Servers, and many more. It automatically discovers data assets, extracts schema information, identifies sensitive fields, and provides searchability for business users. Its lineage dashboard shows how data flows from ingestion to transformation to consumption, which helps maintain compliance and improve trust in enterprise data.
Thus, option B is correct.