Snowflake SnowPro Core Recertification (COF-R02) Exam Dumps and Practice Test Questions Set3 Q41-60

Visit here for our full Snowflake SnowPro Core exam dumps and practice test questions.

Q41. A Snowflake modernization team is redesigning its ingestion pipeline to support high-volume semi-structured data arriving from numerous IoT devices. The data includes variable schemas and sudden bursts in volume. The team wants cost-efficient ingestion with minimal operational overhead. Which strategy provides the most scalable and efficient Snowflake-native solution?

A Use Snowpipe with continuous event notifications to load files as they arrive
B Run a large warehouse 24/7 to perform scheduled COPY INTO operations
C Store the incoming data in external tables and query directly from cloud storage
D Use tasks to execute COPY INTO commands every few minutes regardless of file arrival

Answer: A

Explanation:

Option A is the correct solution because Snowpipe provides truly event-driven ingestion with minimal warehouse consumption and strong elasticity for unpredictable traffic patterns common in IoT use cases. Snowpipe automatically scales with inbound ingestion volume and processes incoming files without requiring scheduled compute resources. Using event notifications enables near-real-time processing measured in seconds, ensuring timely availability of data for analytics or downstream transformations. Its serverless nature offloads warehouse management and eliminates the burden associated with maintaining computer infrastructure. For semi-structured and variable-schema IoT data, Snowpipe handles continuous ingestion efficiently while maintaining low operational overhead.

Option B is incorrect because running a large warehouse continuously is highly inefficient and expensive. COPY INTO operations triggered on a schedule consume compute regardless of whether new files are present. For workloads with unpredictable bursts, this results in wasted credits and unnecessary operational cost. It also contradicts Snowflake’s elasticity philosophy by forcing a static compute posture.

Option C can be useful for cost-effective cold or archival storage, but external tables queried directly from object storage suffer from higher latency, limited statistics, and reduced partition pruning efficiency. They rely on remote metadata and do not provide the same performance characteristics as Snowflake-managed micro-partitions. For high-volume semi-structured IoT data, this results in suboptimal performance and provides no ingestion governance layer.

Option D suffers from the same flaw as option B, though to a lesser extent. Tasks executing COPY INTO every few minutes introduce fixed-frequency polling, wasting compute cycles when file arrival is irregular. While tasks are useful for scheduled ETL, they are not optimal for highly variable event-driven workloads. They also impose latency based on schedule granularity, not real-time arrival.

Thus, option A provides the most scalable, efficient, and architecturally aligned ingestion mechanism for IoT-style data streams in Snowflake.

Q42. A data governance team requires the ability to track changes made to sensitive tables for auditing purposes. They must maintain a historical log of every mutation while avoiding any direct performance impact on transactional workloads. Which Snowflake feature best satisfies these strict auditing requirements?

A Time Travel queries executed periodically
B Streams combined with change capture tables
C Fail-safe recovery as an audit data source
D Zero-copy clones refreshed daily

Answer: B

Explanation:

Option B streams combined with change capture tables is the correct auditing strategy because Snowflake streams provide a consistent and efficient mechanism to capture all row-level changes without interfering with transactional workloads. Streams track inserts, updates, and deletes by referencing internal change history. When consumed by a downstream ETL or auditing process, they provide a complete view of changes while preventing reprocessing. Streams operate independently of user activity and maintain minimal overhead, making them ideal for continuous auditing.

Option A Time Travel is not optimized for continuous auditing. While Time Travel can reconstruct table states at previous points, it does not explicitly list row-level changes. Regularly comparing table versions using Time Travel would require large scans and heavy compute usage. It would also introduce expensive operational workflows prone to inconsistency.

Option C is unsuitable because fail-safe is a disaster recovery mechanism, not a change auditing tool. Fail-safe provides access only during catastrophic recovery windows and requires Snowflake assistance. It has multi-day delays and cannot provide routine change logs.

Option D zero-copy clones offer point-in-time snapshots but do not capture incremental changes. Daily refreshes would miss updates occurring between snapshots and require additional logic to compute deltas. This hybrid approach creates gaps in auditing and fails to meet compliance-grade requirements.

Thus B is the most complete, efficient, and audit-ready solution in Snowflake.

Q43. A data engineering team needs to perform complex joins on large historical datasets without repeatedly scanning the entire table. They want the ability to isolate and reuse expensive computations across multiple queries. Which Snowflake feature supports this requirement most effectively?

A Transient tables
B Materialized views
C Temporary tables
D Secure views

Answer: B

Explanation:

Option B materialized views is correct because they store precomputed results derived from underlying tables and automatically maintain those results incrementally. Materialized views significantly reduce query latency for recurring analytical patterns, particularly when dealing with large historical datasets. They are ideal for scenarios involving expensive joins, aggregations, or filters. By storing precomputed data, they mitigate repeated scanning of massive tables and enable rapid response times across multiple sessions and workloads.

Option A transient tables can store computed results but lack automatic maintenance. Engineers would need to continually refresh or rebuild them, adding operational burden and eliminating the primary benefit sought — automated reuse of expensive computations.

Option C temporary tables provide session-scoped results suitable for ephemeral analysis but cannot serve as reusable assets across multiple queries or user sessions. They also require manual maintenance and refreshes.

Option D secure views do not store results; they simply define a query layer to be executed at runtime. While secure views enhance data protection, they do not reduce compute cost for recurring complex joins because every query triggers full execution of the underlying logic.

Thus B materialized views best meet the performance and reuse requirements.

Q44. A Snowflake architect is optimizing a multi-tenant environment. Tenants require strict data isolation, but compute resources must be shared. The architect needs a method that ensures data visibility is isolated per tenant while still using centralized compute. Which Snowflake approach provides the most secure and scalable model?

A Give each tenant a dedicated warehouse but store data in shared tables
B Use role-based row access policies referencing tenant identifiers
C Maintain independent databases for every tenant
D Rely on network policies to segment tenant traffic

Answer: B

Explanation:

Option B is correct because row access policies allow fine-grained filtering based on tenant identifiers tied to user roles or session attributes. This ensures complete data isolation while maintaining shared storage objects and shared warehouses. Row access policies integrate directly into Snowflake’s security model, ensuring that every query is automatically filtered regardless of user intent. They remain scalable as the number of tenants grows and minimize operational overhead by centralizing logic.

Option A provides compute isolation but no data isolation. If tenants share tables, they can potentially access each other’s data without strong row-level filtering. Dedicated warehouses only isolate compute workloads, not data visibility.

Option C is secure but not scalable. Maintaining separate databases leads to fragmentation, duplicate objects, increased maintenance burden, and complex governance. It requires separate schemas, grants, and sometimes redundant pipelines.

Option D network policies manage access at the IP or network level. They do not filter data visibility and cannot enforce multi-tenant isolation beyond connection control.

Thus B satisfies strict isolation with shared compute and unified governance.

Q45. A Snowflake performance specialist discovers that a large table has highly skewed micro-partitions due to uneven ingestion patterns. The team wants Snowflake to automatically optimize clustering over time without manually running frequent reclustering. Which feature best meets this requirement?

A Use clustering keys with automatic clustering enabled
B Convert the table into an external table
C Reduce table retention to minimize partition count
D Use materialized views to offload pruning logic

Answer: A

Explanation:

Option A is correct because automatic clustering ensures that Snowflake continuously monitors and optimizes micro-partition distribution for tables with defined clustering keys. Once enabled, automatic clustering runs in the background without requiring engineer intervention. It maintains orderly physical structuring that improves pruning efficiency and query performance, especially for large tables suffering from ingestion skew.

Option B is incorrect because external tables rely on external metadata and do not benefit from Snowflake’s micro-partitioning, clustering statistics, or pruning optimizations.

Option C retention settings affect Time Travel and Fail-safe windows but do not meaningfully impact micro-partition skew. They do not enforce physical reorganization or correct ingestion-driven disorder.

Option D materialized views accelerate specific queries but do not modify the underlying table’s micro-partitions. They are not a substitute for clustering operations.

Thus A is the feature that ensures ongoing, automated micro-partition optimization.

Q46. A data engineering workflow requires rebuilding and testing an entire downstream transformation pipeline using historical data from a specific point in the past. The rebuild must not impact current production tables and must allow repeated experiments. Which capability is best suited for this?

A Time Travel combined with zero-copy clones
B Materialized views refreshed daily
C Transient tables used as experimental replicas
D Streams on historical tables

Answer: A

Explanation:

Option A is correct because Time Travel allows access to data as it existed at a specific point in time, and zero-copy clones allow creating complete replicas of tables or databases without duplicating storage. Combining these features enables engineers to clone tables from a historical timestamp, run experiments, modify cloned objects freely, and repeat the process as needed. This approach isolates experiments from production and preserves exact historical state.

Option B materialized views do not provide historical snapshots and cannot reconstruct historical datasets.

Option C transient tables may support experiments but do not inherently provide historical versioning. Engineers would have to populate them manually, which increases complexity.

Option D streams only capture ongoing changes and do not serve as mechanisms for reconstructing full historical datasets.

Thus A best supports accurate historical modeling with isolated experimentation.

Q47. A finance analytics team wants to improve query performance on a table frequently filtered by date ranges. The team already uses clustering keys but still observes scans over unnecessary partitions. They suspect the clustering depth is deteriorating over time. What is the best Snowflake-native solution?

A Use manual reclustering commands on a scheduled basis
B Enable automatic clustering to maintain ordering
C Reduce the number of columns in the table
D Store all date values in a single VARIANT column

Answer: B

Explanation:

Option B is correct because automatic clustering ensures that Snowflake continuously reorganizes micro-partitions as new data arrives or as clustering depth degrades. This feature requires no manual scheduling and guarantees that clustering remains effective for pruning long-term. It directly addresses the issue of decreasing clustering efficiency.

Option A works, but it requires ongoing operational scheduling, warehouse allocation, and manual oversight. It is more expensive and less reliable than automatic clustering.

Option C reducing columns does not solve clustering depth issues. Column count has no direct relationship with partition ordering.

Option D placing dates into VARIANT would worsen performance. Filtering on nested fields prevents Snowflake from using physical ordering efficiently.

Thus B provides the cleanest long-term performance guarantee.

Q48. A global enterprise requires regional data residency while still allowing unified query access for headquarters analysts. They want minimal data movement and must ensure analytics can run without violating regional laws. Which Snowflake feature provides the most compliant and efficient design?

A Reader accounts
B External tables pointing to regional storage
C Database-level replication using read-only replicas
D Secure data sharing without copying data

Answer: D

Explanation:

Option D is correct because secure data sharing enables cross-region data access without physically copying data across borders. The provider region retains full residency compliance while consumers in other regions query shared data through Snowflake’s secure metadata and compute layers. This meets legal residency requirements while enabling centralized analytics without duplication or data movement.

Option A reader accounts are useful for read-only external parties but do not address residency constraints or cross-region compliance.

Option B external tables still require data movement or cross-region access to object storage, which may violate residency laws if analysts query foreign storage.

Option C replication copies data physically, which directly contradicts the requirement to avoid cross-border movement. It violates many residency regulations.

Thus D is the most compliant and efficient strategy.

Q49. A Snowflake engineer needs to capture and analyze anomalies in ingestion frequency. They want a lightweight method to monitor whether files have stopped arriving without adding heavy compute usage. Which approach best fits this requirement?

A Query the load history table frequently using a small warehouse
B Use event notifications to trigger an alerting mechanism without compute
C Create a stream on the stage to detect file arrival events
D Run a COPY INTO command in a task every minute to verify arrival

Answer: B

Explanation:

Option B is correct because event notifications allow monitoring of file arrival without using compute resources, making it the optimal solution for lightweight anomaly detection in data ingestion pipelines. Snowflake’s event notification framework integrates with cloud-native messaging services such as Amazon SNS, Azure Event Grid, or Google Cloud Pub/Sub to generate notifications automatically when new files land in designated cloud storage locations, creating a reactive monitoring architecture that operates independently of Snowflake’s compute layer. These notifications can be forwarded to alerting systems, logging platforms, monitoring dashboards, or custom anomaly detection services that analyze file arrival patterns, validate expected data volumes, check for missing files, or detect unusual timing deviations without ever spinning up a Snowflake warehouse. This approach provides real-time signaling with minimal latency since cloud storage platforms emit events immediately upon file creation or modification, enabling organizations to detect data quality issues, pipeline failures, or upstream system problems within seconds of occurrence rather than waiting for periodic polling cycles. The event-driven architecture eliminates the compute costs associated with repeatedly querying metadata tables or scanning directory structures, since the cloud platform handles event generation and routing through its native infrastructure without consuming Snowflake credits. Organizations can implement sophisticated monitoring logic by subscribing multiple consumers to the same event stream, allowing simultaneous forwarding to operational alerting tools like PagerDuty or Datadog, compliance logging systems for audit trails, and custom validation services that verify file naming conventions, size thresholds, or arrival schedules against expected patterns. This lightweight monitoring capability proves invaluable for detecting anomalies such as delayed file arrivals indicating upstream processing failures, missing files in expected sequences suggesting data loss or pipeline interruptions, unusually large or small files that may signal data quality issues, or files arriving outside expected time windows that could indicate configuration drift or system misconfigurations.

Option A wastes compute resources because polling the load history table frequently requires repeated warehouse activation to execute queries against Snowflake’s metadata layer, generating unnecessary costs for a monitoring-only operation that doesn’t require data transformation or analysis. Each polling query consumes compute time proportional to the warehouse size and query complexity, and maintaining frequent polling intervals to achieve near-real-time monitoring multiplies these costs throughout the day, potentially consuming more credits on monitoring than on actual data processing workloads. The polling approach also introduces detection latency since anomalies remain undetected between polling intervals, creating blind spots where problems can escalate before discovery, and it adds operational complexity through the need to manage warehouse scheduling, query optimization, and result storage for historical tracking. Option C streams fundamentally do not operate on stages or external cloud storage locations, as they are designed specifically to track changes in Snowflake tables through change data capture mechanisms that record inserts, updates, and deletes at the row level. Streams provide a powerful tool for incremental processing and CDC workflows within Snowflake’s database layer, but they cannot detect file arrival events in external stages since those files exist outside Snowflake’s table storage system until explicitly loaded through COPY commands or Snowpipe. Attempting to use streams for stage monitoring represents a fundamental misunderstanding of their architectural purpose and capabilities, as they operate exclusively on committed table changes rather than file system events.

Option D incurs repetitive compute usage through scheduled tasks that must execute queries or SQL statements periodically to check for new files, creating ongoing warehouse activation cycles that consume credits continuously even when no files arrive or no anomalies exist. Tasks require compute resources for every execution regardless of whether they detect meaningful events, making them inefficient for monitoring-only purposes where the vast majority of executions confirm normal operations rather than discovering actionable issues. The task-based approach also introduces configuration complexity through the need to define appropriate scheduling frequencies that balance detection latency against compute costs, and it lacks the immediate responsiveness of event-driven architectures that react instantly to file arrivals.

Thus Option B satisfies the lightweight monitoring requirement by leveraging cloud-native event infrastructure to provide real-time anomaly detection without consuming Snowflake compute resources, enabling cost-effective, responsive monitoring that scales effortlessly with file arrival rates and supports sophisticated alerting logic through integration with external monitoring and observability platforms.

Q50. A data engineering lead wants to build a reliable, automated ingestion pipeline that processes new files, transforms the data, and loads it into curated tables. The pipeline must run end-to-end without manual intervention. What Snowflake capability provides the best orchestration?

A Tasks with AFTER dependencies chaining transformation steps
B Materialized views refreshed on demand
C External tables with manual follow-up processing
D Transient tables scheduled with COPY operations

Answer: A

Explanation:

Option A is correct because tasks linked with AFTER dependencies can orchestrate an entire ingestion-to-transformation pipeline, providing comprehensive workflow automation that spans from raw data landing through final analytical output. Snowflake Tasks support both scheduled execution based on cron expressions and event-driven triggers through table streams or external notifications, allowing pipelines to adapt to various data arrival patterns whether batch-oriented or near-real-time. The AFTER dependency clause creates directed acyclic graphs where each task explicitly declares its prerequisite tasks, and Snowflake’s scheduler guarantees sequential execution based on upstream completion, ensuring that transformation logic never processes incomplete or inconsistent data. This dependency framework handles complex pipeline topologies including linear chains where each stage depends on the previous one, parallel branches that process independent data streams simultaneously, and convergence points where multiple upstream tasks must complete before downstream aggregation or reconciliation occurs. The orchestration mechanism provides built-in reliability through automatic retry policies that handle transient failures, failure propagation that halts dependent tasks when upstream stages fail, and comprehensive monitoring through task history views that track execution status, duration, and error messages. This architecture ensures reliable end-to-end automation without human intervention, eliminating the operational overhead of external schedulers, manual coordination, or custom scripting to manage dependencies and error handling. Engineers can define entire data pipelines declaratively using SQL DDL statements that create tasks and specify their dependencies, and Snowflake’s scheduler automatically manages execution timing, resource allocation, and error recovery, transforming complex multi-step workflows into self-healing, maintainable automation that scales seamlessly with data volume and pipeline complexity.

Option B cannot orchestrate multi-step pipelines because materialized views are passive database objects that only compute derived results from underlying tables based on predefined queries, lacking any scheduling, dependency management, or workflow coordination capabilities. Materialized views automatically refresh when their source tables change, but this refresh mechanism operates independently for each view without awareness of broader pipeline context or dependencies between multiple transformation stages. They cannot trigger external processes, coordinate with ingestion operations, or enforce execution order across multiple transformation steps, making them fundamentally unsuitable as an orchestration framework despite their value as query acceleration mechanisms. Option C Snowpipe provides efficient continuous ingestion by automatically loading files as they arrive in cloud storage, but it requires manual or external orchestration to trigger transformations after ingestion completes. While Snowpipe excels at the ingestion phase through its event-driven architecture that responds to new file notifications, it cannot natively initiate downstream transformation tasks, manage dependencies between processing stages, or coordinate complex multi-step workflows. Organizations using Snowpipe for ingestion must implement additional orchestration layers through external tools like Apache Airflow, dbt with cron schedulers, cloud-native workflow services, or Snowflake Tasks to coordinate post-ingestion transformations, creating architectural complexity and operational overhead that integrated task-based orchestration eliminates.

Option D transient tables reduce storage costs by eliminating Fail-safe protection and time travel beyond one day, but they do not provide workflow orchestration capabilities or dependency management features. While transient tables serve as useful staging areas within pipelines by holding intermediate results temporarily, the COPY command that loads data into these tables is a singular operation that cannot build multi-step dependencies or coordinate subsequent transformation logic. COPY operations alone execute independently without knowledge of downstream processing requirements, pipeline state, or execution sequencing, requiring external coordination mechanisms to chain multiple operations together into cohesive workflows. Using transient tables with COPY commands addresses storage efficiency concerns but leaves the fundamental orchestration challenge unsolved, necessitating additional tooling or manual processes to manage pipeline execution flow.

Thus Option A provides the most robust and automated orchestration model by integrating scheduling, dependency management, error handling, and workflow coordination into Snowflake’s native task framework, enabling teams to build sophisticated data pipelines entirely within the platform without external dependencies or operational complexity associated with maintaining separate orchestration infrastructure.

In a Snowflake environment, a data engineering team notices fluctuating performance when querying a shared table accessed by multiple virtual warehouses. Which configuration change would most effectively stabilize performance without significantly increasing costs?

A Increase the warehouse auto-suspend time to the maximum allowed
B Switch from a standard warehouse to a multi-cluster warehouse with minimum clusters set to one
C Enable clustering keys on the frequently filtered columns
D Create a larger single warehouse and disable auto-scaling

Correct Answer: B

Explanation:

Option B is correct because switching from a standard warehouse to a multi-cluster warehouse with the minimum cluster count set to one provides a highly pragmatic equilibrium between cost containment and performance consistency. Multi-cluster warehouses introduce concurrent processing elasticity, meaning that if sudden surges in workload occur, Snowflake automatically spins up additional clusters within the allowed range. When the minimum is one, the organization pays only for a single running cluster under typical load. This mitigates ephemeral performance degradation caused by concurrency spikes while preventing constant overspending. The mechanism disperses simultaneous user queries efficiently and visibly reduces queuing delays without requiring perpetual high-capacity infrastructure.

Option A is not correct because increasing the auto-suspend time does not stabilize performance. Auto-suspend controls only billing continuity and resource idling. Extending the suspend interval keeps the warehouse running unnecessarily when idle, ultimately increasing costs but not enhancing throughput or concurrency handling. Auto-suspend adjustments are financially oriented rather than performance-oriented, making this an inefficient and ineffective optimization in this context.

Option C is not correct because although clustering keys can improve query performance for heavily filtered, large, and semi-structured data sets, they do not address concurrency-related resource contention. Clustering optimizes micro-partition pruning but cannot alleviate the symptomatic queuing delays triggered when many users simultaneously interrogate the same table. Additionally, clustering introduces maintenance overhead and potential re-clustering costs, further distancing it from the goal of maintaining stable performance under load fluctuation.

Option D is not correct because creating a significantly larger single warehouse only scales vertically, not horizontally. Vertical scaling does not mitigate concurrency-based queuing sufficiently, because a single compute cluster—no matter its size—has fundamental limitations when facing surges of simultaneous workloads. Disabling auto-scaling also eliminates Snowflake’s intrinsic elasticity, negating one of its most essential performance advantages. In practical Snowflake usage, a large static warehouse is rarely the most cost-efficient solution for concurrency issues.

Thus, option B is the only option aligning with Snowflake’s architectural model of horizontally elastic compute designed to accommodate variable workload volumes while controlling overall expenditure.

A governance administrator wants to ensure that sensitive financial columns are masked dynamically depending on the querying user’s role. Which Snowflake feature should be implemented to achieve this requirement with minimal overhead?

A Stream objects with tagging on change events
B Row-level security with user-defined sequences
C Dynamic data masking with masking policies applied to columns
D External tokenization before ingestion

Correct Answer: C

Explanation:

Option C is correct because dynamic data masking with masking policies is the precise Snowflake-native mechanism designed for conditional, role-aware data obfuscation. Masking policies can reference session context, including current roles, enabling administrators to craft rules that show unmasked values only to authorized roles while returning redacted, randomized, or partial values to unauthorized users. Masking policies also integrate naturally into Snowflake’s metadata system, making their implementation exceptionally low-friction. This approach requires no additional ETL overhead, supports consistent governance enforcement, and scales cleanly across schemas and tables.

Option A is not correct because streams are intended for change tracking (CDC) and not for user-based masking. Tagging can help with identifying sensitive data objects, yet it does not control visibility at query time. Streams provide delta records for ingestion workflows but do not dynamically transform output based on role context. Therefore, streams fail to meet the requirement of real-time role-conditional masking.

Option B is not correct because row-level security controls which records a user can query but not how column values appear. Even with user-defined sequences or improvisational constructs, row-level security does not dynamically transform data visibility at the column level. It is effective for filtration—not masking.

Option D is not correct because external tokenization is performed before ingestion and typically handled outside Snowflake, often through dedicated tokenization vendors or on-prem systems. While tokenization protects sensitive data, it is static and irreversible within Snowflake unless detokenization pipelines exist externally. This conflicts with the requirement for dynamic, role-aware visibility since once data is tokenized outside Snowflake, Snowflake cannot natively reveal the original values conditionally.

Thus, dynamic data masking with masking policies is the only Snowflake-native, low-overhead method offering the precise, dynamic control required.

A data architect observes rapid expansions in storage usage caused by multiple transient tables created during iterative ELT processing. Which adjustment most effectively minimizes long-term storage consumption while preserving processing speed?

A Convert transient tables to temporary tables where feasible
B Increase the table retention period for easier cleanup
C Apply clustering keys to reduce micro-partition storage inflation
D Improve virtual warehouse size to complete transformations faster

Correct Answer: A

Explanation

Option A is correct because temporary tables consume the least long-term storage among Snowflake table types; they persist only for the duration of the session and incur no Fail-safe period. Transient tables reduce Fail-safe but still retain Time Travel data for the configured period, making them more expensive storage-wise in high-volume ELT pipelines. When ELT workflows generate numerous intermediate objects, the ephemeral nature of temporary tables becomes ideal. They reduce residual metadata, minimize storage persistence, and avoid unnecessary retention charges.

Option B is not correct because increasing table retention periods does not lower storage consumption. Retention determines how long historical versions are kept, and extending it actually increases storage footprint. Even for cleanup convenience, longer retention counteracts the goal of minimizing long-term storage usage.

Option C is not correct because clustering is unrelated to storage expansion caused by transient or temporary staging tables. Micro-partitions are immutable, and clustering affects pruning efficiency, not the number of partitions created. Re-clustering could even increase storage consumption temporarily due to background reorganization.

Option D is not correct because increasing virtual warehouse size only affects compute speed, not storage accumulation. Faster transformations may marginally speed cleanup operations, but the core issue—storage retention of staging tables—is unrelated to compute power. Adjusting warehouse size does not mitigate lingering table remnants or retention-based storage charges.

Therefore, converting transient tables to temporary tables where appropriate is the most effective and direct method to significantly reduce persistent storage use while ensuring the ELT workflow remains fast and efficient.

A security engineer wants to enforce strict control over which client applications may connect to Snowflake. Which feature should be used to implement this restriction most effectively?

A Snowflake masking policies
B Network policy rules
C Secure UDFs
D Zero-copy cloning

Correct Answer: B

Explanation:

Option B is correct because network policies allow Snowflake administrators to restrict client access based on IP addresses, ensuring that only machines or applications originating from approved networks can connect. By explicitly whitelisting or blacklisting IP ranges, organizations create a robust, perimeter-level constraint on how Snowflake accepts inbound sessions. This is the canonical Snowflake feature designed specifically to regulate which client endpoints can establish connectivity, making it the most appropriate mechanism for enforcing application-based access hygiene.

Option A is not correct because masking policies regulate data visibility at the column level and do not provide any restrictions on which client applications may initiate a connection. Masking protects sensitive fields but does not prevent unauthorized applications or networks from reaching the authentication stage.

Option C is not correct because secure UDFs restrict logic visibility and execution provenance, not client connectivity. They ensure that code definitions stay protected and operate securely, yet they neither control incoming IPs nor enforce client identity constraints.

Option D is not correct because zero-copy cloning pertains to data replication and environment management. Clones are efficient versions of datasets, but they offer no access-filtering capabilities. Cloning solves data provisioning problems, not application-level access control.

Thus, network policies remain the direct, purpose-built feature for enforcing strict connection constraints.

A Snowflake developer must implement an architecture allowing downstream tools to receive near-real-time change data from a table while ensuring minimal latency. Which choice best fulfills this requirement?

A Use streams in combination with tasks for continuous processing
B Use external tables synced via periodic refresh
C Use zero-copy cloning at frequent intervals
D Use stored procedures scheduled weekly

Correct Answer: A

Explanation:

Option A is correct because combining streams with tasks is Snowflake’s recommended design for near-real-time change data capture (CDC). Streams track row-level changes such as inserts, updates, and deletes, while tasks execute SQL on a schedule as frequently as one minute. Together they form an automated, low-latency processing pipeline that propagates delta data to downstream systems. This design supports scalable ingestion, minimizes manual intervention, and delivers CDC outputs reliably and efficiently.

Option B is not correct because external tables depend on refresh cycles, which introduce lag that is ill-suited for near-real-time requirements. External tables work well for lake ingestion patterns but not for rapid CDC propagation. Their refresh mechanics create periodic delays rather than continuous streaming.

Option C is not correct because zero-copy cloning is a bulk replication mechanism, not an incremental CDC workflow. Clones do not communicate deltas and would require constant recreation to simulate change events, making them too slow and compute-intensive.

Option D is not correct because weekly scheduled stored procedures cannot satisfy near-real-time latency demands. Even if scheduled more frequently, procedures alone cannot detect changes without streams. Relying solely on procedures creates brittle logic and significant processing overhead.

Therefore, streams plus tasks is the optimal setup for low-latency change delivery.

A data analyst complains that queries against a large semi-structured JSON column have become sluggish. What is the most appropriate optimization strategy to improve performance without altering source data?

A Create a materialized view flattening the required JSON paths
B Re-cluster the table on all variant fields
C Increase virtual warehouse size permanently
D Use external stages instead of internal tables

Correct Answer: A

Explanation:

Option A is correct because materialized views precompute and physically store parsed JSON path elements, enabling Snowflake to serve common fields quickly without repeatedly scanning deeply nested variant structures. Materialized views are particularly effective when querying recurring paths, as they reduce repeated parsing overhead and micro-partition scans. This substantially accelerates semi-structured data queries while preserving the original raw JSON.

Option B is not correct because clustering semi-structured data on variant fields provides limited benefit. JSON fields often have high cardinality and volatile structures, making clustering unproductive. Additionally, clustering increases maintenance costs and might negatively impact write performance.

Option C is not correct because permanently increasing warehouse size raises costs without addressing the underlying inefficiency caused by complex JSON parsing. Larger compute improves speed but fails to optimize repeated path evaluation patterns.

Option D is not correct because external stages are storage locations, not query-performance optimizers. Moving raw JSON outside Snowflake would only degrade performance further since external data scanning is slower and lacks micro-partition pruning.

Thus, materialized views flattening JSON paths provide the most balanced and effective enhancement.

A newly onboarded team requires access to read-only production data while ensuring they cannot accidentally modify or delete records. What is the simplest Snowflake-native method to support this?

A Create a role with SELECT privilege only
B Apply dynamic masking policies on sensitive columns
C Use secure views to hide base tables
D Increase warehouse size to discourage modification attempts

Correct Answer: A

Explanation:

Option A is correct because creating a read-only role and granting SELECT privilege is the simplest and most appropriate solution. Snowflake’s privilege model ensures that without INSERT, UPDATE, DELETE, TRUNCATE, or MERGE permissions, users cannot modify or remove data. This method is minimalistic, clean, and aligns with least-privilege principles.

Option B is not correct because masking policies restrict data visibility, not modification privileges. Masking cannot prevent DML operations.

Option C is not correct because secure views hide underlying structures but still do not inherently prevent data manipulation unless the role lacks DML on the underlying objects. Secure views focus on data obfuscation and controlled logic exposure, not modification control.

Option D is not correct because warehouse size is irrelevant to permissions. Access control is enforced at the privilege level, not compute configuration.

Thus, granting only SELECT privileges is the simplest solution.

A Snowflake engineer wants to accelerate the ingestion of large CSV files stored in cloud object storage. Which configuration adjustment yields the most direct performance improvement?

A Using a larger virtual warehouse during COPY
B Using dynamic masking during ingestion
C Using row-level security on the staging table
D Compressing files using inefficient codecs

Correct Answer: A

Explanation:

Option A is correct because COPY INTO operations scale with warehouse size. Larger warehouses parallelize ingestion across more compute resources, thus reducing load time. Snowflake automatically distributes file processing among nodes, so increasing warehouse size directly accelerates ingestion throughput.

Option B is not correct because masking policies apply during data access, not ingestion. They add no benefits and may even add overhead.

Option C is not correct because row-level security is irrelevant during ingestion; it controls user visibility, not ingestion speed.

Option D is not correct because inefficient codecs hinder performance by increasing decompression overhead. Efficient codecs such as gzip or snappy improve ingestion speed, whereas poor compression slows it.

Thus, larger compute is the most direct performance booster.

A Snowflake data team wants to rapidly provision a QA environment identical to production while minimizing storage consumption. Which approach best fulfills this?

A Zero-copy cloning of the production database
B Re-ingesting data manually
C Creating transient tables from scratch
D Exporting data and reloading it into another database

Correct Answer: A

Explanation:

Option A is correct because zero-copy cloning creates a new logical copy of an existing database without physically duplicating micro-partitions. Cloned objects reference the same underlying storage until modifications occur, making this the fastest and most storage-efficient method to replicate environments. It preserves table structures, schemas, and historical states when combined with Time Travel.

Option B is not correct because manual re-ingestion is labor-intensive, time-consuming, and expensive. It also wastes storage because full duplicates are created.

Option C is not correct because transient tables do not copy production data by default. They require manual loading and do not provide identical state replication.

Option D is not correct because exporting and reloading creates full data copies, increasing storage use and complexity.

Zero-copy cloning is purpose-built for this scenario.

A performance engineer wants consistent query speed across a reporting workload that varies mildly throughout the day. Which setting stabilizes performance while keeping costs predictable?

A Multi-cluster warehouse with max clusters equal to one
B Multi-cluster warehouse with min and max clusters set to the same value
C Single warehouse with auto-suspend disabled
D Frequent zero-copy cloning of reporting tables

Correct Answer: B

Explanation:

Option B is correct because setting both minimum and maximum clusters to the same value creates a multi-cluster warehouse running a fixed number of clusters. This ensures stable performance by maintaining parallel compute capacity while keeping costs predictable because the number of active clusters never fluctuates. The workload receives consistent horizontal scaling, and Snowflake does not auto-add or auto-remove clusters throughout the day.

Option A is not correct because setting max clusters to one essentially reverts to a single-cluster warehouse. This eliminates horizontal scaling entirely and fails to stabilize performance under mild workload variation.

Option C is not correct because disabling auto-suspend increases costs and has no inherent relation to performance stability. Warehouse suspension does not cause performance degradation; query concurrency does.

Option D is not correct because cloning does not improve query performance. It is a data-provisioning technique, not a performance stabilization mechanism.

Thus, a fixed multi-cluster configuration best satisfies this requirement.

Exam

Related posts:

Leave a Reply Cancel reply