Snowflake SnowPro Core Recertification (COF-R02) Exam Dumps and Practice Test Questions Set4 Q61-80

Visit here for our full Snowflake SnowPro Core exam dumps and practice test questions.

Q61. A Snowflake engineering group is building a multi-zone ingestion pipeline that ingests JSON payloads from numerous mobile applications. These payloads shift frequently as developers introduce new fields. The team needs an ingestion layer that can gracefully handle evolving structures while still enabling downstream consumers to extract standardized fields without causing ingestion failures. Which Snowflake-native design pattern best satisfies this requirement?

A Store all incoming JSON directly into a single table with strict typed columns
B Use a VARIANT column in a raw landing table and transform into structured tables downstream
C Require developers to freeze schemas and reject any new fields at ingestion time
D Pre-flatten JSON into dozens of rigid columns before loading into Snowflake

Answer: B

Explanation:

Option B is correct because placing JSON payloads into a VARIANT column within a raw landing table allows Snowflake to ingest all semi-structured data without enforcing rigid schema constraints. Payloads that include new or unexpected fields will not cause ingestion failures. This strategy preserves the fidelity of all received information while supporting late-binding schema evolution in curated layers. Downstream transformations can then standardize and extract specific fields that analytical consumers need, while still maintaining full access to the original structure. This aligns with best practices by allowing the raw layer to remain flexible and schema-on-read friendly.

Option A is incorrect because defining strict typed columns for every potential JSON attribute creates fragility. As payload fields evolve, ingestion operations may fail when encountering unexpected or missing attributes. This approach also inflates maintenance overhead because schema adjustments would be required constantly. It contradicts Snowflake’s natural handling of semi-structured data.

Option C fails conceptually and organizationally. Forcing developers to freeze schemas restricts agility and limits the value of modern mobile telemetry, where new fields and metrics appear frequently. Rejecting new fields results in data loss and contradicts the design principle of flexible ingestion pipelines.

Option D is suboptimal because pre-flattening JSON introduces rigidity and immediate schema coupling. It also results in extremely wide tables, many of whose columns may be sparsely populated or obsolete. This inflates storage and complicates future evolution. Additionally, initial flattening forces early assumptions about data structure before downstream analytics define their needs.

Therefore, B is the optimal pattern, offering robust adaptability, downstream flexibility, and strong resilience for evolving mobile-generated JSON data.

Q62. A financial institution requires immutable historical snapshots of critical datasets for regulatory audits. Analysts must be able to query datasets exactly as they existed at specific points in time, even years later. However, ongoing workloads must remain unaffected. Which Snowflake-based strategy ensures compliant, queryable historical preservation with minimal storage overhead?

A Create full table copies for each monthly snapshot
B Use zero-copy clones combined with Time Travel to preserve point-in-time datasets
C Store historical data in external storage and re-import when needed
D Maintain a history table populated by DELETE triggers

Answer: B

Explanation:

Option B is correct because zero-copy clones combined with Time Travel allow creation of immutable, point-in-time snapshots without physically duplicating data. Clones reference the same underlying micro-partitions, so storage overhead is minimal. Regulatory audits often require exact historical views, and clones ensure the dataset remains frozen and queryable. When combined with Time Travel, users can extend the temporal reach of snapshotting so that even if changes occur, the older partitions remain preserved. This approach satisfies compliance needs by guaranteeing reproducibility without unnecessary storage consumption.

Option A is incorrect because full copies duplicate all micro-partitions, leading to enormous storage overhead over months or years. This approach is costly, inefficient, and operationally burdensome.

Option C offloads data externally but destroys queryability and Snowflake-native optimization. Re-importing for audits introduces delays, risks inconsistencies, and lacks the compliance-grade reliability required for regulated data.

Option D using DELETE triggers to capture historical states is problematic. Snowflake does not support triggers in the traditional relational sense, and even if implemented via external orchestration, the approach would miss UPDATE operations, multi-statement transactions, and temporal accuracy. This method also introduces unnecessary complexity and lacks regulatory robustness.

Thus B offers the cleanest, most compliant, storage-efficient solution for immutable historical snapshots.

Q63. A Snowflake operations team wants to ensure that analysts do not accidentally consume excessive compute while running exploratory queries on massive datasets. They need a strategy that limits compute usage but still permits flexible ad hoc analysis. Which configuration best achieves this?

A Assign analysts to a dedicated X-Small warehouse with auto-suspend
B Restrict analysts to read-only access on small curated tables
C Disable auto-resume to prevent accidental activation
D Require analysts to use secure views only

Answer: A

Explanation:

Option A is correct because assigning analysts to a dedicated X-Small warehouse provides a controlled compute environment while still supporting broad analytical exploration. The small warehouse size naturally constrains the maximum compute cost per hour. Combined with auto-suspend, the warehouse shuts down quickly when idle, ensuring cost control. Analysts can run any query permitted by their role without threatening organization-wide compute budgets.

Option B is overly restrictive. Limiting analysts to small tables defeats the purpose of exploratory analytics and undermines data discovery. Analysts need access to full datasets or curated aggregates, not just a truncated subset.

Option C is counterproductive because disabling auto-resume prevents analysts from executing queries unless a warehouse is already running. This results in workflow friction and manual overhead without actually controlling compute costs efficiently.

Option D secure views enhance security but do not inherently limit compute consumption. Analysts can still cause heavy scans if the view definition involves large tables or complex joins.

Thus A provides the perfect balance between compute governance and analytical freedom.

Q64. A Snowflake architect is asked to design a workload-isolation strategy ensuring that BI dashboards, ELT pipelines, and data science experiments do not interfere with one another. BI workloads must remain fast even during heavy ELT processing. Which Snowflake configuration best supports this goal?

A Use a single large multi-cluster warehouse for all workloads
B Assign each workload a separate warehouse sized to its needs
C Increase the maximum cluster count on the main warehouse
D Restrict ELT to off-peak hours only

Answer: B

Explanation:

Option B is correct because assigning each workload a separate warehouse ensures complete compute isolation. BI dashboards can run smoothly on their own warehouse regardless of ELT workloads, preventing resource contention. Data science experiments—often compute-intensive—can run independently without degrading other performance. This approach leverages Snowflake’s elastic warehouse model to isolate workloads logically and operationally.

Option A is flawed because a single multi-cluster warehouse can still experience contention if clusters max out or if large ELT jobs overwhelm the resource pool. It also risks unpredictable performance fluctuations.

Option C increases concurrency but does not guarantee isolation. ELT workloads may still dominate scan operations and metadata pressure, affecting BI responsiveness.

Option D restricts operational flexibility. Snowflake is designed for 24/7 elasticity, and forcing ELT into off-peak windows contradicts that. BI users often need consistent performance regardless of time.

Thus B is the most reliable and scalable solution.

Q65. A Snowflake data engineering team notices that a table used for time-range filtering has become poorly organized after months of ingestion. Queries filtering by date now scan far more partitions than before. The team wants Snowflake to restore efficient ordering without manual execution overhead. What should they enable?

A Automatic clustering on the date column
B Fail-safe recovery to restore older partitions
C Time Travel to revert to an earlier physical arrangement
D Materialized views that pre-aggregate date ranges

Answer: A

Explanation:

Option A is correct because automatic clustering ensures Snowflake continually optimizes micro-partitions according to the defined clustering key, which in this case should be the date column. Over time, ingestion patterns lead to fragmentation and reduced pruning efficiency. Automatic clustering restores and maintains partition alignment, ensuring faster queries and lower compute usage.

Option B is unrelated to clustering. Fail-safe exists only for data recovery and cannot restore prior micro-partition organization.

Option C cannot restore physical order. Time Travel allows querying historical data but does not alter physical partition structure or improve pruning.

Option D materialized views help for specific aggregated queries but do not correct the underlying table’s physical ordering. They serve different analytical goals.

Thus A ensures sustained pruning efficiency with minimal operational overhead.

Q66. A healthcare analytics platform must enforce strict HIPAA-compliant security. Analysts may query aggregated metrics, but they must not see detailed patient-level data unless specifically authorized. The solution must ensure row-level and column-level controls simultaneously. Which Snowflake design pattern best enforces these layered constraints?

A Secure views combined with row access policies and dynamic masking
B Materialized views containing only aggregated data
C Assign analysts read-only access to patient tables
D Store PHI in an external system and only load anonymized data into Snowflake

Answer: A

Explanation:

Option A is correct because combining secure views with row access policies and dynamic data masking enforces multi-layer governance: row-level filtering restricts which patient records are visible, masking policies obscure sensitive fields, and secure views hide underlying logic and metadata. This layered approach ensures that analysts only see aggregated or de-identified data unless explicitly authorized. It aligns well with HIPAA’s principle of minimum necessary access.

Option B materialized views alone cannot enforce granular restrictions. They may contain aggregated results, but analysts could still circumvent them by querying underlying tables unless access is separately restricted.

Option C read-only access does not protect sensitive fields or rows. Analysts would still see unmasked patient data, violating privacy rules.

Option D may reduce compliance risks but restricts Snowflake’s analytical capabilities and forces unnecessary dependences on external systems. It also fails to satisfy the requirement that some authorized users still need full detail.

Thus A delivers comprehensive PHI protection while enabling analytics.

Q67. A Snowflake team wants to build an incremental transformation pipeline in which each run only processes newly arrived data since the previous run. They require consistent, reliable change tracking across multi-statement transactions. Which Snowflake-native feature provides this?

A Streams on the source table
B Time Travel on the staging schema
C Secure views applied to staging tables
D Manual auditing tables populated during ETL

Answer: A

Explanation:

Option A streams on the source table is correct because streams track inserts, updates, and deletes in a transactionally consistent manner. They allow incremental processing without scanning full datasets, and they manage offsets automatically so each change is consumed exactly once unless reset. Streams remain the canonical approach for incremental Snowflake transformations.

Option B Time Travel allows querying historical states but does not track incremental deltas cleanly. It requires computationally expensive comparisons between historical and current data.

Option C secure views provide restricted access but offer no mechanism for incremental capture.

Option D manual auditing tables are error-prone and cannot guarantee transactional consistency across multi-statement operations.

Thus A is the only Snowflake-native mechanism explicitly designed for incremental pipelines.

Q68. A retail analytics team reports slow aggregation queries against massive fact tables. These queries filter by region, product category, and date. The team wants Snowflake to automatically reorganize physical storage to maximize pruning. What approach yields the strongest improvement?

A Define a compound clustering key on region, category, and date
B Partition the table manually into multiple smaller tables
C Convert the table into an external table
D Flatten all attributes into a single VARCHAR column

Answer: A

Explanation:

Option A is correct because compound clustering keys align Snowflake’s micro-partition ordering with specific query filter patterns, enabling the query optimizer to leverage metadata-based partition pruning that dramatically reduces the volume of data scanned during query execution. By clustering on region, category, and date in that hierarchical order, Snowflake organizes micro-partitions so that data sharing similar values for these dimensions is physically co-located within the same or adjacent partitions, creating a natural data organization that matches common analytical access patterns like regional sales analysis, category performance reporting, or time-series trend evaluation. When queries filter on these clustered dimensions, Snowflake’s query optimizer examines micro-partition metadata containing minimum and maximum values for each clustering column and instantly eliminates entire partitions that cannot possibly contain matching rows, a process called partition pruning that can reduce scanned data volumes by orders of magnitude without reading actual data blocks. This pruning efficiency dramatically improves performance for large fact tables containing billions of rows by transforming full table scans into targeted partition access patterns that read only the minimal data subset necessary to satisfy query predicates, reducing both query execution time and compute costs proportionally to the pruning effectiveness. Automatic clustering further enhances this architecture by continuously monitoring data distribution and automatically reorganizing micro-partitions when DML operations degrade clustering quality, ensuring that the physical data layout remains optimized over time even as inserts, updates, and deletes gradually introduce disorder. The maintenance process runs transparently in the background using Snowflake-managed compute resources, reclustering partitions that fall below quality thresholds without requiring manual intervention, scheduled maintenance windows, or administrative overhead, making it a self-healing optimization that adapts to evolving data patterns and workload characteristics while preserving query performance consistency across the table’s lifecycle.

Option B is operationally inefficient and creates significant maintenance headaches by fragmenting a single logical entity into multiple physical tables based on partitioning dimensions, transforming straightforward single-table queries into complex multi-table operations requiring UNION ALL constructs or view abstractions. Manually splitting tables forces application developers and analysts to understand the physical partitioning scheme, correctly identify which partitions contain relevant data for their queries, and construct SQL that accesses the appropriate subset of tables, increasing query complexity and creating opportunities for errors where queries inadvertently omit necessary partitions or scan irrelevant ones. This approach dramatically increases maintenance burdens because schema changes, column additions, or constraint modifications must be replicated across dozens or hundreds of partition tables, creating consistency challenges and administrative overhead that scales linearly with partition count. The manual partitioning strategy also introduces rigidity where partition schemes cannot easily adapt to changing query patterns, and cross-partition queries that span multiple dimensions sacrifice the performance benefits entirely by requiring full scans across all partition tables, making it fundamentally incompatible with Snowflake’s architecture that expects the platform to manage physical data organization automatically.

Option C sacrifices pruning efficiency because external tables reference data stored in cloud object storage outside Snowflake’s native format, lacking the comprehensive micro-partition metadata and optimization structures that enable Snowflake’s powerful partition pruning capabilities. External tables provide convenient access to files without importing them, but queries against external data cannot leverage Snowflake’s columnar storage format, zone map metadata, or clustering optimizations, resulting in slower performance that requires reading and parsing raw files rather than efficiently skipping irrelevant data through metadata analysis. Option D represents a catastrophic anti-pattern that stuffs all filter attributes into a single concatenated column, completely destroying Snowflake’s ability to perform column-level filtering and breaking fundamental query optimization mechanisms. This approach prevents the optimizer from understanding individual dimension values, eliminating partition pruning capabilities entirely since the clustering metadata would contain concatenated strings rather than discrete values for each dimension, and it forces queries to perform expensive string parsing and pattern matching operations on every row rather than leveraging efficient equality or range comparisons on native data types.

Thus Option A provides the strongest pruning and performance gain by aligning physical data organization with analytical query patterns through compound clustering keys, enabling metadata-driven partition elimination that minimizes data scanning, reduces query latency, lowers compute costs, and maintains optimization quality automatically over time through Snowflake’s built-in clustering maintenance mechanisms without operational complexity or architectural compromises.

Q69. A Snowflake administrator discovers that the data science team frequently executes compute-intensive models on shared warehouses, slowing down other workflows. The company wants guaranteed compute separation but must avoid unnecessary cost. What is the best solution?

A Give data science users their own separate warehouse sized appropriately
B Use multi-cluster warehouses in auto mode
C Increase the cluster count for the shared warehouse
D Restrict data science access to read-only operations

Answer: A

Explanation:

Option A is correct because giving data science users a dedicated warehouse guarantees compute isolation, preventing their resource-heavy queries from affecting other teams. The warehouse can be sized based on typical model workloads. This approach avoids the unnecessary cost of multi-cluster warehouses while still protecting performance for other users.

Option B multi-cluster warehouses increase cost significantly and are mainly designed for high concurrency, not workload isolation.

Option C increasing cluster count does not isolate workloads; it only offers temporary relief and increases compute usage.

Option D restricting data science users to read-only operations defeats their purpose entirely and is not feasible.

Thus A ensures optimal isolation and cost efficiency.

Q70. A Snowflake pipeline processes thousands of daily micro-batches. Each batch must load raw data, transform it, verify quality, and finally publish curated results. The team wants a dependable, automated orchestration flow that prevents downstream execution if upstream steps fail. Which Snowflake capability should they use?

A Tasks chained with AFTER dependencies
B External tables combined with secure views
C Snowpipe followed by manual scripts
D Repeated zero-copy cloning at each stage

Answer: A

Explanation:

Option A tasks chained with AFTER dependencies is correct because it provides robust orchestration with guaranteed sequencing. Each task waits for its predecessor to succeed before running, preventing bad data from propagating downstream. Tasks can trigger transformations, validations, and load steps automatically, creating a fully managed end-to-end pipeline inside Snowflake.

Option B external tables and secure views handle storage and access but cannot orchestrate multi-step processes.

Option C Snowpipe helps with ingestion but cannot orchestrate transformations or enforce sequential dependencies without additional external systems.

Option D cloning is useful for testing but cannot coordinate ongoing pipeline execution.

Thus A delivers the required reliability and automation.

A data engineer wants to ensure that a Snowflake task executing incremental transformations never overlaps with a previous run, even if the prior run experiences delays. Which configuration best guarantees serialized task execution?

A Configure tasks using a dedicated warehouse with auto-suspend disabled
B Use after-dependency chaining so each task waits for the previous one to finish
C Increase the task schedule frequency to avoid overlap
D Place the task inside a stored procedure that checks timestamps manually

Correct Answer: B

Explanation:

Option B is correct because after-dependency chaining is the intended Snowflake-native mechanism for sequential task execution. When a task is configured to run after another task, Snowflake ensures the downstream task only begins once the upstream task has completed successfully. This serialization pattern prevents overlaps altogether because Snowflake enforces dependency order at the engine level, eliminating timing-based interference between runs. It also avoids unnecessary complexity and does not require manual checks or inflated scheduling windows.

Option A is not correct because assigning the task a dedicated warehouse or disabling auto-suspend does not regulate task concurrency. Warehouses influence compute allocation, not execution order. Two tasks using the same warehouse can run concurrently regardless of warehouse size or suspend settings. This means computational adjustments cannot prevent overlapping operations and thus do not satisfy the requirement for serialized execution.

Option C is not correct because increasing the schedule frequency actually increases the likelihood of overlap rather than preventing it. Even if one attempts to manipulate frequency intervals, time-based scheduling can never guarantee non-overlapping execution, especially when workloads experience variable durations. A time-driven strategy is inherently unreliable for serialized workflows in data pipelines.

Option D is not correct because using a stored procedure to manually check timestamps is cumbersome, error-prone, and unnecessary when Snowflake already offers dependency-based execution. Manual timestamp logic introduces maintenance burden and brittleness, whereas Snowflake’s dependency mechanism provides deterministic, system-level enforcement.

Thus, after-dependency chaining is the only option that guarantees serialization without additional overhead.

A Snowflake architect must guarantee that staging data required only for a single user session never persists or becomes visible to other users. Which Snowflake table type is most appropriate?

A Permanent table
B Transient table
C Temporary table
D External table

Correct Answer: C

Explanation:

Option C is correct because temporary tables are specifically designed to persist only for the duration of the user session that created them. They are invisible to other users by default, consume no Fail-safe storage, and automatically disappear when the session ends. This ensures that sensitive or intermediate staging data never persists beyond the user’s active workflow. The ephemeral characteristic of temporary tables precisely fits the requirement for session-bound isolation.

Option A is not correct because permanent tables persist indefinitely until explicitly dropped, include Fail-safe retention, and are inherently visible to other authorized roles. This contradicts the requirement for data that must not be accessible beyond a single user session or visible to others.

Option B is not correct because transient tables reduce Fail-safe time but still persist across sessions and are visible to other users granted access. They are not intended for session-level isolation and therefore fail the privacy and longevity requirements stated.

Option D is not correct because external tables reference data stored in cloud storage and do not provide session-level visibility guarantees. They are shared objects that remain accessible as long as permissions allow, completely unsuitable for ephemeral session-long staging.

Thus, temporary tables fully satisfy the requirement for per-session isolation and non-persistence.

A Snowflake administrator wants to block queries from unapproved geographic regions. Which solution best enforces this restriction within Snowflake?

A Create network policies using allowed IP ranges
B Define masking policies that hide data for unapproved users
C Configure custom sequences that invalidate sessions
D Use zero-copy clones restricted to approved users

Correct Answer: A

Explanation:

Option A is correct because Snowflake network policies enable administrators to allow or block connections based on specific IP ranges. Geographic restrictions are most effectively enforced by IP filtering, as regions can be mapped to known CIDR blocks or corporate VPN endpoints. Snowflake will deny session establishment attempts originating from non-whitelisted locations, thus preventing unauthorized geographic access even before authentication proceeds.

Option B is not correct because masking policies control column-level data visibility, not connection origin. They offer no protection against login attempts from unapproved regions, and users could still connect even if their data visibility is restricted. The requirement concerns connection-level blocking, not data masking.

Option C is not correct because sequences have no relation to access control. They generate numeric values and cannot enforce session invalidation or geographic filtering. Leveraging sequences for security controls is conceptually misaligned and technically impossible.

Option D is not correct because zero-copy clones replicate data environments but do not control geographic connections. Restricting access to clones does nothing to prevent unauthorized network-origin sessions from reaching Snowflake. Access to clones is permission-based, not geography-based.

Thus, network policies deliver the precise, Snowflake-native, connection-level filtering needed for the requirement.

A company wants to maintain complete historical versions of tables for analytical comparison while ensuring that cloning operations remain efficient. Which Snowflake feature is essential for supporting this strategy?

A Using multi-cluster warehouses
B Using Time Travel with appropriate retention
C Using secure views
D Using user-managed encryption keys

Correct Answer: B

Explanation:

Option B is correct because Time Travel is Snowflake’s feature for maintaining historical versions of tables. It preserves past snapshots based on retention settings, enabling point-in-time queries and supporting zero-copy cloning across historical states. Cloning becomes exceptionally efficient because Snowflake stores only micro-partition deltas rather than full replicas. Organizations leveraging Time Travel can compare historical datasets, track changes over time, and produce lineage-compliant analytical artifacts without heavy storage burdens.

Option A is not correct because multi-cluster warehouses deal only with compute scaling for concurrency. They do not preserve historical data nor influence clone efficiency. Concurrency enhancements are unrelated to storing prior table versions.

Option C is not correct because secure views govern logic exposure and provide data protection by encrypting view definitions. However, they do not maintain historical records of table states or facilitate retrospective comparisons. Secure views do not interact with retention mechanisms.

Option D is not correct because user-managed encryption keys govern cryptographic control but do not influence historical versioning or zero-copy cloning. Encryption relates to compliance and data security, not temporal storage or cloning efficiency.

Therefore, Time Travel is the core mechanism underpinning historical version management and efficient cloning.

A Snowflake developer needs to ingest hundreds of small files from cloud storage every hour. Processing them individually increases overhead. What is the best practice to optimize ingestion performance?

A Combine small files into fewer large files before COPY
B Enable search optimization on the staging table
C Increase the maximum clustering depth
D Execute COPY commands in a loop using a stored procedure

Correct Answer: A

Explanation:

Option A is correct because combining small files into larger files is a well-established Snowflake best practice that significantly improves ingestion efficiency. Small file fragmentation leads to excessive metadata operations and increased overhead because COPY INTO parallelizes work based on file count. Larger files better utilize warehouse compute, reduce metadata chatter, and accelerate throughput. Snowflake explicitly recommends fewer, larger files over numerous small ones to maximize COPY performance.

Option B is not correct because search optimization affects query performance, not ingestion throughput. Enabling it on staging tables would actually increase cost and provide no ingestion-related advantage. It is designed for selective queries against large datasets, not file loading.

Option C is not correct because clustering depth applies to micro-partition organization and has no bearing on ingestion behavior for small files. Snowflake does not re-cluster during COPY, and clustering improvements cannot reduce ingestion overhead caused by file count fragmentation.

Option D is not correct because executing COPY inside a loop still processes each small file individually. This does not fix the underlying issue of inefficiently small file size. The loop introduces additional complexity and does not align with Snowflake ingestion guidance.

Thus, combining small files into larger, optimized batches is the most effective way to accelerate ingestion.

A Snowflake analyst creates a materialized view to accelerate aggregation queries but notices sudden increases in warehouse consumption. What is likely the cause?

A Maintenance of materialized view freshness
B Automatic secure view encryption
C Increase in Time Travel retention
D Network policy enforcement overhead

Correct Answer: A

Explanation:

Option A is correct because materialized views require compute to maintain freshness whenever underlying base tables change. As data is inserted, updated, or deleted, Snowflake recalculates the affected micro-partitions of the materialized view. This incremental refresh consumes warehouse compute cycles, and heavy DML workloads can dramatically increase consumption. Since materialized views store precomputed results, the maintenance process inevitably uses compute, making it the most plausible cause of increased warehouse load.

Option B is not correct because secure view encryption does not consume warehouse compute during query execution or maintenance. Secure views merely store definitions in protected form and have negligible runtime cost.

Option C is not correct because Time Travel retention affects storage usage, not warehouse computation. Increasing retention may increase the amount of historical micro-partitions kept, but it does not introduce additional compute overhead related to query acceleration or materialized views.

Option D is not correct because network policy enforcement is handled at connection level, not warehouse level. It does not impact computer consumption and therefore cannot be responsible for increased warehouse activity.

Thus, materialized view refresh maintenance is the direct driver of elevated warehouse resource usage.

A data ingestion specialist wants to ensure that COPY INTO commands automatically load newly arrived files in cloud storage without manually running SQL. Which approach best accomplishes this in Snowflake?

A Use Snowpipe with event notifications
B Use role-based access control
C Use secure UDFs to process staged files
D Use network policies to monitor storage buckets

Correct Answer: A

Explanation:

Option A is correct because Snowpipe is Snowflake’s automatic ingestion service that continuously loads files as they arrive in cloud storage. When used with event notifications from supported providers, Snowpipe activates immediately upon file arrival, triggering ingestion without human intervention. This automation reduces latency, supports continuous pipelines, and eliminates the need to manually run COPY commands.

Option B is not correct because role-based access control governs permissions, not automation. RBAC does not load files nor trigger ingestion events.

Option C is not correct because secure UDFs execute procedural logic but cannot autonomously monitor storage changes. Their purpose is secure code execution, not file event detection or ingestion orchestration.

Option D is not correct because network policies apply to user IP filtering, not cloud storage monitoring. Snowflake cannot use a network policy to track files arriving in external storage platforms.

Thus, Snowpipe coupled with event notifications is the correct automation mechanism.

A Snowflake engineer needs to grant a partner organization access to a curated subset of data without physically copying the data. Which Snowflake capability is most appropriate?

A Data sharing using reader accounts or direct share
B Creating permanent tables in a separate database
C Using external tables and exporting metadata
D Creating transient tables populated from production

Correct Answer: A

Explanation:

Option A is correct because Snowflake’s data sharing framework enables instant, zero-copy access to datasets for external partners. Whether using reader accounts or direct shares, Snowflake exposes secure, real-time access without duplicating micro-partitions. This reduces maintenance overhead, improves governance, and ensures partners always see the most current shared data. Data sharing is explicitly designed for inter-organizational collaboration.

Option B is not correct because creating permanent tables in a separate database still requires data copying. This contradicts the requirement of avoiding physical duplication and increases storage use.

Option C is not correct because exporting metadata through external tables provides no actual access to live data. External tables point to external storage, not curated Snowflake datasets, and do not support zero-copy visibility.

Option D is not correct because transient tables require populating data from production, creating another copy. They also lack Fail-safe and are unsuitable for controlled external sharing.

Thus, data sharing is the correct capability.

A Snowflake administrator wants to monitor changes made to tables, including inserts, updates, and deletes, for auditing purposes. Which feature best supports this requirement?

A Streams
B Secure views
C Materialized views
D Network policies

Correct Answer: A

Explanation:

Option A is correct because streams are Snowflake’s built-in change tracking mechanism specifically designed to capture and expose row-level DML changes with comprehensive metadata that enables sophisticated auditing, ETL orchestration, and change data capture workflows. Streams operate by maintaining offset pointers that track committed transactions against source tables, recording inserts, updates, and deletes along with metadata columns that identify the change type, allowing applications and administrators to query exactly what modifications occurred since the stream was last consumed. They capture delta visibility by presenting a queryable view of changes that includes both the current state of modified rows and metadata indicating whether each row represents an insert, update, or delete operation, providing the granular change history essential for compliance auditing, data lineage tracking, and forensic analysis of data modifications. Streams maintain a persistent record of modifications that remains available until explicitly consumed and advanced, ensuring no changes are lost even if monitoring processes experience temporary failures or delays, and they support multiple concurrent consumers so auditing systems, ETL pipelines, and CDC replication processes can independently track the same change stream without interference. This architecture makes streams the canonical feature for tracking table changes in Snowflake, enabling administrators to review when and how data is altered by querying the stream to see which rows changed, what values were modified, and when transactions committed, all without requiring custom trigger logic, application instrumentation, or external change tracking infrastructure. Streams integrate seamlessly with Snowflake’s transactional model, capturing changes atomically as part of committed transactions and providing time-travel capabilities that allow point-in-time analysis of change patterns, making them indispensable for regulatory compliance scenarios requiring complete audit trails, data quality monitoring that detects unexpected modification patterns, incremental processing pipelines that efficiently handle only changed records, and multi-region replication architectures that synchronize data across geographical boundaries.

Option B is not correct because secure views provide obfuscation of underlying query logic and data masking capabilities to protect sensitive information and proprietary transformation code, but they fundamentally do not capture historical modifications or provide any change tracking functionality. Secure views operate as virtual layers over base tables that execute SQL transformations at query time, filtering rows, masking columns, or applying business logic to present controlled data perspectives to different user roles, but they remain passive query constructs without awareness of DML operations, transaction history, or data evolution over time. They serve security and access control purposes by restricting what users can see rather than auditing what users have done, making them inappropriate for change tracking scenarios where the goal is monitoring data modifications rather than controlling data visibility. Option C is not correct because materialized views store precomputed, aggregated, or transformed query results to accelerate analytical workloads by eliminating repetitive computation, but they do not record or expose the underlying DML changes that trigger their refresh cycles. While materialized views refresh automatically when source tables change, this refresh mechanism operates as a black box that recomputes results without providing visibility into which specific rows were inserted, updated, or deleted, what values changed, or when modifications occurred. Materialized views optimize query performance through result caching but lack the change tracking metadata, delta visibility, and historical preservation capabilities required for auditing purposes, making them useful for accelerating analytics but unsuitable for compliance monitoring or forensic investigation of data modifications.

Option D is not correct because network policies are security constructs that restrict access to Snowflake accounts based on IP address allowlists or blocklists, controlling which network locations can establish connections to the platform but having no relationship whatsoever to monitoring table changes or tracking data modifications. Network policies operate at the authentication and connection layer to prevent unauthorized access from untrusted networks, but they provide no visibility into database operations, DML activities, or data evolution within the platform once authenticated users establish connections. They serve perimeter security functions rather than data governance or auditing purposes, making them completely orthogonal to the requirement for tracking table-level changes.

Thus, streams best fulfill auditing requirements by providing native, comprehensive change tracking capabilities that capture row-level DML operations with rich metadata, maintain persistent change history for forensic analysis, support multiple concurrent consumers for diverse monitoring needs, and integrate seamlessly with Snowflake’s transactional architecture to deliver reliable, complete audit trails without requiring custom development or external infrastructure.

A performance engineer wants a Snowflake warehouse to automatically scale out during moderate peak times but avoid cost spikes caused by excessive scaling. Which configuration is optimal?

A Multi-cluster warehouse with min=1 and max=2
B Multi-cluster warehouse with min=3 and max=10
C Single warehouse with auto-suspend disabled
D Warehouse with disabled auto-scale

Correct Answer: A

Explanation:

Option A is correct because a multi-cluster warehouse with min=1 and max=2 provides limited horizontal scaling that accommodates mild peak loads while preventing runaway cost increases, striking the optimal balance between performance elasticity and cost predictability for workloads with moderate variability. This configuration establishes a conservative scaling boundary that allows the warehouse to dynamically respond to increased concurrency demands by automatically provisioning a second cluster when query queuing occurs or when the first cluster reaches saturation, ensuring users experience minimal wait times during peak periods without sacrificing system responsiveness. The maximum cluster setting of two strictly caps the scaling limit, creating a hard ceiling on compute costs that prevents unexpected budget overruns while still providing double the query processing capacity compared to a single-cluster warehouse, which typically proves sufficient for handling mild to moderate traffic spikes characteristic of predictable business cycles like month-end reporting, morning login surges, or scheduled dashboard refreshes. This bounded elasticity ensures predictable charges since organizations can calculate their maximum possible hourly compute cost by multiplying the warehouse credit consumption rate by two clusters, enabling accurate budget forecasting and eliminating the financial uncertainty associated with unbounded auto-scaling configurations. The min=1 setting further optimizes costs by allowing the warehouse to scale down to a single cluster during off-peak periods when concurrency demands diminish, ensuring you pay only for the baseline compute capacity when traffic normalizes rather than maintaining multiple permanently active clusters that waste resources during low-utilization windows. This configuration eliminates extreme cluster growth scenarios where aggressive auto-scaling policies might spawn numerous clusters in response to temporary load spikes, query storms, or inefficient SQL patterns that could otherwise generate unexpectedly large bills, making it ideal for organizations prioritizing cost governance while still requiring some degree of automatic performance scaling to maintain acceptable user experience during predictable peak periods.

Option B is not correct because allowing up to 10 clusters introduces significant cost unpredictability and potential budget overruns that contradict the requirement for controlled scaling with cost containment. A maximum of 10 clusters represents heavy-duty concurrency infrastructure designed for enterprise-scale analytical workloads with hundreds of concurrent users or complex multi-tenant architectures requiring massive parallel query processing capacity, far exceeding the needs of mild peak loads described in the scenario. This configuration could result in compute costs reaching 10 times the baseline single-cluster rate during scaling events, creating financial exposure that most organizations find unacceptable for moderate workload fluctuations where doubling capacity typically provides sufficient headroom. The potential for a 10-cluster warehouse to activate fully during unexpected query patterns, poorly optimized SQL, or even simple user errors creates risk scenarios where a single hour of maximum scaling could consume an entire day’s typical compute budget, making it inappropriate for cost-conscious environments managing predictable, moderate peaks. Option C is not correct because disabling auto-suspend increases costs without improving performance elasticity or addressing the peak load scalability requirement. Auto-suspend controls whether warehouses automatically shut down during idle periods to conserve credits, and disabling this feature forces warehouses to run continuously regardless of query activity, transforming the cost model from pay-per-use to pay-for-constant-availability. This configuration dramatically increases baseline costs by consuming credits 24/7 even during nights, weekends, and other low-activity periods, while providing zero benefit for handling peak loads since auto-suspend affects shutdown behavior rather than horizontal scaling capacity or query concurrency handling.

Option D is not correct because disabling auto-scale eliminates the warehouse’s ability to respond dynamically to load fluctuations and contradicts the fundamental requirement for automated peak-time scaling without manual intervention. Disabling auto-scale locks the warehouse to its minimum cluster count regardless of query queuing, concurrency pressure, or performance degradation, forcing users to experience slower query response times and potential timeout errors during peak periods when demand exceeds single-cluster capacity. This approach transforms the warehouse into static infrastructure that cannot adapt to workload variability, requiring manual administrator intervention to provision additional capacity when peaks occur, which introduces operational overhead, response delays, and potential service degradation if administrators cannot react quickly enough to emerging congestion patterns.

Thus, configuring min=1 and max=2 is the ideal configuration that provides automated elasticity for handling mild peak loads while maintaining strict cost controls through bounded scaling limits, ensuring predictable compute expenses and responsive performance without the financial risks of aggressive scaling or the performance limitations of static capacity configurations.

Exam

Related posts:

Leave a Reply Cancel reply