Databricks Certified Generative AI Engineer Associate Exam Dumps and Practice Test Questions Set 2 Q 21-40

Visit here for our full Databricks Certified Generative AI Engineer Associate exam dumps and practice test questions.

Question 21

Which component in a Retrieval-Augmented Generation (RAG) system is responsible for converting text into numerical representations?

A) Query processor

B) Embedding model

C) Language model

D) Document parser

Answer: B

Explanation:

Embedding models convert text into dense numerical vector representations that capture semantic meaning, enabling mathematical similarity comparisons between text passages. These models transform both documents during indexing and user queries during retrieval into comparable vector representations in high-dimensional space.

Embedding models use neural networks trained to position semantically similar text close together in vector space. Popular embedding models include sentence transformers, OpenAI embeddings, and domain-specific models fine-tuned for particular content types. The quality of embeddings directly impacts retrieval accuracy since similarity searches depend on meaningful vector representations capturing semantic relationships.

Option A is incorrect because query processors parse and prepare user queries for processing but do not perform the text-to-vector transformation that enables semantic similarity search in vector databases.

Option C is incorrect because language models generate text responses based on retrieved context but do not create the vector embeddings used for document retrieval and similarity matching in RAG systems.

Option D is incorrect because document parsers extract and structure text from various formats like PDFs or HTML but do not convert text into numerical vector representations required for semantic search.

Selecting appropriate embedding models for specific domains and content types is critical for RAG system performance and retrieval accuracy.

Question 22

What is the primary purpose of implementing prompt templates in generative AI applications?

A) To reduce model training time

B) To create reusable prompt structures with variable placeholders

C) To compress model weights

D) To manage database connections

Answer: B

Explanation:

Prompt templates provide reusable prompt structures with variable placeholders that can be dynamically populated with specific content, enabling consistent prompt formatting across applications while allowing customization for individual requests. Templates separate prompt structure from variable content, improving maintainability and enabling systematic prompt engineering.

Templates typically include system instructions, context placeholders, user query insertion points, and output format specifications. This approach enables version control for prompts, A/B testing of different structures, and consistent behavior across application instances. Templates support various placeholder syntaxes and can include conditional logic for adapting prompts based on context.

Option A is incorrect because prompt templates affect inference-time behavior rather than model training, which involves adjusting model weights through learning algorithms on training datasets over extended periods.

Option C is incorrect because prompt templates structure input text for language models but do not compress model weights, which involves quantization, pruning, or knowledge distillation techniques reducing model size.

Option D is incorrect because prompt templates format inputs for language models rather than managing database connections, which are handled by connection pooling, drivers, and database management systems.

Well-designed prompt templates improve consistency, enable collaboration, and simplify maintenance of generative AI applications.

Question 23

Which technique helps reduce hallucinations in large language model responses?

A) Increasing model temperature

B) Retrieval-Augmented Generation with factual sources

C) Removing all constraints

D) Using only creative prompts

Answer: B

Explanation:

Retrieval-Augmented Generation grounds model responses in factual source documents by retrieving relevant information and including it in prompts, significantly reducing hallucinations where models generate plausible-sounding but incorrect information. RAG provides verifiable context that models use when formulating responses rather than relying solely on potentially outdated or incorrect training data.

RAG systems retrieve documents from trusted knowledge bases, extract relevant passages, and inject them into prompts with instructions to base responses on provided context. This approach enables citation of sources, allows content verification, and reduces fabricated information. Additional techniques include lower temperature settings for more deterministic outputs and explicit instructions to acknowledge uncertainty.

Option A is incorrect because increasing temperature makes model outputs more random and creative, actually increasing hallucination likelihood rather than reducing it by making the model more likely to generate unexpected token sequences.

Option C is incorrect because removing constraints typically increases hallucination risk by allowing models complete freedom in generation without guidance, verification requirements, or factual grounding mechanisms.

Option D is incorrect because creative prompts without factual grounding encourage imaginative responses that may prioritize novelty over accuracy, potentially increasing rather than decreasing hallucination rates.

Combining RAG with other techniques like temperature control and uncertainty acknowledgment provides robust hallucination mitigation.

Question 24

What is the primary function of a vector database in generative AI applications?

A) To store relational data only

B) To enable efficient similarity search over high-dimensional embeddings

C) To compile code

D) To manage user authentication

Answer: B

Explanation:

Vector databases specialize in storing, indexing, and querying high-dimensional vector embeddings enabling efficient similarity searches that identify semantically related content. These databases use specialized indexing algorithms like HNSW, IVF, or product quantization that optimize nearest neighbor searches in high-dimensional spaces where traditional indexing fails.

Vector databases support RAG systems by quickly finding relevant documents given query embeddings, typically returning top-k most similar vectors in milliseconds even with millions of indexed documents. They provide metadata filtering, hybrid search combining vector and keyword matching, and APIs for inserting, updating, and deleting vectors with associated metadata.

Option A is incorrect because vector databases are specifically designed for high-dimensional vector data rather than relational data with tables, rows, and foreign keys, though some solutions support hybrid storage.

Option C is incorrect because vector databases store and query embeddings for similarity search rather than compiling code, which is performed by compilers transforming source code into executable formats.

Option D is incorrect because vector databases focus on similarity search over embeddings rather than managing user authentication, which involves identity providers, credential validation, and access control systems.

Popular vector databases include Pinecone, Weaviate, Chroma, and FAISS for building production RAG systems.

Question 25

Which parameter controls the randomness of large language model outputs?

A) Learning rate

B) Temperature

C) Batch size

D) Embedding dimension

Answer: B

Explanation:

Temperature controls output randomness by scaling logits before applying softmax during token selection, with lower temperatures producing more deterministic outputs and higher temperatures increasing diversity and creativity. Temperature of 0 selects the highest probability token deterministically, while higher values flatten the probability distribution enabling less likely tokens.

Temperature values typically range from 0 to 2, with 0.7-1.0 providing balanced outputs suitable for most applications. Lower temperatures (0.1-0.5) suit factual tasks requiring consistency like information extraction or structured outputs, while higher temperatures (1.0-2.0) benefit creative tasks like story generation or brainstorming requiring diversity.

Option A is incorrect because learning rate controls parameter update step size during model training rather than output randomness during inference when generating responses.

Option C is incorrect because batch size determines how many training examples are processed together during training for gradient computation but does not affect output randomness during inference.

Option D is incorrect because embedding dimension defines vector representation size for tokens or documents but does not control output randomness, which is determined by token sampling strategies.

Temperature is one of several sampling parameters including top-p and top-k that collectively control generation characteristics.

Question 26

What is the primary purpose of fine-tuning a large language model?

A) To increase model size

B) To adapt model behavior for specific tasks or domains

C) To reduce inference latency only

D) To convert text to embeddings

Answer: B

Explanation:

Fine-tuning adapts pre-trained language models to specific tasks, domains, or organizational requirements by continuing training on specialized datasets, modifying model behavior while leveraging knowledge acquired during pre-training. This technique achieves better performance than prompting alone for specialized applications requiring consistent behavior or domain expertise.

Fine-tuning adjusts model weights through additional training iterations on task-specific or domain-specific data, teaching models specialized vocabulary, writing styles, reasoning patterns, or factual knowledge. Techniques include full fine-tuning updating all parameters, parameter-efficient methods like LoRA updating small adapter layers, and instruction tuning for improved prompt following.

Option A is incorrect because fine-tuning adapts existing model behavior rather than increasing model size, which would involve changing model architecture by adding layers, increasing hidden dimensions, or expanding vocabulary.

Option C is incorrect because fine-tuning primarily adapts model behavior and capabilities rather than reducing inference latency, which requires techniques like quantization, pruning, or knowledge distillation.

Option D is incorrect because fine-tuning adapts language model behavior rather than converting text to embeddings, which is performed by specialized embedding models or encoder components.

Fine-tuning enables organizations to create specialized models aligned with specific requirements without training from scratch.

Question 27

Which evaluation metric measures the relevance of retrieved documents in a RAG system?

A) Perplexity

B) Recall at K or Mean Reciprocal Rank

C) BLEU score

D) Token count

Answer: B

Explanation:

Recall at K measures the proportion of relevant documents appearing in the top K retrieval results, while Mean Reciprocal Rank evaluates the rank position of the first relevant document, both assessing retrieval component effectiveness. These metrics evaluate whether the retrieval system surfaces relevant information needed for accurate response generation.

Recall at K is calculated as the number of relevant documents in top K results divided by total relevant documents, with higher values indicating better retrieval coverage. MRR averages the reciprocal rank of first relevant documents across queries, emphasizing early relevant results. Additional retrieval metrics include Precision at K, NDCG, and Mean Average Precision.

Option A is incorrect because perplexity measures language model uncertainty or surprise at predicting token sequences during generation rather than evaluating document retrieval relevance or ranking quality.

Option C is incorrect because BLEU score compares generated text to reference translations measuring n-gram overlap, used for evaluating generation quality rather than retrieval relevance.

Option D is incorrect because token count measures text length but does not evaluate retrieval relevance, document ranking quality, or whether retrieved content addresses user information needs.

Comprehensive RAG evaluation requires both retrieval metrics and generation quality metrics to assess end-to-end system performance.

Question 28

What is the primary benefit of using chain-of-thought prompting with large language models?

A) To reduce model size

B) To improve reasoning by encouraging step-by-step problem decomposition

C) To increase training speed

D) To compress embeddings

Answer: B

Explanation:

Chain-of-thought prompting encourages language models to generate intermediate reasoning steps before final answers, improving performance on complex reasoning tasks by breaking problems into manageable sub-problems. This technique makes reasoning explicit and verifiable while helping models avoid logical errors through systematic problem decomposition.

CoT prompting includes examples demonstrating step-by-step reasoning or explicit instructions to think through problems systematically. The approach particularly benefits mathematical reasoning, logical puzzles, multi-step planning, and tasks requiring sequential decision-making. Variations include zero-shot CoT using prompts like “Let’s think step by step” and few-shot CoT providing reasoning examples.

Option A is incorrect because chain-of-thought prompting is an inference-time technique affecting how models generate responses rather than reducing model size, which requires architectural changes or compression techniques.

Option C is incorrect because CoT prompting improves inference quality rather than training speed, which depends on computational resources, batch sizes, optimization algorithms, and distributed training configurations.

Option D is incorrect because CoT prompting enhances reasoning during generation rather than compressing embeddings, which involves dimensionality reduction techniques like PCA or specialized compression algorithms.

Chain-of-thought prompting represents a significant advancement in eliciting reasoning capabilities from large language models.

Question 29

Which component in MLflow tracks experiments for generative AI model development?

A) MLflow Models only

B) MLflow Tracking with experiments and runs

C) MLflow Projects only

D) MLflow Registry only

Answer: B

Explanation:

MLflow Tracking provides APIs and UI for logging parameters, metrics, artifacts, and metadata during model training and evaluation, organizing work into experiments containing multiple runs. Each run captures a single execution with associated parameters, metrics, model versions, and artifacts enabling comparison and reproducibility.

Tracking logs hyperparameters like learning rate and temperature, metrics like accuracy or perplexity, artifacts like model checkpoints or evaluation results, and metadata like timestamps and user information. The tracking UI enables filtering, sorting, and comparing runs to identify best configurations. Integration with popular frameworks enables automatic logging.

Option A is incorrect because MLflow Models focuses on model packaging, deployment formats, and inference rather than tracking experiments, parameters, and metrics during development.

Option C is incorrect because MLflow Projects defines reproducible runs with dependencies and entry points rather than tracking experiment results, though projects can log to tracking.

Option D is incorrect because MLflow Registry manages model lifecycle and versioning for deployment rather than tracking training experiments, though registered models reference tracking runs.

MLflow Tracking is essential for managing iterative experimentation and optimization in generative AI development workflows.

Question 30

What is the primary purpose of implementing few-shot learning in prompt engineering?

A) To reduce model training time

B) To provide examples demonstrating desired behavior without fine-tuning

C) To compress model weights

D) To manage database schemas

Answer: B

Explanation:

Few-shot learning provides input-output examples in prompts demonstrating desired behavior, enabling models to perform tasks without fine-tuning by learning patterns from examples during inference. This approach leverages in-context learning capabilities of large language models to adapt behavior based on provided demonstrations.

Few-shot prompts typically include 2-10 examples showing input format, reasoning process, and expected output format. The technique works for classification, extraction, transformation, and generation tasks where examples clarify requirements. Few-shot learning balances performance and prompt length, using enough examples to establish patterns without exceeding context limits.

Option A is incorrect because few-shot learning is an inference-time prompting technique that avoids training entirely rather than reducing training time, which would require optimization algorithms or infrastructure improvements.

Option C is incorrect because few-shot learning demonstrates task behavior through examples rather than compressing model weights, which requires quantization, pruning, or distillation techniques.

Option D is incorrect because few-shot learning teaches models task behavior through examples rather than managing database schemas, which involves defining tables, relationships, and constraints.

Few-shot learning enables rapid task adaptation without costly fine-tuning while maintaining flexibility for changing requirements.

Question 31

Which technique helps prevent prompt injection attacks in generative AI applications?

A) Removing all input validation

B) Input sanitization and instruction-data separation

C) Increasing model temperature

D) Disabling all security measures

Answer: B

Explanation:

Input sanitization and instruction-data separation protect against prompt injection by validating user inputs, detecting malicious patterns, and clearly delineating system instructions from user-provided data using delimiters or formatting. This prevents attackers from manipulating model behavior by injecting instructions disguised as user content.

Defensive techniques include filtering suspicious patterns, using special tokens or XML tags to separate instructions from data, implementing adversarial prompt detection, validating outputs against expectations, and using instruction-following models trained to ignore embedded commands in data sections. Additional layers include output filtering and human review for sensitive applications.

Option A is incorrect because removing input validation increases vulnerability to injection attacks, SQL injection, cross-site scripting, and other exploits that depend on unsanitized user input.

Option C is incorrect because increasing temperature affects output randomness rather than preventing prompt injection attacks, which require input validation, structural separation, and security-aware prompt design.

Option D is incorrect because disabling security measures maximizes vulnerability to attacks rather than preventing them, leaving applications exposed to prompt injection, data exfiltration, and unauthorized actions.

Security considerations are essential for production generative AI applications handling untrusted user input.

Question 32

What is the primary function of attention mechanisms in transformer models?

A) To compress data

B) To weigh importance of different input tokens when processing sequences

C) To manage memory allocation

D) To compile code

Answer: B

Explanation:

Attention mechanisms enable models to dynamically weigh the importance of different input tokens when processing each token, allowing models to focus on relevant context regardless of position in the sequence. Self-attention computes relationships between all token pairs, creating rich contextual representations that capture long-range dependencies.

Attention computes query, key, and value representations for each token, then calculates attention scores between queries and keys to determine how much each token should attend to others. These scores weight value vectors to produce context-aware representations. Multi-head attention applies this process in parallel across different representation subspaces, capturing diverse relationships.

Option A is incorrect because attention mechanisms create rich contextual representations rather than compressing data, which involves reducing data size through encoding, quantization, or dimensionality reduction techniques.

Option C is incorrect because attention mechanisms compute token relationships rather than managing memory allocation, which involves allocating, deallocating, and organizing memory usage by operating systems or runtime environments.

Option D is incorrect because attention mechanisms process sequence data in neural networks rather than compiling code, which involves transforming source code into executable machine instructions through lexical analysis, parsing, and optimization.

Attention mechanisms are the fundamental innovation enabling transformer models’ superior performance on language tasks.

Question 33

Which approach is most effective for evaluating generative AI model outputs when no ground truth exists?

A) Ignoring evaluation entirely

B) Human evaluation or LLM-as-judge with rubrics

C) Random scoring

D) Only measuring latency

Answer: B

Explanation:

Human evaluation or using LLMs as judges with well-defined rubrics provides systematic assessment when ground truth is unavailable, evaluating dimensions like relevance, coherence, accuracy, safety, and helpfulness. This approach applies consistent criteria across outputs while accommodating subjective quality aspects.

Human evaluation involves domain experts rating outputs on defined scales, often comparing multiple model versions or approaches. LLM-as-judge uses powerful models to evaluate other model outputs based on rubrics, scaling evaluation while maintaining consistency. Rubrics specify evaluation criteria, scoring ranges, and example ratings at different quality levels providing standardized assessment frameworks.

Option A is incorrect because ignoring evaluation prevents quality assurance, performance tracking, model comparison, and identifying issues before production deployment, which is unacceptable for reliable applications.

Option C is incorrect because random scoring provides no meaningful quality signal, prevents comparing models or approaches, and offers no guidance for improving system performance or detecting issues.

Option D is incorrect because measuring only latency ignores output quality, relevance, accuracy, and safety, which are typically more important than response speed for generative AI applications.

Combining multiple evaluation approaches including automated metrics, LLM judges, and human review provides comprehensive quality assessment.

Question 34

What is the primary purpose of implementing context window management in RAG systems?

A) To reduce storage costs

B) To fit relevant information within model token limits while maximizing useful content

C) To increase model training speed

D) To manage database connections

Answer: B

Explanation:

Context window management ensures that retrieved documents, system instructions, user queries, and conversation history fit within model token limits while maximizing useful information density. This involves prioritizing most relevant content, chunking documents appropriately, and strategically allocating context budget across components.

Techniques include ranking retrieved chunks by relevance and including top results, summarizing or extracting key points from documents, removing redundant information, using sliding windows for long documents, and implementing hierarchical retrieval that first identifies relevant sections then retrieves details. Effective management balances comprehensiveness with token efficiency.

Option A is incorrect because context window management optimizes prompt composition for model limits rather than reducing storage costs, which involves compression, deduplication, and lifecycle policies.

Option C is incorrect because context window management occurs during inference rather than training, affecting what information is provided to models during generation rather than training speed.

Option D is incorrect because context window management handles prompt composition within token limits rather than managing database connections, which involves connection pooling, drivers, and connection lifecycle management.

Effective context management is critical for RAG systems to provide relevant information without exceeding model capabilities.

Question 35

Which technique helps reduce computational costs for large language model inference?

A) Increasing model size

B) Model quantization or distillation

C) Adding more parameters

D) Removing all optimizations

Answer: B

Explanation:

Model quantization reduces precision of model weights and activations from 32-bit floats to 8-bit integers or lower, dramatically reducing memory requirements and computational costs with minimal accuracy degradation. Knowledge distillation trains smaller student models to mimic larger teacher models, achieving comparable performance with fewer parameters.

Quantization techniques include post-training quantization applied to trained models, quantization-aware training that simulates quantization during training, and mixed-precision approaches using lower precision for most operations. Distillation transfers knowledge through training student models on teacher outputs, soft labels, or intermediate representations. Both approaches significantly reduce inference costs.

Option A is incorrect because increasing model size raises computational costs, memory requirements, and inference latency rather than reducing them, though larger models may provide better quality.

Option C is incorrect because adding parameters increases model capacity and computational requirements rather than reducing costs, though parameter-efficient fine-tuning adds minimal parameters for adaptation.

Option D is incorrect because removing optimizations increases computational costs by eliminating efficiency improvements from techniques like quantization, pruning, caching, and batching.

Organizations commonly deploy quantized models or distilled variants for production inference to balance quality with cost constraints.

Question 36

What is the primary benefit of using semantic caching in generative AI applications?

A) To increase model training time

B) To reuse responses for semantically similar queries reducing latency and costs

C) To corrupt model weights

D) To disable all features

Answer: B

Explanation:

Semantic caching stores responses indexed by query embeddings, enabling response reuse for semantically similar queries even when exact text differs. This reduces latency by serving cached responses instantly, decreases costs by avoiding redundant model invocations, and improves scalability by reducing backend load.

Semantic caching computes embeddings for incoming queries, performs similarity search against cached query embeddings, and returns cached responses when similarity exceeds thresholds. The approach handles query variations, paraphrasing, and minor differences that exact-match caching misses. Cache invalidation strategies manage freshness and capacity constraints.

Option A is incorrect because semantic caching is an inference optimization that reduces latency and costs rather than affecting model training time, which depends on data volume, model architecture, and computational resources.

Option C is incorrect because semantic caching stores and retrieves responses based on query similarity rather than corrupting model weights, which would degrade model quality and require retraining.

Option D is incorrect because semantic caching enhances application performance and reduces costs rather than disabling features, which would reduce functionality and limit application capabilities.

Semantic caching provides substantial benefits for applications with repeated or similar queries across users.

Question 37

Which component in a generative AI system handles orchestration of multiple model calls and tools?

A) Database only

B) Agent or orchestration framework

C) Embedding model only

D) Cache only

Answer: B

Explanation:

Agents or orchestration frameworks coordinate multi-step workflows involving multiple model calls, tool usage, retrieval operations, and decision-making based on intermediate results. These systems implement reasoning loops where models decide which actions to take, execute tools, process results, and determine next steps.

Orchestration frameworks like LangChain, LlamaIndex, or Semantic Kernel provide abstractions for chaining operations, managing tool access, handling errors, implementing memory across conversation turns, and coordinating complex workflows. Agents use models as reasoning engines to plan actions, select tools, interpret results, and adapt strategies based on feedback.

Option A is incorrect because databases store and retrieve data but do not orchestrate multi-step workflows, decide action sequences, or coordinate between models and tools for complex tasks.

Option C is incorrect because embedding models convert text to vectors for similarity search but do not orchestrate workflows, make decisions about tool usage, or coordinate multi-step processes.

Option D is incorrect because caches store and retrieve results for performance optimization but do not provide orchestration capabilities for coordinating multiple operations and decision-making.

Agent frameworks enable building sophisticated applications that autonomously solve complex tasks through tool use and reasoning.

Question 38

What is the primary purpose of implementing guardrails in generative AI applications?

A) To remove all constraints

B) To enforce safety, quality, and compliance requirements on inputs and outputs

C) To increase model size

D) To disable monitoring

Answer: B

Explanation:

Guardrails implement safety, quality, and compliance controls that validate inputs before processing and filter outputs before returning to users, preventing harmful, biased, inaccurate, or policy-violating content. Guardrails protect users, organizations, and brand reputation by enforcing acceptable use boundaries.

Input guardrails detect and block malicious prompts, inappropriate content, personal information, or jailbreak attempts. Output guardrails filter toxic language, harmful advice, biased statements, hallucinations, or sensitive information before delivery. Implementations use classifiers, keyword filters, safety models, fact-checking, and business rule engines. Guardrails balance safety with usability.

Option A is incorrect because removing constraints eliminates safety protections, allows harmful outputs, violates compliance requirements, and creates liability risks for organizations deploying generative AI applications.

Option C is incorrect because guardrails enforce content policies rather than increasing model size, which involves adding parameters, layers, or expanding architecture for increased capacity.

Option D is incorrect because guardrails require monitoring to detect violations, track effectiveness, identify evasion attempts, and continuously improve safety systems rather than disabling monitoring.

Production generative AI applications require comprehensive guardrail systems to ensure safe, appropriate, and compliant operation.

Question 39

Which metric is most appropriate for evaluating retrieval quality in a RAG system?

A) Training loss only

B) Normalized Discounted Cumulative Gain (NDCG)

C) Model size

D) Hardware utilization

Answer: B

Explanation:

Normalized Discounted Cumulative Gain evaluates ranking quality by considering both relevance and position, rewarding systems that place highly relevant documents at top positions while penalizing relevant documents buried in lower ranks. NDCG ranges from 0 to 1 with higher values indicating better ranking quality.

NDCG computes cumulative gain with position-based discounting, then normalizes by ideal ranking score, enabling comparison across queries with varying numbers of relevant documents. This metric captures that users primarily examine top results, making ranking order critical. NDCG complements recall and precision metrics for comprehensive retrieval evaluation.

Option A is incorrect because training loss measures model optimization during training rather than retrieval quality, which evaluates whether relevant documents are surfaced and properly ranked for user queries.

Option C is incorrect because model size indicates capacity and resource requirements rather than retrieval quality, which measures relevance and ranking of retrieved documents for information needs.

Option D is incorrect because hardware utilization shows resource consumption rather than retrieval effectiveness, which assesses whether systems return relevant information in useful rankings for user queries.

Comprehensive RAG evaluation uses multiple metrics covering retrieval quality, generation quality, and end-to-end system performance.

Question 40

What is the primary benefit of using parameter-efficient fine-tuning methods like LoRA?

A) To increase training data requirements

B) To adapt models with minimal parameter updates and computational costs

C) To remove all model capabilities

D) To maximize memory usage

Answer: B

Explanation:

Parameter-efficient fine-tuning methods like Low-Rank Adaptation update only small sets of parameters or add lightweight adapter layers, achieving effective model adaptation with minimal computational costs and memory requirements compared to full fine-tuning. LoRA adds trainable low-rank matrices to model layers while freezing base model weights.

LoRA decomposes weight updates into low-rank matrices with far fewer parameters than original layers, dramatically reducing training costs and enabling fine-tuning large models on consumer hardware. The approach maintains base model quality while adapting behavior for specific tasks or domains. Multiple LoRA adapters can be trained for different tasks sharing one base model.

Option A is incorrect because parameter-efficient methods reduce computational requirements rather than increasing data requirements, enabling effective fine-tuning with relatively small task-specific datasets.

Option C is incorrect because parameter-efficient fine-tuning adapts and enhances model capabilities for specific tasks rather than removing capabilities, building on pre-trained model knowledge.

Option D is incorrect because parameter-efficient methods minimize memory usage and computational costs rather than maximizing them, enabling fine-tuning on resource-constrained hardware.

LoRA and similar methods democratize fine-tuning by making it accessible without massive computational infrastructure.

Exam

Related posts:

Leave a Reply Cancel reply