⚡️ Pass with Confidence Sale - 40% off ALL packages! ⚡️

1 day, 19 hours remaining!

CompTIA Data+ DA0-002 (V2) Practice Question

A data engineering team is ingesting two new data sources into a cloud data lake. The first source is a set of monthly customer-sales extracts that the ERP system exports as fixed-size CSV files. The second source comprises the company's firewall log files, which record every network connection as it happens. To minimize compute costs and avoid re-loading duplicate records, which consideration BEST explains why the ingestion process for the firewall logs should be designed as a streaming or incremental load rather than a full-replace load?

  • The log files follow a strict relational DDL schema, so a full reload is required each time to preserve referential integrity.

  • The log files store binary image data that must be base-64 decoded, so the entire file has to be processed on every run.

  • The log files are continuously appended, so the pipeline should ingest only the newly written lines via streaming or incremental processing.

  • Each log file is limited to exactly 10 MB, making it simplest to delete and reload the whole file whenever it reaches that limit.

CompTIA Data+ DA0-002 (V2)
Data Concepts and Environments
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot