00:20:00

CompTIA Data+ Practice Test (DA0-002)

Use the form below to configure your CompTIA Data+ Practice Test (DA0-002). The practice test can be configured to only include certain exam objectives and domains. You can choose between 5-100 questions and set a time limit.

Logo for CompTIA Data+ DA0-002 (V2)
Questions
Number of questions in the practice test
Free users are limited to 20 questions, upgrade to unlimited
Seconds Per Question
Determines how long you have to finish the practice test
Exam Objectives
Which exam objectives should be included in the practice test

CompTIA Data+ DA0-002 (V2) Information

The CompTIA Data+ exam is a test for people who want to show they understand how to work with data. Passing this exam proves that someone can collect, organize, and study information to help businesses make smart choices. It also checks if you know how to create reports, use charts, and follow rules to keep data safe and accurate. CompTIA suggests having about 1 to 2 years of experience working with data, databases, or tools like Excel, SQL, or Power BI before taking the test.

The exam has different parts, called domains. These include learning basic data concepts, preparing data, analyzing it, and creating easy-to-read reports and visualizations. Another important part is data governance, which covers keeping data secure, private, and high quality. Each section of the test has its own percentage of questions, with data analysis being the largest part at 24%.

Overall, the CompTIA Data+ exam is a good way to prove your skills if you want a career in data. It shows employers that you know how to handle data from start to finish, including collecting it, checking it for errors, and sharing results in clear ways. If you enjoy working with numbers and information, this certification can be a great step forward in your career.

CompTIA Data+ DA0-002 (V2) Logo
  • Free CompTIA Data+ DA0-002 (V2) Practice Test

  • 20 Questions
  • Unlimited
  • Data Concepts and Environments
    Data Acquisition and Preparation
    Data Analysis
    Visualization and Reporting
    Data Governance
Question 1 of 20

During a pilot with smart shelves, sensors publish an inventory event every 5 seconds to an Event Hub. Regional merchandisers rely on a Power BI dashboard to reorder items as soon as stock is low. Today the dashboard reads from a table that an ETL job overwrites with an hourly timestamped snapshot. Executives say they still miss stock-outs by nearly an hour and do not want users to click Refresh. As the data analyst, which data-versioning technique will ensure the dashboard tiles update automatically within seconds of each new event?

  • Retain the hourly snapshot and instruct users to force a browser refresh when needed.

  • Reduce the ETL schedule to every 30 minutes but keep the snapshot history for analysis.

  • Switch to an overnight full batch load and email merchandisers a CSV extract of inventory levels.

  • Configure a real-time streaming dataset that pushes each inventory event directly to the dashboard.

Question 2 of 20

After last night's refresh, the ecommerce dashboard is showing every order three times, so the total revenue KPI has tripled. The visualization uses a live connection to the company's Snowflake data warehouse, and the ETL job's execution log reported no errors. No changes were published to the dashboard itself. Which action should the data analyst take first to determine whether the duplication originates in the visualization layer or in the warehouse?

  • Clear the dashboard's cache and republish the workbook.

  • Increase the BI server's memory allocation to handle larger result sets.

  • Post a question to the BI vendor's community forum to ask about duplicate-row rendering defects.

  • Run a SQL query on the warehouse table to count rows per unique order ID and check for duplicates.

Question 3 of 20

A data analyst at a healthcare organization is preparing a dataset for a university research study on patient outcomes. The dataset contains sensitive Personal Health Information (PHI). To comply with privacy regulations and protect patient identities, while still providing valuable data for statistical analysis, which of the following data protection practices is the MOST appropriate to apply before sharing the dataset?

  • Encryption at rest

  • Role-based access control (RBAC)

  • Data masking

  • Anonymization

Question 4 of 20

A data analyst joins a project and is tasked with creating a report from a customer database they have never used before. To properly interpret the data, the analyst needs to understand the definitions of each field, their data types (e.g., string, integer, date), and any constraints, such as 'cannot be null'.

Which of the following documents would be the MOST direct and comprehensive resource for this specific information?

  • Data flow diagram

  • Data dictionary

  • Data lineage report

  • Hierarchy structure diagram

Question 5 of 20

Brianna, a data analyst at an e-commerce company, must present the results of a new customer-churn model to two audiences on the same day. The first audience is the Vice President of Customer Success, who will decide whether to fund a retention campaign; the second audience is the data-engineering team that will operationalize the model. Based on user-persona type, which communication approach should Brianna use when briefing the Vice President?

  • Walk through each feature-engineering step, hyper-parameter tuning results and the full confusion matrix, emphasizing data-quality caveats.

  • Open with a one-page executive summary that highlights projected revenue at risk and the expected ROI of a retention campaign, show one clear visualization, and keep detailed model information in backup slides.

  • Share the raw training data and Python notebooks in a shared repository and schedule a hands-on working session to review the code line by line.

  • Email a CSV containing every customer's predicted churn probability and ask the Vice President to choose which customers to target.

Question 6 of 20

A data analyst is designing a new fact table in Microsoft SQL Server to store the latitude-longitude coordinates of every retail branch. The business will run queries such as

SELECT TOP (5) BranchID
FROM   dbo.Branch
WHERE  @customerLocation.STDistance(Location) <= 10000; -- 10 km

to find the five closest branches to a customer anywhere in the world. The analyst needs the chosen column data type to perform built-in, accurate distance calculations that account for the curvature of the Earth without requiring custom formulas. Which SQL Server data type best meets these requirements?

  • varbinary(MAX)

  • geography

  • varchar(50)

  • geometry

Question 7 of 20

An e-commerce company needs to store the complete clickstream generated by its website. Each event arrives as a semi-structured JSON document whose set of fields can change often as new marketing experiments roll out. The data layer must automatically distribute writes across many inexpensive servers to absorb sudden traffic spikes, yet the analysts do not require complex joins or multi-row ACID transactions. Given these requirements, which database type is the most appropriate choice?

  • Spreadsheet files stored on a network share

  • In-memory OLAP cube

  • Non-relational document database

  • Relational database management system (RDBMS)

Question 8 of 20

A data analyst is responsible for a weekly sales report. After a scheduled data refresh, the analyst observes that the key metric for total sales is 50% lower than the weekly average for the past year. The SQL query used for the report has not been altered, and there are no connection error messages in the reporting tool. The analyst's immediate goal is to troubleshoot this discrepancy. Which of the following actions is the most direct and effective first step to validate the data source?

  • Consult with the sales team to confirm if their recent sales figures were unusually low.

  • Run a simple aggregate query, such as COUNT(*) or MAX(transaction_date), directly against the source database table.

  • Enable verbose logging on the reporting server to capture more detailed execution data.

  • Re-authenticate the database connection credentials in the business intelligence (BI) tool's data source settings.

Question 9 of 20

A data analyst is tasked with developing a new, interactive sales performance dashboard for the executive team. The requirements are high-level, with the primary request being a 'single-pane-of-glass' view of key performance indicators (KPIs). The analyst wants to gather feedback on the proposed layout, color scheme, and selection of charts before connecting to the live database and writing complex queries. Which communication approach would be most effective for this purpose?

  • Develop a fully functional prototype with sample data.

  • Schedule a presentation to verbally describe the proposed dashboard.

  • Create a static mock-up of the dashboard.

  • Write a detailed technical specification document.

Question 10 of 20

Your data team has quantified the ROI of a recent marketing campaign and must brief two different groups: the executive leadership team during the quarterly business review and the marketing-operations analysts who will refine future campaigns. The underlying metrics and conclusions are identical, but you will create two separate slide decks to fit each audience. Which adjustment is MOST appropriate when tailoring the presentation specifically for the C-suite?

  • Increase chart granularity to daily spend and click-level data to show underlying variability.

  • Embed the full SQL script and transformation logic so leaders can audit the calculation lineage.

  • Open with an executive summary that highlights strategic impact and recommendations while omitting detailed methodology.

  • Append a comprehensive data dictionary that defines every field and value used in the analysis.

Question 11 of 20

A team is redesigning a SQL Server customer table that will store names from more than 40 countries, including languages that use Chinese, Arabic, and Cyrillic characters. Names can be up to 200 characters long, but most are under 50 characters. Which SQL string data type BEST satisfies the business requirements for international character support and keeps storage overhead low when the value is shorter than the maximum length?

  • varchar(200)

  • nchar(200)

  • nvarchar(200)

  • char(200)

Question 12 of 20

A data analyst has created an interactive sales report in Power BI Desktop. The report needs to be shared with regional managers so they can view and interact with it online through their web browsers. Which Power BI component should the analyst use to publish and distribute the report?

  • Power BI Report Builder

  • Power BI Service

  • Power BI Desktop

  • Power BI Gateway

Question 13 of 20

A data analyst needs to load a large CSV file containing customer information into Python. The primary goal is to perform data cleaning, manipulation, and analysis on this tabular data, including tasks like filtering rows, grouping data, and calculating summary statistics. Which of the following Python libraries is specifically designed for these purposes, providing the DataFrame as its core data structure?

  • NumPy

  • scikit-learn

  • pandas

  • Matplotlib

Question 14 of 20

A data analyst is defining a column to store customer email addresses in a SQL Server table. Values can be anywhere from 5 to 320 characters long, contain only ASCII characters, and should be limited to 320 characters. To minimize storage while supporting this range, which column definition is the most appropriate?

  • varchar(320)

  • nvarchar(320)

  • varchar(MAX)

  • char(320)

Question 15 of 20

A manufacturing company relies on three legacy desktop applications that do not expose APIs or direct database connectivity. Each night a junior analyst manually logs in, exports the day's production data to CSV, renames the files according to a strict convention, and copies them to a network share so an overnight ETL job can load them into the data warehouse. Management wants to eliminate this repetitive task without rewriting the legacy software. Which approach best matches the capabilities of robotic process automation (RPA) for this scenario?

  • Record the analyst's UI actions in an unattended software bot and schedule it to export, rename, and move the files every night.

  • Replace the legacy applications with microservice-based web APIs and rebuild the workflow around REST calls.

  • Train a convolutional neural network to predict missing production values and write the results directly to the warehouse.

  • Configure the ETL tool to query the legacy applications' databases directly through JDBC connections.

Question 16 of 20

You are building a monthly revenue report in a PostgreSQL 15 database. The source table orders stores the precise purchase timestamp in the column order_created_at (TIMESTAMP). Before aggregating, you must transform every timestamp so that all rows from the same calendar month are grouped under the identical key (for example, 2025-03-15 and 2025-03-31 should both become 2025-03-01 00:00:00). Which single SQL expression will perform this normalization in one step so you can immediately use it in a GROUP BY clause?

  • EXTRACT(month FROM order_created_at)

  • DATE_TRUNC('month', order_created_at)

  • TO_CHAR(order_created_at, 'YYYY-MM')

  • DATEDIFF('month', order_created_at, CURRENT_DATE)

Question 17 of 20

A data analyst has created a line chart to display the quarterly sales performance of four different product lines over the last three years. Each product line is represented by a uniquely colored line. During a review, a manager notes that it is impossible to determine which product line corresponds to which colored line. To resolve this ambiguity and make the chart interpretable, which design element must the analyst add?

  • Data labels

  • A legend

  • Gridlines

  • A title

Question 18 of 20

An online retailer has two AI-related initiatives:

Project A uses an AI service to automatically detect product names, competitor brands and overall sentiment in thousands of free-text customer reviews so that marketing dashboards update in near real time.

Project B aims to remove a nightly manual task in which an analyst opens a legacy desktop application, exports a CSV file, renames it and uploads it to a cloud folder.

Which combination of AI concepts best matches Project A and Project B, respectively?

  • Generative AI for Project A and deep learning for Project B

  • Deep learning classification for Project A and generative AI for Project B

  • Robotic process automation (RPA) for Project A and large language model (LLM) fine-tuning for Project B

  • Natural language processing (NLP) for Project A and robotic process automation (RPA) for Project B

Question 19 of 20

Your team has finished analyzing customer-satisfaction metrics for fiscal Q2. According to a mandate from corporate communications, the results summary must be a static document (such as a PDF, DOC, or image) for posting on the public investor-relations site. The content must also comply with Section 508 and WCAG-AA standards to ensure board members who rely on screen-reader software can independently review it. Finally, stakeholders will download the report on both mobile and desktop devices. Given these requirements, which communication approach provides the most accessible experience?

  • Publish an interactive HTML5 dashboard in which insights appear only when users hover over elements or interpret color cues.

  • Email the underlying Excel workbook that uses red/green conditional formatting to indicate trends but contains no supporting narrative.

  • Export the visual summary to a tagged PDF that includes alt text for every chart, a logical reading order, and a high-contrast color theme.

  • Record a narrated screencast walk-through of the dashboard and embed the video on the site without closed captions or transcripts.

Question 20 of 20

A regional retail chain tracks point-of-sale data that is loaded into its data warehouse every night by 04:00. The sales director wants store managers to open an existing Power BI dashboard at 08:00 each Monday and immediately see a summary of the previous week's results without having to click a refresh button or run a query. Which delivery approach best meets this requirement while minimizing manual effort?

  • Export the dashboard as a static PDF every Friday afternoon and email it to all store managers.

  • Switch the dataset to DirectQuery so the dashboard streams live transactions whenever someone opens it.

  • Provide an ad-hoc report template that managers must run and filter themselves each Monday morning.

  • Configure a scheduled refresh that runs at 05:00 every Monday so the dashboard is updated before managers log in.