Bash, the Crucial Exams Chat Bot
AI Bot
Data Concepts and Environment Fundamentals Flashcards
CompTIA DataX DY0-001 (V1) Flashcards
| Front | Back |
| Batch processing | Executing a series of commands on a large set of data at once |
| Big data | Large and complex datasets that require advanced tools for storage, processing, and analysis |
| Blockchain data storage | Decentralized, secure method to store data across a network |
| Cloud computing | Using remote servers hosted on the internet to store, manage, and process data |
| Columnar database | A database that stores data by columns, optimized for analytical workloads |
| Concurrent processing | Performing multiple tasks or operations simultaneously in a system |
| Data anonymization | Masking or removing identifying information to protect privacy |
| Data audit trails | Records showing the history and transformations of data |
| Data compression | Reducing the size of data to save storage and improve performance |
| Data encodings | Methods for formatting data into a standardized representation such as UTF-8 |
| Data federation | Integrating data from different sources into a virtual unified view |
| Data governance | Policies and practices to ensure data quality, security, and compliance |
| Data integrity | Ensuring data is accurate, consistent, and reliable |
| Data lake | A storage solution that holds raw data in its native format before processing |
| Data lineage | Tracking where data comes from, how it moves, and where it ends up |
| Data modeling | The process of creating a visual representation of a data system |
| Data partitioning | Dividing data into smaller chunks to optimize performance and scalability |
| Data pipeline | A series of processes that move and transform data from source to destination |
| Data processing | The act of converting raw data into meaningful information |
| Data redundancy | Storing the same data in multiple locations leading to inefficiency |
| Data scalability | The ability to handle increasing amounts of data without performance issues |
| Data storage | The method or technology used to save data such as HDD, SSD, or cloud storage |
| Data type | Represents the kind of data such as integer, float, string, or boolean |
| Data visualization | Representing data graphically to better understand trends and patterns |
| Data warehouse | A centralized repository for storing large amounts of structured data for analysis |
| Distributed computing | Using multiple machines to process large volumes of data |
| ETL process | Extract, Transform, Load - steps to move and prepare data for analysis |
| Foreign key | A field in one table that links to the primary key in another table |
| Immutable data | Data that cannot be changed after it is stored |
| Index in database | A structure that improves the speed of data retrieval operations |
| Metadata | Data that provides information about other data such as its format or origin |
| Non-relational database | A database that stores unstructured or semi-structured data like key-value pairs or documents |
| Normalization | Process of organizing data to reduce redundancy and improve consistency |
| NoSQL | Non-relational database systems designed for scalability and flexibility |
| OLAP | Online Analytical Processing designed for complex data analysis and decision-making |
| OLTP | Online Transaction Processing focused on handling routine transaction data |
| Primary key | A unique identifier for a record in a relational database |
| Real-time processing | Processing data as it is generated to provide immediate results |
| Relational database | A type of database using tables with rows and columns to store structured data |
| Replication in databases | Creating duplicates of data for backup and high availability |
| Sharding | Dividing a database into smaller pieces to distribute the load across servers |
| Snapshot in databases | A point-in-time copy of the database for backup or analysis |
| SQL | Structured Query Language used to interact with relational databases |
| Structured data | Data organized in rows and columns often seen in relational databases |
| Transactional data | Data generated from business transactions like orders or payments |
| Unstructured data | Data with no predefined format like images, videos, and text documents |
This deck covers foundational topics such as data types, data storage, and data processing environments important for understanding data management.