Bash, the Crucial Exams Chat Bot
AI Bot
Data Processing in Azure Flashcards
Microsoft Azure Data Fundamentals DP-900 Flashcards
| Front | Back |
| How does Azure Monitor assist in data processing workflows | By providing observability through logs, metrics, and alerts for pipelines and resources. |
| How does Azure Synapse enable real-time analytics | Through integration with Azure Stream Analytics for real-time data processing. |
| How does Azure Synapse handle big data processing | By integrating Spark processing within the same platform as SQL-based analytics. |
| How does Azure Synapse Link support operational analytics | By enabling hybrid transactional and analytical processing on operational databases in real time. |
| How does PolyBase assist in Azure Synapse Analytics | By allowing the direct query of external data sources using T-SQL, without moving the data. |
| What are Synapse Notebooks used for in Azure Synapse | To run and manage code written in Python, Spark SQL, Scala, or .NET for data analytics. |
| What feature of Azure Data Factory supports scheduling and monitoring workflows | Integration with Azure Monitor and time-based triggers. |
| What is a Data Lake | An online repository for large volumes of raw data stored in its native format. |
| What is a Delta Table in Azure Databricks | A storage format that enables efficient updates and ACID transactions on data lakes. |
| What is a Lakehouse in Azure Databricks | A combined approach using data lake storage and data warehouse features for unified analytics. |
| What is a Synapse Dedicated SQL Pool | A provisioned data warehouse service optimized for high-performance analytics. |
| What is a Synapse pipeline | A workflow in Azure Synapse that integrates data processes and analytics tasks. |
| What is an Event Hub used for in Azure data processing | To ingest large volumes of data streams in real time for analytics or storage. |
| What is an Integration Runtime in Azure Data Factory | The compute infrastructure used to provide data movement and transformation. |
| What is Apache Spark used for in Azure Databricks | A distributed data processing engine for big data analytics and machine learning. |
| What is AutoML in Azure Databricks | An automated machine learning feature to build predictive models with minimal coding. |
| What is Azure Data Factory | A data integration service for creating workflows to move and transform data at scale. |
| What is Azure Data Lake Storage Gen2 | A scalable data storage solution optimized for big data analytics with hierarchical namespace support. |
| What is Azure Databricks | A collaborative Apache Spark-based analytics platform for big data. |
| What is Azure Synapse Analytics | A cloud analytics service for big data and data warehousing. |
| What is Delta Lake in Azure Databricks | A storage layer that provides ACID transactions and reliable data lakes. |
| What is Incremental Processing in Azure Data Factory | A method to process only the new or updated data, optimizing pipeline performance. |
| What is Mapping Data Flow Debug in Azure Data Factory | A feature to preview and test your transformations before executing the pipeline. |
| What is Row-Level Security in Synapse Dedicated SQL Pool | A feature to restrict data access based on user roles for enhanced security. |
| What is Serverless SQL Pool in Azure Synapse | A distributed query engine that allows on-demand querying of data in a data lake without prior data preparation. |
| What is SQL On-demand analytics in Azure Synapse | A feature to run T-SQL queries on data in Azure Data Lake with no infrastructure setup. |
| What is the Data Explorer in Azure Synapse | A tool for interactive and fast analytics on log and telemetry data in storage. |
| What is the Data Flow Designer in Azure Data Factory | A design interface for visually building and configuring data transformation logic. |
| What is the difference between Copy Activity and Data Flow in Azure Data Factory | Copy Activity moves data between sources, while Data Flow transforms data at scale. |
| What is the difference between Linked Services and Datasets in Azure Data Factory | Linked Services define connectivity to external resources, while Datasets represent the data structure within those connections. |
| What is the purpose of a Data Flow in Azure Data Factory | To perform data transformations without writing code. |
| What is the purpose of a Workspace in Azure Databricks | To organize and collaborate on notebooks, jobs, and libraries. |
| What is the purpose of Integration Pipelines in Azure Synapse | To orchestrate and automate data movement and processing workflows across the platform. |
| What is the purpose of Job Clusters in Azure Databricks | Temporary clusters created for running automated tasks and terminated after execution to save costs. |
| What is the purpose of Triggered Pipelines in Azure Data Factory | To automate data workflows that run based on specific events or schedules. |
| What is the role of Cosmos DB in Azure data processing | A globally distributed, multi-model database for modern app development. |
| What is the Workspace Repository in Azure Databricks used for | To manage notebooks, jobs, and libraries in a centralized repository for collaboration. |
| What tool in Azure Synapse enables Power BI integration | The built-in workspace for seamless visualization and reporting. |
| What type of cluster is used in Azure Databricks for machine learning | GPU-enabled clusters for faster processing of ML tasks. |
| Which programming languages are supported in Azure Databricks notebooks | Python, Scala, R, SQL, and Java. |
This deck delves into data processing tools within Azure, covering services like Azure Synapse Analytics, Azure Databricks, and Data Factory.