DAS-C01 | How Many Questions Of DAS-C01 Test Question

Cause all that matters here is passing the Amazon-Web-Services DAS-C01 exam. Cause all that you need is a high score of DAS-C01 AWS Certified Data Analytics - Specialty exam. The only one thing you need to do is downloading Examcollection DAS-C01 exam study guides now. We will not let you down with our money-back guarantee.

Online DAS-C01 free questions and answers of New Version:

NEW QUESTION 1
A company that produces network devices has millions of users. Data is collected from the devices on an hourly basis and stored in an Amazon S3 data lake.
The company runs analyses on the last 24 hours of data flow logs for abnormality detection and to troubleshoot and resolve user issues. The company also analyzes historical logs dating back 2 years to discover patterns and look for improvement opportunities.
The data flow logs contain many metrics, such as date, timestamp, source IP, and target IP. There are about 10 billion events every day.
How should this data be stored for optimal performance?

  • A. In Apache ORC partitioned by date and sorted by source IP
  • B. In compressed .csv partitioned by date and sorted by source IP
  • C. In Apache Parquet partitioned by source IP and sorted by date
  • D. In compressed nested JSON partitioned by source IP and sorted by date

Answer: A

NEW QUESTION 2
A data analytics specialist is setting up workload management in manual mode for an Amazon Redshift environment. The data analytics specialist is defining query monitoring rules to manage system performance and user experience of an Amazon Redshift cluster.
Which elements must each query monitoring rule include?

  • A. A unique rule name, a query runtime condition, and an AWS Lambda function to resubmit any failed queries in off hours
  • B. A queue name, a unique rule name, and a predicate-based stop condition
  • C. A unique rule name, one to three predicates, and an action
  • D. A workload name, a unique rule name, and a query runtime-based condition

Answer: C

NEW QUESTION 3
A company developed a new elections reporting website that uses Amazon Kinesis Data Firehose to deliver full logs from AWS WAF to an Amazon S3 bucket. The company is now seeking a low-cost option to perform this infrequent data analysis with visualizations of logs in a way that requires minimal development effort.
Which solution meets these requirements?

  • A. Use an AWS Glue crawler to create and update a table in the Glue data catalog from the log
  • B. Use Athena to perform ad-hoc analyses and use Amazon QuickSight to develop data visualizations.
  • C. Create a second Kinesis Data Firehose delivery stream to deliver the log files to Amazon Elasticsearch Service (Amazon ES). Use Amazon ES to perform text-based searches of the logs for ad-hoc analyses and use Kibana for data visualizations.
  • D. Create an AWS Lambda function to convert the logs into .csv forma
  • E. Then add the function to the Kinesis Data Firehose transformation configuratio
  • F. Use Amazon Redshift to perform ad-hoc analyses of the logs using SQL queries and use Amazon QuickSight to develop data visualizations.
  • G. Create an Amazon EMR cluster and use Amazon S3 as the data sourc
  • H. Create an Apache Spark job to perform ad-hoc analyses and use Amazon QuickSight to develop data visualizations.

Answer: A

Explanation:
https://aws.amazon.com/blogs/big-data/analyzing-aws-waf-logs-with-amazon-es-amazon-athena-and-amazon-qu

NEW QUESTION 4
A company currently uses Amazon Athena to query its global datasets. The regional data is stored in Amazon S3 in the us-east-1 and us-west-2 Regions. The data is not encrypted. To simplify the query process and manage it centrally, the company wants to use Athena in us-west-2 to query data from Amazon S3 in both Regions. The solution should be as low-cost as possible.
What should the company do to achieve this goal?

  • A. Use AWS DMS to migrate the AWS Glue Data Catalog from us-east-1 to us-west-2. Run Athena queries in us-west-2.
  • B. Run the AWS Glue crawler in us-west-2 to catalog datasets in all Region
  • C. Once the data is crawled, run Athena queries in us-west-2.
  • D. Enable cross-Region replication for the S3 buckets in us-east-1 to replicate data in us-west-2. Once the data is replicated in us-west-2, run the AWS Glue crawler there to update the AWS Glue Data Catalog in us-west-2 and run Athena queries.
  • E. Update AWS Glue resource policies to provide us-east-1 AWS Glue Data Catalog access to us-west-2.Once the catalog in us-west-2 has access to the catalog in us-east-1, run Athena queries in us-west-2.

Answer: B

NEW QUESTION 5
A marketing company has data in Salesforce, MySQL, and Amazon S3. The company wants to use data from these three locations and create mobile dashboards for its users. The company is unsure how it should create the dashboards and needs a solution with the least possible customization and coding.
Which solution meets these requirements?

  • A. Use Amazon Athena federated queries to join the data source
  • B. Use Amazon QuickSight to generate the mobile dashboards.
  • C. Use AWS Lake Formation to migrate the data sources into Amazon S3. Use Amazon QuickSight to generate the mobile dashboards.
  • D. Use Amazon Redshift federated queries to join the data source
  • E. Use Amazon QuickSight to generate the mobile dashboards.
  • F. Use Amazon QuickSight to connect to the data sources and generate the mobile dashboards.

Answer: C

NEW QUESTION 6
An insurance company has raw data in JSON format that is sent without a predefined schedule through an Amazon Kinesis Data Firehose delivery stream to an Amazon S3 bucket. An AWS Glue crawler is scheduled to run every 8 hours to update the schema in the data catalog of the tables stored in the S3 bucket. Data analysts analyze the data using Apache Spark SQL on Amazon EMR set up with AWS Glue Data Catalog as the metastore. Data analysts say that, occasionally, the data they receive is stale. A data engineer needs to provide access to the most up-to-date data.
Which solution meets these requirements?

  • A. Create an external schema based on the AWS Glue Data Catalog on the existing Amazon Redshift cluster to query new data in Amazon S3 with Amazon Redshift Spectrum.
  • B. Use Amazon CloudWatch Events with the rate (1 hour) expression to execute the AWS Glue crawler every hour.
  • C. Using the AWS CLI, modify the execution schedule of the AWS Glue crawler from 8 hours to 1 minute.
  • D. Run the AWS Glue crawler from an AWS Lambda function triggered by an S3:ObjectCreated:* event notification on the S3 bucket.

Answer: D

Explanation:
https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html "you can use a wildcard (for example, s3:ObjectCreated:*) to request notification when an object is created regardless of the API used" "AWS Lambda can run custom code in response to Amazon S3 bucket events. You upload your custom code to AWS Lambda and create what is called a Lambda function. When Amazon S3 detects an event of a specific type (for example, an object created event), it can publish the event to AWS Lambda and invoke your function in Lambda. In response, AWS Lambda runs your function."

NEW QUESTION 7
A medical company has a system with sensor devices that read metrics and send them in real time to an Amazon Kinesis data stream. The Kinesis data stream has multiple shards. The company needs to calculate the average value of a numeric metric every second and set an alarm for whenever the value is above one threshold or below another threshold. The alarm must be sent to Amazon Simple Notification Service (Amazon SNS) in less than 30 seconds.
Which architecture meets these requirements?

  • A. Use an Amazon Kinesis Data Firehose delivery stream to read the data from the Kinesis data stream with an AWS Lambda transformation function that calculates the average per second and sends the alarm to Amazon SNS.
  • B. Use an AWS Lambda function to read from the Kinesis data stream to calculate the average per second and sent the alarm to Amazon SNS.
  • C. Use an Amazon Kinesis Data Firehose deliver stream to read the data from the Kinesis data stream and store it on Amazon S3. Have Amazon S3 trigger an AWS Lambda function that calculates the average per second and sends the alarm to Amazon SNS.
  • D. Use an Amazon Kinesis Data Analytics application to read from the Kinesis data stream and calculatethe average per secon
  • E. Send the results to an AWS Lambda function that sends the alarm to Amazon SNS.

Answer: D

NEW QUESTION 8
A large telecommunications company is planning to set up a data catalog and metadata management for multiple data sources running on AWS. The catalog will be used to maintain the metadata of all the objects stored in the data stores. The data stores are composed of structured sources like Amazon RDS and Amazon Redshift, and semistructured sources like JSON and XML files stored in Amazon S3. The catalog must be updated on a regular basis, be able to detect the changes to object metadata, and require the least possible administration.
Which solution meets these requirements?

  • A. Use Amazon Aurora as the data catalo
  • B. Create AWS Lambda functions that will connect and gather themetadata information from multiple sources and update the data catalog in Auror
  • C. Schedule the Lambda functions periodically.
  • D. Use the AWS Glue Data Catalog as the central metadata repositor
  • E. Use AWS Glue crawlers to connect to multiple data stores and update the Data Catalog with metadata change
  • F. Schedule the crawlers periodically to update the metadata catalog.
  • G. Use Amazon DynamoDB as the data catalo
  • H. Create AWS Lambda functions that will connect and gather the metadata information from multiple sources and update the DynamoDB catalo
  • I. Schedule the Lambda functions periodically.
  • J. Use the AWS Glue Data Catalog as the central metadata repositor
  • K. Extract the schema for RDS and Amazon Redshift sources and build the Data Catalo
  • L. Use AWS crawlers for data stored in Amazon S3 to infer the schema and automatically update the Data Catalog.

Answer: D

NEW QUESTION 9
A mobile gaming company wants to capture data from its gaming app and make the data available for analysis immediately. The data record size will be approximately 20 KB. The company is concerned about achieving optimal throughput from each device. Additionally, the company wants to develop a data stream processing application with dedicated throughput for each consumer.
Which solution would achieve this goal?

  • A. Have the app call the PutRecords API to send data to Amazon Kinesis Data Stream
  • B. Use the enhanced fan-out feature while consuming the data.
  • C. Have the app call the PutRecordBatch API to send data to Amazon Kinesis Data Firehos
  • D. Submit a support case to enable dedicated throughput on the account.
  • E. Have the app use Amazon Kinesis Producer Library (KPL) to send data to Kinesis Data Firehos
  • F. Use the enhanced fan-out feature while consuming the data.
  • G. Have the app call the PutRecords API to send data to Amazon Kinesis Data Stream
  • H. Host the stream- processing application on Amazon EC2 with Auto Scaling.

Answer: A

Explanation:
https://docs.aws.amazon.com/streams/latest/dev/enhanced-consumers.html

NEW QUESTION 10
A US-based sneaker retail company launched its global website. All the transaction data is stored in Amazon RDS and curated historic transaction data is stored in Amazon Redshift in the us-east-1 Region. The business intelligence (BI) team wants to enhance the user experience by providing a dashboard for sneaker trends.
The BI team decides to use Amazon QuickSight to render the website dashboards. During development, a team in Japan provisioned Amazon QuickSight in ap-northeast-1. The team is having difficulty connecting Amazon QuickSight from ap-northeast-1 to Amazon Redshift in us-east-1.
Which solution will solve this issue and meet the requirements?

  • A. In the Amazon Redshift console, choose to configure cross-Region snapshots and set the destination Region as ap-northeast-1. Restore the Amazon Redshift Cluster from the snapshot and connect to Amazon QuickSight launched in ap-northeast-1.
  • B. Create a VPC endpoint from the Amazon QuickSight VPC to the Amazon Redshift VPC so Amazon QuickSight can access data from Amazon Redshift.
  • C. Create an Amazon Redshift endpoint connection string with Region information in the string and use this connection string in Amazon QuickSight to connect to Amazon Redshift.
  • D. Create a new security group for Amazon Redshift in us-east-1 with an inbound rule authorizing access from the appropriate IP address range for the Amazon QuickSight servers in ap-northeast-1.

Answer: B

NEW QUESTION 11
A company has an application that ingests streaming data. The company needs to analyze this stream over a 5-minute timeframe to evaluate the stream for anomalies with Random Cut Forest (RCF) and summarize the current count of status codes. The source and summarized data should be persisted for future use.
Which approach would enable the desired outcome while keeping data persistence costs low?

  • A. Ingest the data stream with Amazon Kinesis Data Stream
  • B. Have an AWS Lambda consumer evaluate the stream, collect the number status codes, and evaluate the data against a previously trained RCF mode
  • C. Persist the source and results as a time series to Amazon DynamoDB.
  • D. Ingest the data stream with Amazon Kinesis Data Stream
  • E. Have a Kinesis Data Analytics application evaluate the stream over a 5-minute window using the RCF function and summarize the count of status code
  • F. Persist the source and results to Amazon S3 through output delivery to Kinesis Data Firehouse.
  • G. Ingest the data stream with Amazon Kinesis Data Firehose with a delivery frequency of 1 minute or 1 MB in Amazon S3. Ensure Amazon S3 triggers an event to invoke an AWS Lambda consumer that evaluates the batch data, collects the number status codes, and evaluates the data against a previouslytrained RCF mode
  • H. Persist the source and results as a time series to Amazon DynamoDB.
  • I. Ingest the data stream with Amazon Kinesis Data Firehose with a delivery frequency of 5 minutes or 1 MB into Amazon S3. Have a Kinesis Data Analytics application evaluate the stream over a 1-minute window using the RCF function and summarize the count of status code
  • J. Persist the results to Amazon S3 through a Kinesis Data Analytics output to an AWS Lambda integration.

Answer: B

NEW QUESTION 12
A global company has different sub-organizations, and each sub-organization sells its products and services in various countries. The company's senior leadership wants to quickly identify which sub-organization is the strongest performer in each country. All sales data is stored in Amazon S3 in Parquet format.
Which approach can provide the visuals that senior leadership requested with the least amount of effort?

  • A. Use Amazon QuickSight with Amazon Athena as the data sourc
  • B. Use heat maps as the visual type.
  • C. Use Amazon QuickSight with Amazon S3 as the data sourc
  • D. Use heat maps as the visual type.
  • E. Use Amazon QuickSight with Amazon Athena as the data sourc
  • F. Use pivot tables as the visual type.
  • G. Use Amazon QuickSight with Amazon S3 as the data sourc
  • H. Use pivot tables as the visual type.

Answer: A

NEW QUESTION 13
A transport company wants to track vehicular movements by capturing geolocation records. The records are 10 B in size and up to 10,000 records are captured each second. Data transmission delays of a few minutes are acceptable, considering unreliable network conditions. The transport company decided to use Amazon Kinesis Data Streams to ingest the data. The company is looking for a reliable mechanism to send data to Kinesis Data Streams while maximizing the throughput efficiency of the Kinesis shards.
Which solution will meet the company’s requirements?

  • A. Kinesis Agent
  • B. Kinesis Producer Library (KPL)
  • C. Kinesis Data Firehose
  • D. Kinesis SDK

Answer: B

NEW QUESTION 14
A hospital is building a research data lake to ingest data from electronic health records (EHR) systems from multiple hospitals and clinics. The EHR systems are independent of each other and do not have a common patient identifier. The data engineering team is not experienced in machine learning (ML) and has been asked to generate a unique patient identifier for the ingested records.
Which solution will accomplish this task?

  • A. An AWS Glue ETL job with the FindMatches transform
  • B. Amazon Kendra
  • C. Amazon SageMaker Ground Truth
  • D. An AWS Glue ETL job with the ResolveChoice transform

Answer: A

Explanation:
Matching Records with AWS Lake Formation FindMatches

NEW QUESTION 15
A financial company uses Apache Hive on Amazon EMR for ad-hoc queries. Users are complaining of sluggish performance.
A data analyst notes the following:
DAS-C01 dumps exhibit Approximately 90% of queries are submitted 1 hour after the market opens.
DAS-C01 dumps exhibit Hadoop Distributed File System (HDFS) utilization never exceeds 10%.
Which solution would help address the performance issues?

  • A. Create instance fleet configurations for core and task node
  • B. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch CapacityRemainingGB metri
  • C. Create an automatic scaling policy to scale in the instance fleet based on the CloudWatch CapacityRemainingGB metric.
  • D. Create instance fleet configurations for core and task node
  • E. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch YARNMemoryAvailablePercentage metri
  • F. Create an automatic scaling policy to scale in the instance fleet based on the CloudWatch YARNMemoryAvailablePercentage metric.
  • G. Create instance group configurations for core and task node
  • H. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch CapacityRemainingGB metri
  • I. Create anautomatic scaling policy to scale in the instance groups based on the CloudWatch CapacityRemainingGB metric.
  • J. Create instance group configurations for core and task node
  • K. Create an automatic scaling policy to scale out the instance groups based on the Amazon CloudWatch YARNMemoryAvailablePercentage metri
  • L. Create an automatic scaling policy to scale in the instance groups based on the CloudWatch YARNMemoryAvailablePercentage metric.

Answer: D

Explanation:
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-instances-guidelines.html

NEW QUESTION 16
A mortgage company has a microservice for accepting payments. This microservice uses the Amazon DynamoDB encryption client with AWS KMS managed keys to encrypt the sensitive data before writing the data to DynamoDB. The finance team should be able to load this data into Amazon Redshift and aggregate the values within the sensitive fields. The Amazon Redshift cluster is shared with other data analysts from different business units.
Which steps should a data analyst take to accomplish this task efficiently and securely?

  • A. Create an AWS Lambda function to process the DynamoDB strea
  • B. Decrypt the sensitive data using the same KMS ke
  • C. Save the output to a restricted S3 bucket for the finance tea
  • D. Create a finance table in Amazon Redshift that is accessible to the finance team onl
  • E. Use the COPY command to load the data from Amazon S3 to the finance table.
  • F. Create an AWS Lambda function to process the DynamoDB strea
  • G. Save the output to a restricted S3 bucket for the finance tea
  • H. Create a finance table in Amazon Redshift that is accessible to the finance team onl
  • I. Use the COPY command with the IAM role that has access to the KMS key to load the data from S3 to the finance table.
  • J. Create an Amazon EMR cluster with an EMR_EC2_DefaultRole role that has access to the KMS key.Create Apache Hive tables that reference the data stored in DynamoDB and the finance table in Amazon Redshif
  • K. In Hive, select the data from DynamoDB and then insert the output to the finance table in Amazon Redshift.
  • L. Create an Amazon EMR cluste
  • M. Create Apache Hive tables that reference the data stored inDynamoD
  • N. Insert the output to the restricted Amazon S3 bucket for the finance tea
  • O. Use the COPY command with the IAM role that has access to the KMS key to load the data from Amazon S3 to the finance table in Amazon Redshift.

Answer: B

NEW QUESTION 17
A global pharmaceutical company receives test results for new drugs from various testing facilities worldwide. The results are sent in millions of 1 KB-sized JSON objects to an Amazon S3 bucket owned by the company. The data engineering team needs to process those files, convert them into Apache Parquet format, and load them into Amazon Redshift for data analysts to perform dashboard reporting. The engineering team uses AWS Glue to process the objects, AWS Step Functions for process orchestration, and Amazon CloudWatch for job scheduling.
More testing facilities were recently added, and the time to process files is increasing. What will MOST efficiently decrease the data processing time?

  • A. Use AWS Lambda to group the small files into larger file
  • B. Write the files back to Amazon S3. Process the files using AWS Glue and load them into Amazon Redshift tables.
  • C. Use the AWS Glue dynamic frame file grouping option while ingesting the raw input file
  • D. Process the files and load them into Amazon Redshift tables.
  • E. Use the Amazon Redshift COPY command to move the files from Amazon S3 into Amazon Redshift tables directl
  • F. Process the files in Amazon Redshift.
  • G. Use Amazon EMR instead of AWS Glue to group the small input file
  • H. Process the files in Amazon EMR and load them into Amazon Redshift tables.

Answer: A

NEW QUESTION 18
An IoT company wants to release a new device that will collect data to track sleep overnight on an intelligent mattress. Sensors will send data that will be uploaded to an Amazon S3 bucket. About 2 MB of data is generated each night for each bed. Data must be processed and summarized for each user, and the results need to be available as soon as possible. Part of the process consists of time windowing and other functions. Based on tests with a Python script, every run will require about 1 GB of memory and will complete within a couple of minutes.
Which solution will run the script in the MOST cost-effective way?

  • A. AWS Lambda with a Python script
  • B. AWS Glue with a Scala job
  • C. Amazon EMR with an Apache Spark script
  • D. AWS Glue with a PySpark job

Answer: A

NEW QUESTION 19
A data engineering team within a shared workspace company wants to build a centralized logging system for all weblogs generated by the space reservation system. The company has a fleet of Amazon EC2 instances that process requests for shared space reservations on its website. The data engineering team wants to ingest all weblogs into a service that will provide a near-real-time search engine. The team does not want to manage the maintenance and operation of the logging system.
Which solution allows the data engineering team to efficiently set up the web logging system within AWS?

  • A. Set up the Amazon CloudWatch agent to stream weblogs to CloudWatch logs and subscribe the Amazon Kinesis data stream to CloudWatc
  • B. Choose Amazon Elasticsearch Service as the end destination of the weblogs.
  • C. Set up the Amazon CloudWatch agent to stream weblogs to CloudWatch logs and subscribe the Amazon Kinesis Data Firehose delivery stream to CloudWatc
  • D. Choose Amazon Elasticsearch Service as the end destination of the weblogs.
  • E. Set up the Amazon CloudWatch agent to stream weblogs to CloudWatch logs and subscribe the Amazon Kinesis data stream to CloudWatc
  • F. Configure Splunk as the end destination of the weblogs.
  • G. Set up the Amazon CloudWatch agent to stream weblogs to CloudWatch logs and subscribe the Amazon Kinesis Firehose delivery stream to CloudWatc
  • H. Configure Amazon DynamoDB as the end destinationof the weblog

Answer: B

Explanation:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_ES_Stream.html

NEW QUESTION 20
A company is migrating from an on-premises Apache Hadoop cluster to an Amazon EMR cluster. The cluster runs only during business hours. Due to a company requirement to avoid intraday cluster failures, the EMR cluster must be highly available. When the cluster is terminated at the end of each business day, the data must persist.
Which configurations would enable the EMR cluster to meet these requirements? (Choose three.)

  • A. EMR File System (EMRFS) for storage
  • B. Hadoop Distributed File System (HDFS) for storage
  • C. AWS Glue Data Catalog as the metastore for Apache Hive
  • D. MySQL database on the master node as the metastore for Apache Hive
  • E. Multiple master nodes in a single Availability Zone
  • F. Multiple master nodes in multiple Availability Zones

Answer: ACE

Explanation:
https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-ha.html "Note : The cluster can reside only in one Availability Zone or subnet."

NEW QUESTION 21
......

P.S. 2passeasy now are offering 100% pass ensure DAS-C01 dumps! All DAS-C01 exam questions have been updated with correct answers: https://www.2passeasy.com/dumps/DAS-C01/ (130 New Questions)