by Justin Cook
Hi All,
As we know, the new certification came out this month for the AWS Data Engineer. This associate-level certification didnโt have a lot of documentation out there since it was new but here is what breakdown before and after.
Before I took it I focused on this:
โข Developing data ingestion and transformation techniques and coordinating data pipelines with programming concepts.
โข Identifying the most effective data store, creating models, organizing schemas, and managing lifecycles.
โข Maintaining, operating, and monitoring data pipelines. Evaluating data quality and analyzing data.
โข Implementing proper authentication, authorization, data encryption, and governance. Enabling logging for security purposes.
โข Competence in configuring and maintaining extract, transform, and load (ETL) pipelines, covering the entire data journey from ingestion to the destination.
โข Application of high-level, language-agnostic programming concepts tailored to the requirements of the data pipeline.
โข Proficiency in utilizing Git commands for effective source code control, ensuring versioning and collaboration within the development process.
โข Utilizing data lakes to store data.
โข Having general concepts related to networking, storage, and computing, providing a foundation for designing and implementing robust data engineering solutions.
But after taking the exam, I have to say these are the topics that I would focus on instead:
Start with Analytics:
โข Amazon Athena
โข Amazon EMR
โข AWS Glue
โข AWS Glue DataBrew
โข AWS Lake Formation
โข Amazon Kinesis Data Analytics
โข Amazon Kinesis Data Firehose
โข Amazon Kinesis Data Streams
โข Amazon Managed Streaming for Apache Kafka (Amazon MSK)
โข Amazon OpenSearch Service
โข Amazon QuickSight
Then look into databases:
โข Amazon DocumentDB (with MongoDB compatibility)
โข Amazon DynamoDB
โข Amazon Keyspaces (for Apache Cassandra)
โข Amazon MemoryDB for Redis
โข Amazon Neptune
โข Amazon RDS
โข Amazon Redshift
You have to have a rough understanding of Developer Tools:
โข AWS CLI
โข AWS Cloud9
โข AWS Cloud Development Kit (AWS CDK)
โข AWS CodeBuild
โข AWS CodeCommit
โข AWS CodeDeploy
โข AWS CodePipeline
A fundamental understanding of monitoring and governance tools is good as well:
โข AWS CloudFormation
โข AWS CloudTrail
โข Amazon CloudWatch
โข Amazon CloudWatch Logs
โข AWS Config
โข Amazon Managed Grafana
โข AWS Systems Manager
โข AWS Well-Architected Tool
There were quite a few questions about migration services so get to know those:
โข AWS Application Discovery Service
โข AWS Application Migration Service
โข AWS Database Migration Service (AWS DMS)
โข AWS DataSync
โข AWS Schema Conversion Tool (AWS SCT)
โข AWS Snow Family
โข AWS Transfer Family
Odd I have a few questions of backup solutions as well, so get to know these:
โข AWS Backup
โข Amazon Elastic Block Store (Amazon EBS)
โข Amazon Elastic File System (Amazon EFS)
โข Amazon S3
โข Amazon S3 Glacier
Here is the outline: https://aws.amazon.com/certification/certified-data-engineer-associate/
Overall, the exam went fast, and if you know Glue, Data Pipelines, you will need much less than 130 minutes for the 65 questions.
Thanks & Contact Us with any questions!