Main activities and responsibilities:
Define the architecture, scope and deliver various Big Data solutions.
Build scalable ETL pipelines to ingest data from a variety of data sources, identify critical data elements and define data quality rules.
Leverage Hadoop ecosystem knowledge to design and develop capabilities to deliver innovative and improved data solutions.
Provide insights on area of improvements including Data Governance, best practices, large scale processing and anything else you are interested in
Support other teams by providing guidance on data usage, processing and how they can best leverage the platform you are building|
Support the bug fixing and performance analysis along the data pipeline
- 3+ years of experience as software engineer, with strong skills in at least one programming language is mandatory, preferably Python or Java or Scala
- 1+ year of experience as a Big Data Engineer or similar role
- 1+ year of experience with Hadoop and/or Spark or Big Data technologies
- Experience with distributed computing of large clusters
- Proven experience with AWS technologies like S3, EMR, Cloudformation.
- Creative and innovative approach to problem-solvingGood to have:
- Experience with CICD using Jenkins, Terraform or other related technologies
- Familiarity with Docker and Kubernetes
- Experience working with real time data processing using Kafka, Spark Streaming or similar technology
- Experience working with Hive, Presto or other querying frameworks.
What do we offer you?