Automating Data Pipelines ★★★ Expert Level
This two-day course will provide a way of thinking and best practices for automating data pipelines from scratch, starting with the design then continuing with implementing orchestration and making scripts automatable. It concludes with the components you need to monitor and manage an automated data pipeline, showing options on how to implement those components.
Course Badge
2 days
Recommended Level
Upcoming courses
Currently there are no scheduled dates for this course. To be notified about upcoming dates, please choose "Reserve a seat".
Select tickets
We're sorry, but all tickets sales have ended because the event is expired.

*If you are a group of 5 or more, we are happy to accommodate a date for the training that suits you best. If so, please choose the "Reserve a seat" option.

Automating Data Pipelines

About the course

Automating your data pipeline ensures its speed and quality. This two-day module will teach you how to design a data pipeline that you can apply to all work which involves processing data. We will cover everything from the basics of creating automatable data pipelines, to orchestrating scripts and monitoring an automated data pipeline. You will learn the management of data pipelines including the why, when, and how of making code idempotent and improving scripts. All of this is combined with implementing basic logging and validation checks, so you walk away with a complete understanding and the confidence to apply your new knowledge to your business. This course follows a five-step structured approach, combining theory with case studies so participants gain both a theoretical understanding and the experience and confidence to use their knowledge within the business. After completing this course, participants will be able to successfully design and develop an automated data pipeline with the appropriate quality control measures to monitor its performance over time.  

Why this is for you

Do you still manually run scripts or perform checks in a periodic data process? Do you feel like you could improve the speed of your data pipeline by redesigning it? Do you run into bugs with your data pipeline? This course will teach you an improved way of thinking when it comes to designing, developing, and monitoring data pipelines and give you hands-on experience with widely used tools to achieve this. Reduce the time spent on doing things manually that could be automated, and learn how to manage them well.  

For whom

This course is designed for AI Engineers, Data Engineers, and Data Scientists who have experience with manipulating data and programming and are looking to automate the flow of their data from source to models and applications. Before signing up for this course we require you to have completed both the Data Models and Manipulation (4204) and Programming Meta-Skills (4205) badges. Expert programming in SQL and Python is also required as a prerequisite as both languages are used on advanced levels in the cases during this course.  

What you’ll learn

  1. Designing an E2E automated data pipeline for an E2E AI solution
  2. Defining and orchestrating components to create an automated data pipeline
  3. Ways of changing and improving scripts to make them automatable
  4. Managing quality of automated data pipelines
  5. Implementing quality control measures in your data pipelines
Learning Goals
  • Design data pipelines – Based on design and data flow requirements for E2E AI solution
  • Orchestrate scripts – Automate data pipelines and define requirements from each component
  • Write automatable code – Change and improve scripts to make them automatable
  • Manage quality of automated data pipelines – Define components to monitor data pipeline quality and plan how to act on irregularities
  Theory and practical use All trainings in the GAIn portfolio combine high-quality standardized training material with theory sessions from experts and hands-on experience where you directly apply the material to real-life cases. Each training is developed by top of the field practitioners which means they are full of industry examples along with practical challenges and know-how, fueling the interactive discussions during training. We believe this multi-level approach creates the ideal learning environment for participants to thrive.