Deploying a reliable and scalable machine learning project requires a lot of software engineering and devops work. Mathematical algorithms are typically a small portion of the overall deployment. The data engineer will assist data scientists and machine learning engineers with both coding and devops in order to build and deploy machine learning models.
You are not expected to have machine learning experience but you will be in charge of many of the tasks involved in running machine learning code in production. This will include implementing distributed systems such as microservices and task queues, as well as automating various data ingestion and preprocessing steps.
You should be comfortable with day-to-day operations and DBA tasks like launching EC2 instances, installing packages, writing shell scripts, and running SQL migrations.
Design and implement microservices to run machine learning models
Help implement the next version of a distributed task queue for data processing
Support data access layers and data ingestion for PostgreSQL databases
Assist with devops and automation prior to full production releases
Work on any task and help solve problems when needed — be humble and scrappy!
WHAT YOU’LL NEED TO SUCCEED IN THE ROLE
3+ years of experience building backend architecture for distributed systems
Bachelor’s or equivalent degree in computer science, or a related field
Experience with Python, SQL
Comfortable with Linux, BASH, git
Experience working within AWS and AWS technologies (e.g. EC2, RDS, VPC, etc.)
Analytical and Problem Solving Skills
Proven ability to work in a collaborative and fast-paced environment