Data Engineer focused on developing and sustaining Machine Learning solutions for ICBC's data needs. Collaborating with Data Scientists and Statistical Analysts on data-driven projects.
Responsibilities
Understanding Data Science, Machine Learning, Performance & Evaluative Analytics model requirements, working closely with Data Scientists & Statistical Analysts, supporting them with their data and Machine Learning operational needs.
Operationalizing Data Science Model into Machine Learning pipelines, applying coding optimization of the data science models, conducting model training and re-training, deploying the models and sustaining them in Production.
Responding to data requests, data discovery and data profiling to support various data science, evaluative and machine learning solutions and projects, reviewing and clarifying data requirements, ensuring the data artifacts are acceptable within policy and privacy protocols.
Providing subject matter & data expertise to the Strategic Analytics, Actuarial and Regulatory Affairs departs as well as ICBC divisional clients on data sources, reporting workflows, business process, and the appropriate tools with which to analyze their data.
Participating with corporate data user teams, developing data science model validation and test plans, performing user acceptance testing, and providing support to data scientists, evaluative & performance metrics analysts and sustainment of their end products.
Conducting analysis for moderate to complex strategic solutions and POCs, defining data fields and determining data availability, developing information layout, format and interactivity. Presenting findings and providing clarification.
Requirements
Proven work-based experience coding using Python Language and PySpark data framework will be required.
Experience working with ML libraries & frameworks including Scikit-Learn for traditional ML, TensorFlow and PyTorch for deep learning.
Proficiency in Data Science Stack such as NumPy, PySpark and Pandas for data manipulation.
Technical knowledge in cleaning, transforming and preparing un-curated data including handling of values and feature scaling.
Exposure to Machine Learning Operations (MLOps) supporting Model development, skills with Docker for containerization, API development and using cloud platforms.
Knowledge & experience with Machine Learning Algorithms and techniques
Experience or exposure to working with pre-trained models such as Large Language Models (LLM), using Retrieval-Augmented Generation (RAG) and working with HuggingFace pre-trained models
Experience with processing structured and unstructured data.
Intermediate to Advance experience of writing SQL Queries & working with NoSQL Databases
Knowledge of experiment tracking & Management using tools like MLFlow, Data Version Control (DVC), managing model versions, parameters and results.
Pipeline orchestration using Apache AirFlow to automate training, testing and deployment workflows.
Setting up automated pipelines for Continuous integration and continuous deployment (CI/CD) using GitLab.
Excellent interpersonal, verbal and written communication skills to work with customers.
Strong data quality management process understanding, data analysis and data profiling.
Ability to apply critical thinking skills to troubleshoot and perform root cause analysis on technical problems and Machine Learning model deployments.
Understanding of Agile Methodologies.
Experience with reporting and visualization tools, such as Tableau, Jupiter or other reporting tools would be an asset.
Cloud Data Engineer responsible for modern Data & AI solutions on Microsoft Azure. Collaborating with clients and teams to develop production - ready data platforms and support analytics.
Senior Data Engineer at Solana Foundation collaborating with blockchain engineers on data indexing and pipeline creation. Ensuring efficient data processing and metrics formulation for decentralized applications.
Senior Engineer on Data Platform team designing and building systems for data flow at Movable Ink. Collaborating with engineering, analytics, and infrastructure teams to power data ingestion and processing.
Senior Data Engineer responsible for designing and maintaining event streaming pipelines at Movable Ink. Working with modern technologies to enhance data availability and reliability.
Senior Data Engineer architecting and owning Snowflake layer for Knak’s Data Infrastructure and AI enablement. Collaborating across departments to ensure data accessibility and governance standards.
Data Engineer designing and implementing cloud - native data ecosystem for sports analytics. Building scalable infrastructure to transform raw data into valuable consent assets.
Data Engineer owning infrastructure that turns raw events from mobile users into trustworthy data. Building scalable data architecture and collaborating with cross - functional teams for data management.
Data Architect engaging with companies on transformational data programs to enhance AI and data capabilities. Leading architectural frameworks and mentoring data teams against industry best practices.
ML Data Engineer responsible for designing and developing AI platforms at Newfold Digital. Collaborating across teams to integrate and optimize data sources for AI - driven applications.