Data Scientist specializing in data architecture and ETL workflows at Xsolla. Collaborating with engineering and data teams to optimize data processes for game developers.
Responsibilities
Architecture & Development
Design, build, and optimize data pipelines and ETL workflows in Snowflake using Snowpark, Streams/Tasks, and Snowpipe.
Develop scalable data models, Algorithm supporting user 360 views, churn prediction, and recommendation engine inputs.
Lead integration across data sources: MySQL, BigQuery, Redis, Kafka, GCP Storage, and API Gateway.
Implement CI/CD for data pipelines using Git, dbt, and automated testing.
Define data quality checks and auditing pipelines for ingestion and transformation layers.
Leadership & Collaboration
Mentor and guide junior data engineers on data modeling, performance tuning, and Snowflake best practices.
Partner with Data Science, ML, and Backend teams to productionize machine learning features in Snowflake.
Work closely with Legal, Security, and Infrastructure teams to ensure compliance, privacy, and governance of user data (PII).
Collaborate with the Director of Data Platforms and product stakeholders to translate business requirements into technical specifications.
Performance & Scalability
Tune algorithm performance.
Establish data partitioning, clustering, and materialized views for fast query execution.
Build dashboards and monitors for pipeline health, job success, and data latency metrics (e.g., via Looker, Tableau, or Snowsight).
Governance & Best Practices
Establish and enforce naming conventions, data lineage, and metadata standards across schemas.
Lead code reviews, enforce documentation standards, and manage schema versioning.
Contribute to the company’s evolving data mesh and streaming architecture vision.
Requirements
5+ years of experience in Data Scientist, with **3+ years in Spark framework**.
Strong SQL and Python skills, with proven experience building **ETL/ELT** at scale.
Deep understanding of algorithm** performance tuning**, **query optimization**, and **warehouse orchestration**.
Experience with **data pipeline orchestration** (Airflow, Prefect, dbt, or similar).
Solid understanding of **data modeling** (Kimball, Data Vault, or hybrid).
Proficiency in **Kafka**, **GCP**, or **AWS** for real-time or batch ingestion.
Familiarity with **API-based data integration** and **microservice architectures**.
**Preferred**
Experience lead **machine learning teams** or/and deploying **ML feature pipelines**.
Background in **ad-tech, gaming, or e-commerce** recommendation systems.
Familiarity with **data contracts** and **feature stores** (Feast, Tecton, or custom-built).
Experience managing small data engineering teams and setting technical direction.
Strong ownership and ability to work autonomously in a fast-paced environment.
Excellent cross-functional communication — can translate between engineering and business.
Hands-on problem solver who balances velocity with reliability.
Collaborative mentor who raises the bar for team quality and discipline
Senior / Principal Data Scientist role focused on building and scaling GenAI and Agentic AI systems in production. Location: Toronto (Hybrid). Compensation up to $200K - 225k base.
Senior Data Science Manager leading a team of data scientists to drive marketing analytics at Reddit. Focused on strategic insights and data - driven decision making for marketing efforts.
Senior Data Scientist enhancing Reddit's user experience through data - driven projects. Collaborating with teams to analyze and model data for insights and strategies.
Data Scientist driving product strategy and user engagement at Reddit. Analyzing data and collaborating with teams to enhance user experiences in online communities.
Staff Data Scientist leading strategy and execution for marketing intelligence efforts at Reddit. Focus on optimizing marketing spend and measuring effectiveness for long - term growth.
Data Scientist on Aviva's Fraud Data Science team transforming large datasets into actionable insights. Designing and deploying machine learning models to enhance fraud detection capabilities.
Data Science Student undertaking challenging work assignments in analytics and machine learning at PCL. Collaborating under supervision in a supportive environment for personal and professional growth.