Software Data Engineer evolving data models and operationalizing data pipelines at BenchSci. Collaborating with a world-class team to shape the future of scientific discovery.
Responsibilities
Collaborate with Machine Learning, Full-stack engineers and Science to solve complex document mining challenges, helping us capture and model additional scientific experiments
Scale data pipelines to allow our data to go from research to platform quickly and reliably
Work with sources that contain both semi-structured and unstructured data
Use your experience to help define and apply best practices for a broad platform of technologies in a cloud-based environment
Architect and maintain robust data pipelines that ingest diverse sources and utilize LLMs for high-fidelity entity extraction into structured formats
Implement evaluation frameworks to monitor the accuracy, drift, and hallucination rates of extraction models within the production pipeline.
Lead or consult the authoring of engineering design proposals following the unified Platform Stream roadmap at BenchSci
Leverage a deep understanding of the business context and the team’s goals to unlock independent technical decisions in the face of open-ended requirements
Proactively identify new opportunities (from both internal and external sources) and advocate for and implement improvements to the current state of projects
Respond with urgency and drive urgency in own team to operational issues, owning resolution within one's sphere of responsibility
Challenge the status quo and propose newer technologies or ways of working
Requirements
A degree in Computer Science/Engineering or a related field within science
3+ years experience working as a software developer in the industry
Proficient with Python
Proficient with SQL
Experience using LLMs for structured data extraction
Experience with event-driven architecture with Pub/Sub
A track record in building high-quality, maintainable code
Experience with cloud computing (for example: GCP, Azure, AWS)
Benefits
A great compensation package that includes BenchSci equity options
A robust vacation policy plus an additional vacation day every year
Company closures for 14 more days throughout the year
Flex time for sick days, personal days, and religious holidays
Comprehensive health and dental benefits
Annual learning & development budget
A one-time home office set-up budget to use upon joining BenchSci
An annual lifestyle spending account allowance
Generous parental leave benefits with a top-up plan or paid time off options
The ability to save for your retirement coupled with a company match!
Data Engineer helping to improve ETL processes for investment analyses at The Battle of Giants. Collaborating directly with leadership to shape strategies and insights.
Data Engineer at Tiger Analytics architecting scalable Generative AI solutions in the AWS ecosystem for Fortune 500 partners. Joining a team with deep expertise in Data Science and Machine Learning.
Senior Information Architect/Data Engineer working with a global software services provider. Leading the architecture of a new cloud data platform for innovative technology solutions.
Senior Software Developer modernizing Data Transfer Platform for Intrahealth, a healthcare EMR provider. Focusing on scalable and configurable backend systems in a complex environment.
Data Engineer Intern gaining hands - on experience in TD's big data platform. Collaborating on software development and system enhancements while learning about analytical tools and technologies.
Senior Data Engineer at Mozilla managing data lifecycle and quality. Building data pipelines and collaborating with product teams for data - driven decisions.
Principal Product Manager leading product strategy for health data platform at PointClickCare. Collaborating across teams to optimize health data for analytics and care delivery.
Data Engineer optimizing and maintaining data pipelines in Blackline Safety's IoT - enabled safety ecosystem. Collaborating with product, engineering, and analytics teams on impactful data - driven initiatives.
Azure Data Engineer contractor for Ontario Crown Corporation. Design/build data pipelines using Azure Data Factory, Databricks, Python, PySpark. 3 days/week onsite in Oshawa.