Software Data Engineer evolving data models and operationalizing data pipelines at BenchSci. Collaborating with a world-class team to shape the future of scientific discovery.
Responsibilities
Collaborate with Machine Learning, Full-stack engineers and Science to solve complex document mining challenges, helping us capture and model additional scientific experiments
Scale data pipelines to allow our data to go from research to platform quickly and reliably
Work with sources that contain both semi-structured and unstructured data
Use your experience to help define and apply best practices for a broad platform of technologies in a cloud-based environment
Architect and maintain robust data pipelines that ingest diverse sources and utilize LLMs for high-fidelity entity extraction into structured formats
Implement evaluation frameworks to monitor the accuracy, drift, and hallucination rates of extraction models within the production pipeline.
Lead or consult the authoring of engineering design proposals following the unified Platform Stream roadmap at BenchSci
Leverage a deep understanding of the business context and the team’s goals to unlock independent technical decisions in the face of open-ended requirements
Proactively identify new opportunities (from both internal and external sources) and advocate for and implement improvements to the current state of projects
Respond with urgency and drive urgency in own team to operational issues, owning resolution within one's sphere of responsibility
Challenge the status quo and propose newer technologies or ways of working
Requirements
A degree in Computer Science/Engineering or a related field within science
3+ years experience working as a software developer in the industry
Proficient with Python
Proficient with SQL
Experience using LLMs for structured data extraction
Experience with event-driven architecture with Pub/Sub
A track record in building high-quality, maintainable code
Experience with cloud computing (for example: GCP, Azure, AWS)
Benefits
A great compensation package that includes BenchSci equity options
A robust vacation policy plus an additional vacation day every year
Company closures for 14 more days throughout the year
Flex time for sick days, personal days, and religious holidays
Comprehensive health and dental benefits
Annual learning & development budget
A one-time home office set-up budget to use upon joining BenchSci
An annual lifestyle spending account allowance
Generous parental leave benefits with a top-up plan or paid time off options
The ability to save for your retirement coupled with a company match!
Data Engineer building scalable databases and components for regulatory and business analytics at Bounteous. Collaborating with global teams and following agile methodologies.
AWS Data Engineer working on AWS applications in a data intensive environment. Focusing on technical design, development, and maintenance of Cloud applications with long term growth opportunities.
Senior Data Engineer building scalable data infrastructure on Snowflake and AWS for a dynamic iGaming startup. Collaborating closely with analysts to turn raw data into trustworthy data products and insights.
Senior Data Engineer responsible for designing data pipelines at Samsara for IoT data processing. Collaborating with Data teams to ensure efficient data analysis and integration from IoT devices.
Software Engineering Intern building scalable data platforms at a venture - backed startup. Contributing to data processing for high - volume telematics data in Toronto.
Technology Designer at iA Financial Group, designing and evolving robust data platforms and DataOps practices. Collaborating with teams to implement cloud infrastructures and CI/CD pipelines.
Senior Data Engineer developing a trustworthy analytics data layer for commonsku's platform. Leading projects and mentoring team members in a remote - first work environment.
Data Engineer III developing scalable data infrastructure and systems for Ad Hoc, a tech company transforming public - sector digital services. Collaborating with teams to drive data engineering improvements.