About the role

Data Scientist specializing in LLMs at Tekever. Developing algorithms and innovative solutions for human language processing.

Responsibilities

Develop, implement and optimize advanced algorithms, models and capabilities that help teams automate their workloads.
Work on a variety of projects that involve understanding, processing and generating human language to solve complex problems and create innovative solutions.
Design, develop and implement state-of-the-art algorithms and models, within the context of language models.
Realize new AI-based capabilities in areas such as decision support, mission planning, workflow automation.
Train and optimize large language models using vast amounts of textual data, ensuring high performance and accuracy.
Perform data preprocessing tasks such as tokenization, stemming, lemmatization and normalization to prepare datasets for training and evaluation.
Stay current with the latest advancements in LLM and Natural Language Processing (NLP) and apply new techniques to improve existing models and develop new solutions.
Work closely with data engineers, software developers, product managers and other stakeholders to understand project requirements and deliver effective solutions.
Evaluate the performance of models using appropriate metrics and techniques and iteratively improve their accuracy and efficiency.
Collaborate with engineering teams to deploy models into production environments and ensure their robustness and scalability.
Maintain comprehensive documentation of models, algorithms and processes for future reference and reproducibility.

Requirements

Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field. A Ph.D. is a plus.
3+ years of experience in data science, with a focus on large language models and NLP.
Strong programming skills in Python, with experience using NLP and LLM libraries such as spaCy, Hugging Face (Transformers, Datasets, PEFT, TRL) and the major model families (e.g. GPT, Claude, Gemini, Llama, Mistral, Qwen, Gemma) via both API and open weights.
Proficiency in deep learning frameworks, primarily PyTorch (plus Keras/TensorFlow as needed), and familiarity with inference optimisation (quantisation, TensorRT-LLM).
Experience with data preprocessing , curation and tokenisation for LLM workloads, including building and cleaning datasets for fine-tuning and retrieval (chunking, embeddings, deduplication, synthetic data generation).
Solid understanding of transformer architectures and attention, with working knowledge of fine-tuning and alignment techniques (full fine-tuning, LoRA/QLoRA, instruction tuning, RLHF/DPO).
Exposure to RNNs and CNNs is a plus rather than a core requirement.
Experience training and fine-tuning LLMs and building RAG and agentic systems, including orchestration frameworks (LangChain, LlamaIndex, LangGraph), vector databases (e.g. Qdrant, Weaviate, pgvector) and tool/function calling.
Experience with experimentation and tracking tooling : Jupyter notebooks plus experiment and prompt tracking (MLflow, Weights & Biases) and LLM evaluation (e.g. Ragas, LangSmith/Langfuse, custom eval harnesses).
Familiarity with cloud platforms (AWS, Azure, Google Cloud) and their AI services, with a focus on Google Cloud (Vertex AI, model garden, managed endpoints).
Experience deploying self-hosted and open-weight LLMs in production, using serving frameworks such as vLLM, TGI, Ollama or llama.cpp, with awareness of GPU sizing, quantisation formats (GGUF, AWQ, GPTQ) and on-prem or airgapped constraints.
Working knowledge of MLOps/LLMOps and DevOps practices: Git, CI/CD, containerisation (Docker, Kubernetes), plus telemetry, monitoring and observability for model and inference performance.
Excellent analytical and problem-solving skills with the ability to design innovative solutions to complex problems.
Experience or awareness of AI ethics, fairness and bias mitigation strategies, in the context of NLP and LLMs.
Strong verbal and written communication skills, with the ability to explain complex technical concepts to non-technical stakeholders.
Ability to work effectively in a collaborative, cross-functional team environment.
High attention to detail and a commitment to ensuring the accuracy and quality of work.
Ability to thrive in a fast-paced, dynamic environment and manage multiple projects simultaneously.

Benefits

An excellent work environment and an opportunity to create a real impact in the world
A truly high-tech, state-of-the-art engineering company with flat structure and no politics
Working with the very latest technologies in Data & AI, including Edge AI, Swarming - both within our software platforms and within our embedded on-board systems
Flexible work arrangements
Professional development opportunities
Collaborative and inclusive work environment
Salary compatible with the level of proven experience

Data Scientist – LLM Engineer

at TEKEVER

Resume Score

About the role

Responsibilities

Requirements

Benefits

Job title

Job type

Experience level

Salary

Degree requirement

Tech skills

Location requirements

Report this job

Similar roles

Data Scientist

Clio - Cloud-Based Legal Technology

Lead Data Manager – Clinical Trials

PSI CRO AG

Data Manager – Clinical Trials

PSI CRO AG

Data Scientist

Borrowell

Head of Analytics

Stellartech Research Corporation

Senior Data Scientist

Northbeam

Senior Product Data Scientist

MaintainX

Director, Data Management Office (Wealth Americas)

LinkedIn Recruiter Post

Senior Data Scientist, Sales

Wealthsimple

Principal Data Scientist – Statistical Methodology

Roche