Bei Roche kannst du ganz du selbst sein und wirst für deine einzigartigen Qualitäten geschätzt. Unsere Kultur fördert persönlichen Ausdruck, offenen Dialog und echte Verbindungen. Hier wirst du für das, was du bist, wertgeschätzt, akzeptiert und respektiert. Dies schafft ein Umfeld, in dem du sowohl persönlich als auch beruflich wachsen kannst. Gemeinsam wollen wir Krankheiten vorbeugen, stoppen und heilen und sicherstellen, dass jeder Zugang zur Gesundheitsversorgung hat – heute und in Zukunft. Werde Teil von Roche, wo jede Stimme zählt.
Die Position
As a Data Transfer Engineer within the Accelerated Compute Engineering (ACE) team, you will be responsible for owning the high-speed data transfer services that feed our advanced compute environments. With the introduction of our industry-leading AI Factory, the ability to move massive, petabyte-scale datasets rapidly and securely is more critical than ever. You will own the deployment, configuration, and ongoing optimization of specialized Data Transfer Appliances that bridge our traditional HPC clusters with our next-generation AI infrastructure. By ensuring seamless, high-bandwidth data mobility and aggressively eliminating network and I/O bottlenecks, you will play a foundational role in ensuring that our researchers can train complex AI models and execute large-scale computational science workloads without friction.
Description of the area
Hosting and Infrastructure (HI) provides mission-critical on-premise infrastructure, cloud hosting, connectivity, and technology products that enable all functions at every Roche site to develop, innovate, connect, and deliver compliant digital products across the Roche Enterprise.
The Value Streams - Accelerated Compute Engineering (ACE) Team is focused on driving both customer success and platform success by acting as a center of excellence and delivery for the High Performance Compute and AI Infrastructure supporting AI and HPC use cases across Roche. This team facilitates seamless onboarding and adoption for business vertical customers needing accelerated compute—helping those infrastructure consumers with needs optimized for high availability, seamless data transfer, flexibility, speed, and the rapidly changing needs of AI—helping achieve rapid time-to-value.
Job Responsibilities
Pipeline Architecture& Management
- Own the end-to-end deployment and lifecycle management of specialized Data Transfer Appliances.
- Identify and resolve systemic network and I/O bottlenecks to maximize throughput between storage tiers and compute nodes.
Technical Operations& Optimization
- Manage integration with various storage architectures, including Object Storage, NAS (NFS), and parallel file systems such as GPFS.
- Implement automation for on-site deployment and configuration of our factory-built and tuned Data Transfer appliances using Ansible, Kickstart, or Red Hat Satellite to ensure scalable and reproducible environments.
Performance& Monitoring
- Analyze system logs, kernel messages, and network statistics to proactively monitor the health and performance of the data transfer fabric.
- Define and track success metrics for data mobility, providing insights to leadership on infrastructure utilization and performance trends.
User Enablement& Support Experience.
- Explore innovative approaches to drive more frictionless consumption of our Data Transfer appliances, e.g, Agentic AI and MCP
- Develop user-facing documentation, scripts, and tools that simplify complex data movement tasks for the broader scientific community.
- Provide high-level technical support, isolating complex issues across storage, network, and application layers in a collaborative, cross-functional manner.
Who You Are
Basic Qualifications:
- Bachelor’s or an advanced degree in Computer Science, Engineering, or a similar technical discipline.
- Proven experience in managing large-scale Linux environments (RHEL/rebuilds) with deep proficiency in CLI tools (grep, awk, sed, etc.).
- Demonstrated experience in high-performance computing (HPC) or AI infrastructure environments.
Technical& Business Skills:
- Scripting& Automation: Strong proficiency in Bash and Python scripting, along with hands-on experience using Ansible for infrastructure as code.
- Storage& Networking: Deep understanding of parallel file systems (Lustre, GPFS) and high-speed networking (InfiniBand, TCP/IP tuning).
- Security: Solid grasp of IAM configurations (LDAP, AD, OIDC), JWT tokens, and certificate management.
- Problem Solving: A diagnostic mindset with the ability to interpret complex logs (system, kernel, network) to isolate performance degradation.
- Technical Communication: Excellent ability in technical English (reading and writing) to document complex architectures and guide users.
Leadership& Mindset:
- Lean& Agile Mindset: You focus on automation and efficiency to scale support and operations.
- Enterprise Mindset: Ability to break down silos and collaborate across organizational boundaries to ensure end-to-end data mobility.
- Intellectual Curiosity: A passion for staying current with major IT market trends, specifically in AI hardware and high-speed data movement.
Wer wir sind
Eine gesündere Zukunft treibt uns zur Innovation an. Mehr als 100.000 Mitarbeiter weltweit arbeiten gemeinsam daran, wissenschaftliche Fortschritte zu erzielen und sicherzustellen, dass jeder Zugang zur Gesundheitsversorgung hat – heute und für zukünftige Generationen. Durch unser Engagement werden über 26 Millionen Menschen mit unseren Medikamenten behandelt und mehr als 30 Milliarden Tests mit unseren Diagnostik-Produkten durchgeführt. Wir ermutigen uns gegenseitig, neue Möglichkeiten zu erkunden, Kreativität zu fördern und hohe Ziele zu setzen, um lebensverändernde Gesundheitslösungen zu liefern.
Gemeinsam können wir eine gesündere Zukunft gestalten.
Roche ist ein Arbeitgeber, der die Chancengleichheit fördert.