HPC Senior System/Software Administrator, BioFrontiers Institute
Requisition Number:
Location:
City
State
Employment Type:
Schedule:
Posting Close Date:
Date Posted:
Job Summary
The HPC Senior System/Software Administrator will be a senior member of the BioFrontiers IT team focused on providing technical assistance, software management, day-to-day operations, and problem-solving on BioFrontiers’ owned computing and software resources. This role will serve as the BioFrontiers subject matter expert for all scientific computation and storage services.
This position carries with it a general expectation to respond to critical issues and incidents that arise outside of normal business hours within a reasonable time frame, as set forth by the position's supervisor. This expectation is consistent with commitments BioFrontiers has made with customers that many of its services will have 24/7 “best effort” coverage.
Who We Are
At the University of Colorado BioFrontiers Institute, researchers from the life sciences, physical sciences, computer science and engineering are working together to uncover new knowledge at the frontiers of science and partnering with industry to transform their discoveries into new tools.
The BioFrontiers Institute is uniquely defined both by our excellent faculty, research, and leadership and by the scientific and geographical ecosystem that empowers our work. At BioFrontiers, we have an outstanding "frontier" culture that enables researchers to explore new areas of bioscience by enhancing resources and talents across the Institute and the university system. The Institute integrates faculty members from eleven academic departments, allowing them to work across fields. BioFrontiers drives innovation without boundaries.
What Your Key Responsibilities Will Be
- Diagnose, solve, and implement solutions for the BioFrontiers storage resources and computational clusters.
- Provides primary management of BioFrontiers computational and storage services including all subsequent offerings and will independently make decisions for these functions.
- Serve as the subject matter expert for BioFrontiers scientific computing resources including technical guidance of junior staff and student employees.
- Primary tasks may include hardware repairs, operating system installation and configuration, system software updates, and procedure automation.
- Assist with network hardware and network service maintenance and configuration. Respond to end-user queries.
- Install, update, test, document, and maintain open-source and commercial software packages used by supported researchers.
- Maintain environment modules.
- Acquire and maintain licenses for commercial software.
- Assist end-users with the usage of centrally-maintained software and libraries; and configuring, compiling, installing, debugging, and optimizing their own software.
- Daily maintenance operations for all BioFrontiers HPC hardware and storage resources including, but not limited to, installation, disk replacements, parts replacement and hardware troubleshooting.
- Advise BioFrontiers IT leadership on necessary expenditures and improvements of computing resources.
- Proactive daily monitoring and health checks of the BioFrontiers computing and storage infrastructure. Use and extension of existing Nagios/Zabbix/Icinga infrastructure, and development of additional monitoring scripts.
- This position will require independent planning and design of monitoring services and continued improvement of the same.
- Maintain and/or create documentation in support of the BioFrontiers Computing Infrastructure.
- Train storage and computational users on existing BioFrontiers systems and efforts. This will consist of individual training, teaching of training courses, and participation in workshop teaching efforts.
What You Should Know
- A hybrid work modality is available with 3 days in the office and 2 days remote per week.
- Work hours include standard business hours with occasional off-hours maintenance windows and emergency response times as needed.
What We Can Offer
The annual hiring range for this position is $97,775 - $117,500. Relocation is available for eligible candidates.
Benefits
Be Statements
What We Require
- Bachelor's Degree in Computer Science, Computer Engineering, Natural Sciences, or related field.
- Three years of relevant experience as described below.
- An equivalent combination of education and experience as described below may substitute.
Knowledge of, and production experience with, a combination of the following:
- Compiling, installing, and/or developing scientific or engineering software in a Linux environment.
- Building, configuring, and administering Linux or Unix computer systems.
- Diagnosing system and application software problems.
- Experience programming in C, or C++.
- Experience scripting in Bash, Python, or Go.
- Familiarity with version control systems (e.g., Git or Subversion) and GitLab.
- Familiarity with configuration management tools (e.g., Puppet, Salt).
- Experience with network filesystems, particularly NFS and CIFS.
- Knowledge of, or experience in, networking systems and software including DNS, LDAP, and TCP/IP .
What You Will Need
- Exercise independent judgment and decision-making skills for guiding a complex service and all subsequent requirements.
- Networking with campus technical resources to assist in providing support for research operations.
- Proficiency with video-conferencing solutions to assist in everyday tasks including technical support.
- Ability to work effectively both within a team and also independently.
- Ability to communicate with end-users via an electronic ticketing system, by phone, or in person.
- Ability to follow through with assignments and commitments in a timely and professional manner.
What We Would Like You to Have
- Experience in system and related network administration of complex computer systems, specifically Linux systems and preferably Linux clusters. Experience in end-user support.
- Experience with software building and configuration tools (e.g., makefiles, autoconf, or environment modules).
- Familiarity with parallel programming (e.g. OpenMP, MPI).
- Familiarity with ticket tracking systems.
- Experience working in an environment with constantly evolving job priorities.
- Administrative experience with Slurm workload manager.
- Administrative experience with ZFS and ZFS native replication tools.
- Administrative experience with NetApp storage systems.
- Administrative experience with Dell PowerScale storage systems.
- Experience with disaster recovery planning and execution for enterprise-grade storage operations.
- Experience with hands-on hardware troubleshooting and maintenance tasks.
- Administrative experience with parallel file systems, especially BeeGFS.
Special Instructions
- A current resume.
- A cover letter that specifically addresses how your background and experience align with the requirements, qualifications and responsibilities of the position.
Posting Contact Information
Posting Contact Name: Boulder Campus Human Resources
Posting Contact Email: Recruiting@colorado.edu