• Systems Administrator

    Job Locations US-MA-Boston
    Posted Date 4 days ago(5/17/2018 4:27 PM)
    Job ID
    # of Openings
    Information Technology
  • Overview

    The Million Veteran Program receives and stores large volume of genomic data for Veterans around the world. We are looking for a Linux, Windows Server and storage array administrator who will be responsible for designing, implementing, and monitoring a High-Performance Compute cluster (HPC).  The HPC consists of 110 compute nodes and 50 virtual machines with 4 petabytes of storage and an additional smaller 16 node remote cluster; also, to collaborate with other team members to develop automation strategies, performance reports and deployment processes.



    - Help tune performance and ensure high availability of the HPC

      - Design and develop HPC monitoring and reporting tools

      - Develop and maintain configuration management solutions

      - Load balancing multiple users on the same network

      - Develop test automation frameworks in collaboration with rest of the team

      - Create tools to help teams make the most out of the available infrastructure

      - Develop performance reports with recommendations on how to improve poor performing areas




      - Experience with Linux and Windows servers in physical and virtualized environments. 

      - Experience with the fundamentals of Linux and Windows scripting languages

      - Knowledge of genomic data is a plus but not required

      - Ability to work remotely with other administrators on the HPC

      - Experience with LDAP installation and troubleshooting

      - Experience installing, configuring, and maintaining services such as LSF, SSSD, Apache, MySQL, PostgreSQL, etc.

      - Strong grasp of configuration management tools, such as Ansible, Puppet or Chef

      - Familiarity with load balancing, firewalls, etc.

      - Proficient with network tools such as iptables, Linux IFB, etc.

      - Experience with virtualization technologies, such as libvirt

      - Ability to build and monitor services on production servers using CLI tools and monitoring frameworks such as Cacti, Nagios, Grafana or collectd

      - Knowledge of network switching technologies and administration



    Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
    Share on your newsfeed