Senior HPC Systems Support Engineer - 0173471

We are no longer accepting applications for this position.


How to apply:

Applications for this position must be submitted via the Georgia Tech Human Resources website.  Please visit https://gatech.taleo.net/careersection/gatech_classified/jobdetail.ftl?job=0173471.

Job Purpose:

Responsible for advanced level design, integration, implementation and modification of a wide variety of user information systems usually within a highly complex environment. This includes coordination of the installation, testing, operation, troubleshooting and maintenance of those systems. Provide functional and technical expertise in support of user systems operation and maintenance. May include support of labs, training facilities and general IT office software and hardware. Provides technical advice and leadership to lower level technical staff

Major Responsibilities:
  • Conduct needs analysis, planning and scheduling the installation of a wide variety of new or modified hardware/software.
  • Develop functional and technical IT system requirements and specifications.
  • Provide technical support through allocating systems resources, managing accounts, administering passwords, documentation, security, recoverability and access including deploying security and operating system patches.
  • Monitor and reports on system performance.
  • Support internal departmental servers.
  • Maintain hardware and software inventories including developing and updating departmental distributed software license management policy.
  • Makes technical presentations to other technical personnel and user department administrators.
  • May lead project technical team; provide technical advice to lower level technical staff.
  • Perform other related duties as assigned.
Basic Qualifications:
  • Education: Bachelor's in Computer Science or related field or equivalent combination of education and experience
  • Work Experience: Four to six years job related experience.
  • Certifications: N/A
  • Skills: This job requires advanced knowledge of complex concepts, practices and procedures associated with IT systems design, installation and maintenance. Expert knowledge in variety of applications and platforms is required.
Preferred Qualifications:
  • Preferred Education: Master’s degree in Computer Science, or related field, or or equivalent combination of education and experience

  • Preferred Work Experience: N/A

  • Preferred Skills:
    • Advanced knowledge of, and experience in, RedHat Enterprise Linux system administration in large scale (100+ servers) environments
    • Experience using system provisioning & configuration management tools to manage large server environments (e.g. RedHat Network Satellite, Puppet, CFengine, Ansible)
    • Experience with High-performance storage and clustered filesystems (e.g. GPFS, Lustre)
    • Experience in cloud implementations (e.g. OpenStack, VMWare)
    • Expertise in HPC middleware (e.g. MPICH, Moab, PBS) installation, administration, security and patching
    • Experience with High-performance computer interconnects (e.g. 10 and 40 Gigabit Ethernet, InfiniBand)
    • Fluency in one or more programming and/or scripting languages (e.g. C/C++, Fortran, Perl, shell, Python)
Additional Information:

This position shares responsibility for the functionality of the foundational hardware/software of High Performance Computing (HPC) resources. Diagnose and resolve technical problems, often of a complex nature, associated with computer hardware and software interrelationships and dependencies. Utilize a wide variety of skills in system and network monitoring; large-scale systems administration; scripting and automation; security; network distributed services; storage and backups; and hardware and software problem diagnosis and resolution. Install, configure, maintain, and support computer software and hardware, which may include network environments or software upgrades in accordance with specifications, and perform testing as required. Develop and maintain documentation, including configuration changes, as needed. The work environment involves adherence to safety precautions relative to working near electrical circuits, plus handling of objects up to 25 pounds. Occasional after-hours or weekend work, and travel 1-10 nights per year is possible.

Impact & Influence:

This position will interact on a consistent basis with: system staff and departmental users including management. This position typically will advise and counsel: systems staff. This position will supervise: N/A (work direction only)

Department Description

The Office of Information Technology (OIT) provides information technology leadership and support to the Georgia Institute of Technology, working in partnership with academic and business units to meet the unique needs of a leading research university. OIT serves as the primary source of enterprise-wide information technology and telecommunications services in support of students, faculty, staff, and researchers.

The Partnership for an Advanced Computing Environment (PACE) seeks to create and maintain research computing services in support of Georgia Tech's academic and research missions. Our goal is to facilitate Institute and external partnerships, architectures, technology pilots, and leading edge infrastructures that deliver unique and sustained competitive advantage to Georgia Tech faculty, students, and staff. PACE also strives to foster a work environment that is customer centered; values diversity of all forms; encourages, enables, and rewards innovation, service excellence, teamwork, and collaboration (both within OIT and the Georgia Tech community); and effectively balances leading edge and production activities. For additional information about PACE, visit: http://www.pace.gatech.edu.