Enjoy competitive salaries and exceptional benefits by joining the rapidly growing PACE team at Georgia Tech.
Located in the heart of vibrant Atlanta, Georgia, the Partnership for an Advanced Computing Environment (PACE) team defines and manages centralized research computing services at the Georgia Institute of Technology. We facilitate research efforts and foster strategic partnerships to provide Georgia Tech researchers with an unrivaled advantage, empowering them to lead their disciplines for the advancement of science and society across the globe. In addition to high-performance computing, we incorporate emerging cloud, network, data analytics, and storage technologies. Our fast-growing team directly supports faculty, students, and other researchers resulting in a dynamic environment filled with intellectual stimulation and continuous innovation.
This position shares responsibility for the functionality of the foundational hardware/software of High Performance Computing (HPC) resources. Diagnose and resolve technical problems, often of a complex nature, associated with computer hardware and software interrelationships and dependencies. Utilize a wide variety of skills in system and network monitoring; large-scale systems administration; scripting and automation; security; network distributed services; storage and backups; and hardware and software problem diagnosis and resolution. Install, configure, maintain, and support computer software and hardware, which may include network environments or software upgrades in accordance with specifications, and perform testing as required. Develop and maintain documentation, including configuration changes, as needed. The work environment involves adherence to safety precautions relative to working near electrical circuits, plus handling of objects up to 25 pounds. Occasional after-hours or weekend work, and travel 1-10 nights per year is possible.
Responsible for advanced level design, integration, implementation and modification of a wide variety of user information systems usually within a highly complex environment. This includes coordination of the installation, testing, operation, troubleshooting and maintenance of those systems. Provide functional and technical expertise in support of user systems operation and maintenance. May include support of labs, training facilities and general IT office software and hardware.
- Education: Bachelor's in Computer Science or related field or equivalent combination of education and experience
- Work Experience: Four to six years job related experience.
- Skills: This job requires advanced knowledge of complex concepts, practices and procedures associated with IT systems design, installation and maintenance. Expert knowledge in variety of applications and platforms is required.
Additional knowledge, skills, and experience that are desirable, but NOT required:
- Education: Master’s degree in Computer Science, or related field, or or equivalent combination of education and experience
- Advanced knowledge of, and experience in, RedHat Enterprise Linux system administration in large scale (100+ servers) environments
- Experience using system provisioning & configuration management tools to manage large server environments (e.g. RedHat Network Satellite, Puppet, CFengine, Salt)
- Experience with High-performance storage and clustered filesystems (e.g. GPFS, Lustre)
- Experience in cloud implementations (e.g. OpenStack, VMWare)
- Expertise in HPC middleware (e.g. MPICH, Moab, PBS) installation, administration, security and patching
- Experience with High-performance computer interconnects (e.g. 10 and 40 Gigabit Ethernet, InfiniBand)
- Fluency in one or more programming and/or scripting languages (e.g. C/C++, Fortran, Perl, shell, Python)
For formal job description and to apply, please visit the following web site: https://gatech.taleo.net/careersection/gatech_classified/jobdetail.ftl?job=0177497