New Header logo

Back to Search Results

SLAC National Accelerator Laboratory is one of 17 Department of Energy (DOE) National Laboratories, and operated by Stanford University on behalf of the DOE. SLAC develops and operates some of the world’s premier science facilities, including the first hard X-ray free-electron laser. Research at SLAC explores the structure and function of matter and the properties of energy, space and time, at the smallest and largest scales, all with the goal of solving problems facing society and advancing human knowledge.

 

 


High Performance Computing Architect - UNIX Clusters

Job Requisition #: 3088
Classification Title: Information Systems Specialist
Grade: M
Location: Menlo Park, CA (HQ)
# of openings: 1

Description

POSITION OVERVIEW

The Computing Division at SLAC is dedicated to providing world-class technology and IT services to the lab. Our Computing Division's Unix High Performance Computing and Storage team supports the lab’s science mission including systems for scientific analysis, numerically intensive computing and data management. As a member of the High Performance Computing (HPC) and Storage team, you will conceive, design, develop, optimize, integrate, and maintain HPC systems, lead technical operation and continued development of HPC and Storage services. Specifically you will: architect and optimize highly complex UNIX clusters and related components for multiuser production analysis environments; develop and manage performance metrics and analytics; automate central management; provide end-user technical support and troubleshooting; research and develop tools and services to enable scientific workflows; integrate emerging technologies into production environments; and document procedures and recommendations.

JOB PURPOSE:

Apply unique skill combinations to create IT solutions for complex problems. Work may involve information theory, computing theory, mathematical simulations, or scientific computing.

CORE DUTIES*:

  • Act as the conceptual source for assignments involving more than one area of specialization and/or innovative system design.
  • Plan and coordinate IT efforts with a high degree of dependence upon their individual unique technical contributions.
  • Conceive, design, develop, optimize, integrate, and maintain information technology at a complex level.
  • Troubleshoot highly complex problems for which the analysis and resolution require extensive knowledge of many diverse system components.
  • Develop long range technology plans.
  • Provide project management, coordination and programming for IT projects having significant impact.
  • Provide leadership and IT solutions for complex problems.   
  • Identify applicable new technologies through research, collaboration with peers, and participation in standards organizations, industry groups, panels, etc.
  • May work on University-wide task forces and committees related to strategic planning efforts for information technologies.

- Other duties may also be assigned

MINIMUM REQUIREMENTS:

Education & Experience:

Bachelor's degree in computer sciences or related field and ten years of increasingly technical work experience including:

  • Building relationships in a cutting-edge research community and gathering requirements
  • Managing HPC services in a multiuser environment
  • Translating requirements into cluster designs and HPC service features
  • Troubleshooting distributed computing workflows
  • Change management and communications, including service interruptions/outages
  • Contributing to a HPC Community-of-Practice or technical working group as a Subject Matter Expert
  • Managing resource contention and utilization in multiuser environments

Knowledge, Skills and Abilities:

  • Expert knowledge in Unix systems for cluster-based distributed computing
  • Skilled in large-scale automation using common programming languages and configuration management tools.
  • Extensive knowledge of benchmarking, performance monitoring and Analytics reporting
  • Tuning compute and storage systems to alleviate bottlenecks
  • Knowledge of architecture and interrelationships (technical and functional).
  • Ability to combine information technologies to create solutions for complex problems.
  • Ability to work effectively in a team environment and lead cross-functional teams.

A background or experience in one or more of the following is strongly desired

  • Unix Configuration management: Chef, Puppet
  • Batch cluster resource management: LSF, SLURM, Condor, etc
  • Virtualization, Cloud and Containerized workflows: OpenStack, Kubernetes, Jupyter
  • Networking with low-latency interconnects for parallel processing: Infiniband, OmniPath
  • Open Source monitoring tools and frameworks: Monit, Nagios, Ganglia, Elasticsearch

Certifications and Licenses:

None

SLAC Employee Competencies:

  • Effective Decisions:  Uses job knowledge and solid judgment to make quality decisions in a timely manner.
  • Self-Development:  Pursues a variety of venues and opportunities to continue learning and developing.
  • Dependability:  Can be counted on to deliver results with a sense of personal responsibility for expected outcomes.
  • Initiative:  Pursues work and interactions proactively with optimism, positive energy, and motivation to move things forward.
  • Adaptability:  Flexes as needed when change occurs, maintains an open outlook while adjusting and accommodating changes.
  • Communication:  Ensures effective information flow to various audiences and creates and delivers clear, appropriate written, spoken, presented messages.
  • Relationships:  Builds relationships to foster trust, collaboration, and a positive climate to achieve common goals.

PHYSICAL REQUIREMENTS*:

  • Constantly seated, perform desk-based computer tasks.
  • Occasionally stand/walk, use a telephone, writing by hand, grasp lightly/fine manipulation.
  • Rarely lift/carry/push/pull objects that weigh up to 10 pounds.

- Consistent with its obligations under the law, the University will provide reasonable accommodation to any employee with a disability who requires accommodation to perform the essential functions of his or her job.

WORKING CONDITIONS:

May work extended hours, evenings and weekends.

WORK STANDARDS:

  • Interpersonal Skills: Demonstrates the ability to work well with Stanford colleagues and clients and with external organizations.
  • Promote Culture of Safety: Demonstrates commitment to personal responsibility and value for environment, safety and security; communicates related concerns; uses and promotes safe behaviors based on training and lessons learned. Meets the applicable roles and responsibilities as described in the ESH Manual, Chapter 1—General Policy and Responsibilities: http://www-group.slac.stanford.edu/esh/eshmanual/pdfs/ESHch01.pdf
  • Subject to and expected to comply with all applicable University policies and procedures, including but not limited to the personnel policies and other policies found in the University's Administrative Guide, http://adminguide.stanford.edu.

  


SLAC National Accelerator Laboratory is an Affirmative Action / Equal Opportunity Employer and supports diversity in the workplace. All employment decisions are made without regard to race, color, religion, sex, national origin, age, disability, veteran status, marital or family status, sexual orientation, gender identity, or genetic information. All staff at SLAC National Accelerator Laboratory must be able to demonstrate the legal right to work in the United States. SLAC is an E-Verify employer.

 

Final candidates are subject to background checks prior to commencement of employment at the SLAC National Accelerator Laboratory.

Internal candidates, who are selected for hire, may require degree verification and/or credit checks based on requirements of the new position.

 

For Clery Act Information click here: http://www.stanford.edu/group/SUDPS/safety-report/security-authorities.shtml




Are you a returning applicant?

Previous Applicants:

If you do not remember your password click here.

New Search

slac_footer 


Powered By Taleo