Idaho High Performance Computing Workshop 2004

Friday October 22, 12:30pm - 7:00pm
Boise State University, Micron Engineering Center, Room 106
[ Schedule - Presentations - Speakers - Logistics - Thanks ]


Welcome to the first workshop focused on high performance computing in Idaho. This workshop presents grid and cluster computing architectures and applications in use in Idaho academia, industry and government.

What is high performance computing? What's the difference between a grid and a cluster? How are these tools improving the lives of Idahoans? Come find out on October 22. The workshop is free and open to the general public. Seating is limited to 100.

The workshop will be opened by BSU's College of Engineering Dean Dr. Cheryl Schrader. Todd Tannebaum of the Condor Project will provide the keynote on the present and future state of High Performance Computing. The majority of the workshop will consist of presentations by Grid and Cluster operators and application developers from Idaho universities, industry and government. The workshop will conclude with a panel of experts who will field questions from the participants.


Schedule of Presentations

Time Presentation Length Speaker Position Organization E-Mail
12:40pm Opening of the Workshop 00:05 Dr. Cheryl Schrader Dean BSU College of Engineering schrader@boisestate.edu
12:45pm Keynote:
The State and Future of HPC
00:40 Todd Tannebaum Associate Researcher Condor Project tannenba@cs.wisc.edu
1:30pm Top500 Supercomputing at INL 00:25 L. Eric Greenwade Science Fellow INEL leg@inel.gov
2:00pm U. Idaho Bioinformatics Server Farm 00:25 Kenneth Blair Assistant Computer Scientist U. of Idaho kblair@uidaho.edu
2:30pm Bioinformatics Computing at U. Idaho 00:25 Celeste Brown Bioinformatics Coordinator U. of Idaho celesteb@uidaho.edu
3:00pm Enterprise Grid Best Practices 00:25 Brooklin Gore Senior Fellow Micron bgore@micron.com
3:30pm Break 00:15 Refreshments compliments of BSU and the Micron Foundation
3:45pm Document Image Degradation Analysis 00:20 Elisa Barney-Smith Associate Professor Boise State U. ebarneysmith@boisestate.edu
4:10pm Boise State Beowulf Cluster 00:25 Amit Jain Associate Professor Boise State U. amit@cs.boisestate.edu
4:40pm Airshed Modeling 00:20 Rick Hardy
Wei Zhang
Model Group Lead
Scientific Programmer
Idaho DEQ rhardy1@deq.state.id.us
wzhang@deq.state.id.us
5:05pm Biologically Inspired Computing 00:20 Tim Andersen Assistant Professor Boise State U. tim@cs.boisestate.edu
5:30pm ISU Bioinformatics Apple Cluster 00:20 Luobin Yang Bioinformatics Specialist Idaho State U. yangluob@isu.edu
6:00pm Panel Discussion ? Ken Blair, Brooklin Gore, Eric Greenwade, Kevin Hess[kevin.hess@hp.com]*, Amit Jain, Paul Michaels[pm@cgiss.boisestate.edu](abstract)*, Todd Tannenbaum
* Having not yet presented, these speakers will briefly introduce themselves and their work
 

[ Schedule - Presentations - Speakers - Logistics - Thanks ]

Presentation Abstracts

Top500 Supercomputing at INL

The Idaho National Laboratory (INL) has recently added a 250 node, dual Opteron Linux compute cluster to their High Performance Computing (HPC) resources. This system forms a logical compliment to the existing HPC resources of Sun and SGI symmetric multiprocessor and Cray vector supercomputing environments. An addition 15 TB of disk space has been added and configured into a high performance, high availability set of filesystems useable by all HPC systems. Finally a dedicated network has been established to house these resources, providing for the first time the ability to share the HPC environments with collaborators outside the INL. This presentation will describe the current and near term future supercomputing environments at the INL and the new capabilities they will provide to INEL researchers and collaborators.

U. Idaho Bioinformatics Server Farm

The Initiative for Bioinformatics and Evolutionary Studies (IBEST), set out two and one half years ago to build the states Bioinformatics Core Computing Center. This presentation focuses on the design and construction of the server farm facilities and an overview of our Sun SMP systems and 3 Beowulf Clusters.

Bioinformatics Computing at the U. of Idaho

The High Performance Computing facilities at the University of Idaho are used both for developing bioinformatics software and for running a wide range of commercial and free bioinformatics software. The main emphasis of our bioinformatics program is in evolutionary studies of both living organisms and artificial systems. We have a wide array of software that facilitates our evolutionary studies of living organisms, and we are developing new software to help others in their evolutionary studies. We also develop simulations of evolution in artificial systems, such as genetic algorithms and evolutionary computation.

Enterprise Grid Best Practices

Like any major project, a successful enterprise Grid implementation requires a well-defined purpose, a solid foundation, and a clear vision for a successful outcome. While Grid computing hype abounds and commercial Grid offerings proliferate, a well-planned Grid deployment with noble yet achievable goals can succeed in delivering new capabilities at reduced costs. This presentation leverages more than three years of experience at a large, global manufacturing company in building a successful, general purpose, enterprise Grid. Several attainable goals for a Grid project are discussed in addition to infrastructure and philosophical best practices. Sample applications are recommended that have broad appeal and good Grid success track records. Tips for engaging and pleasing both the CFO and CTO are shared. Finally, pitfalls and challenges are highlighted to ease your adoption of this exciting technology.

Document Image Degradation Analysis

Parts of the research into Document Image Degradation Analysis require heavy computational power to complete. Fortunately many of the processes are easily parallelizable. This talk will describe some of the problems in Document Image Degradation Analysis that have been parallelized and implemented to run under the CONDOR Grid Computing system at Boise State University. Issues that have been encountered while setting up the CONDOR system at Boise State University and getting the programs to run under it will also be described.

Boise State Beowulf Cluster

An overview of the 122-processor Boise State Beowulf Cluster project. Discussion of the design, setup and configuration. Summary of various projects underway on the cluster. Live demos will be included in the talk.

Airshed Modeling

The air quality challenges facing the Treasure Valley today are increasingly complex because the pollutants of concern today, ozone and fine particulates, are formed largely by photochemical reactions in the atmosphere. Large, complex, Eulerian grid models are required to address these problems. Simulations are fed by numerical weather models and comprehensive emission inventories and must treat detailed transport and photochemistry mechanisms. One 5-day simulation may take 5 days of processing on one CPU and generate 100 GB of output. The IdahoDepartment of Environmental Quality is building a small Linux cluster using existing machines to reduce simulation times and expedite the airshed modeling effort.

Experiences Porting Fortran Software to a Beowulf Cluster

There are many ways to adapt an existing serial program to take advantage of the parallel computational capabilities of a Beowulf Cluster. This presentation focuses on lessons learned during the conversion of a serial program under the Single Program Multiple Data (SPMD) paradigm. The serial program was written in Fortran 77 and compiled on the G77 compiler available under the GNU open source license (gcc-3.2.2). A priority was placed on performing the conversion under the open source model. Thus, no special compilers or commercial software were required.

While the specific problem of computing body wave dispersion in a down-hole geophysical survey turned out to be embarrassingly parallel, a number of issues were revealed which would benefit any Fortran conversion to parallel computation. In particular, we found it advantageous to employ some mixed language coding. By mixing Fortran with C language functions, we were able to perform the conversion using existing libraries. This proved valuable in instrumenting the code and manipulating directories and files on different nodes.

Biologically Inspired Computing

Tim Andersen will detail his collaboration with Crowley Davis Research, Inc. One of the goals of Crowley Davis Research is to be able to model tissue properties using biologically inspired computational approaches. Thus, our approach to tissue modeling incorporates biologically-derived concepts into a computational framework. The general strategy is: 1) to construct a basic set of building blocks using biologically derived primitives that then give rise to higher order properties; and 2) to evolve a synthetic genome capable of producing a target object using the biologically derived primitives. As an initial step in developing this system we have focused on the ability to produce different 3-dimensional shapes, since shape is an important property of many different types of tissue. In simulation, in addition to the ability to produce various shapes with our system, by using biologically inspired principles we have captured one of the most basic, emergent features of living organisms, the capacity for self-repair. A relatively simple encoded object, a 64-cell cube, can repair itself both during development (while the cube is being built) and after its phenotype has become fully formed, stabilized in an apparently static state. Once wounded, the cube reveals that the capacity for self repair remains a latent property, and the cube can recover fully from substantial damage (loss of ~30% of its cells). Our self-repairing cube demonstrates that it is possible to produce simulated biological function embedded within the corresponding simulated form, even though the encoding scheme does not include any specific instructions for maintenance of form. Instead, this capacity derives from the encoding rule set used to generate the cube. We are extending these results to generate more complex self-repairing shapes and to explore other possible embedded properties such as mechanical deformability, excitability, and contraction.

ISU Bioinformatics Computing Using Apple Cluster

An Introduction of the setup, configuration, and management of Apple Workgroup Cluster at ISU, which consists of 10 Apple dual processor Xserve nodes; each node runs Mac OS X server. A description of the system software and bioinformatics applications installed on the cluster. An introduction of how these bioinformatics applications are accessed and the major users of the cluster. Our future plans of extending the current cluster to provide more computing power and to add more bioinformatics applications.

Panel discussion

Here's your chance as an attendee to find out from the experts how to build your own grid or cluster or develop your high performance computing application. Each panelist will briefly introduce his or herself and their HPC infrastructure and/or application. Some areas for discussion include build or buy, can I just use yours, how hard is it to modify an application to run on a grid or cluster, where can I get help?

[ Schedule - Presentations - Speakers - Logistics - Thanks ]

Speaker Biographies

Todd Tannebaum

Todd Tannenbaum is an Associate Researcher in the Department of Computer Sciences at the University of Wisconsin-Madison (UW-Madison) and first became involved with the Condor Project in 1993. On the Condor team, Todd's responsibilities as Technical Lead include the day-to-day management of the full-time development staff, and is also heavily involved in the design and implementation details of future Condor releases. His contributions include the low-level libraries which provide Condor's communication framework, process management, and portability layers. Previous to his involvement with the Condor Project, Todd served as the Director of the Model Advanced Facility, an advanced visualization and high-performance computing center housed in the UW-Madison College of Engineering. Todd has also served as Technology Editor for Network Computing magazine, and as an officer of Coffee Computing Corp., a software development consulting company. In addition to invited speaking engagements and over 10 research publications on various aspects of Condor technology, Todd is a contributing author on books relating to cluster and grid computing, and has published over 25 articles in several of the USA's mainstream software development and administration publications such as Dr. Dobbs Journal, Network Computing, and Information Week. He received a B.S. in Computer Science from UW-Madison. [More on Todd]

L. Eric Greenwade

L. Eric Greenwade is the Chief Information Technology Architect and a laboratory Fellow at the US Department of Energy's Idaho National Laboratory (INL). He is presently the INL's Group Leader for High Performance Computing and Visualization with his group providing support to approximately 3000 scientists and engineers. Supported activities include nuclear engineering, national security, subsurface science, computational fluid dynamics, computational chemistry and physics, bioinformatics, structural engineering and a variety of other disciplines. His degrees are in mathematics (numerical algorithms) with an undergraduate emphasis in physics and chemistry. He currently holds eleven elected and appointed leadership positions in national and international technical societies and is a member of the US national delegation to ISO for the next generation of the MPEG and JPEG standards definition and development. He has designed and overseen the construction of several high performance computing and large-scale visualization environments. Current research interests include effective information processing via hierarchical data representations and systems that facilitated geographical distributed computation, visualization and analysis.

Ken D. Blair

Ken D. Blair is the Assistant Computer Scientist and Unix Systems Administrator for the UofI's Bioinformatics Core Server Farm. Ken's job is to provide high reliability and availability for Initiative for Bioinformatics and Evolutionary Studies (IBEST) computing hardware and software. This includes day-to-day operations, backups, systems / application software maintenance, account management, and security monitoring. Ken has over 24 years of experience in Management of Information Systems. He has spent the last 12 years dealing primarily in educational systems design for Geophysics, Engineering, and Computer Science disciplines at both BSU and UI.

Dr. Celeste Brown

Dr. Celeste Brown is the Bioinformatics Coordinator at the University of Idaho. Her task is to facilitate the use of bioinformatics tools, both software and hardware, at the University of Idaho and across the state of Idaho. Dr. Brown's early graduate career was in population genetics, and her PhD dissertation was on the molecular evolution of the amylase gene region of Drosophila pseudoobscura, a fruit fly native to western America. Her postdoctoral work was in gene expression (the transcriptional regulation of human alcohol dehydrogenase genes and the response of gene expression to selection in yeast genes) with forays into mouse and salmon mitochondrial DNA evolution. She was a research scientist at Washington State University studying the sequence and evolutionary properties of intrinsically disordered protein (i.e., protein that does not fold into a fixed 3D structure), and she is continuing this research at the University of Idaho.

Brooklin Gore

Brooklin Gore has been researching and implementing enterprise Grid technologies for over three years to create Micron's global Grid infrastructure, which runs over 15 production applications today. Brooklin has been with Micron for over 16 years. In that time he served as a product engineer, Computer Aided Design group manager, network manager and general manager of Micron's Internet Services Division. Brooklin has been issued several United States patents and is a Senior Member of the IEEE. He holds Bachelor of Science degrees in computer science and electrical engineering from the University of Idaho and a Masters of Science in computer science from the National Technological University.

Elisa Barney-Smith

Elisa Barney Smith is an associate professor in the Electrical & Computer Engineering department at Boise State University. She received a B.S. in Computer Science and the M.S. and Ph.D. degrees in Electrical,Computer and Systems Engineering all from Rensselaer Polytechnic Institute, Troy, NY. Elisa's main research focus is in document imaging. Current projects are centered around developing models of the degradations produced by optical page scanners, printers, FAX machines and photocopiers. These models will be used to improve the recognition and processing of the images. Elisa is the principal investigator on an NSF grant "CAREER: Document Image Degradation Analysis." She also was the PI on the project to bring CONDOR Grid Computing to Boise State.

Amit Jain

Amit Jain is the co-principal investigator for the Boise State Beowulf Cluster, sponsored by a National Science Foundation Major Research Infrastructure grant. He has also designed, built and used several other clusters. He got his start in parallel computing at Fermilab working on the Advanced Computing Project, one of the first production clusters. He also has extensive experience with BBN GP1000 (shared memory system) and MasPar MP-1(SIMD system). He is currently an Associate Professor of Computer Science in the College of Engineering at Boise State University. His research areas are parallel  application software, parallel systems software and operating systems. He received a BTech in Computer Science and Engineering from the Indian Institute of Technology, New Delhi and a PhD in Computer Science from the School of Computer Science at the University of Central Florida.

Rick Hardy

Rick Hardy leads the Modeling Group in Idaho Department of Environmental Quality's Technical Services Division. This group conducts numerical groundwater modeling, contaminant fate/transport and risk modeling and analysis and atmospheric modeling. His primary focus is building a capability for photochemical modeling of the Treasure Valley airshed to address the problems of ozone and fine particulate formation. Mr. Hardy is a registered professional engineer with 25 years atmospheric modeling and photochemistry experience. He received a B.S. in Chemical Engineering from Washington University in St. Louis,and an M.S. in Engineering from Washington State University in Pullman.

Wei Zhang

Wei Zhang is a Scientific Programmer/Modeler with the Idaho Departmentof Environmental Quality's Modeling Group. He received a B.S. in Environmental Science from Wuhan University and an M.S. in Computer Science from Boise State University. He administers DEQ's Linux networkfor scientific applications and is implementing a Beowulf cluster to support the airshed modeling project. He has achieved an important milestone recently in successfully converting 40 GB of data files and perl, Fortran and C programs from one airshed modeling system to another.

Tim Andersen

Dr. Andersen has significant industry experience in the field of document recognition. Dr. Andersen helped to develop an artificial neural network (ANN)-based OCR system as well as newspaper document segmentation algorithms, local adaptive thresholding (LAT) routines (a modification of Niblack's LAT algorithm that uses a secondary adaptive threshold to determine window size), and proprietary noise and scratch removal routines. In September 2001, Dr. Andersen joined the faculty in the Computer Science Department at Boise State University. Dr. Andersen also holds a Senior Scientist position with Crowley Davis Research, Inc. Crowley Davis Research, Inc. (CDR) is currently engaged in research and development of advanced rule based systems. Dr. Andersen's areas of expertise include Neural Networks, Machine Learning, Genetic Algorithms, Pattern Recognition, Artificial Intelligence, and Computational Complexity. Dr. Andersen is actively pursuing research topics in the areas of: 1) biologically inspired methods of computation; 2) the use of ANNs for image segmentation, image region identification and OCR; 3) ANN architecture selection, in particular, he is interested in optimal ANN architecture selection for voting techniques such as bagging and boosting. His research in this area is studying the ability of ANNs with different architectures to learn and generalize using teacher networks of varying complexity and training set sizes; and 4) ANN training algorithms.

Paul Michaels

Paul Michaels is an associate professor teaching engineering geophysics at Boise State University. He is a registered professional engineer in Idaho, and a graduate of Michigan Technological University (BS) and University of Utah (MS and Ph.D.). He is the principal investigator on the NSF funded Beowulf Cluster at Boise State, and is author of Basic Seismic Utilities , an open source seismic processing software package written in both Fortran and the C language. His research interests include developing mathematical and computer models which link the dynamic properties of soils to soil permeability and the presence of ground water. He is also interested in the application of geophysical methods to problems in civil engineering.

Luobin Yang

Luobin Yang is a Bioinformatics specialist at Idaho State University. He received his first M.S. in Computer Science from Marquette University, Wisconsin and his second M.S. in Bioinformatics from Marquette University & Medical College at Wisconsin. He writes Bioinformatics applications using different programming languages. He is a Sun Certified Java Developer.

Kevin Hess

Kevin C. Hess is the Global Computing Environment Manager for HP's Embedded LaserJet Systems lab. Kevin provides computing systems and services for more than 300 design and development engineers in Boise, and for hundreds of developers at 12 other sites involved in product development around the world. Kevin has worked at HP for 21 years, having designed disk drives, LAN cards and printer subsystems. He accepted a management position six years ago to provide computer systems and support for other HP developers. With more than 20 people reporting to him with a variety of skill levels and disciplines, he is responsible for ensuring developers have systems to complete their work every day they report to work at HP. Kevin has an A.S. in Electronics, a B.S. In Computer Science, and has done postgraduate work in Computer Science.

[ Schedule - Presentations - Speakers - Logistics - Thanks ]

Workshop Logistics and other Information

Getting to the Workshop

The workshop will be held in room 106 of the Micron Engineering Center on the BSU Campus at 1375 University Dr. (west from cross street Broadway). BSU Campus Map

Parking

You are on your own for parking and some areas may require a fee. There is good parking behind the SUB which is just across the street from the Engineering Center. Be careful about parking so you don't get a ticket.

Speakers: You should receive a parking pass. If you have not received one before the workshop, contact Amit.

Cost

There are none.

[ Schedule - Presentations - Speakers - Logistics - Thanks ]

Thank You!

Organizers

Many thanks to Amit Jain, John Griffin, Scott Jeide, Angus McDonald and Brooklin Gore for having the vision and inspiration to organize the workshop.

Sponsors

Special thanks to the Boise Section IEEE, Boise State University College of Engineering and Micron Foundation for their time and money in helping promote and conduct the workshop.

[ Schedule - Presentations - Speakers - Logistics - Thanks ]