Computational biology and bioinformatics make extensive use of high performance computers for the study of biological systems and processes, and in particular molecular sequences, structures, functions and evolutions. Computer based statistical and machine learning techniques are used to make sense out of the huge amount of data obtained in sequencing of genomes, DNA microarray chips, two-hybrid experiments, and tandem mass spectrometry. Efficient computation is central to the study of the primary (sequence), secondary (folding), and tertiary (3-dimensional) structures of DNA, RNA, and protein sequences. Functional and structural genomics initiatives generate huge quantities of sequence and expression data for plants, animals and microbes that create challenges in data storage, retrieval and analysis. Studies of genome evolution and macromolecular structure and functions use high performance computing to understand patterns and processes of changes that occur over time and to discern structure and functional meaning of DNA and protein sequences.
Massively parallel computations have been used in many areas: The study of atomic and molecular dynamics. Even for simple diatomic molecules, their interactions and scattering involving quantum mechanics of vibrational and rotational states require parallel computers. For complicated molecules like proteins, large scale computation is an indispensable tool. Such studies of proteins are central to the understanding of, e.g., drugs and antibiotics, which in turns plays an increasing role in biology, human health, agriculture, and environmental sciences. Computations are also needed for the applications of semi-empirical theories used in e.g., the determination of parameters for the electronic structures of molecules or clusters of molecules in the gas phase.
The studies of reactions in the liquid phase. The reactions can involve the solvents affected by the temperature and pressure, e.g., in Car-Parrinello dynamics. Other computationally intensive studies in this area include liquid systems with molecular order like liquid crystals used in display devices.
Studies of solid and fully ordered systems. The study of the function and design of catalysts is important for industrial applications. The simulations of the binding of pollutants on surfaces of solid are important for environmental sciences. The calculations intrinsically involve a large number of atoms and called for parallel computation.
Computational chemistry makes extensive use of commercial packages, the popular ones including VASP, AMBER, NAMD, CPMD, CRYSTAL, GAMESS, Jaguar, MOISS, TURBOMOLE, GULP, MPQC, CADPAC, WIEN and Gaussian.
Description of the physical world is by and large based on partial differential equations (PDE). Many of them are intrinsically non-linear and are often beyond analytic treatments for realistic physical systems. The advance of high performance computing lets us go beyond perturbation and other approximate schemes and opens new frontiers in physics research for a whole range of disciplines — from basic theoretical studies like the dynamics of space and time to applied areas like fluid dynamics. The former involves the solving of the Einstein equations, and one has to appeal to massively parallel computation even for highly idealized situations in astrophysics, e.g., the collision of two black holes. For the latter, one has to solve the Navier-Stokes and related equations, which again calls for numerical treatment in realistic situations involving 3 spatial dimensions.
Arguably the second most important class of descriptions in physics is the statistical description for systems with randomness, fluctuations or involving many bodies. While such systems can also be described by differential equations and/or stochastic differential equations, the preferred method of treatment is often Monte-Carlo simulations, which can readily take advantage of high performance computation. The range of applicability again spans from fundamental studies like lattice gauge theory of the elementary particles quarks and gluons to applied areas like calculation of band structure in condensed matter physics. Again the availability of massively parallel computers can extend the scope of research, bring the accuracy of calculations to a level not previously possible and produce results in a much shorter timescale.
The study of geodynamics involves solving time-dependent, nonlinear partial differential equations. High performance computing is therefore an indispensable tool. Other aspects of earth science that can benefit from advanced computer technologies include, e.g., the study of the interior structure of Earth using digital waveforms of seismic waves from earthquakes recorded at global networks of seismometers. The 3-dimensional imaging of the Earth needs the collection, storage, retrieval and processing of large amount of data — the solving of the inverse problem to determine seismic structure from seismograms is again a computationally intensive problem.
Many areas in environmental science involve large data volumes, e.g., the analysis of remote sensing data, integrating the scene, atmosphere, sensor and detector, along with object data bases to perform a multi-component end-member analysis or principal component analysis on environmental remote sensing hyper-spectral data. Such work needs massively parallel computers with large RAM and fast network access.
The advancement of laboratory processes has resulted in a dramatic growth of genomics data in recent years. To accelerate the process of drug discovery, pharmaceutical and biotechnology institutes are facing the challenge of effectively applying bioinformatics, computing and mathematical sciences to the analysis of biological data, which involves huge data volume and consequently complex in silicon simulations and testing.
To yield accurate results for time-critical operations, parallel and distributed computing have been used in a wide range of bioinformatics techniques, including biological sequence alignment, gene finding and prediction, phylogenetic analysis, protein folding, genetic networks, gene linkage analysis and DNA microarray design.
ClusterTech Products & Solutions
Our consulting services span the whole deployment cycle and we offer advice on the purchase of hardware and system level software at the planning stage. We also offer code porting at the pre-deployment stage and batch queue tuning in the production stage.More Consultancy Services...