Folding@home - adomain

Introduction

Folding@homescience

Protein is formed by long chains. Protein is the basic condition for biological survival. As enzymes, they are the driving force of all biochemical reactions. As the basic components of structure, they are the main components of our bones, muscles, hair, skin and blood vessels. As antibodies, they can recognize invading objects and make the immune system work to eliminate these objects. Therefore, scientists have sequenced the human genome—the protein blueprint of the biosphere—but how can we understand what these proteins do? How do they work?

Related to the human chromosome project

Proteins play such an important (original) role in biology. Scientists have begun to understand the human genome. Start sorting. The genome is actually a "blueprint" related to proteins-the genome contains the genetic code (DNACode), which determines the sequence of amino acids strung into long protein chains.

The reason for folding

However, just knowing the genome sequence does not enable us to fully understand the working of the protein, let alone how it works. In order to perform its function (such as enzymes and antibodies), they must have a very specific shape, also known as "fold (Fold)." Proteins are like an amazing machine: before they go to work, they assemble themselves. This self-assembly is called "Folding".

One of the goals of our project is to imitate protein folding, so as to understand how proteins fold so quickly and reliably, and to understand how to use the properties of these proteins to make high-molecular polymers.

Protein folding and related diseases: Mad Cow Disease, Alzheimer's Disease

What will happen if the protein is not folded correctly? For example, Alzheimer's disease (Alzheimer's), cystic fibrosis (Cysticfibrosis), mad cow disease (MadCow, BSE), an inherited emphysema, and even many cancers are caused by abnormal protein folding.

When the protein folds abnormally, it may aggregate ("aggregate"). These aggregates may often accumulate in the brain, which is now commonly thought to cause Alzheimer's disease and mad cow disease.

Protein folding and nanotechnology: building nanoscale instruments!

In addition to biomedical applications, understanding protein folding will also teach us how to design our own protein-sized "nano-instruments" to perform similar tasks. Of course, they must also be assembled before nanoinstruments can perform any tasks.

Protein folding

The most surprising thing is not only that proteins themselves can assemble themselves-folding, but also that they self-assemble so fast: some proteins can assemble in millions Complete self-folding within a fraction of a second. Although this time is very fast in the human timetable, it is quite long if it is simulated with a computer. In fact, it takes about a day for the computer to simulate 1 nanosecond (1/1,000,000,000 of a second). Unfortunately, protein folding takes tens of milliseconds (10,000 nanoseconds) as a timetable. In this way, it would take 10,000 computers to spend several days to imitate folding. ——For example, it will take 30 computers for several years. It takes too much time to wait for a result.

An answer: distribution dynamics

To solve the problem of protein folding, we need to break through the microsecond barrier. Our group has developed a new way to imitate protein folding—a method of "breaking the work unit into multiple parts and using multiple processors to simulate" to break the millisecond barrier. Therefore, with 1,000 processors, we can break through the microsecond barrier to help understand the mystery of how proteins fold.

What have we done so far? What will we do?

Folding@home1.0 is successful. In the year from October 2000 to October 2001, we have used our experimental method to fold some small and fast-folding proteins. We are further developing our method and extending it to mimic the folding of some more complex and interesting proteins and the problem of "normal and abnormal protein folding". You can learn more from our results page.

How it works

Folding@home does not rely on powerful supercomputers for calculations. Instead, the main contributors are thousands of personal computers. Each participating computer is installed with a client program that runs in the background, and calls the central processing unit to perform simulation work when the system is not busy. At present, most personal computers in the world rarely use up their computing power under normal circumstances. Folding@home is to use these wasted computing power.

The client of Folding@Home will periodically connect to the server at Stanford University to obtain "workunits", which are data packets that store experimental data, and calculate based on the experimental data. After each unit of work is calculated, it is sent back to the server.

Analysis software

The client of Folding@home uses the modified four molecular simulation programs of TINKER, GROMACS, AMBER, and CPMD to perform calculations, and will be subject to permission Make optimizations to speed up the calculation. These four simulation programs have also been modified into multiple different versions for use on multiple operating platforms. The variants of each program will be classified by the serial number "Corexx".

Active kernel

GROMACS

Gromacs (Core78)

Only available for all single-processor platforms.

DGromacs (Core79)

The double precision version of Gromacs uses only SSE2.

Folding@home

Only available for all single-processor platforms.

DGromacsB(Core7b)

It is nominally an updated version of DGromacs, but it is actually a new kernel based on the source code of the SMP/GPU version. Both are in use.

The double precision version of Gromacs uses only SSE2.

Only available for all single-processor platforms.

DGromacsC(Core7c)

The double precision version of Gromacs uses only SSE2.

Only available for single-processor Windows and Linux platforms.

GBGromacs(Core7a)

GromacsSREM(Core80)

GroSimT(Core81)

Gromacs33(Corea0)

Gro-SMP (Corea1)

GroCVS (Corea2)

GroGPU2 (Core11 and Core13)

Gro-PS3 (no number, but also called SCEARD Kernel)

AMBER

PMD(Core82)

Not optimized

Only available for single-processor Windows and Linux platforms.

The kernel has been discontinued

TINKER

Tinkercore(Core65)

It has been discontinued and is powered by a faster similar kernel Replaced by GBGromacs (Core7a).

Not optimized

Only available for all single-processor platforms.

GROMACS

GroGPU(Core10)

CPMD

QMD(Core96)

Interface version

Folding@homeConsoleversion is the interface version of the command line interface version of Folding@home. It is hosted by the Pande Group of the Department of Chemistry at Stanford University. It was officially launched on October 1, 2000. It can accurately simulate protein folding. And the process of misfolding, in order to better understand the causes and development of multiple diseases, Folding@home is currently the largest distributed computing project in the world.

Platform support

Graphics processor

To quickly calculate the folding effect of proteins, it must be performed by a processor with high floating-point computing capabilities, and the GPU has With the advantages of powerful floating-point computing performance, Folding@home has also started to develop programs for GPUs, and delegate the tasks to GPU computing. On October 2, 2006, Folding@home publicly released a GPU test version for Windows systems. During the test, it received 31TFLOPS computing performance provided by 450 ATIX1900GPUs. The average computing power of each core is more than 70 of a traditional CPU. Times. As of April 10, 2008, the second-generation WindowsGPU public beta version was launched. The new version supports ATI/AMDHD2xxx and HD3xxx series. It no longer needs to communicate with the graphics core through the DirectX interface and supports multiple GPU cores. The version for NVIDIAGPU uses CUDA technology to enable the GPU to perform protein folding operations. NVIDIA officially stated that only 0.1% of the world’s graphics cards that support CUDA are required to perform operations, and the performance can reach 7PFLOPS, far exceeding the computing level of supercomputers [6]. A public beta version for CUDA-enabled NVIDIAGPU has been launched.

Station3

Sony has joined the Folding@home project. Starting with PS3 version 1.6 firmware, it supports the scientific calculation of the project. As PS3 uses the Cell processor, it can provide powerful computing performance. When the PS3 is idle, it will start the calculation program to calculate the folding effect of the protein, and use the results to study various intractable diseases. When the CELL processor operates, NVIDIA's RSX display core will provide a three-dimensional real-time graphic display of protein folding. The graphics display effect is good, supports 1080p output, and HDR effects. The user can use the handle to control the viewing angle.

PS3 once provided Folding@home with the most computing power. With the introduction of software for NVIDIAGPU, NVIDIAGPU replaced PS3 and became the main force of Folding@home. As of early September 2008, participating PS3 game consoles provided more than 1,200 TFLOPS of computing power for the project, accounting for nearly 35% of the total.

Multi-core processors

As more and more different styles of multi-core processors are launched, more and more software supports multi-core, PandeGroup also added support for symmetry in Folding@home Multiprocessor (SMP), hoping to enhance the computing power of the software. By using MPI, the software can use multiple cores to perform operations at the same time.

Folding@home, which supports SMP, launched a beta test version for x86-64Linux and x86MacOSX on November 13, 2006. A trial version for Win32 has also been released for 32-bit Linux. The version of is still under development.

Energy consumption

The nominal power of a PlayStation3 game console is 380W. Since Folding@Home is designed for CPU use, it will cause the power consumption of the host to reach 100%. However, according to Stanford’s common questions about PS3 consoles, it is pointed out that each console "when running the program, its estimated power is about 200W [7]". As of the end of May 2008, a total of more than 51,000 PS3 hosts have provided more than 1,400 TFlops of computing power for the plan, and each PS3 provides an average of nearly 30,000 MFlops. Based on Stanford’s 200W output per host (using a 90nm process processor), it is estimated that Each watt output provides more than 150 MFlops [8]. As PS3's Cell processor uses more subtle 65nm and 45nm processes, the power consumption of the processor will be further reduced, and its computing power per watt output will also increase