Information about Flops



In computing, FLOPS (or flops or flop/s) is an acronym meaning FLoating point Operations Per Second. The FLOPS is a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating point calculations; similar to instructions per second. Since the final S stands for "second", conservative speakers consider "FLOPS" as both the singular and plural of the term, although the singular "FLOP" is frequently encountered. Alternatively, the singular FLOP (or flop) is used as an abbreviation for "FLoating-point OPeration", and a flop count is a count of these operations (e.g., required by a given algorithm or computer program). In this context, "flops" is simply the plural rather than a rate.

Computing devices exhibit an enormous range of performance levels in floating-point applications, so it makes sense to introduce larger units than FLOPS.
The standard SI prefixes can be used for this purpose, resulting in such units as teraFLOPS (1×1012 FLOPS)

According to Top500.org, the fastest computer in the world as of June 2007 was the IBM Blue Gene/L supercomputer, measuring a peak of 280.6 TFLOPS. But Cray Inc. recently released their Cray XT4 which can peak at 318 TFLOPS.

A basic calculator performs relatively few FLOPS. Each calculation request to a typical calculator requires only a single operation, so there is rarely any need for its response time to exceed that needed by the operator. Any response time below 0.1 second is perceived as instantaneous by a human operator, so a simple calculator could be said to operate at about 10 FLOPS.

FLOPS as a measure of performance

In order for FLOPS to be useful as a measure of floating-point performance, a standard benchmark must be available on all computers of interest. One example is the LINPACK benchmark.

FLOPS in isolation are arguably not very useful as a benchmark for modern computers. There are many factors in computer performance other than raw floating-point computation speed, such as I/O performance, interprocessor communication, cache coherence, and the memory hierarchy. This means that supercomputers are in general only capable of a small fraction of their "theoretical peak" FLOPS throughput (obtained by adding together the theoretical peak FLOPS performance of every element of the system). Even when operating on large highly parallel problems, their performance will be bursty, mostly due to the residual effects of Amdahl's law. Real benchmarks therefore measure both peak actual FLOPS performance as well as sustained FLOPS performance.

For ordinary (non-scientific) applications, integer operations (measured in MIPS) are far more common. Measuring floating point operation speed, therefore, does not predict accurately how the processor will perform on just any problem. However, for many scientific jobs such as analysis of data, a FLOPS rating is effective.

Historically, the earliest reliably documented serious use of the Floating Point Operation as metric appears to be AEC justification to Congress for purchasing a Control Data CDC 6600 in the mid-1960s.

The terminology is currently so confusing that until April 24, 2006 U.S. export control was based upon measurement of "Composite Theoretical Performance" (CTP) in millions of "Theoretical Operations Per Second" or MTOPS. On that date, however, the U.S. Department of Commerce's Bureau of Industry and Security amended the Export Administration Regulations to base controls on Adjusted Peak Performance (APP) in Weighted TeraFLOPS (WT).

Records

Today Blue Gene is the world's fastest computer, at 360 TFLOPS. On June 26, 2007, IBM announced the second generation of its top supercomputer, dubbed Blue Gene/P and designed to continuously operate at speeds exceeding one petaflop. When configured to do so, it can reach speeds in excess of three petaflops. In June 2006, a new computer was announced by Japanese research institute RIKEN, the MDGRAPE-3. The computer's performance tops out at one petaflop, over three times faster than the Blue Gene/L. MDGRAPE-3 is not a general purpose computer, which is why it does not appear in the TOP500 list. It has special-purpose pipelines for simulating molecular dynamics. MDGRAPE-3 houses 4,808 custom processors, 64 servers each with 256 dual-core processors, and 37 servers each containing 74 processors, for a total of 40,314 processor cores, compared to the 131,072 needed for the Blue Gene/L. MDGRAPE-3 is able to do many more computations with few chips because of its specialized architecture. The computer is a joint project between Riken, Hitachi, Intel, and NEC subsidiary SGI Japan.

Distributed computing uses the Internet to link personal computers to achieve a similar effect:
  • The entire BOINC averages 663 TFLOPS as of September 8, 2007.[1]
  • SETI@home computes data averages more than 265 TFLOPS.[2]
  • Folding@home has reached over 1 PFLOPS.[3] Note, as of March 22nd, 2007, PlayStation 3 owners may now participate in the FAH project. Because of this, FAH is now sustaining considerably higher than 210 TFLOPS (1267 as of 9/23/07). See current stats[4] for details.
  • Einstein@home is crunching more than 70 TFLOPS.[5]
  • As of June 2007, GIMPS is sustaining 23 TFLOPS.[6]
  • Intel Corporation has recently unveiled the experimental multi-core POLARIS chip, which achieves 1 TFLOPS at 3.2GHz. The 80-core chip can increase this to 1.8 TFLOPS at 5.6GHz, although the thermal dissipation at this frequency exceeds 260 watts.
As of 2007, the fastest PC processors perform over 30 GFLOPS.[7] GPUs in PCs are considerably more powerfull in terms of pure FLOPS. For example, in the GeForce 8 Series the nVidia 8800 Ultra performs around 576 GFLOPS on 128 Processing elements. This equates to around 4.5 GFLOPS per element, compared with 2.75 per core for the Blue Gene/L. It should be noted that the 8800 series performs only single precision calculations, and that while GPUs are highly efficient at calculations they are not as flexible as a general purpose CPU. Current top end ATI GPU cards do perform double precision operations.

Cost of computing

  • 1997: about US$30,000 per GFLOPS; with two 16-Pentium-Pro–processor Beowulf cluster computers, [8]
  • 2000, April: $1,000 per GFLOPS, Bunyip, Australian National University. First sub-US$1/MFlop. Gordon Bell Prize 2000.
  • 2000, May: $640 per GFLOPS, KLAT2, University of Kentucky
  • 2003, August: $82 per GFLOPS, KASY0, University of Kentucky
  • 2006, February: about $1 per GFLOPS in ATI PC add-in graphics card (X1900 architecture) - these figures are disputed as they refer to highly parallelized GPU power''
  • 2007, March: about $0.42 per GFLOPS in Ambric AM2045 [9]''
This trend toward low cost follows Moore's law.

Pop culture references

  • In the Star Trek fictional universe, circa 2364, the android Data was constructed with an initial linear computational speed rated at 60 trillion operations per second, or 60 TOPS (and thereby, potentially 'dating' the series in which he appears).
  • In the series, the main computer reported it processed 575 Trillion operations per nanosecond. This would be 575 ZettaOPS.
  • In the movie Terminator III, Skynet is said to be operating at 60 teraFLOPS.
  • In the mass multiplayer online game EVE-Online, the computer systems on starships is rated in teraFLOPS, ranging from 100 teraFLOPS on the smallest sized vessels, to 1.25 petaFLOPS on the largest vessel.

See also

References

External links

  • A flop is a slang term for failure; in the entertainment world, usually referring to a movie or TV show that does not do well, or is expected to do well and falls short.

..... Click the link for more information.
computing is synonymous with counting and calculating. Originally, people that performed these functions were known as computers. Today it refers to a science and technology that deals with the computation and the manipulation of symbols.
..... Click the link for more information.
computer is a machine which manipulates data according to a list of instructions.

Computers take numerous physical forms. The first devices that resemble modern computers date to the mid-20th century (around 1940 - 1941), although the computer concept and various machines
..... Click the link for more information.
Computer performance is characterised by the amount of useful work accomplished by a computer system compared to the time and resources used.

Depending on the context, good computer performance may involve one or more of the following:

..... Click the link for more information.
In computing, floating-point is a numerical-representation system in which a string of digits (or bits) represents a real number. The most commonly encountered representation is that defined by the IEEE 754 Standard.
..... Click the link for more information.
A calculation is a deliberate process for transforming one or more inputs into one or more results.

The term is used in a variety of senses, from the very definite arithmetical calculation using an algorithm to the vague heuristics of calculating a strategy in a competition
..... Click the link for more information.
Instructions per second (IPS) is a measure of a computer's processor speed. Many reported IPS values have represented "peak" execution rates on artificial instruction sequences with few branches, whereas realistic workloads consist of a mix of instructions and even applications,
..... Click the link for more information.
An SI prefix (also known as a metric prefix) is a name or associated symbol that precedes a unit of measure (or its symbol) to form a decimal multiple or submultiple.
..... Click the link for more information.
20th century - 21st century - 22nd century
1970s  1980s  1990s  - 2000s -  2010s  2020s  2030s
2004 2005 2006 - 2007 - 2008 2009 2010

2007 by topic:
News by month
Jan - Feb - Mar - Apr - May - Jun
..... Click the link for more information.
Blue Gene is a computer architecture project designed to produce several next-generation supercomputers, designed to reach operating speeds in the PFLOPS (petaFLOPS) range, and currently reaching sustained speeds over 360 TFLOPS (teraFLOPS).
..... Click the link for more information.
A calculator is a hand-held device for performing calculations. Although modern calculators often incorporate a general purpose computer, the device is designed for performing specific operations, rather than for flexibility.
..... Click the link for more information.
Response time may mean:
  • Response time (technology), the time a generic system or functional unit takes to react to a given input.
  • Round-trip delay time in telecommunications
  • Reaction time in experimental psychology

..... Click the link for more information.
benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it.
..... Click the link for more information.
LINPACK is a software library for performing numerical linear algebra on digital computers. It was written in Fortran by Jack Dongarra, Jim Bunch, Cleve Moler, and Pete Stewart, and was intended for use on supercomputers in the 1970s and early 1980s.
..... Click the link for more information.
In computing, cache coherence refers to the integrity of data stored in local caches of a shared resource. Cache coherence is a special case of memory coherence.

When clients in a system, particularly CPUs in a multiprocessing system, maintain caches of a common memory
..... Click the link for more information.
memory hierarchy. It is designed to take advantage of memory locality in computer programs. Each level of the hierarchy has the properties of higher speed, smaller size, and lower latency than lower levels.
..... Click the link for more information.
Amdahl's law, named after computer architect Gene Amdahl, is used to find the maximum expected improvement to an overall system when only part of the system is improved. It is often used in parallel computing to predict the theoretical maximum speedup using multiple processors.
..... Click the link for more information.
The integers (from the Latin integer, which means with untouched integrity, whole, entire) are the set of numbers including the whole numbers (0, 1, 2, 3, …) and their negatives (0, −1, −2, −3, …).
..... Click the link for more information.
Instructions per second (IPS) is a measure of a computer's processor speed. Many reported IPS values have represented "peak" execution rates on artificial instruction sequences with few branches, whereas realistic workloads consist of a mix of instructions and even applications,
..... Click the link for more information.
United States Atomic Energy Commission (AEC) to foster and control the peace time development of atomic science and technology. President Harry S. Truman signed the Atomic Energy Act (also known as the McMahon Act) on August 1, 1946, transferring the control of atomic energy from
..... Click the link for more information.
CDC 6600 was a mainframe computer from Control Data Corporation, first delivered in 1964. It is generally considered to be the first successful supercomputer, outperforming its fastest predecessor, IBM 7030 Stretch, by about three times.
..... Click the link for more information.
Untied States
Department of Commerce


Seal of the Department of Commerce
Agency overview
Formed February 14, 1903

Employees 36,000 (2004)
Annual Budget $9.
..... Click the link for more information.
The Bureau of Industry and Security (BIS) is an agency of the United States Department of Commerce which deals with issues involving national security and high technology.
..... Click the link for more information.
Adjusted Peak Performance (APP) is a metric introduced by the U.S. Department of Commerce's Bureau of Industry and Security to more accurately predict the suitability of a computing system to complex computational problems such as simulating nuclear weapons.
..... Click the link for more information.
Weighted TeraFLOPS (WT) is a unit of measurement introduced by the U.S. Department of Commerce's Bureau of Industry and Security to specify Adjusted Peak Performance (APP).

The weighting factor is 0.3 for non-vector processors and 0.9 for vector processors.
..... Click the link for more information.
Blue Gene is a computer architecture project designed to produce several next-generation supercomputers, designed to reach operating speeds in the PFLOPS (petaFLOPS) range, and currently reaching sustained speeds over 360 TFLOPS (teraFLOPS).
..... Click the link for more information.
June 2006 : ← - January - February - March - April - May - June - July - August - September - October - November - December- →

Deaths
  • 1: RocĂ­o Jurado
  • 2: Vince Welnick
  • 6: Billy Preston
  • 7: Abu Musab al-Zarqawi
  • 7: John Tenta

..... Click the link for more information.
RIKEN (理研) is a large natural sciences research institute in Japan. It was founded in 1917, and now has approximately 3000 scientists on seven campuses across Japan, the main one in Wako, just outside Tokyo.
..... Click the link for more information.
pipeline is a set of data processing elements connected in series, so that the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time-sliced fashion; in that case, some amount of buffer storage is often inserted
..... Click the link for more information.
RIKEN (理研) is a large natural sciences research institute in Japan. It was founded in 1917, and now has approximately 3000 scientists on seven campuses across Japan, the main one in Wako, just outside Tokyo.
..... Click the link for more information.


This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus


page counter