Information about Flat File

A flat file is a computer file that can only be read or written sequentially. It consists of one or more records. Each record contains one or more field instances. Each field instance can contain a data value, or be omitted. Some definitions state that all records must be of the same type. This restriction is usual when discussing a flat file database. However, most usages allow a flat file to have more than one record type.

Flat files date back to the earliest days of computer processing. Originally flat files were stored on punch cards, paper tape, or magnetic tape. These are inherently sequential. Flat files are still widely used, even for files stored on a disk. One reason is that sequential access is faster than indexed access, (also known as random access or direct access). Flat files are often used to transmit data between batch processing systems, especially on mainframes.

Flat files are often described using a COBOL copybook which defines the type, length, and other properties of the fields and records.

Often each field has a fixed width. In the common case when all the fields and all the records are fixed width the flat file can be called a fixed width file. In a fixed width file there typically is no field delimiter and no record delimiter and field instances are never omitted. An empty field is indicated using a special filler value, e.g. spaces or zeroes. Fixed width records often contain 80 bytes, the width of a punch card.

In a variable width record the fields are separated using a special character such as the tab character, the comma, or the pipe character. Sometimes field values are enclosed in quotation marks, and any internal quotation marks are doubled. The most common record delimiter is the newline. See CSV file for a more detailed description of this kind of file.

There can be records of many different types in the same flat file. A typical approach is for a file to have zero or more header records, one or more detail records, zero or more summary records, and zero or more trailer records.

A flat file does not have any indexes, and does not have any internal pointers. An ISAM file or a VSAM file is not a flat file, because they support indexed access in addition to the sequential access method.

Flat files are still widely used for data transmission because they are compact and support high performance operations. Transmitting the same data using a relational approach would require many tables, one for each different record type. Another difference between flat files and relational tables is that in a flat file the order of the records can matter. Yet another difference is that in a flat file a field can occur more than once in a record. An ETL system will generally sort the input file before submitting it to the database's bulk loader, in order to reduce total elapsed time. Long before there were any databases a master file was "joined" to a detail file by sorting them both on a common key, e.g. part number, and then doing a merge.

Like a flat file, an XML file can contain many different types of data. There are many possible ways to represent the information in a flat file using XML. For example, each field and each record could be an XML element. One advantage of using XML would be that each field is named. A disadvantage is that the file would be larger. A file containing XML is not generally called a flat file, even though it satisfies the definition. It usually is called an XML file.

Flat files are also called feed files, or batch files. They are often transmitted over a network using ftp, the file transfer protocol, or a newer secure alternative, e.g. sftp. Flat files are also used in EDI, Electronic Data Interchange.

The Jargon file and The New Hacker's Dictionary edited by Eric S. Raymond, 1991 contain this definition for flat-file:

A flattened representation of some database or tree or network structure, as a single file from which the structure could implicitly be rebuilt, esp. one in flat-ASCII form.


Flat file may also refer to:

See also

computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage.
..... Click the link for more information.
sequence is an ordered list of objects (or events). Like a set, it contains members (also called elements or terms), and the number of terms (possibly infinite) is called the length of the sequence.
..... Click the link for more information.
In computer science, object composition (not to be confused with function composition) is a way and practice to combine simple objects or data types into more complex ones.
..... Click the link for more information.
In computer science, data that has several parts can be divided into fields. For example, a computer may represent today's date as three distinct fields: the day, the month and the year.
..... Click the link for more information.
In computer science, a value is a sequence of bits that is interpreted according to some data type. It is possible for the same sequence of bits to have different values, depending on the type used to interpret its meaning.
..... Click the link for more information.
flat file database describes any of various means to encode a data model (most commonly a table) as a plain text file.

Flat files

A flat file is a file that contains records, and in which each record is specified in a single line.
..... Click the link for more information.
punch card or punched card (or punchcard or Hollerith card or IBM card), is a piece of stiff paper that contains digital information represented by the presence or absence of holes in predefined positions.
..... Click the link for more information.
Punched tape or paper tape is a largely obsolete form of data storage, consisting of a long strip of paper in which holes are punched to store data. It was widely used during much of the twentieth century for teleprinter communication, and later as a storage medium for
..... Click the link for more information.
Magnetic tape is a medium for magnetic recording generally consisting of a thin magnetizable coating on a long and narrow strip of plastic. Nearly all recording tape is of this type, whether used for recording audio or video or for computer data storage.
..... Click the link for more information.
Hard disk drive

An IBM hard disk drive with the metal cover removed. The platters are highly reflective.
Date Invented: September 13 1956
Invented By: An IBM team led by Reynold Johnson
Connects to:
..... Click the link for more information.
Batch processing is execution of a series of programs ("jobs") on a computer without human interaction.

Batch jobs are set up so they can be run to completion without human interaction, so all input data is preselected through scripts or commandline parameters.
..... Click the link for more information.
Mainframes (often colloquially referred to as Big Iron) are computers used mainly by large organizations for critical applications, typically bulk data processing such as census, industry and consumer statistics, ERP, and financial transaction processing.
..... Click the link for more information.
COBOL
Paradigm: multi-paradigm
Appeared in: 1959
Designed by: Grace Hopper, William Selden, Gertrude Tierney, Howard Bromberg, Howard Discount, Vernon Reeves, Jean E.
..... Click the link for more information.
A copybook is the section of code in a high-level computer programming language that is copied from a master copy and inserted in several different programs (or multiple places in a single program).
..... Click the link for more information.
Comma-separated values

File extension: .csv
MIME type: text/csv
text/comma-separated-values (deprecated)
The comma-separated values (or CSV; also known as a comma-separated list or
..... Click the link for more information.
ISAM stands for Indexed Sequential Access Method, a method for storing data for fast retrieval. ISAM was originally developed by IBM for mainframe computers and today forms the basic data store of almost all databases, both relational and otherwise.
..... Click the link for more information.
ETL may stand for:
  • Extract, transform, load, a data warehousing function
  • ETL SEMKO (formerly Edison Testing Laboratory)
  • Enterprise Tape library
  • Express Toll Lanes
  • The American Association of Railroads reporting mark for the Essex Terminal Railway, in Canada.

..... Click the link for more information.
Extensible Markup Language

File extension: .xml
MIME type: application/xml, text/xml (deprecated)
Uniform Type Identifier: public.xml
Developed by: World Wide Web Consortium
Type of format: Markup language
Extended from: SGML
..... Click the link for more information.
FTP or File Transfer Protocol is used to transfer data from one computer to another over the Internet, or through a network.

Specifically, FTP is a commonly used protocol for exchanging files over any network that supports the TCP/IP protocol (such as the Internet or
..... Click the link for more information.
This article needs sources or references that appear in reliable, third-party publications. Alone, primary sources and sources affiliated with the subject of this article are not sufficient for an accurate encyclopedia article.
..... Click the link for more information.
The Jargon File is a glossary of hacker slang. The original Jargon File was a collection of hacker slang from technical cultures including the MIT AI Lab, the Stanford AI Lab (SAIL), and others of the old ARPANET AI/LISP/PDP-10 communities including Bolt, Beranek and Newman (BBN),
..... Click the link for more information.
Eric Steven Raymond (born December 4, 1957), often referred to as ESR, is a computer programmer, author and open source software advocate. His reputation within the hacker culture was established when he became the maintainer of the "Jargon File".
..... Click the link for more information.
A file folder is a kind of folder that holds loose papers together for organization and protection. File folders usually consist of a sheet of heavy paper stock or other thin, but stiff, material which is folded in half, and are used to keep paper documents.
..... Click the link for more information.
flat file database describes any of various means to encode a data model (most commonly a table) as a plain text file.

Flat files

A flat file is a file that contains records, and in which each record is specified in a single line.
..... Click the link for more information.
In computer science, a record-oriented filesystem is a file system where files are stored as a collection of records. Generally, these systems support several different record formats: either fixed-length and variable length, different lengths, and different physical organisations
..... Click the link for more information.


This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus


page counter