Information about Filename
A filename is a special kind of string used to uniquely identify a file stored on the file system of a computer. Depending on the operating system, such a name may also identify a directory. Different operating systems impose different restrictions regarding length and allowed characters on filenames. A filename includes one or more of these components:
In some systems, if a filename does not contain a path part, the file is assumed to be in the current working directory.
Many operating systems, including MS-DOS, Microsoft Windows, and VMS systems, allow a filename extension that consists of one or more characters following the last period in the filename, thus dividing the filename into two parts: the basename (the primary filename) and the extension (usually indicating the file type associated with a certain file format). On these systems the extension is considered part of the filename, and on systems which allow (for example) an eight character basename followed by a three character extension, a filename with an extension of "" or " " (nothing, or three spaces) will still be 11 characters long (since the "." is supplied, but not considered as part of the name, by the OS). On Unix-like systems, files will often have an extension (for example prog.c, denoting the C-language source code of a program called "prog"); but since the extension is not considered a separate part of the filename, a file on a Unix system which allows 14-character filenames, and with a filename which uses "." as an "extension separator" or "delimiter", could possibly have a filename such as a.longxtension
Within a single directory, filenames must be unique. However, two files in different directories may have the same name. On Unix, however, upper-case and lower-case are considered different, so that files MyName and myname would be valid names for different files in the directory; historically, names with upper-case characters in them have come before those with all-lower-case names in them in directory (folder) listings; many Unix software vendors use this scheme to make important files, such as INSTALL or README, appear in listings before relatively less important files or directories (like lib).
Unix-like systems allow a file to have more than one name; in traditional Unix-style file systems, the names are hard links to the file's inode or equivalent. Hard links are different from Windows shortcuts, Mac OS aliases, or symbolic links.
Some operating systems prohibit some particular characters from appearing in file names:
Note 1: Most Unix shells require certain characters such as spaces, <, >, |, \, and sometimes :, (, ), &, ;, to be quoted or escaped:
- protocol (or scheme) — access method (e.g., http, ftp, file etc.)
- host (or network-ID) — host name, IP address, domain name, or LAN network name (e.g., wikipedia.org, 207.142.131.206, \\MYCOMPUTER, SYS:, etc.)
- device (or node) — port, socket, drive, root mountpoint, disc, volume (e.g., C:, /, SYSLIB, etc.)
- directory (or path) — directory tree (e.g., /usr/bin, \TEMP, [USR.LIB.SRC], etc.)
- file — base name of the file
- type (format or extension) — indicates the content type of the file (e.g., .txt, .exe, .dir, etc.)
- version — revision number of the file
In some systems, if a filename does not contain a path part, the file is assumed to be in the current working directory.
Many operating systems, including MS-DOS, Microsoft Windows, and VMS systems, allow a filename extension that consists of one or more characters following the last period in the filename, thus dividing the filename into two parts: the basename (the primary filename) and the extension (usually indicating the file type associated with a certain file format). On these systems the extension is considered part of the filename, and on systems which allow (for example) an eight character basename followed by a three character extension, a filename with an extension of "" or " " (nothing, or three spaces) will still be 11 characters long (since the "." is supplied, but not considered as part of the name, by the OS). On Unix-like systems, files will often have an extension (for example prog.c, denoting the C-language source code of a program called "prog"); but since the extension is not considered a separate part of the filename, a file on a Unix system which allows 14-character filenames, and with a filename which uses "." as an "extension separator" or "delimiter", could possibly have a filename such as a.longxtension
Within a single directory, filenames must be unique. However, two files in different directories may have the same name. On Unix, however, upper-case and lower-case are considered different, so that files MyName and myname would be valid names for different files in the directory; historically, names with upper-case characters in them have come before those with all-lower-case names in them in directory (folder) listings; many Unix software vendors use this scheme to make important files, such as INSTALL or README, appear in listings before relatively less important files or directories (like lib).
Unix-like systems allow a file to have more than one name; in traditional Unix-style file systems, the names are hard links to the file's inode or equivalent. Hard links are different from Windows shortcuts, Mac OS aliases, or symbolic links.
Reserved characters and words
Many operating systems prohibit control characters from appearing in file names. Unix-like systems are an exception, as the only control character forbidden in file names is the null character, as that's the end-of-string indicator in C. Trivially, Unix also excludes the path separator / from appearing in filenames.Some operating systems prohibit some particular characters from appearing in file names:
| Character | Name | Reason |
|---|---|---|
| / | slash | used as a path name component separator in Unix-like, MS-DOS and Windows, and Amiga systems. |
| backslash | treated the same as slash in MS-DOS and Windows, and as the escape character in Unix systems, see Note 1 | |
| ? | question mark | used as a wildcard in Unix, Windows and AmigaOS; marks a single character. |
| % | percent sign | used as a wildcard in RT-11; marks a single character. |
| * | asterisk | used as a wildcard in Unix, MS-DOS, RT-11, VMS and Windows. Marks any sequence of characters (Unix, Windows, later versions of MS-DOS) or any sequence of characters in either the basename or extension (thus "*.*" in early versions of MS-DOS means "all files". See note 1 |
| : | colon | used to determine the mount point / drive on Windows; used to determine the virtual device or physical device such as a drive on AmigaOS, RT-11 and VMS; used as a pathname separator in classic Mac OS. Doubled after a name on VMS, indicates the DECnet nodename (equivalent to a NetBIOS (Windows networking) hostname preceded by "\\".) |
| | | vertical bar | designates software pipelining in Windows, see Note 1 |
| " | quotation mark | used to mark beginning and end of filenames containing spaces in Windows, see Note 1 |
| < | less than | used to redirect input, allowed in Unix filenames, see Note 1 |
| > | greater than | used to redirect output, allowed in Unix filenames, see Note 1 |
| . | period | allowed but the last occurrence will be interpreted to be the extension separator in VMS, MS-DOS and Windows. In other OSes, usually considered as part of the filename, and more than one full stop may be allowed. |
Note 1: Most Unix shells require certain characters such as spaces, <, >, |, \, and sometimes :, (, ), &, ;, to be quoted or escaped:
five\ and\ six\<seven (example of escaping)<br>'five and six<seven' or "five and six<seven" (examples of quoting)</blockquote>
In Windows the space and the period are not allowed as the final character of a filename. The period is allowed as the first character, but certain Windows applications, such as Windows Explorer, forbid creating or renaming such files (despite this convention being used in Unix-like systems to describe hidden files and directories). Among workarounds are using different explorer applications or saving a file from an application with the desired name.[1]
Some file systems on a given operating system (especially file systems originally implemented on other operating systems), and particular applications on that operating system, may apply further restrictions and interpretations. See comparison of file systems for more details on restrictions imposed by particular file systems.
In Unix-like systems, MS-DOS, and Windows, the file names "." and ".." have special meanings (current and parent directory respectively).
In addition, in Windows and DOS, some words might also be reserved and can not be used as filenames.[1] For example, DOS Device file: CON, PRN, AUX, CLOCK$, NUL COM0, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9 LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9. Operating systems that have these restrictions cause incompatibilities with some other filesystems. For example, Windows will fail to handle, or raise error reports for, these legal UNIX filenames: aux.c, q"uote"s.txt, or NUL.txt.Comparison of file name limitations
System Alphabetic Case Sensitivity Allowed Character Set Reserved Characters Reserved Words Maximum Length Comments MS-DOS FAT case-insensitive case-destruction A–Z 0–9 - _ all except allowed 8 + 3 Commodore 64 case-sensitive case-preservation any :,= shift-space 16 Actual limit depends on the drive used, but most drives limit the length to 16 characters. Win95 VFAT case-insensitive any |\?*<":>+[]/ control characters 255 WinXP NTFS optional any |\?*<":>/ control characters aux, con, prn 255 OS/2 HPFS case-insensitive case-preservation any |\?*<":>/ 254 Mac OS HFS case-insensitive case-preservation any : 255 Finder is limited to 31 characters Mac OS HFS+ case-insensitive case-preservation any : on disk, in classic Mac OS, and at the Carbon layer in Mac OS X; / at the Unix layer in Mac OS X 255 Mac OS 8.1 - Mac OS X most UNIX file systems case-sensitive case-preservation any / null 255 a leading . indicates that ls and file managers will not by default show the file early UNIX (AT&T) case-sensitive case-preservation any / 14 a leading . indicates a "hidden" file POSIX "Fully portable filenames"[2] case-sensitive case-preservation A–Za–z0–9._- / null Filenames to avoid include: a.out, core, .profile, .history, .cshrc 14 hyphen must not be first character AmigaOS case-insensitive case-preservation any :/" 107 dos.library Amiga OFS case-insensitive case-preservation any :/" 30 Original File System 1985 Amiga FFS case-insensitive case-preservation any :/" 30 Fast File System 1988 Amiga PFS case-insensitive case-preservation any :/" 255 Professional File System 1993 Amiga SFS case-insensitive case-preservation any :/" 32,000 Smart File System 1998 Amiga FFS2 case-insensitive case-preservation any :/" 107 Fast File System 2 2002 BeOS BFS case-sensitive UTF-8 / 255 DEC PDP-11 RT-11 case-insensitive RADIX-50 6 + 3 Flat filesystem with no subdirs. A full "file specification" includes device, filename and extension (file type) in the format: dev:filnam.ext. DEC VAX VMS case-insensitive A–Z 0–9 _ 32 per component; earlier 9 per component; latterly, 255 for a filename and 32 for an extension. a full "file specification" includes nodename, diskname, directory/ies, filename, extension and version in the format: OURNODE::MYDISK:[THISDIR.THATDIR]FILENAME.EXTENSION;2 Directories can only go 8 levels deep. ISO 9660 case-insensitive A–Z 0–9 _ . 255 8 directory levels max (for Level 1 conformance) See also
- File system
- Long filename
- Path (computing)
- Uniform Resource Identifier (URI)
- Uniform Resource Locator (URL)
References
1. ^ Naming a file msdn.microsoft.com (MSDN), filename restrictions on Windows
2. ^ Lewine, Donald. POSIX Programmer's Guide: Writing Portable UNIX Programs 1991 O'Reilly & Associates, Inc. Sebastopol, CA pp63-64
External links
- Large list of filename extensions: FILExt
- Large list of filename extensions: File-extensions.org
- Metasearch engine for file extensions: File Extension Seeker
- Filename Strategy for Managing Image Assets: ControlledVocabulary.com
- Recommendations for Limitations on Image Filenaming: ControlledVocabulary.com
string is an ordered sequence of symbols. These symbols are chosen from a predetermined set.
In programming, when stored in memory each symbol is represented using a numeric value.
..... Click the link for more information.computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage.
..... Click the link for more information.file system (often also written as filesystem) is a method for storing and organizing computer files and the data they contain to make it easy to find and access them.
..... Click the link for more information.In computing, a directory, catalog, or folder[1] is an entity in a file system which contains a group of files and/or other directories. A typical file system may contain thousands (or even hundreds of thousands) of directories.
..... Click the link for more information.An operating system (OS) is the software that manages the sharing of the resources of a computer. An operating system processes system data and user input, and responds by allocating and managing tasks and internal system resources as a service to users and programs of the
..... Click the link for more information.In computing, the working directory of a process is the directory of a hierarchical file system, if any,[1] which is implicitly used to determine the file referenced to by the process with a file name only, or with a relative path (as opposed to files referenced by full
..... Click the link for more information.MS-DOS (short for Microsoft Disk Operating System) is an operating system commercialized by Microsoft. It was the most commonly used member of the DOS family of operating systems and was the dominant operating system for the PC compatible
..... Click the link for more information.Microsoft Windows
Screenshot of Windows Vista Ultimate, the latest version of Microsoft Windows.
Company/developer: Microsoft Corporation
OS family: MS-DOS/9x-based, Windows CE, Windows NT
Source model: Closed source
..... Click the link for more information.VMS may stand for:
- OpenVMS and FreeVMS, a computer server operating system
- Variable-message sign, an electronic traffic sign often used on highways
- Video Monitoring Services
..... Click the link for more information.A filename extension is a suffix to the name of a computer file applied to indicate its type. It is commonly used to infer information about what sort of data might be stored in the file.
..... Click the link for more information.A file format is a particular way to encode information for storage in a computer file.
Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa.
..... Click the link for more information.Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification.
..... Click the link for more information.C
The C Programming Language, Brian Kernighan and Dennis Ritchie, the original edition that served for many years as an informal specification of the language.
..... Click the link for more information.source code (commonly just source or code) is any sequence of statements and/or declarations written in some human-readable computer programming language.
..... Click the link for more information.A readme (or read me) file contains information about other files in a directory or archive and is very commonly distributed with computer software. Such a file is usually a text file called README.TXT, README.1ST, READ.
..... Click the link for more information.In computing, a hard link is a reference, or pointer, to physical data on a storage volume. On most file systems, all named files are hard links. The name associated with the file is simply a label that refers the operating system to the actual data.
..... Click the link for more information.In computing, an inode is a data structure on a traditional Unix-style file system such as UFS. An inode stores basic information about a regular file, directory, or other file system object.
..... Click the link for more information.Computer shortcuts are small files containing only the location of another file, and sometimes specific parameters to be passed to it when run. They are commonly placed on the desktop, start menu, and taskbar of various operating systems, and may only work from the GUI and not from
..... Click the link for more information.This article relates to both the original "Classic" Mac OS as well as Mac OS X, Apple's more recent operating system. See the Mac OS X article for information directly relating to this current Macintosh operating system.
..... Click the link for more information.In Mac OS System 7 and later, an alias is a small file that represents another object in the file system. It is similar to the Unix symbolic link, but with the added benefit of working even if the target file moves to another location on the same disk.
..... Click the link for more information.symbolic link (often shortened to symlink and also known as a soft link) consists of a special type of file that serves as a reference to another file or directory. Unix-like operating systems in particular often feature symbolic links.
..... Click the link for more information.Due to technical limitations, /. redirects here. You may be looking for Slashdot, the technology news web site.A slash or stroke, /, is a punctuation mark.
..... Click the link for more information.The backslash ( \ ) is a typographical mark (glyph) used chiefly in computing. It was first introduced in 1960 by Bob Bemer.[1] Sometimes called a reverse solidus, it is the mirror image of the common slash. It is also known as a slosh.
..... Click the link for more information.question mark (?), also known as an interrogation point, question point, query,[1] or eroteme, is a punctuation mark that replaces the full stop at the end of an interrogative sentence.
..... Click the link for more information.AmigaOS is the default native operating system of the Amiga personal computer. It was developed first by Commodore International, and initially introduced in 1985 with the Amiga 1000.
..... Click the link for more information.The percent sign (%) is the symbol used to indicate a percentage (that the preceding number is divided by one hundred). It is represented in Unicode by U+0025.
..... Click the link for more information.RT-11 ('RT' for Real Time) was a small, single-user real-time operating system for the Digital Equipment Corporation PDP-11 family of 16-bit computers. RT-11 was first implemented in 1970 and was widely used for real-time systems, process control, and data acquisition
..... Click the link for more information.asterisk (*), is a typographical symbol or glyph. It is so called because it resembles a conventional image of a star (Latin astrum). Computer scientists and mathematicians often pronounce it as star (as, for example, in the A* search algorithm
..... Click the link for more information.colon (“:”) is a punctuation mark, consisting of two equally sized dots centered on the same vertical line.Punctuation
Usage
As with many other punctuation marks, the usage of colon varies among languages and, for a given language, among
..... Click the link for more information.RT-11 ('RT' for Real Time) was a small, single-user real-time operating system for the Digital Equipment Corporation PDP-11 family of 16-bit computers. RT-11 was first implemented in 1970 and was widely used for real-time systems, process control, and data acquisition
..... Click the link for more information.
This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus