Information about Naming Conventions (programming)

In computer programming, a naming convention is a set of rules for choosing the character sequence to be used for identifiers in source code and documentation.

Reasons for using a naming convention (as opposed to allowing programmers to choose any character sequence) include the following:
  • to reduce the effort needed to read and understand source code;
  • to enhance source code appearance (for example, by disallowing overly long names or abbreviations);
The choice of naming conventions can be an enormously controversial issue, with partisans of each holding theirs to be the best and others to be inferior. Colloquially, this is said to be a matter of "religion" (see the Jargon file article on this matter).

Potential benefits

Some of the potential benefits that can be obtained by adopting a naming convention include the following:
  • to provide additional information (ie, metadata) about the use to which an identifier is put;
  • to help formalize expectations and promote consistency within a development team;
  • to enable the use of automated refactoring or search and replace tools with minimal potential for error;
  • to enhance clarity in cases of potential ambiguity;
  • to enhance the aesthetic and professional appearance of work product (for example, by disallowing overly long names, comical or "cute" names, or abbreviations); and
  • to help avoid "naming collisions" that might occur when the work product of different organizations is combined (see also: namespaces)

Challenges

The choice of naming conventions (and the extent to which they are enforced) is often a contentious issue, with partisans holding their viewpoint to be the best and others to be inferior.

Moreover, even with known and well-defined naming conventions in place, some organizations may fail to consistently adhere to them, causing inconsistency and confusion.

These challenges may be exacerbated if the naming convention rules are internally inconsistent, arbitrary, difficult to remember, or otherwise perceived as more burdensome than beneficial.

Business value of naming conventions

Although largely hidden from the view of most business users, well-chosen identifiers make it significantly easier for subsequent generations of analysts and developers to understand what the system is doing and how to fix or extend the source code for new business needs.

For example, although the following:

a = b * c

is syntactically correct, it is entirely opaque as to intent or meaning. Contrast this with:

weekly_pay = hours_worked * pay_rate

which implies the intent and meaning of the source code, at least to those familiar with the underlying context of the application.

Common elements

The exact rules of a naming convention depend on the context in which they are employed. Nevertheless, there are several common elements that influence most if not all naming conventions in common use today.

Length of identifiers

A fundamental element of all naming conventions are the rules related to identifier length (i.e., the finite number of individual characters allowed in an identifier). Some rules dictate a fixed numerical bound, while others specify less precise heuristics or guidelines.

Identifier length rules are routinely contested in practice, and subject to much debate academically.

Some considerations:
  • shorter identifiers may be preferred as more expedient, because they are easier to type
  • extremely short identifiers (such as 'i' or 'j') are very difficult to uniquely distinguish using automated search and replace tools
  • longer identifiers may be preferred because short identifiers cannot encode enough information or appear too cryptic
  • longer identifiers may be disfavored because of visual clutter
It is an open research issue whether programmers prefer shorter identifiers because they are easier to type, or think up, than longer identifiers, or because in many situations a longer identifier simply clutters the visible code and provides no perceived additional benefit.

Brevity in programming could be in part attributed to early linkers which required variable names to be restricted to 6 characters in order to save memory.

Letter case and numerals

Some naming conventions limit whether letters may appear in uppercase or lowercase. Other conventions do not restrict letter case, but attach a well-defined interpretation based on letter case. Some naming conventions specify whether alphabetic, numeric, or alphanumeric characters may be used, and if so, in what sequence.

Multiple-word identifiers

A common recommendation is "Use meaningful identifiers." A single word may not be as meaningful, or specific, as multiple words. Consequently, some naming conventions specify rules for the treatment of "compound" identifiers containing more than one word.

Word boundaries

As most programming languages do not allow whitespace in identifiers, a method of delimiting each word is needed (to make it easier for subsequent readers to interpret which characters belong to which word).

Delimiter-separated words: One approach is to delimit separate words with a nonalphanumeric character. The two characters commonly used for this purpose are the hyphen ('-') and the underscore ('_'), eg, the two-word name two words would be represented as two-words or two_words. The hyphen is used by nearly all programmers writing Cobol and Lisp; it is also common for selector names in Cascading Style Sheets. Many other languages (eg, languages in the C and Pascal families) reserve the hyphen for use as the subtraction operator, and so it is not available for use in identifiers. Also, delimiter separated words may cause conflicts or unexpected behavior when source code is manipulated using a text editor IDE or other processing tool.

Letter-case separated words: An alternate approach is to indicate word boundaries using capitalization, thus rendering two words as either twoWords or TwoWords. The term CamelCase (or camelCase) is sometimes used to describe this technique.

Metadata and hybrid conventions

Some naming conventions represent rules or requirements that go beyond the requirements of a specific project or problem domain, and instead reflect a greater over-arching set of principles defined by the software architecture, underlying programming language or other kind of cross-project methodology.

Hungarian notation

Positional Notation

A style used for very short (8 characters and less) could be: LCCIIL01, where LC would be the application (Letters of Credit), C for COBOL, IIL for the particular process subset, and the 01 a sequence number.

This sort of convention is still in active use in mainframes dependent upon JCL and is also seen in the 8.3 (maximum 8 characters with period separator followed by 3 character file type) MS-DOS style.

Composite word scheme (OF Language)

One of the earliest published convention systems was IBM's "OF Language" documented in a 1980s IMS (Information Management System) manual .

It detailed the PRIME-MODIFIER-CLASS word scheme, which consisted of names like "CUST-ACT-NO" to indicate "customer account number".

PRIME words were meant to indicate major "entities" of interest to a system.

MODIFIER words were used for additional refinement, qualification and readability.

CLASS words ideally would be a very short list of data types relevant to a particular application. Common CLASS words might be: NO (number), ID (identifier), TXT (text), AMT (amount), QTY (quantity), FL (flag), CD (code), W (work) and so forth. In practice, the available CLASS words would be a list of less than two dozen terms.

CLASS words, typically positioned on the right (suffix), served much the same purpose as Hungarian notation prefixes.

The purpose of CLASS words, in addition to consistency, was to specify to the programmer the data type of a particular data field. Prior to the acceptance of BOOLEAN (two values only) fields, FL (flag) would indicate a field with only two possible values.

Language-specific conventions

C and C++ languages

  • In C and C++, keywords and standard library identifiers are mostly lowercase. Identifiers representing macros are, by convention, written using only upper case letters (this is related to the convention in many programming languages of using all-upper-case identifiers for constants). Names beginning with double underscore or an underscore and a capital letter are reserved for implementation (compiler, standard library) and should be not used (e.g. __reserved or _Reserved).

Java language

  • In Java, very strong conventions established from the beginning by the language's originators require classes and variables to be capitalised differently. Thus, to a Java programmer, widget.expand() and Widget.expand() imply significantly different behaviour, even without prior knowledge of the Widget class and despite the fact that the compiler enforces no such rules.

Visual Basic, VB.NET and BASIC languages

  • Traditionally, Basic does not implement the mandatory case-sensitivity that the C-type languages do, and the IDE often provides on-the-spot variable identification. Hence Visual Basic naming conventions tend to rest on what is most human-readable, as opposed to providing information about the identifier itself. For instance, lpszMyString in C would just become MyString in Visual Basic, and widget.expand() would mean the same as Widget.expand().

See also

External links

A computer program is one or more instructions that are intended for execution by a computer. Specifically, it is a symbol or combination of symbols forming an algorithm that may or may not terminate, and that algorithm is written in a programming language.
..... Click the link for more information.
naming convention is a collection of rules followed by a set of names. The intent is that users of these names will be able to deduce useful information, based on the names' character sequence and knowledge of the rules followed.
..... Click the link for more information.
Identifiers (IDs) are lexical tokens that name entities. The concept is analogous to that of a "name". Identifiers are used extensively in virtually all information processing systems.
..... Click the link for more information.
source code (commonly just source or code) is any sequence of statements and/or declarations written in some human-readable computer programming language.
..... Click the link for more information.
programmer or software developer is someone who programs computers, that is, one who writes computer software. The term computer programmer can refer to a specialist in one area of computer programming or to a generalist who writes code for many kinds of software.
..... Click the link for more information.
Metadata is data about data. An item of metadata may describe an individual datum, or content item, or a collection of data including multiple content items.

Metadata (sometimes written 'meta data') is used to facilitate the understanding, use and management of data.
..... Click the link for more information.
See also: Wikipedia:Refactoring talk pages
A code refactoring is any change to a computer program's code which improves its readability or simplifies its structure without changing its results.
..... Click the link for more information.
Put simply, a "Namespace" is a set of names in which all names are unique.

A namespace is a context in which a group of one or more identifiers might exist. An identifier defined in a namespace is associated with that namespace.
..... Click the link for more information.
source code (commonly just source or code) is any sequence of statements and/or declarations written in some human-readable computer programming language.
..... Click the link for more information.
linker or link editor is a program that takes one or more objects generated by compilers and assembles them into a single executable program.

In IBM mainframe environments such as OS/360 this program is known as a linkage editor.
..... Click the link for more information.
A word is a unit of language that carries meaning and consists of one or more morphemes which are linked more or less tightly together, and has a phonetical value. Typically a word will consist of a root or stem and zero or more affixes.
..... Click the link for more information.
A programming language is an artificial language that can be used to control the behavior of a machine, particularly a computer. Programming languages, like natural languagess, are defined by syntactic and semantic rules which describe their structure and meaning respectively.
..... Click the link for more information.


In computer science, white space, whitespace, or a whitespace character is any single character which represents horizontal or vertical space in typography, or is a series of such characters.
..... Click the link for more information.
delimiter is a sequence of one or more characters used to specify the boundary between separate, independent regions in plain text or other data stream.[1] An example of a delimiter is the comma character in a sequence of comma-separated values.
..... Click the link for more information.
Alphanumeric is a collective term used to identify letters of the Latin alphabet and Arabic digits. There are either 36 (single case) or 62 (case-sensitive) alphanumeric characters. The alphanumeric character set consists of the numbers 0 to 9 and letters A to Z.
..... Click the link for more information.
COBOL
Paradigm: multi-paradigm
Appeared in: 1959
Designed by: Grace Hopper, William Selden, Gertrude Tierney, Howard Bromberg, Howard Discount, Vernon Reeves, Jean E.
..... Click the link for more information.
Lisp
Paradigm: multi-paradigm: functional, procedural, reflective
Appeared in: 1958
Designed by: John McCarthy
Developer: Steve Russell, Timothy P. Hart, and Mike Levin
Typing discipline: dynamic, strong
Dialects: Common Lisp, Scheme, Emacs Lisp
..... Click the link for more information.
Cascading Style Sheets

File extension: .css
MIME type: text/css
Developed by: World Wide Web Consortium
Type of format: Stylesheet language
Standard(s): Level 1 (Recommendation)
Level 2 (Recommendation)
..... Click the link for more information.
C

The C Programming Language, Brian Kernighan and Dennis Ritchie, the original edition that served for many years as an informal specification of the language.
..... Click the link for more information.
Pascal is a structured imperative computer programming language, developed in 1970 by Niklaus Wirth as a language particularly suitable for structured programming. A derivative known as Object Pascal was designed for object oriented programming.
..... Click the link for more information.
Subtraction is one of the four basic arithmetic operations; it is the inverse of addition. Subtraction is denoted by a minus sign in infix notation.

The traditional names for the parts of the formula
cb = a
are
..... Click the link for more information.
text editor is a type of program used for editing plain text files.

Text editors are often provided with operating systems or software development packages, and can be used to change configuration files and programming language source code.

Plain text files vs.


..... Click the link for more information.
IDE may refer to:
  • Insulin degrading enzyme, an enzyme
  • Intact dilation and extraction, a form of abortion
  • Integrated development environment, a software development system
  • Integrated Drive Electronics, a computer hardware bus

..... Click the link for more information.
misleading. Please see the discussion on the talk page.


CamelCase (also spelled camel case) or medial capitals is the practice of writing compound words or phrases in which the words are joined without spaces and are capitalized within the
..... Click the link for more information.
Software development process
Activities and steps
Requirements | Architecture | Implementation | Testing | Deployment
Models
Agile | Cleanroom | Iterative | RAD | RUP | Spiral | Waterfall | XP
Supporting disciplines
..... Click the link for more information.
A programming language is an artificial language that can be used to control the behavior of a machine, particularly a computer. Programming languages, like natural languagess, are defined by syntactic and semantic rules which describe their structure and meaning respectively.
..... Click the link for more information.
Hungarian notation is a naming convention in computer programming, in which the name of a variable indicates its type or intended use. There are two types of Hungarian notation: Systems Hungarian notation and Apps Hungarian notation.
..... Click the link for more information.
In programming languages a data type defines a set of values and the allowable operations on those values[1]. For example, in the Java programming language, the "int" type represents the set of 32-bit integers ranging in value from -2,147,483,648 to 2,147,483,647, and
..... Click the link for more information.
This page is under construction.
This article or section is currently in the middle of an expansion or major revamping. However, you are welcome to assist in its construction by editing it as well.
..... Click the link for more information.
IBM Information Management System (IMS) is a joint hierarchical database and information management system with extensive transaction processing capability.

IBM designed IMS with Rockwell and Caterpillar starting in 1966 for the Apollo program.
..... Click the link for more information.


This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus


page counter