A
word is a unit of
language that carries
meaning and consists of one or more
morphemes which are linked more or less tightly together, and has a
phonetical value. Typically a word will consist of a
root or
stem and zero or more
affixes. Words can be combined to create
phrases,
clauses, and
sentences. A word consisting of two or more stems joined together is called a
compound.
Difficulty in defining the term
Depending on the language, words can sometimes be difficult to identify or delimit. While word separators, most often
spaces, are commonplace in the written corpus of several languages, some languages such as
Chinese and
Japanese do not use these. Words may contain spaces, however, if they are
compounds or proper nouns such as
ice cream and
the United States of America. Furthermore,
synthetic languages often combine many different pieces of lexical data into single words, making it difficult to boil them down to the traditional sense of words found more easily in
analytic languages; this is especially problematic for
polysynthetic languages such as
Inuktitut and
Ubykh where entire sentences may consist of single such words. Especially confusing are languages such as
Vietnamese, where spaces do not necessarily indicate breaks in words and boundaries must be determined by the context of the piece.
However, of all situations, the most confusing is those for
oral languages, which potentially only offer phonolexical clues as to where word boundaries lie.
Sign languages pose a similar problem as well, as does
body language.
word means poop
Official words, however, would be documented in a dictionary of whichever language you are categorizing them under.
Words in different classes of languages
In
synthetic languages, a single
word stem (for example,
love) may have a number of different forms (for example,
loves,
loving, and
loved). However, these are not usually considered to be different words, but different forms of the same word. In these languages, words may be considered to be constructed from a number of morphemes (such as
love and
-s).
Complexity of word boundaries in speech
In
spoken language, the distinction of individual words is even more complex: short words are often run together, and long words are often broken up. Spoken
French has some of the features of a
polysynthetic language:
il y est allé ("He went there") is pronounced /
i.ljɛ.ta.le/. As the majority of the world's languages are not written, the scientific determination of word boundaries becomes important.
Determining word boundaries
There are five ways to determine where the word boundaries of spoken language should be placed:
- Potential pause
- A speaker is told to repeat a given sentence slowly, allowing for pauses. The speaker will tend to insert pauses at the word boundaries. However, this method is not foolproof: the speaker could easily break up polysyllabic words.
- Indivisibility
- A speaker is told to say a sentence out loud, and then is told to say the sentence again with extra words added to it. Thus, I have lived in this village for ten years might become I and my family have lived in this little village for about ten or so years. These extra words will tend to be added in the word boundaries of the original sentence. However, some languages have infixes, which are put inside a word. Similarly, some have separable affixes; in the German sentence "Ich komme gut zu Hause an," the verb ankommen is separated.
- Minimal free forms
- This concept was proposed by Leonard Bloomfield. Words are thought of as the smallest meaningful unit of speech that can stand by themselves. This correlates phonemes (units of sound) to lexemes (units of meaning). However, some written words are not minimal free forms, as they make no sense by themselves (for example, the and of).
- Phonetic boundaries
- Some languages have particular rules of pronunciation that make it easy to spot where a word boundary should be. For example, in a language that regularly stresses the last syllable of a word, a word boundary is likely to fall after each stressed syllable. Another example can be seen in a language that has vowel harmony (like Turkish): the vowels within a given word share the same quality, so a word boundary is likely to occur whenever the vowel quality changes. However, not all languages have such convenient phonetic rules, and even those that do present the occasional exceptions.
- Semantic units
- Much like the abovementioned minimal free forms, this method breaks down a sentence into its smallest semantic units. However, language often contains words that have little semantic value (and often play a more grammatical role), or semantic units that are compound words.
In practice, linguists apply a mixture of all these methods to determine the word boundaries of any given sentence. Even with the careful application of these methods, the exact definition of a word is often still very elusive.
All in all, a word is a very powerful concept that permits us to communicate with others and interact with the rest of the world.
External links
See Language (journal) for the linguistics journal.
A
language is a system of symbols and the rules used to manipulate them.
Language can also refer to the use of such systems as a general phenomenon.
..... Click the link for more information. In linguistics, meaning is the content carried by the words or signs exchanged by people when communicating through language. Restated, the communication of meaning is the purpose and function of language.
..... Click the link for more information.
In morpheme-based morphology, a morpheme is the smallest linguistic unit that has semantic meaning. In spoken language, morphemes are composed of phonemes (the smallest linguistically distinctive units of sound), and in written language morphemes are composed of graphemes (the
..... Click the link for more information.
Phonetics (from the Greek word φωνή, phone meaning 'sound, voice') is the study of the sounds of human speech. It is concerned with the actual properties of speech sounds (phones), and their production, audition and perception, while phonology, which
..... Click the link for more information.
The root is the primary lexical unit of a word, which carries the most significant aspects of semantic content and cannot be reduced into smaller constituents. Content words in nearly all languages contain, and may consist only of, root morphemes.
..... Click the link for more information.
In linguistics, a
stem is the part of a word that is common to all its inflected variants. Stems are often roots, i.e. atomic (unanalyzable) lexical morphemes, but a stem can also be morphologically complex, as seen with compound words (cf.
..... Click the link for more information. An affix is a morpheme that is attached to a base morpheme such as a root or to a stem, to form a word. Affixes may be derivational, like English -ness and pre-, or inflectional, like English plural -s and past tense -ed.
..... Click the link for more information.
In grammar, a phrase is a group of words that functions as a single unit in the syntax of a sentence.
For example the house at the end of the street (example 1) is a phrase. It acts like a noun.
..... Click the link for more information.
In grammar, a clause is a word or group of words ordinarily consisting of a subject and a predicate, although in some languages and some types of clauses, the subject may not appear explicitly. (This is especially common in null subject languages.
..... Click the link for more information.
In linguistics, a sentence is a unit of language, characterized in most languages by the presence of a finite verb. For example, "The quick brown fox jumps over the lazy dog.
..... Click the link for more information.
In linguistics, a compound is a lexeme (a word) that consists of more than one other lexeme.
An endocentric compound consists of a head, i.e. the categorical part that contains the basic meaning of the whole compound, and modifiers, which restrict this meaning.
..... Click the link for more information.
In writing, a space ( ) is any empty (non-written) zone between written sections. However, the term is usually used to refer to an empty zone used for interword separation (interword space) or separation between punctuation and words.
..... Click the link for more information.
Chinese or the Sinitic language(s) (汉语/漢語, Pinyin: Hànyǔ; 华语/華語, Huáyǔ; or 中文, Zhōngwén) can be considered a language or language family.
..... Click the link for more information.
This article contains Japanese text.
Without proper ,
you may see question marks, boxes, or other symbols instead of kanji or kana.
Japanese
日本語
..... Click the link for more information. In linguistics, a compound is a lexeme (a word) that consists of more than one other lexeme.
An endocentric compound consists of a head, i.e. the categorical part that contains the basic meaning of the whole compound, and modifiers, which restrict this meaning.
..... Click the link for more information.
A synthetic language, in linguistic typology, is a language with a high morpheme-per-word ratio. This linguistic classification is largely independent of morpheme-usage classifications (such as fusional, agglutinative, etc.
..... Click the link for more information.
This article or section may be confusing or unclear for some readers.
Please [improve the article] or discuss this issue on the talk page. This article has been tagged since April 2007.
..... Click the link for more information.
Polysynthetic languages are highly synthetic languages, i.e. languages in which words are composed of many morphemes.
Definition
The degree of synthesis refers to the morpheme-to-word ratio. Languages with more than one morpheme per word are synthetic.
..... Click the link for more information. Inuktitut (Inuktitut syllabics: ᐃᓄᒃᑎᑐᑦ ( fonts required ), literally "like the Inuit") is the name of the varieties of Inuit language spoken in Canada.
..... Click the link for more information.
Ubykh or Ubyx is a language of the Northwestern Caucasian group, spoken by the Ubykh people up until the early 1990s.
The word is derived from /wəbəx/
..... Click the link for more information.
Vietnamese (tiếng Việt, or less commonly Việt ngữ[1]), formerly known under the French colonization as Annamese (see Annam), is the national and official language of Vietnam.
..... Click the link for more information.
A spoken language is a human natural language in which the words are uttered through the mouth. Most human languages are spoken languages.
Speech communication stands in contrast to sign language and written language.
..... Click the link for more information.
sign language (also signed language) is a language which uses manual communication, body language and lip patterns instead of sound to convey meaning—simultaneously combining hand shapes, orientation and movement of the hands, arms or body, and facial expressions to
..... Click the link for more information.
Body language is a term for communication using body movements or gestures (such as the '''Pinocchio blue[1]) instead of, or in addition to, sounds, verbal language or other communication.
..... Click the link for more information.
A synthetic language, in linguistic typology, is a language with a high morpheme-per-word ratio. This linguistic classification is largely independent of morpheme-usage classifications (such as fusional, agglutinative, etc.
..... Click the link for more information.
In linguistics, a
stem is the part of a word that is common to all its inflected variants. Stems are often roots, i.e. atomic (unanalyzable) lexical morphemes, but a stem can also be morphologically complex, as seen with compound words (cf.
..... Click the link for more information. A spoken language is a human natural language in which the words are uttered through the mouth. Most human languages are spoken languages.
Speech communication stands in contrast to sign language and written language.
..... Click the link for more information.
French (français, pronounced [fʁɑ̃ˈsɛ]) is a Romance language originally spoken in France, Belgium, Luxembourg, and Switzerland, and today by about 300 million people around the world as either
..... Click the link for more information.
Polysynthetic languages are highly synthetic languages, i.e. languages in which words are composed of many morphemes.
Definition
The degree of synthesis refers to the morpheme-to-word ratio. Languages with more than one morpheme per word are synthetic.
..... Click the link for more information. In linguistics, a sentence is a unit of language, characterized in most languages by the presence of a finite verb. For example, "The quick brown fox jumps over the lazy dog.
..... Click the link for more information.