Information about Zipf Mandelbrot Law
| Probability mass function | |
| Cumulative distribution function | |
| Parameters | (integer) (real) (real) |
|---|---|
| Support | ![]() |
| Probability mass function (pmf) | ![]() |
| Cumulative distribution function (cdf) | ![]() |
| Mean | ![]() |
| Median | |
| Mode | ![]() |
| Variance | |
| Skewness | |
| Excess kurtosis | |
| Entropy | |
| Moment-generating function (mgf) | |
| Characteristic function | |
The probability mass function is given by:
where
is given by:
which may be thought of as a generalization of a harmonic number. In the limit as
approaches infinity, this becomes the Hurwitz zeta function
. For finite
and
the Zipf-Mandelbrot law becomes Zipf's law. For infinite
and
it becomes a Zeta distribution.
Applications
The distribution of words ranked by their frequency in a random corpus of writing is generally a power-law distribution, known as Zipf's law.If one plots the frequency rank of words contained in a large corpus of text data versus the number of occurrences or actual frequencies, one obtains a power-law distribution, with exponent close to one (but see Gelbukh and Sidorov 2001).
References and links
- B. Mandelbrot (1965). "Information Theory and Psycholinguistics", in B.B. Wolman and E. Nagel: Scientific psychology. Basic Books. Reprinted as
- B. Mandelbrot [1965] (1968). "Information Theory and Psycholinguistics", in R.C. Oldfield and J.C. Marchall: Language. Penguin Books.
- Z. K. Silagadze: Citations and the Zipf-Mandelbrot's law
- NIST: Zipf's law
- W. Li's References on Zipf's law
- Gelbukh and Sidorov 2001: Zipf and Heaps Laws’ Coefficients Depend on Language
The integers (from the Latin integer, which means with untouched integrity, whole, entire) are the set of numbers including the whole numbers (0, 1, 2, 3, …) and their negatives (0, −1, −2, −3, …).
..... Click the link for more information.
..... Click the link for more information.
Real may refer to:
..... Click the link for more information.
- Reality, something that exists
- Real (galley), the flagship of Don Juan de Austria in the Battle of Lepanto in 1571
- Real (bicycle manufacturer), a bicycle manufacturer
- Real
..... Click the link for more information.
In mathematics, the real numbers may be described informally as numbers that can be given by an infinite decimal representation, such as 2.4871773339…. The real numbers include both rational numbers, such as 42 and −23/129, and irrational numbers, such as π and
..... Click the link for more information.
..... Click the link for more information.
In mathematics, a support of a function f from a set X to the real numbers R is a subset Y of X such that f (x) is zero for all x in X and outside Y.
..... Click the link for more information.
..... Click the link for more information.
probability mass function (abbreviated pmf) is a function that gives the probability that a discrete random variable is exactly equal to some value. A probability mass function differs from a probability density function (abbreviated pdf
..... Click the link for more information.
..... Click the link for more information.
In probability theory, the cumulative distribution function (CDF), also called probability distribution function or just distribution function,[1] completely describes the probability distribution of a real-valued random variable X.
..... Click the link for more information.
..... Click the link for more information.
expected value (or mathematical expectation, or mean) of a discrete random variable is the sum of the probability of each possible outcome of the experiment multiplied by the outcome value (or payoff).
..... Click the link for more information.
..... Click the link for more information.
median is described as the number separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking
..... Click the link for more information.
..... Click the link for more information.
In statistics, mode means the most frequent value assumed by a random variable, or occurring in a sampling of a random variable. The term is applied both to probability distributions and to collections of experimental data.
..... Click the link for more information.
..... Click the link for more information.
variance of a random variable (or somewhat more precisely, of a probability distribution) is one measure of statistical dispersion, averaging the squared distance of its possible values from the expected value.
..... Click the link for more information.
..... Click the link for more information.
skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable.
..... Click the link for more information.
Introduction
Consider the distribution in the figure. The bars on the right side of the distribution taper differently than the bars on the left side...... Click the link for more information.
kurtosis (from the Greek word kurtos, meaning bulging) is a measure of the "peakedness" of the probability distribution of a real-valued random variable. Higher kurtosis means more of the variance is due to infrequent extreme deviations, as opposed to frequent
..... Click the link for more information.
..... Click the link for more information.
Shannon entropy or information entropy is a measure of the uncertainty associated with a random variable.
Shannon entropy quantifies the information contained in a piece of data: it is the minimum average message length, in bits (if using base-2 logarithms), that must
..... Click the link for more information.
Shannon entropy quantifies the information contained in a piece of data: it is the minimum average message length, in bits (if using base-2 logarithms), that must
..... Click the link for more information.
In probability theory and statistics, the moment-generating function of a random variable X is
wherever this expectation exists. The moment-generating function generates the moments of the probability distribution.
..... Click the link for more information.
wherever this expectation exists. The moment-generating function generates the moments of the probability distribution.
..... Click the link for more information.
In probability theory, the characteristic function of any random variable completely defines its probability distribution. On the real line it is given by the following formula, where X is any random variable with the distribution in question:
..... Click the link for more information.
..... Click the link for more information.
Probability theory is the branch of mathematics concerned with analysis of random phenomena.[1] The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities
..... Click the link for more information.
..... Click the link for more information.
Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It is applicable to a wide variety of academic disciplines, from the physical and social sciences to the humanities.
..... Click the link for more information.
..... Click the link for more information.
Discrete mathematics, also called finite mathematics or Decision Maths, is the study of mathematical structures that are fundamentally discrete, in the sense of not supporting or requiring the notion of continuity.
..... Click the link for more information.
..... Click the link for more information.
probability distribution that assigns a probability to every subset (more precisely every measurable subset) of its state space in such a way that the probability axioms are satisfied.
..... Click the link for more information.
..... Click the link for more information.
Pareto can refer to:
..... Click the link for more information.
- Vilfredo Pareto (born 1848), an Italian sociologist, economist and philosopher;
- Pareto chart, an ordered bar chart used in statistical quality assurance
..... Click the link for more information.
A power law is any polynomial relationship that exhibits the property of scale invariance. The most common power laws relate two variables and have the form
where and are constants, and is of .
..... Click the link for more information.
where and are constants, and is of .
..... Click the link for more information.
Harvard University (incorporated as The President and Fellows of Harvard College) is a private university in Cambridge, Massachusetts, USA and a member of the Ivy League.
..... Click the link for more information.
..... Click the link for more information.
For the journal, see .
Linguistics is the scientific study of language, which can be theoretical or applied. Someone who engages in this study is called a linguist...... Click the link for more information.
The meaning of the word professor (Latin: person who professes to be an expert in some art or science, teacher of highest rank[1]) varies. In most English-speaking countries, it refers to a senior academic who holds a departmental chair
..... Click the link for more information.
..... Click the link for more information.
George Kingsley Zipf (IPA [zɪf]), (1902-1950), was an American linguist and philologist who studied statistical occurrences in different languages. Zipf worked at Harvard University.
..... Click the link for more information.
..... Click the link for more information.
19th century - 20th century - 21st century
1870s 1880s 1890s - 1900s - 1910s 1920s 1930s
1899 1900 1901 - 1902 - 1903 1904 1905
Year 1902 (MCMII
..... Click the link for more information.
1870s 1880s 1890s - 1900s - 1910s 1920s 1930s
1899 1900 1901 - 1902 - 1903 1904 1905
Year 1902 (MCMII
..... Click the link for more information.
19th century - 20th century - 21st century
1920s 1930s 1940s - 1950s - 1960s 1970s 1980s
1947 1948 1949 - 1950 - 1951 1952 1953
Year 1950 (MCML
..... Click the link for more information.
1920s 1930s 1940s - 1950s - 1960s 1970s 1980s
1947 1948 1949 - 1950 - 1951 1952 1953
Year 1950 (MCML
..... Click the link for more information.
Zipf's law, publicized by Harvard linguist George Kingsley Zipf (IPA [zɪf]), stated that, in a corpus of natural language utterances, the frequency of any word is roughly inversely proportional to its rank in the frequency
..... Click the link for more information.
..... Click the link for more information.
mathematician is a person whose primary area of study and research is the field of mathematics.
..... Click the link for more information.
Problems in mathematics
Some people incorrectly believe that mathematics has been fully understood, but the publication of new discoveries in mathematics continues at an immense..... Click the link for more information.
Benoît Mandelbrot
Mandelbrot speaking in 2007 at the EPFL
Born November 20 1924
..... Click the link for more information.
Mandelbrot speaking in 2007 at the EPFL
Born November 20 1924
..... Click the link for more information.
This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus
(
(
(





