*Last revision: Dec. 28, 2013*

**WALD TEST** in Statistics. This principle was introduced by
Abraham Wald in his "Tests of Statistical
Hypotheses Concerning Several Parameters When the Number of Observations is
Large," *Transactions of the American Mathematical Society*, **54**,
(1943), 426-482 although many long-established procedures can be interpreted
as Wald tests.

The term "Wald test" appears in S. D. Silvey "The Lagrangian
Multiplier Test," *Annals of Mathematical Statistics*, **30**, (1959),
pp. 389-407.

The scheme of three test principles (a *trinity*) based
on maximum likelihood estimates, viz. the Wald test, the Lagrange multiplier
test and the likelihood ratio test, was proposed in Silvey’s paper.

See the entries LAGRANGE MULTIPLIER TEST, LIKELIHOOD RATIO TEST and SCORE TEST.

**WALLIS’S FORMULA** for π. John
Wallis gave this in his *Arithmetica infinitorum* published in 1656.
See Encyclopaedia of Mathematics.

**WEIBULL DISTRIBUTION** appears
in the title of Julius Lieblein’s "On Moments of Order
Statistics from the Weibull Distribution," *Annals of Mathematical Statistics*,
**26**, (1955), 330-333. (David (2001))

The Swedish physicist Waloddi
Weibull used this distribution in his 1939 "A Statistical Theory of the Strength
of Materials" and went on to find new applications: see
"A Statistical Distribution Function of Wide Applicability," *Journal
of Applied Mechanics*, **18**, (1951), 293-297. The distribution
was already known as the third limiting form for extreme-value distributions:
see R. A. Fisher and L. H.C. Tippett
Limiting
Forms of the Frequency Distribution of the
Largest of Smallest Member of a Sample, *Proceedings of the
Cambridge Philosophical Society, ***24**, (1928), 180-190.

See the entry EXTREME VALUE.

**WEIERSTRASS APPROXIMATION THEOREM.** Karl
Weierstrass published the theorem in his 1885
paper, “Über die analytische Darstellbarkeit sogenannter willkülicher Funktionen
reeller Argumente,” *Sitzungsber. Akad.
Wiss. Berlin,* pp. 633–639; 789–805. It is reprinted in
*Werke* **3** pp. 1-37.
There is an important generalisation in the form of the STONE-WEIERSTRASS THEOREM.
See the *MathWorld* entry.

**WEIGHT** and **WEIGHTED.** The earliest
quotations for **weight** given by the OED are: "The arithmetical
mean of a set of observations .. is the particular case when the weights a,
a´, a´´ etc. are all equal, and the sum of the errors is equal to zero. (*Phil.
Mag.* **LXV**, (1825), p. 167) and "The method of finding an average is
this: multiply every observation by its weight and divide the sum of the products
by the sum of the weights." (A. De Morgan *Essay on Probabilities* (1838)
p. 138.)

The OED’s earliest quotation for **weighted mean** is "We may..call
the constant *c* the specific weight of the observations to which it applies,
and Σ*c*A ÷ Σ*c* the weighted mean." (*Encycl.
Metrop.* **II**, (**1845**) p. 443)

The term **weighted least squares** was surprisingly late
in arriving, given that Gauss had described the method in 1809 in his first
publication on least squares. David (2001) gives Karl Pearson’s
"Notes on the History of Correlation," *Biometrika*, **13**,
(1920), p. 26.

See MEAN and METHOD OF LEAST SQUARES.

The expression has been in general English since the
time of Shakespeare but its use as a mathematical term of art appears to date
from the 1930s. A *JSTOR* search
found Adams and Clarkson writing in “Properties of Functions *f*(*x, y*)
of Bounded Variation,” *Transactions of the
American Mathematical Society*, **36**,
(1934), p. 712 of “functions which are to a certain extent well behaved,
perhaps to the extent of belonging to the Baire classification.” The *OED*’s earliest citation is from Boyer
*Concepts of Calculus* (1939): “Inasmuch
as Euler restricted himself to well-behaved functions, he did not become
involved in those subtle difficulties connected with the notions of infinity.” [John Aldrich]

See the entry PATHOLOGICAL.

**WELL-ORDERED.** The term *wohlgeordnet* was used by Cantor in an extensive paper,
"Über unendliche lineare Punctmannichfaltigkeiten," which appeared in
*Mathematische
Annalen* in six parts between 1879 and 1884. In part five, which appeared in
vol. 21 (1883), (page 548)
(or *Collected Papers* (p.168)) he wrote:

By a well-ordered set we understand any well-defined set whose elements are related by a well-determined given succession according to which there is a first element in the set and for any element (if it is not the last one) there is a certain next following element. Furthermore, for any finite or infinite set of elements there is a certain element which is the next following one for all these elements (except for the case that such an element which is the next following one to these elements does not exist).

This translation was taken from *Cantor’s Philosophical Views* by Walter Purkert.

When an English literature developed the term was
translated as *well-ordered*, an expression that had been in English since
the sixteenth century. A *JSTOR* search found *well-ordered* in E.
H. Moore “On the Theory of Improper Definite Integrals,” *Transactions of the
American Mathematical Society*, **2**, (1901), p. 473 and A. N. Whitehead
“On Cardinal Numbers,” *American Journal of Mathematics*, **2**,
(1902), p. 384.

**WEYL’S EQUIDISTRIBUTION THEOREM.** "Equidistribution" is in the title of the paper in which Hermann Weyl (1885-1955)
(NAS biographical memoir)
published the theorem: "Über die Gleichverteilung von Zahlen mod. Eins,"
*Mathematische
Annalen,* **77**, (1916) 313-352.
However, as G. H. Hardy and E. M. Wright *An Introduction to the Theory of Numbers*
(1938, p. 381) point out, "the theorem seems to have been found independently, at about the same time, by
Bohl,
Sierpinski
and Weyl."

**WHITE NOISE.** Originally the term referred to a form of sound or of electrical interference
but it now also refers to a type of random process. "Inside the plane ... we hear all frequencies
added together at once, producing a noise which is to sound what white light is to light." (L. D. Carson, W. R.
Miles & S. S. Stevens, "Vision, Hearing and Aeronautical Design," *Scientific Monthly,* **56,**
(1943), 446-451). S. Goldman’s book on radio engineering, *Frequency Analysis, Modulation and Noise*
(1948), has a mathematical treatment of white noise.

By 1953 white noise had entered the stochastic process literature, as in "On the Fourier Expansion
of Stationary Random Processes" by R. C. Davis (*Proceedings of the American Mathematical Society,* **4,**
564-569) [John Aldrich].

**WHOLE NUMBER.** See INTEGER.

**WIENER-HOPF equation, factorization, technique** are terms associated with the paper by
Norbert Wiener
and
Eberhard Hopf
"Uber eine Klasse singularer Integralgleichungen," Sitzber. Deutsch. Akad. Wiss. Berlin, Kl. Math Phys. Tech, 1931, pp. 696-706.

**WIENER PROCESS** appears in M. Kac’s "On Deviations Between Theoretical and Empirical
Distributions," *Proc. Nat. Acad. Sciences,* **35,** (1949), 252-257. The name recalls
N. Wiener’s
analysis of "the Brownian movement" in "Differential-space"
*J. Math. and Phys.* **2** (1923) 131-174. (See BROWNIAN MOTION.)
[John Aldrich]

The **WILCOXON RANK-SUM TEST** and **SIGNED RANK TEST** were proposed in
Frank Wilcoxon
(1892-1965) "Individual Comparisons by Ranking Methods," *Biometrics
Bulletin*, **1**, (1945), 80-83. The properties of these tests were
studied in a stream of papers beginning in the early 1950s, including J. Hemelrijk
"Note on Wilcoxon’s Two-Sample Test when Ties are Present," *Annals
of Mathematical Statistics*, **23**, (1952), 133-135. David (2001) writes
of the signed-rank test that this "clever and helpful term was coined by Tukey
(1949) in an unpublished but repeatedly cited technical report ["The simplest
signed-rank tests."]."

**WILSON’s THEOREM** was given its name by Edward Waring
(1734-1798) for his friend, John Wilson (1741-1793). The first
published statement of the theorem was by Waring in his
*Meditationes algebraicae* (1770), although manuscripts in the
Hanover Library show that the result had been found by Leibniz.

**WINDOW** in Statistics, particularly time series analysis. The term was introduced in B.
Blackman & J. W. Tukey’s “The Measurement of Power Spectra,” *Bell System Technical Journal,* **37**,
(1958). It appears in several forms, including *data window, lag window* and *spectral window*. An
alternative term in some of these uses is KERNEL. The first window
to be proposed for estimating the spectral density was the so-called DANIELL WINDOW. [John Aldrich]

**WINSORIZED** is found in 1960 in W. J. Dixon,
"Simplified Estimation from Censored Normal Samples," *The Annals
of Mathematical Statistics,* **31**, 385-391. Dixon explains the term,
"Winsor [4] and perhaps others have suggested using for the magnitude of an
extreme, poorly known, or unknown observation the value of the next largest
(or smallest) observation." The reference is to a personal communication from
Charles P. Winsor (1895-1951). (David, 1998).

J. W. Tukey writes that, when he first met Winsor in 1941,
"he had already developed a clear and individual philosophy about the proper
treatment of "wild shots" ... It seems only appropriate, then, to attach
his name to the process of replacing certain of the most extreme of the observations
in the sample by the nearest unaffected values, to speak of Winsorizing or Winsorization."
(from "The Future of Data Analysis," *Annals of Mathematical Statistics*,
**33**, (1962), p. 18.

See TRIMMING and DATA ANALYSIS.

**WISHART DISTRIBUTION.** The title of
John Wishart’s
"The Generalised Product Moment
Distribution in Samples from a Normal Multivariate Population," *Biometrika*,
**20A** (1928), 32-52, describes exactly what he was interested in. However
the distribution he derived, a multivariate generalisation of χ^{2},
is almost always called the Wishart distribution.

**WITCH OF AGNESI.** Luigi
Guido Grandi (1671-1742) studied this curve in 1703 and is believed to have
been the first to call it *versiera* or *versoria* in Latin, meaning
"turning in every direction." According to Boyer in *History of
Analytic Geometry,* Grandi coined the Italian word *la versiera* in
1718. The term appears in Father Guido Grandi’s commentary on the *Trattato
del Galileo del moto naturalmente accelerato* (*Opere di G. Galilei,*
III, Firenze, 1718, p. 393): "...sarebbe quella curve, che io descrivo
nel mio libro delle quadrature alla prop. 4, nata da seni versi, che da me suole
chiamarsi la versiera in latino però versoria..."

In 1748, Maria
Gaetana Agnesi (1718-1799), in *Istituzioni Analitiche,* the first
calculus book written by a woman, also called the curve *la versiera,*
using the name twice.

The British mathematician John Colson (1680-1760), translating
Agnesi’s work into English, translated the Italian word
*versiera* as "the Witch." He wrote, "...and therefore
[equation] or [equation] will be the equation of the curve to be
described, which is vulgarly called the Witch." He also wrote, "Let
the curve to be described be that of Prob. III. n. 238, called the
Witch, the equation of which is [equation]." Colson gave the name a
third time, in a marginal note, "Another example of the curve called
the Witch."

According to the translator’s preface to the 1801 English edition of
*Analytical Institutions,* Colson learned Italian for the sole
purpose of translating this work.

The expression *witch of Agnesi* is found in English in 1875 in *An elementary treatise on the integral calculus*
by Benjamin Williamson (1827-1916): “Find the area between the witch of Agnesi *xy*^{2} = 4*a*^{2} (2*a* – *x*)
and its asymptote” (*OED*).

See the entry CAUCHY DISTRIBUTION and
*MathWorld* Witch of Agnesi.

**WITHOUT LOSS OF GENERALITY.** *Without any loss of generality* is found in 1842
*Mathematical Tracts on the Lunar and Planetary Theories* by George Biddell Airy:
"But it is plain that, without any loss of generality, we may get rid of *A* by
altering the origin of time from which *t* is reckoned, or the origin of
linear measure from which *x* is reckoned." [Google print search]

*Without loss of generality* is found in 1843 in "On the theory of determinants" by Arthur Cayley
in *Trans. Camb. Phil. Soc.* 8:
"Hence, without loss of generality, the theorems which follow may be stated with reference to a single marked column only"
[University of Michigan Digital Library].

A Google print search found the abbreviation WLOG in 1964 in *Group Theory* by W. R. Scott. There the abbreviation has
an asterisk and the footnote explains what it stands for.

The **WOLD DECOMPOSITION (theorem)** by which a stationary series is expressed
as a sum of a deterministic component and a stochastic component which can itself
be expressed as an infinite moving average was given in
Herman Wold’s
*A Study in the Analysis of Stationary Time Series* (1938).
The name only became common in the 1960s. (*JSTOR* search.)

**WORKING HYPOTHESIS** occurs in 1868-1870 in *Essays, philosophical and theological*
by James Martineau:
"Mr. Mansel entreats us to hold, and to guide our footsteps; calling them
'regulative truths,' by which he means the best working hypothesis we are
able to attain of the character and purposes of God" [University of Michigan Digital Library].

**WORKING MATHEMATICIAN.** In an article "The Ignorance of
Bourbaki" (The Mathematical Intelligencer vol. 14, no 3, 1992), A.
R. D. Mathias suggests that this phrase is due to Bourbaki. However,
Carlos César de Araújo has found it in a paper by
Eliakim Hastings Moore, "On the foundations of mathematics" (Bull. A.
M. S., 1903, p. 406).

The term **WRONSKIAN** (for Höené Wronski)
was coined by Thomas Muir (1844-1934) in 1881 (Cajori 1919, page 310).