*Last revision: Oct. 17, 2016*

**ST. ANDREW’S CROSS** is the term used by Florian Cajori for the
multiplication symbol X. It appears in 1916 in his "William
Oughtred, A Great Seventeenth-Century Teacher of Mathematics.

*St. Andrew’s cross* is found in 1615, although not in a
mathematical context, in Crooke, *Body of Man*: "[They] doe
mutually intersect themselues in the manner of a Saint Andrewes
crosse, or this letter X" (OED2).

The **ST. PETERSBURG PARADOX** was formulated by Niklaus Bernoulli in 1713:
see problem 5 in the first letter of
Correspondence of Nicholas Bernoulli concerning the
St Petersburg game with Montmort, Daniel Bernoulli and Cramer (translation by
Richard J. Pulskamp.)
The association with St. Petersburg came about because the most prominent discussion was published
there: this was Daniel Bernoulli’s "Specimen Theoriae Novae de Mensara
Sortis," *Commentarii Academiae Scientiarum Imperialis Petropolitana,*
**5**, 175-192 (1738). The paper has been translated as "Exposition
of a New Theory on the Measurement of Risk," *Econometrica*, **22**,
(1954), 23-36.

In 1768 D'Alembert
*Opuscules
Mathématiques vol. IV, p. 78*
(English translation by
Richard J. Pulskamp)
used the phrase "le probléme de petersbourg." J. Bertrand’s
*Calcul des probabilités*
(1889, p. 62) has a section on the "Paradoxe de Saint-Pétersbourg" and the "paradox" appears
in English in J. M. Keynes’s *A Treatise on Probability* (1921).

[John Aldrich, based on Jacques Dutka, "On the St. Petersburg paradox,"
*Arch. Hist. Exact Sci.* 39, No.1, 1988 and David (2001)]

See MORAL EXPECTATION and UTILITY.

**SADDLE POINT** is found in 1922 in *A Treatise on the Theory
of Bessel Functions* by G. N. Watson (OED2).

**SAGITTA** was used in Latin by Fibonacci (1220) to mean the
versed sine (Smith, vol. 2). See VERSED SINE.

In 1726 *Alberti’s Archit.* has: "The .. Line .. from the middle
Point of the Chord up to the Arch, leaving equal Angles on each Side,
is call'd the Sagitta" (OED2).

*Webster’s New International Dictionary* (1909) has the
following definition for *sagitta*: "the distance from a point
in a curve to the chord; also, the versed sine of an arc; -- so
called (by Kepler) from its resemblance to an arrow resting on the
bow and string; also, *Obs.,* an abscissa.

The 1961 third edition of the same dictionary has the following definition: "the distance from the midpoint of an arc to the midpoint of its chord."

**SALIENT ANGLE.** The OED2 has a 1687 citation for *Angle
Saliant.*

In 1781 Sir John T. Dillon wrote in *Travels Through Spain*: "He
could find nothing which seemed to confirm the opinion relating to
the salient and reentrant angles" (OED2).

*Mathematical Dictionary and Cyclopedia of Mathematical Science*
(1857) has: "SALIENT ANGLE of a polygon, is an interior angle, less
than two right angles."

See also CONVEX POLYGON.

**SAMPLE.** See POPULATION.

**SAMPLE PATH.** This term seems to have originated in sequential analysis and
then was transferred to stochastic processes in general. JSTOR gives one pre-1950
reference, to Anscombe (1949) "Large-Sample Theory of Sequential Estimation,"
*Biometrika,* **36,** 455-458 [John Aldrich].

**SAMPLE SPACE** was introduced into statistical theory by J. Neyman and E. S. Pearson, "On the Problem
of the Most Efficient Tests of Statistical Hypotheses,"
*Philosophical
Transactions of the Royal Society, A,* **231**.
(1933), 289-337. It was associated with the representation of
a sample comprising n numbers as a point in n-dimensional space, a representation
R. A. Fisher had exploited in articles going back to
1915.
W. Feller used this notion of sample space in his "Note on regions similar
to the sample space," *Statist. Res. Mem.,* Univ. London 2, 117-125
(1938) but in the *Introduction to Probability Theory and its Applications,
volume one* (1950) Feller used the term quite abstractly for the set of outcomes
of an experiment. He attributed the general concept to Richard von Mises (1883-1953)
who had referred to the *Merkmalraum* (label space) in writings on the
foundations of probability from 1919 onwards: see his "Grundlagen der Wahrscheinlichkeitsrechnung,"
*Math.
Zeit.* **5**, (1919), 52-99.

This entry was contributed by John Aldrich. See also EXPERIMENT.

**SAMPLING DISTRIBUTION.** R. A. Fisher seems to have introduced this term. It appears incidentally
in a 1922 paper
(*JRSS*, **85,** 598)
and then in the title of his 1928 paper
"The
General Sampling Distribution of the Multiple Correlation Coefficient",
*Proc. Roy. Soc. A,* **213,** p. 654.

**SCALAR.** See VECTOR.

**SCALAR PRODUCT.** See VECTOR PRODUCT.

**SCALAR QUANTITIES.** In 1646 Viète used *magnitudines scalares* to refer to a set of quantities in continual geometrical proportion. The term was not adopted by others. [James A. Landau]

**SCALENE.** In Sir Henry Billingsley’s 1570 translation of
Euclid’s *Elements* *scalenum* is used as a noun: "Scalenum
is a triangle, whose three sides are all unequall."

In 1642 *scalene* is found in a rare use as a noun, referring to
scalene triangle in *Song of Soul* by Henry More: "But if 't
consist of points: then a Scalene I'll prove all one with an
Isosceles."

*Scalenous* is found in 1656 in Stanley, *Hist. Philos.*.
(1687): "A Pyramid consisteth of four triangles,..each whereof is
divided..into six scalenous triangles."

*Scalene* occurs as an adjective is in 1684 in *Angular
Sections* by John Wallis: "The Scalene Cone and Cylinder."

The earliest use of *scalene* as an adjective to describe a
triangle is in 1734 in *The Builder’s Dictionary.* (All
citations are from the OED2.)

**SCATTER DIAGRAM.** According to H. L. Moore, *Laws of Wages* (1911), the term "scatter diagram"
was due to Karl Pearson. A *JSTOR* search finds the term first appearing in a 1906 article in
*Biometrika* (which Pearson edited), "On the Relation Between the Symmetry of the Egg and
the Symmetry of the Embryo in the Frog (Rana Temporaria)" by J. W. Jenkinson. However the term only
came into wide use in the 1920s when it began to appear in textbooks, e.g. F. C. Mills,
*Statistical Methods* of 1925. OED2 gives the following quotation from Mills:
"The equation to a straight line, fitted by the method of least squares to the points on the scatter
diagram, will express mathematically the average relationship between these two variables" (X. 366)
[John Aldrich].

*Scattergram* is found in 1938 in A. E. Waugh, *Elem.
Statistical Method*: "This is the method of plotting the data on a
scatter diagram, or scattergram, in order that one may see the
relationship" (OED2).

*Scatterplot* is found in 1939 in *Statistical Dictionary of
Terms and Symbols* by Kurtz and Edgerton (David, 1998).

For the history of this form of graphical representation of data see
Michael Friendly, Daniel Denis "The early origins and development of the scatterplot,"
*Journal of the History of the Behavioral Sciences,* **41,** Issue 2 (2005), pp. 103-130.

**SCHLICHT** is a loan-word from the German literature on *Funktiontheorie* (complex analysis). It entered the English literature
in the 1920s and is still used. A *schlicht* function “takes no value more once” explains J. E. Littlewood in his 1925
article “On inequalities in the theory of functions,” *Proc. London Math. Soc*., **23**, p. 481. Authors of English textbooks
have usually preferred other terms. “Biuniform” is used by P. Dienes *The Taylor Series* (1931). “Simple” is used
by E. T. Copson *An Introduction to the Theory
of Functions of a Complex Variable* (1935) presumably because in
ordinary non-mathematical German *schlicht* means “simple” or “plain.”
Copson reports that the French term is “univalent.”
L. V. Ahlfors *Complex Analysis* (1953, p. 172) remarks that *schlicht* “lacks an adequate translation.”
Ahlfors prefers *univalent* and that is now the commonest term in English.

See UNIVALENT.

**SCHMIDT ORTHOGONALIZATION.** See the entry GRAM-SCHMIDT ORTHOGONALIZATION.

The term **SCHUR COMPLEMENT** was introduced in 1968 by Emilie V. Haynsworth
(1916-1985) and named for the German mathematician
Issai Schur (1875-1941) and his lemma of 1917 in
“Über Potenzreihen, die im Innern des Einheitskreises beschränkt sind,”
*Journal für die reine und angewandte Mathematik*,
**147**, (1917), 205-232.
However the historical notes in chapter 0 of Fuzhen Zhang (ed.) *The Schur Complement and Its Applications* (2005)
identify “implicit manifestations” in the work of Sylvester in 1851 and even Laplace in 1812.

**SCHUR PRODUCT**. See HADAMARD or SCHUR PRODUCT.

**SCHWARZ INEQUALITY.** See CAUCHY-SCHWARZ INEQUALITY.

**SCHWARZ’ THEOREM** on the equality of mixed partial derivatives.
See CLAIRAUT’S THEOREM, SCHWARZ’ THEOREM, and YOUNG’S THEOREM.

**SCIENTIFIC NOTATION.** In 1895 in *Computation Rules and
Logarithms* Silas W. Holman referred to the notation as "the
notation by powers of ten." In the preface, which is dated August
1895, he wrote: "The following pages contain ... an explanation of
the use of the notation by powers of ten ... the notation by powers
of 10, as in the explanation here given. It seems unfortunate that
this simple notation, so useful in computation and so great an aid in
the explanation of numerical relations, is not universally
incorporated into arithmetical instruction." [James A. Landau]

In *A Scrap-Book of Elementary Mathematics* (1908) by William F.
White, the notation is called the *index notation.*

*Scientific notation* is found in 1921 in *An Introduction to
Mathematical Analysis* by Frank Loxley Griffin: "*To write out
in the ordinary way* any number given in this ’Scientific
Notation,' we simply perform the indicated multiplication -- i.e.,
move the *decimal point* a number of places equal to the
exponent, supplying as many zeros as may be needed."

According to *Webster’s Second New International Dictionary*
(1934), numbers in this format are sometimes called *condensed
numbers.*

Other terms are *exponential notation* and *standard
notation.*

**SCORE, METHOD OF SCORING** and **SCORE TEST** in Statistics.
The derivative of the log-likelihood
function played an important part in R. A. Fisher’s theory of maximum likelihood
from its beginnings in the 1920s but the name score is more recent. The "score"
was originally associated with a particular genetic application; a family is
assigned a score based on the number of children of each category and there
were different ways scoring associated with different ways of estimating linkage.
In a 1935 paper ("The Detection of Linkage with Dominant Abnormalities,"
*Annals of Eugenics,* 6, 193) Fisher wrote that, because of the efficiency
of maximum likelihood, the "ideal score" is provided by the derivative
of the log-likelihood function. In 1948 C. R. Rao used the phrase *efficient
score* (*Proc. Cambr. Philos. Soc.* 44, 50-57) and *score* by itself
(*J. Roy. Statist. Soc.,* B, 10: 159-203) when writing about maximum likelihood
*in general,* i.e. without reference to the linkage application. Today
"score" is so established in this derivative of the log-likelihood
sense that the phrases "non-ideal score" or "inefficient score"
would convey nothing.

In 1946 - still in the genetic context - Fisher ("A System
of Scoring Linkage Data, with Special Reference to the Pied Factors in Mice.
*Amer. Nat.,* 80: 568-578) described an iterative method for obtaining
the maximum likelihood value. Rao’s 1948 *J. Roy. Statist. Soc. B* paper
treats the method in a more general framework and the phrase "Fisher’s
method of scoring" appears in a comment by Hartley. Fisher had already
used the method in a general context in his 1925
"Theory
of Statistical Estimation" paper
(*Proc. Cambr. Philos. Soc.* 22: 700-725) but it attracted neither attention
nor name.

In 1948 Rao introduced a test which he called the **efficient
score test** in his "Large sample tests of statistical hypotheses concerning
several parameters with applications to problems of estimation," *Proc. Cambr.
Philos. Soc*, **44**, 50-57. The one-parameter version had already been
discussed by Wald in "Some Examples of Asymptotically Most Powerful Tests," *Annals of Mathematical
Statistics*, **12**, (1941), 396-408. While "Rao’s efficient score
test" is sometimes seen, the terms "score test" and "Lagrange multiplier test"
are more common.

This entry was contributed by John Aldrich, with some information taken from David (1995). See the entries LAGRANGE MULTIPLIER TEST and WALD TEST.

**SECANT** (in trigonometry) was introduced by Thomas Fincke
(1561-1656) in his *Thomae Finkii Flenspurgensis Geometriae rotundi
libri XIIII,* Basileae: Per Sebastianum Henricpetri, 1583. (His
name is also spelled Finke, Finck, Fink, and Finchius.) Fincke wrote
*secans* in Latin.

Vieta (1593) did not approve of the term *secant,* believing it
could be confused with the geometry term. He used
*Transsinuosa* instead (Smith vol. 2, page 622).

**SECOND DIFFERENCE** is found in 1777 in "A Method of finding the Value of an
infinite Series of decreasing Quantities of a certain Form," by Francis Maseres
in the *Philosophical Transactions of the Royal Society* vol. 67:
"And 2dly, let these numbers be so related to
each other, that they not only shall form a decreasing progression
theselves, but that their differences, *a-b, b-c,
c-d, d-e, e-f, f-g, g-h,* &c. shall also form a decreasng
progression, so that *b-c* shall be less than *a-b,* and *c-d*
than *b-c,* and *d-e* than *c-d,* and so on of the following differences;
and likewise, that the differences of these differences (which may be
called *the second differences* of the original numbers *a, b, c, d, e,
f, g, h,* &c. shall form a decreasing progression; and that the differences
of those second differences, or *the third differences* of the original numbers
*a, b, c, d, e, f, g, h,* &c. shall also form a decreasing progression;
and in like manner, that the differences of the said third differences, or
*the fourth differences,* of the original numbers *a, b, c, d, e, f, g, h,*
&c. and the fifth and sixth differences, and all higher differences, of the
same numbers, shall also form decreasing progressions."

**SECULAR EQUATION.** See EIGENVALUE.

**SELF-CONJUGATE.** Kramer (p. 388) says Galois used this
term, referring to a normal subgroup.

The term **SEMI-CUBICAL PARABOLA** was coined by John Wallis
(Cajori 1919, page 181).

The term **SEMIGROUP** apparently was introduced in French as
*semi-groupe* by J.-A. de Séguier in *Élem. de la
Théorie des Groupes Abstraits* (1904).

**SEMI-INVARIANT** or **HALF-INVARIANT.**
T. N. Thiele
(1838-1910) introduced the concept
in a Danish work, called in English, *The General Theory of Observations*
(1889). His "half-invariants" did not fare well in Britain. They were noticed
by Karl Pearson in his
"Contributions
to the Mathematical Theory of Evolution. II. Skew Variation in Homogeneous Material,"
*Philosophical Transactions of the Royal Society A*, **186**, (1895),
p. 412 but only to be dismissed. Arne Fisher, a Danish emigrant to the United
States, gave an account of semi-invariants in his *Mathematical Theory of
Probabilities* (2^{nd} edition, 1922); "semi-invariant" was the more
common English rendering. In 1929 Ronald Fisher began publishing on *cumulative
moment functions* (see CUMULANT) and several correspondents told him that these were identical to
Thiele’s semi-invariants. See the index entry *cumulant* in J. H. Bennett
*Statistical
Inference and Analysis: Selected Correspondence of R. A. Fisher* (1990). Fisher did not agree and did not change
his terminology, which became the accepted one.

This entry was contributed by John Aldrich based on S L Lauritzen (2002) *Thiele: Pioneer in Statistics*.

**SENTENTIAL CALCULUS** is found in English in 1937 in
a translation by Amethe Smeaton of *The Logical Syntax of Language*
by Rudolf Carnap: "Primitive sentences of the sentential calculus" (OED2).

**SEPARABLE** is found in 1803 in *A General History of Mathematics* translated from the French of
John Bossut: “No one could completely accomplish the object: but a great number of cases were
pointed out, in which the indeterminates are separable, and in which the equations may consequently
be resolved by the quadratures of curves.” [Google print search]

**SEQUENCE.** The OED2 shows a use by Sylvester in 1882
in the *American Journal of Mathematics* with the "rare"
definition of a succession of natural numbers in order.

*Sequence* is found in 1891 in a translation by George Lambert Cathcart
of the German *An introduction to the study of the elements of the differential and integral calculus*
by Axel Harnack:
"What conditions must be fulfilled in order that for continually
diminishing values of Δ*x*, the quotient ... may present a continuous sequence of numbers
tending to a determinate limiting value: zero, finite or infinitely great?" [University of
Michigan Historical Math Collection; the term may be considerably older.]

**SEQUENTIAL ANALYSIS** was developed
in the Second World War by statisticians in the USA and in Britain. In 1945
Abraham Wald published his "Sequential Tests of Statistical Hypotheses,"
*Annals of Mathematical Statistics*, **16**, (1945), 117-186 and followed
it up with a book *Sequential Analysis* in 1947. The terms **sequential
analysis**, **sequential test** and **sequential probability ratio test**
all appear in the 1945 article. (David (2001))

**SERIAL CORRELATION.** The term was introduced
by G. U. Yule in his 1926 paper "Why Do We Sometimes Get Nonsense Correlations
between Time-series? A Study in Sampling and the Nature of Time-series,"
*Journal of the Royal Statistical Society,* **89,** 1-69: "I propose
to term such correlations, *r*_{1} between *u*_{s}
and *u*_{s+1}, *r*_{2} between *u*_{s}
and *u*_{s+2}, etc., where *u*_{s} is the value of
the variable in year s, the *serial correlations* for the given series."
(p. 14) (David 2001).

**SERIES.** According to Smith (vol. 2, page 481), "The early
writers often used *proportio* to designate a series, and this
usage is found as late as the 18th century."

John Collins (1624-1683) wrote to James Gregory on Feb. 2, 1668/1669, "...the Lord Brouncker asserts he can turne the square roote into an infinite Series" (DSB, article: "Newton").

James Gregory wrote to John Collins on Feb. 16, 1671 [apparently O. S.]: "I do not question that all equations may be formed by tables, but I doubt exceedingly if all equations can be solved by the help only of the tables of logarithms and sines without serieses."

According to Smith (vol. 2, page 497), "The change to the name ’series' seems to have been due to writers of the 17th century. ... Even as late as the 1693 edition of his algebra, however, Wallis used the expression 'infinite progression' for infinite series."

In the English translation of Wallis' algebra (translated by him and published in 1685), Wallis wrote:

Now (to return where we left off:) Those Approximations (in the Arithmetick of Infinites) above mentioned, (for the Circle or Ellipse, and the Hyperbola;) have given occasion to others (as is before intimated,) to make further inquiry into that subject; and seek out other the like Approximations, (or continual approaches) in other cases. Which are now wont to be called by the name ofTheInfinite Series,orConverging Series,or other names of like import.

**SET** and **SET THEORY** are the modern English equivalents
for the German terms *Menge* and *Mengenlehre* adopted by
Georg Cantor
(1845-1918) at the end of the 19^{th} century and used by subsequent
German writers. Their French counterparts are *ensemble* and *théorie
des ensembles*. Before they acquired the new meaning the words *Menge,*
*ensemble* and *set* were established non-technical
words in their respective languages. Each word had several meanings and the
meanings by no means coincided: see dictionary
entries. To add to the complexity, the word *Menge* was not Cantor’s original choice and in English a
number of words have been used.

In 1796, William Frend used the phrase “set of numbers” in
*The Principles of Algebra*. This use of the word was found by Stanley Burris, who wites, “This
was certainly not an influential book since Frend did not
accept negative numbers, but it suggests the use of the word *set* in math texts
may have been common.”

In English the *OED* records the use of *set* for a
collection of things (musical instruments, say) from the 17^{th} century.
In the 19^{th} century the word is found in mathematical contexts. Thus in the
*Lectures
on Quaternions* (1853) Hamilton used the word “set” and even once
the term “theory of sets.” Hamilton used “set” to mean what we would call an
“*n*-tuple,” that is, a set of numbers which
could be used as a coordinate in n-dimensional analytic geometry. However this
usage did not become established and *set* only arrived as a specialised technical term in the 20^{th} century.

In German the old word *Menge* also began to be used in technical
contexts in the 19^{th} century. E.g. in von Staudt’s* Geometrie der Lage* (2nd ed., 1856):
"Wenn man die Menge aller in einem und demselben reellen einfoermigen
Gebilde enthaltenen reellen Elemente durch n + 1 bezeichnet und mit diesem
Ausdrucke, welcher dieselbe Bedeutung auch in den acht folgenden Nummern hat,
wie mit einer endlichen Zahl verfaehrt, so ...".

The modern importance of the term *Menge/set* is due to Cantor. His work is
described both in general works like Kline (ch. 41) and in monographs like J.
W. Dauben *Georg Cantor : His Mathematics and
Philosophy of the Infinite* (1979).

Dauben describes the emergence of the *Menge* terminology on p. 170. In his early
work Cantor used the term *Mannichfaltigkeit*,
Riemann’s term often translated as MANIFOLD.
The terms *Menge* and *Mengenlehre* appear in a note to the article
“Über unendliche, lineare Punktmannichfaltigkeiten, (Part 5)
*Mathematische
Annalen,* **21** (1883), 545-591, issued separately
as a pamphlet *Grundlagen einer allgemeinen
Mannichfaltigkeitslehre*. The meaning of the term *Mannichfaltigkeitslehre* is discussed in a note on p. 587 of the
article. Cantor distinguishes this general theory from geometrical *Mengenlehre* and goes on to explain, “By an
‘aggregate’ [Mannichfaltigkeit] or ‘set’ [Menge] I mean generally any multitude
which can be thought of as a whole, i.e., any collection of definite elements
which can be united by a law into a whole.” Dauben’s translation.

However Cantor did not adopt *Menge* and *Mengenlehre* as *the* terms until later. Both
are used in his “Beiträge zur Begründung der Transfiniten Mengenlehre,”
*Mathematische
Annalen,* **46** (1895), 481-512. The opening words are:

By a “set” [„Menge”] we mean any collectionMof into a whole of definite, distinct objectsmof our intuition or our thinking (which are called the “elements” ofM) of our perception or of our thought.

Translation from Dauben (p. 170).

Beginning in 1883 Cantor’s papers were translated
into French for publication in Mittag-Leffler’s journal *Acta Mathematica*.
There the terms *ensemble* and *théorie des ensembles* were used. The first of the series was
“Une contribution á la théorie des ensembles,” *Acta Mathematica*, **2**, 311-328 (1883)
from “Ein Beitrag zur Mannigfaltigkeitslehre,”
*Journal für die reine und angewandte Mathematik*
**84** (1878) 242-258. The French terms
did not change when When Cantor adopted *Menge*.

While the French terms were approved by Cantor
the English terms, which came somewhat later, were at the writer’s discretion
and there was more variation. In 1903 the mathematical logician Bertrand
Russell treated Cantor’s *Menge* as
equivalent to the logical term *class* (see CLASS). In
analysis the terms *aggregate* and *set* were used and co-existed for some
decades.

In 1901 E. H. Moore declared, “It is
convenient to use *set* as the
equivalent of *Menge *and *ensemble.*” “Concerning Harnack’s Theory of
Improper Definite Integrals,” *Transactions
of the American Mathematical Society*, **2**, p. 297.

However, the practice of E. W. Hobson was more typical for the time. In
*The
Theory of Functions of a Real Variable and the Theory of Fourier’s Series* (1907, p. v)
he wrote of “the Theory of Sets of Points, also known
in its more general aspect, as the Theory of Aggregates.”

*Theory of point sets* is found in 1912 in volume II of
*Lectures
on the Theory of Functions of Real Variables* by James Pierpont:
“After the epoch-making discoveries inaugurated in 1874 by G. Cantor in the
theory of point sets...” (Preface p. 5)

In the well-known volume of translations of Cantor’s
1895 papers, *Contributions to the Founding
of the Theory of Transfinite Numbers* (1915, re-issued by Dover 1955),
the translator, P. E. B. Jourdain, renders *Menge* and *Punktmenge* as *aggregate*
and *point-aggregate.* In the preface Jourdain states that the
broad field is usually described as “the theory of aggregates” or “the theory
of sets.”

*Set theory* is found in 1926 in Orrin Frink “The Operations of Boolean Algebras,”
*Annals of Mathematics* (2d ser.) XXVII. p. 487.
(*JSTOR*).

The term *axiomatic set theory* came into circulation in the 1930s, while *naïve set theory* was used occasionally in
the 1940s, becoming an established term in the 1950s. It appears in Hermann
Weyl’s review of P. A. Schilpp (ed) *The Philosophy of Bertrand Russell*
in the *American Mathematical Monthly*, **53**, No. 4. (1946), p. 210
and Laszlo Kalmar’s review of *The Paradox of Kleene and Rosser* in
*Journal of Symbolic Logic*, **11**, No. 4. (1946), p. 136. (*JSTOR*)

[John Aldrich, James A. Landau, and Ken Pledger contributed to this entry.]

A complete list of the set theory and logic
terms on this web site is here. For
set theory symbols see *Earliest Use of Symbols*.

**SEXAGESIMAL** appears in *A Proposal About Printing A treatise of
Algebra* by John Wallis, which was circulated in 1683: "The *Sexagesimal*
Fractions (introduced it seems by Ptolemy) did but imperfectly supply
the want of such a Method of Numerical Figures."

**SHEAF** has been used
for a family of rays or planes that pass through a given point. Among the quotations given by the *OED* are these:
“A sheaf of calorific rays” from Tyndall *Heat* 1863 and
“A *sheaf* (*sheaf of planes, sheaf of lines*)
is a figure made up of planes or straight lines, all of which pass through a
given point (the *centre* of the
sheaf)” from an 1885 translation of *Cremona Elements of projective geometry.*

More recently *sheaf* has been used in algebraic topology as a translation of the French word “faisceau.”
The term was introduced by Jean Leray in
his “L'anneau d'homologie d'une representation,”
*Comptes rendus de l'Académie des sciences,* **222**, (1946),
1366-1368. The earliest quotation in the *OED* is:

The French word ‘faisceau’ has been translated into English as ‘sheaf’ or ‘stack’. In this paper we use the word ‘stack’, since ‘sheaf’ has been used before in mathematics.

*Ann. Math.* **62**, (1955), p. 56. Alas, “stack” did not take off.

This entry was contributed by John Aldrich. For sheaf theory see the entry in the Encyclopedia of Mathematics.

**SHEPPARD’S CORRECTIONS** are adjustments to moments calculated from grouped data, proposed
by W. F. Sheppard
(1863-1936). The phrase *Sheppard’s corrections* appears
in Karl Pearson’s "Mathematical Contributions to the Theory of
Evolution. X. Supplement to a Memoir on Skew Variation,"
*Philosophical Transactions
of the Royal Society A, 197, (1901),* p. 451. Pearson refers to Sheppard’s "On
the Calculation of the Most Probable Values of Frequency Constants for data
arranged according to Equidistant Divisions of a Scale,"

(David (2001))

**SHORT DIVISION** is found in 1777 *The Man of Business and Gentleman’s Assistant* by William Perry:
“Short Division is when the Divisor does not exceed 12.”
[Google print search]

**SHRINKAGE and SHRINKING** in statistical estimation theory. The terms were introduced by J. R.
Thompson in “Some Shrinkage Techniques for Estimating the Mean,” *Journal of
the American Statistical Association*, **63**, (1968), 113-122.
Particular shrinkage estimators had been investigated by earlier writers including
Karlin and Bartlett but they did not introduce any terms. [John Aldrich]

**SIBLING.** The OED2 shows two citations for *sibling* from
the Middle Ages. In both cases, the word had the obsolete meaning of
"one who is of kin to another; a relative."

*Sibling* does not appear in the 1890 Funk & Wagnalls
unabridged dictionary.

The OED2 shows a use of *sib* to mean "brother or sister" in
1901.

After the two citations from the Middle Ages, the next citation in
the OED2 for *sibling* is by Karl Pearson in 1903 in
*Biometrika,* where the word is used in its modern sense: "These
[calculations] will enable us .. to predict the probable character in
any individual from a knowledge of one or more parents or brethren
(’siblings', = brothers or sisters)."

In 1931, a translation by E. & C. Paul of *Human Heredity*
by E. Baur et al. has: "The word ‘sib’ or ‘sibling’ is coming into
use in genetics in the English-speaking world, as an equivalent of
the convenient German term 'Geschwister'" (OED2).

**SIEVE OF ERATOSTHENES** is attributed to
Eratosthenes of Cyrene.
The expression is found in English in 1772 in
“ΚΟΣΚΙΝΟΝ ΕΡΑΤΟΣΘΕΝΟΥΣ .
or, The Sieve of Eratosthenes. Being an Account of His Method of Finding All the
Prime Numbers,” by the Rev. Samuel Horsley, F. R. S.
*Philosophical
Transactions (1683-1775)*, **62**, (1772), pp. 327-347. (*JSTOR*
search.) See the Wikipedia
entry. [John Aldrich]

**SIGN OF AGGREGATION** is found in 1863 in *The Normal: or,
Methods of Teaching the Common Branches, Orthoepy, Orthography,
Grammar, Geography, Arithmetic and Elocution* by Alfred Holbrook:
"The signs of aggregation are the bar ___, which signifies that the
numbers over which it is placed are to be taken together as one
number; also, the parenthesis, (); the brackets, []; and the braces,
{}, which signify that the quantities enclosed by them respectively
are to be taken together, as one quantity."

In 1900 in *Teaching of Elementary Mathematics,* David Eugene
Smith wrote: "Signs of aggregation often trouble a pupil more than
the value of the subject warrants. The fact is, in mathematics we
never find any such complicated concatenations as often meet the
student almost on the threshold of algebra."

The **SIGN TEST** seems to be the *oldest* formal significance test,
for it was used by Dr. John Arbuthnott, “Physitian in Ordinary to Her Majesty,”
An Argument for Divine Providence, taken from the constant
Regularity observed in the Births of both Sexes, *Philosophical
Transactions of the Royal Society of London* **27**, (1710-1712), 186-190.

F. R. Helmert used the test and named it (in German) in 1905.

The sign test appears in R. A. Fisher’s
*Statistical
Methods for Research Workers* (1925, ch. V, Example 19) where
it is compared with a *t*-test. [This entry was contributed by John Aldrich,
using David (2001)).]

**SIGNATURE** of a quadratic form. C. C. MacDuffee *The Theory of
Matrices* (1933, p. 57) attributes the term to Frobenius. *Die Signatur* appears in his “Ueber das
Trägheitsgesetz der quadratischen Formen,”
Journal
für die reine und angewandte Mathematik, **114**, (1895), p. 187. See
the entry LAW OF INERTIA.

**SIGNED NUMBER.** *Signed magnitude* appears in 1873 in
*Proc. Lond. Math. Soc.*: "A signed magnitude" (OED2).

*Signed number* appears in the title "The [Arithmetic]
Operations on Signed Numbers" by Wilson L. Miser in *Mathematics
Magazine* (1932).

**SIGNIFICANCE.** Significance testing is almost as old as the theory of probability.
Three hundred years ago Dr. John Arbuthnott, "Physitian in Ordinary to
Her Majesty," tested the hypothesis that the probability of a male birth is equal to that of a female birth in
An Argument for
Divine Providence, taken from the constant
Regularity observed in the Births of both Sexes, *Philosophical
Transactions of the Royal Society of London* **27**, (1710-1712), 186-190.
However the terminology of *significance* is more recent.

*Significant* is found in 1885 in F. Y. Edgeworth, "Methods of Statistics,"
*Jubilee Volume, Royal Statistical Society,* pp. 181-217: "In order
to determine whether the observed difference between the mean stature of 2,315
criminals and the mean stature of 8,585 British adult males belonging to the
general population is significant [etc.]" (OED2).

*Significance* is found in 1888 in
*Logic of Chance* by John Venn:
"As before, common sense would feel little doubt that such a difference was significant,
but it could give no numerical estimate of the significance" (p. 486) (OED2).

The terms *test of significance* and *significance
test* were used before the 1920s but only rarely. A JSTOR search finds *significance
test* in Oswald H. Latter "The Egg of Cuculus Canorus. An Enquiry into
the Dimensions of the Cuckoo’s Egg and the Relation of the Variations to the
Size of the Eggs of the Foster-Parent, with Notes on Coloration, &c *Biometrika*,
**1**, (1902), p. 168.

The expression *test of significance* was very prominent
in R. A. Fisher’s *Statistical
Methods for Research Workers* (1925). This book introduced the
related terms *level of significance* (p. 161), *5 per cent point*
(p. 198) and *statistical significance* (p. 218).

*Testing the significance*
is found in Student’s "New tables for testing the significance
of observations," *Metron* 5 (3) pp 105-108 (1925).

*Statistically significant* is
found in 1931 in L. H. C. Tippett, *Methods of Statistics*: "It is
conventional to regard all deviations greater than those with probabilities
of 0.05 as real, or statistically significant" (OED2).

Curiously no "significance" terms appear in the
famous Neyman and Pearson 1933 paper, although Neyman uses "level of significance"
in his textbook *First Course in Probability and Statistics,* (1950) where
it is identified with the probability of committing an error of the first kind.
(p. 265).

This entry was contributed by John Aldrich. See also HYPOTHESIS AND HYPOTHESIS TESTING.

**SIGNIFICANT DIGIT.** Smith (vol. 2, page 16) indicates
Licht used the term in 1500, and shows a use of "neun bedeutlich
figuren" by Grammateus in 1518.

In 1544, Michael Stifel wrote, "Et nouem quidem priores, significatiuae uocantur."

*Signifying figures* is found in 1542 in Robert Recorde, *Gr.
Artes* (1575): "Of those ten one doth signifie nothing... The
other nyne are called Signifying figures" (OED2).

*Significant figures* is found in 1660 in Milton, *Free
Commw.*: "Only like a great Cypher set to no purpose before a long
row of other significant Figures" (OED2).

*Significant figures* is found in the first edition of the
*Encyclopaedia Britannica* (1768-1771) in the article "Arithmetick":
"Of these, the first nine, in contradistinction to the cipher, are
called *significant figures.*"

*Mathematical Dictionary and Cyclopedia of Mathematical Science*
(1857) has this definition:

SIGNIFICANT. Figures standing for numbers are called significant figures. They are 1, 2, 3, 4, 5, 6, 7, 8, and 9.

*Non-significant digit* is found in January 1900 in Neal H.
Ewing, "The Shakespeare Name," *Catholic World*: "Naught is the
non-significant digit; though it means nothing, yet it counts for so
much."

An article in *The Mathematics Teacher* in October 1939
explains that zero is sometimes a "significant figure."

**SIMILAR.** In 1557 Robert Recorde used *like* in the
*Whetstone of Witte*: "When the sides of one plat forme, beareth
like proportion together as the sides of any other flatte forme of
the same kinde doeth, then are those formes called *like
flattes* .. and their numbers, that declare their quantities, in
like sorte are named *like flattes*" (OED2).

In the manuscript of his *Characteristica Geometrica* which was
not published by him, Leibniz wrote "similitudinem ita notabimus:
*a* ~ *b.*"

In 1660 Isaac Barrow used *like* in his *Euclid*: "If in a
triangle *FBE* there be drawn *AC* a parallel to one side
*FE,* the triangle *ABC* shall be like to the whole
*FBE* (OED2).

In English, *similar triangles* is found in 1704 in *Lexicon
technicum*: "*Similar Triangles* are such as have all their
three Angles respectively equal to one another" (OED2).

**SIMILAR (applied to a matrix)** was introduced by Frobenius (as "ähnlich") in
"Ueber lineare Substitutionen und bilineare Formen,"
*J.
reine angew. Math.* 84 (1879) p. 21. This is according to C. C.
MacDuffee, *The Theory of Matrices,* Springer (1933).

**SIMILAR REGION **was introduced by J. Neyman and E. S. Pearson in
"On the Problem
of the Most Efficient Tests of Statistical
Hypotheses," *Philosophical Transactions of the Royal Society
of London. Series A,* **231**, (1933), 289-337. (David, 1995.)

See also HYPOTHESIS AND HYPOTHESIS TESTING.

**SIMPLE CLOSED CURVE** occurs in 1873 in
"On Listing’s Theorem" by Arthur Cayley in the *Messenger of Mathematics*
[University of Michigan Historical Math Collection].

**SIMPLEX.** William Kingdon Clifford (1845-1879) used the
term *prime confine* in "Problem in Probability,"
*Educational Times,* Jan. 1886:

Now consider the analogous case in geometry of[There could be an error in this citation, as Clifford had died in 1879.]ndimensions. Corresponding to a closed area and a closed volume we have something which I shall call aconfine.Corresponding to a triangle and to a tetrahedron there is a confine withn+ 1 corners or vertices which I shall call aprime confineas being the simplest form of confine.

In a post to a math history mailing list in 2004,
John Stillwell wrote, "It is true that Poincare uses the idea of simplex in his pioneering
works of algebraic topology, but he does not seem to use the actual
word *simplex.* In his 1st Complement a l'analysis situs of 1900 he
calls it a *generalized tetrahedron.*
The first use of the word (in our sense) I can find is in Schoute’s
*Mehrdimensionale Geometrie* of 1902. On p. 9 of volume 1 he suggests
the name "Simplicissimum", because it is the simplest piece of
d-dimensional space. Then on p. 10 he decides to call it a "simplex"
for short."

**SIMPLEX METHOD** is found in Robert Dorfman, "Application of the
simplex method to a game theory problem," *Activity Analysis of
Production and Allocation,* Chap. XXII, 348-358 (1951).

*Simplex approach* is found in 1951 by George B. Dantzig (1914-2005)
in T. C. Koopman’s *Activity Analysis of Production and
Allocation* xxi. 339: "The general nature of the ’simplex'
approach (as the method discussed here is known)" (OED2).

**SIMPLY ORDERED SET** was defined by Cantor in *Mathematische
Annalen,* vol. 46, page 496.

**SIMPSON’S PARADOX** is the name
given to a result in conditional probability by C. R. Blyth: "The paradox is
the possibility of *P*{*A* | *B*}<* P*{*A* | *B*'
while *P*{*A* | *B*}≥* P*{*A* | *B*'}
both under the additional condition *C* and under the complement
*C*' of that condition." ("On Simpson’s Paradox and the Sure-Thing
Principle", *Journal of the American Statistical Association*, **67**,
(1972), p. 364.)

Blythe altered the details of the inequalities underlying
the "curious case" discussed in paragraphs 8-10 of E. H. Simpson’s "The
Interpretation of Interaction in Contingency Tables", *Journal of the
Royal Statistical Society, B*, **13**, (1951), pp. 238-241. More importantly,
however, the novelty and interest of the case were not in the inequalities and
the possible conflict between the unconditional (total) analysis and the conditional
(partial) analysis but rather in Simpson’s demonstration that some situations
require the one and some the other and there is nothing in the numbers to say
which is required.

The possibility of a conflict between total and partial analyses was first
noticed in 1899 by Pearson, Lee & Bramley-Moore
"Mathematical
Contributions to the Theory of Evolution. - VI. Genetic (Reproductive) Selection: Inheritance of Fertility in Man, and
of Fecundity in Thoroughbred Racehorses," *Philosophical Transactions
of the Royal Society A,* **192**, (1899), p. 278):
"We are thus forced to the conclusion that a mixture of heterogeneous groups, each
of which exhibits no organic correlation, will exhibit a greater or less amount
of correlation. This correlation may properly be called spurious . . ."
G. U. Yule gave more attention to the phenomenon and called the correlation
(or association) in the population formed from the mixing of records "fictitious"
in his 1903 *Biometrika* paper "Notes on the Theory
of Association of Attributes in Statistics" (p. 143) and "illusory"
in his 1911 book *Introduction to the Theory of Statistics* (pp. 49ff.).
In the case of conflict Pearson and Yule considered the partial analyses the
relevant ones and the total analysis suspect. [John Aldrich]

Given that Simpson referred to Yule and Yule to Pearson, Blythe’s
choice of term may seem surprising and unfortunate. I. J. Good and Y. Mittal
"The Amalgamation and Geometry of Two-by-Two Contingency Tables," *Annals
of Statistics*, **15**, (1987), p. 695 consider "Simpson’s paradox" an
instance of Stigler’s law of eponymy. Some authors prefer the names "Simpson-Yule
Paradox" or "Yule Paradox." Good and Mittal use the impersonal term "amalgamation
paradox." [John Aldrich]

See ASSOCIATION, CORRELATION, EPONYMY and SPURIOUS CORRELATION.

**SIMPSON’S RULE** for the numerical evaluation of an integral is named for
Thomas
Simpson. However, E. T. Whittaker and G. Robinson note, “This formula
[generally known as Simpson’s or the parabolic rule] was first given (in a
geometrical form) by Cavalieri [1639], and later by James Gregory [1668] and by
Thomas Simpson [1743].” (*Calculus of Observations* (1924, p. 156))

*Simpson’s rule* is found in
1856 in *A treatise on land-surveying* by William Mitchell Gillespie:
“When the line determined by the offsets is a curved line, ‘Simpson’s
rule’ gives the content more accurately” [University of Michigan Digital
Library].

In 1911, *Elements of the Differential and Integral
Calculus* by William Anthony Granville has “Simpson’s rule (parabolic
rule).” This may also appear in the earlier 1904 edition.

The expression *Simpson’s
rule* has been used in other ways reflecting Simpson’s broad interests. One
use was in the theory of annuities and another in algebra. In the latter sense
it appears in 1851 in *Bonnycastle’s introduction to algebra* by John
Bonnycastle [University of Michigan Digital Library].

**SIMSON LINE.** The theorem was attributed to Robert Simson
(1687-1768) by François Joseph Servois (1768-1847) in the
Gergonne’s Journal, according to Jean-Victor Poncelet in
*Traité des propriétés projectives des
figures.* The line does not appear in Simson’s work and is
apparently due to William Wallace. [The University of St. Andrews
website]

**SIMULATE, SIMULATOR, SIMULATION.** Words of this family have been in English
since the fourteenth century (*OED*)
but the modern technical meaning dates from around 1950 and was a response to
the development of the modern COMPUTER. F. C. Williams & F. J. U. Ritson’s “Electronic Servo Simulators,”
*Jrnl. Inst. Electr. Engineers* XCIV. IIA.(1947),
p. 112 describes a form of analogue computer, “This paper
presents an outline of a method which will allow automatic
control systems to be studied experimentally by means of an electronic
device
called a “simulator,” which is constructed so as to have
the same
characteristic equation as the control system.” (The *OED* has a different quotation from the same paper.) For
digital computers, a *JSTOR* search
found the term in use in “A Technique for Real Time
Simulation of a Rigid Body Problem,” by H. J. Gray, Jr.; M. Rubinoff; H. Sohon in *Mathematical Tables and Other Aids to Computation*,
**7**, (1953), pp. 73-77.

See also MONTE CARLO.

**SIMULTANEOUS EQUATIONS** is found in 1820 in *A Collection of Examples
of the Applications of the Differential and Integral Calculus*
by George Peacock:
"Let us take the three linear simultaneous equations of the second order...."
[Google print search]

**SIMULTANEOUS EQUATIONS MODEL** in ECONOMETRICS.
The model was formulated and an estimation method proposed by Trygve Haavelmo in
"The Statistical Implications of a System
of Simultaneous Equations," *Econometrica*, **11**,
(1943), pp. 1-12. His starting point was
"if one assumes that the economic variables considered satisfy, simultaneously,
several stochastic relations, it is usually *not* a
satisfactory method to try to determine each of the equations separately from
the data without regard to the restrictions which the *other*
equations might impose upon the same variables." (p. 2) In 1989 Haavelmo received the
Nobel Prize in Economics for this work. The term
"simultaneous equations model" entered currency in the early 1950s. For the
history of the development of the model see M. S.
Morgan *A History of Econometric Ideas*, Cambridge 1990.

Much of the standard terminology associated with the model was created by T. C. Koopmans. See the entries ENDOGENOUS/EXOGENOUS VARIABLE and IDENTIFIABILTY.

**SINE.** The word has come with some distortion from Sanskrit through Arabic
and Latin. Accounts differ on the details but the basic story is this: the
Sanskrit *jya* (“chord”) was taken into Arabic as *jiba* but
the word that was translated into Latin was not this word but *jaib* (“bay”) and this became *sinus* (“bay” or “curve”)
which was anglicized as *sine*.

The account of Indian trigonometry in Katz (6.6) begins
with a fragment dating from the early fifth century which contains a table of “half-chords.”
However, the first fully preserved work is the *Aryabhatiya* of
Aryabhata
the Elder. This small astronomical treatise completed in 499 gave a summary of Hindu
mathematics up to that time; see here.
Aryabhata used the terms *ardha-jya* (“half-chord”) and *jya-ardha* (“chord-half”), and
abbreviated them to *jya* (“chord”). The word *jya* derives from “bowstring.”

From *jya* the Arabs phonetically derived *jiba,* which, following the practice in Arabic of
omitting vowels, was written as *jb.* Accounts differ on who was responsible
for the subsequent confusion with *jaib* and who first translated this word into Latin.

In some accounts *sinus* first appears in Latin in a translation of the *Algebra* of
al-Khowarizmi by Gherard of Cremona (1114-1187). For example, Eves (page
177) writes:

Later writers, coming acrossjbas an abbreviation for the meaninglessjiba,substitutedjaibinstead, which contains the same letters and is a good Arabic word meaning "cove" or "bay." Still later, Gherardo of Cremona (ca. 1150), when he made his translations from the Arabic, replaced the Arabianjaibby its Latin equivalent,sinus,whence came our present wordsine.

Boyer (page 278) places the first appearance of *sinus* in a translation of 1145:

When Robert of Chester came to translate the technical wordjiba,he seems to have confused this with the wordjaib(perhaps because vowels were omitted); hence he used the wordsinus,the Latin word for "bay" or "inlet." Sometimes the more specific phrasesinus rectus,or "vertical sine," was used; hence the phrasesinus versus,or our "versed sine," was applied to the "sagitta," or the "sine turned on its side."

Smith (vol. 1, page 202) writes that the Latin *sinus* "was probably first used in
Robert of Chester’s revision of the tables of al-Khowarizmi."

According to Cajori (1906), the Latin term *sinus* was introduced in a translation of
the astronomy of Al Battani by Plato of Tivoli (or Plato Tiburtinus).

The term *sinus* was adopted by European mathematicians in their own writings and it appeared in
various phrases. In his *Practica Geometriae* (1220) Fibonacci used the expressions *sinus rectus
arcus* and *sinus versus arcus.* Regiomontanus (1436-1476) used *sinus, sinus
rectus,* and *sinus versus* in *De triangulis omnimodis* (On
triangles of all kinds; Nuremberg, 1533) [James A. Landau]. Smith (vol. 2, page
617) points out that not everyone used the term and that Rheticus (c. 1560)
preferred *perpendiculum*.

The Latin word *sinus* went into English in two forms, *sinus* and *sine*. The *OED* has
citations for both *sinus* and *sine* in the
sense of a “gulf” or “bay” but it describes the latter usage as obsolete. The
word *sine* has survived only in
the mathematical sense and the earliest citation in the *OED* is to
Thomas Fale in 1593:

This Table of Sines may seem obscure and hard to those who are not acquainted with Sinicall computation.

From *Horologiographia. The art of dialling: teaching an easie and perfect way to make all kinds of
dials vpon any plaine plat howsoeuer placec: With the drawing of the twelue
signes, and houres vnequall in them all....*

Naturally when English mathematicians wrote in Latin
they used the word *sinus* but the *OED* does not report any use of the word in the mathematical
sense in English texts. In French mathematics the word is *sinus*.

See also the entry **COSINE.**

The term **SINGLE-VALUED FUNCTION** (meaning analytic function)
was used by Yulian-Karl Vasilievich Sokhotsky (1842-1927).

The term **SINGULAR INTEGRAL** (in the theory of differential equations) is due to Lagrange
*Oeuvres, 3,* pp. 549-575 (Kline, page 532).

The term is found in 1831 in *Elements of the Integral Calculus*
(1839) by J. R. Young:

We see, therefore, that it is possible for a differential equation to have other integrals besides the complete primitive, but derivable from it by substituting in it, for the arbitrary constantc,each of its values given in terms ofxandyby the equation (5). Such integrals are calledsingular integrals,orsingular solutionsof the proposed differential equation.

**SINGULAR INTEGRAL** (in the theory of integration). A. L. Cauchy
introduced l’intégrale singulière in the "Mémoire sur les intégrales
définies," (presented in 1814 but published in 1827)
*Oeuvres Ser. 1*, **1**
p. 394. (F. Smithies *Cauchy and the Creation of Complex Function Theory*.)
See Mathworld.

**SINGULAR MATRIX.** *Singular matrix* and *non-singular
matrix* occur in 1907 in *Introduction to Higher Algebra* by
Maxime Bôcher: "Definition 2. A square matrix is said to be
singular if its determinant is zero."

**SINGULAR POINT** appears in a paper by George Green published in
1828. The paper also contains the synonymous phrase "singular value"
[James A. Landau].

*Singular point* appears in 1836 in the second edition of
*Elements of the Differential Calculus* by John Radford Young.
According to James A. Landau, who supplied this citation, it is not
clear what the author meant by the term. Landau writes, "Judging by
the contents of Chapter IV, to the author ’singular point' was the
name of the category to which 'multiple points,' 'cusps,' and 'points
of inflexion' belong."

In *An Elementary Treatise on Curves, Functions and Forces*
(1846), Benjamin Peirce writes, "Those points of a curve, which
present any peculiarity as to curvature or discontinuity, are called
*singular points.*"

**SINGULAR VALUE** and **SINGULAR VALUE DECOMPOSITION.** The paper most often cited in connection
with the singular value decomposition of a matrix is C. Eckart, & G. Young “The Approximation
of One Matrix by Another of Lower Rank,” *Psychometrika*, **1**, 211-218. The result, however,
is much older.

G. W. Stewart considers five mathematicians who were responsible for establishing the existence of the singular value decomposition and developing its theory: “Beltrami, Jordan, and Sylvester came to the decomposition through what we should now call linear algebra; Schmidt and Weyl approached it from integral equations.” It is interesting to compare the development of EIGENVALUE theory.

Apart from Beltrami (1873), all the contributions are available on the web: C. Jordan
“Sur la réduction des formes bilinéaires,”
*Comptes Rendus,* **78** (1874), 614-617;
J. J. Sylvester “On the reduction of a bilinear quantic of the *n*th order to the form of a sum of *n* products
by a double orthogonal substitution,” *Messenger of Mathematics*, **19**, (1889), 42-46
(*Papers
IV*, 655); E. Schmidt “Zur Theorie der linearen und nichtlinearen
Integralgleichungen. I Teil. Entwicklung willkiirlichen Funktionen nach System vorgeschriebener,
*Math.
Ann.,* **63**, (1907), 433-476; H. Weyl “Das asymptotische
Verteilungsgesetz der Eigenwert linearer partieller Differentialgleichungen
(mit einer Anwendung auf der Theorie der Hohlraumstrahlung),
*Math.
Ann.,* **71** (1912), 441-479.

These authors used a variety of terms, e.g. Sylvester
referred to “canonical multipliers.” The term “singular value” has been used in
this context since 1908 but yet quite in the way it is used today. The term is
used with its modern meaning by F. Smithies “The Eigen-Values and Singular
Values of Integral Equations,” *Proc. London
Math. Soc.*, **43**, (1938), 255-279. Eckart and Young do not use any special
terminology although they refer to Sylvester’s paper.

[This entry was contributed by John Aldrich, based on G. W. Stewart “On the Early History of the
Singular Value Decomposition,” *SIAM Review*, **35**, (1993), 551-566.]

**SIZE** (of a critical region) is found in 1933 in J. Neyman and E. S. Pearson,
"On the Problem
of the Most Efficient Tests of Statistical Hypotheses," *Philosophical Transactions of the Royal Society
of London, Ser. *A, **231**, (1933), 289-337 (David (2001)).

See also HYPOTHESIS AND HYPOTHESIS TESTING.

**SKEW DISTRIBUTION** and **SKEW CURVE** appear
in 1895 in Karl Pearson’s
Contributions
to the Mathematical Theory of Evolution. II. Skew Variation in Homogeneous Material,
*Philosophical Transactions of the Royal Society A*, **186**, 343-414. [James A. Landau]

**SKEW SYMMETRIC MATRIX.** *Skew symmetric determinant*
appears in 1849 in Arthur Cayley, *Jrnl. für die reine und
angewandte Math.* XXXVIII. 93: "Ces déterminants peuvent
être nommés 'gauches et symmétriques'" (OED2).

*Skew symmetric determinant* appears in 1885 in *Modern Higher
Algebra* by George Salmon: "A *skew symmetric* determinant is
one in which each constituent is equal to its conjugate with its sign
changed."

*Skew symmetric matrix* appears in "Linear Algebras," Leonard
Eugene Dickson, *Transactions of the American Mathematical
Society,* Vol. 13, No. 1. (Jan., 1912).

**SKEWES NUMBER** appears in 1949 in Kasner & Newman,
*Mathematics and the Imagination*: "A veritable giant is
Skewes' number, even bigger than a googolplex" (OED2).

**SLIDE RULE.** Soon after the introduction of LOGARITHMS
devices for multiplication were developed that incorporated the principle
of multiplication by addition. Gunter’s scale was introduced in 1620; it had
no moving parts.

Cajori gives Edmund Wingate the credit for devising the first
slide rule in 1630, although William Oughtred’s device of 1632 is more often
cited. Cajori states that Oughtred was an independent discoverer of the rectilinear
slide rule and the first to propose a circular rule. The 1632 publication, *Circles
of Proportion*, uses the terms *horizontal instrument* and
*circles of proportion.*

*Slide rule* appears in the Diary of Samuel Pepys (1633-1703) in April 1663:
"I walked to Greenwich, studying the slide rule for measuring of timber."
However, the device referred
to may not have been a slide rule in the modern sense.

*Slide rule* appears in 1838 in *Civil Eng. & Arch. Jrnl.*:
"To assist in facilitating the use of the slide rule among working mechanics" (OED2).

Amédée Mannheim (1831-1906) designed (c. 1850) the Mannheim Slide Rule.

*Sliding-rule* and *sliding-scale* appear in 1857 in
*Mathematical Dictionary and Cyclopaedia of Mathematical
Science,* defined in the modern sense.

*Slide rule* appears in 1876 in *Handbk. Scientif. Appar.:*
"The slide rule,--an apparatus for effecting multiplications and divisions by means
of a logarithmic scale" (OED2).

(Florian Cajori’s History of the Logarithmic Slide-Rule (1909) is the standard work but basic information can be found in Museum of HP Calculators: Slide-rules).

**SLOPE** is found in 1829 in *A treatise on practical surveying and topographical plan drawing*:
"When these lines differ but little from the horizonal lines, they may be taken
for them; but if the slope is very great it is easy to reduce them, because
we have always the hypothenuse and perpendicular of a right-angled triangle
given by our measurement to find the base."

*Slope* is found in 1835 in *Second report addressed to the directors
and proprietors of the London and Birmingham railway.*

*Slope* is found in 1854 in *A Manual of Topographical Drawing*
by Richard Somers Smith:
"If, for example, it is found that it coincides in length with No. 12 of the scale,
then the slope expressed by that interval is 1/12."
[Google print search]

*Mathematical Dictionary and Cyclopedia of Mathematical Science* (1857) has:

SLOPE. Oblique direction. The slope of a plane is its inclination to the horizon. This slope is generally given by its tangent. Thus, the slope, 1/2, is equal to an angle whose tangent is 1/2; or, we generally say, the slope is 1 upon 2; that is, we rise, in ascending such a plane, a vertical distance of 1, in passing over a horizontal distance of 2. The slope of a curved surface, at any point, is the slope of a plane, tangent to the surface at that point.

In 1924 *Analytic Geometry* by Arthur M. Harding and George W.
Mullins has: "If the line is parallel to the *y* axis, the slope
is infinite." Modern textbooks say such a line has undefined slope.

For information on the use of *m* and other symbols for
slope, see
Earliest Uses of Symbols for Geometry.

**SLOPE FIELD** is found in 1955 in "Line Element Fields and Lorentz Structures on
Differentiable Manifolds" by L. Markus in *Annals of Mathematics*:
"Moreover, it is possible to construct continuous line element fields,
without singularities, on a compact manifold which cannot be oriented,
for example, the slope field *dy*/*dx* = tan(1 - 2π)π/2 on the
torus considered as *E*^{2} iwth the coordinates
modulo one." [*JSTOR* search]

An older term is DIRECTION FIELD. M.
Golomb and M. Shanks (*Elements of Ordinary Differential Equations*, 2^{nd} edition 1965)
attached this note to their definition of “direction field”: “‘Slope field’
would be a more appropriate name but direction field is a long established term.”

**SLOPE-INTERCEPT FORM** is found in 1904 in *Elements
of the Differential and Integral Calculus* by William
Anthony Granville [James A. Landau].

In *Webster’s New International Dictionary* (1909), the term is
*slope form.*

**SMALL SAMPLE PROBLEM, THEORY etc.** in Statistics. In the early 20^{th}
century Student argued that existing large sample methods had to be augmented
because they could give misleading results in small samples; his best known
contribution was the 1908 paper, “The
probable error of a mean”, *Biometrika*, **6**, 1-25. The expression “small sample” soon became
established. In a 1909 paper Student was writing, “It will be observed that
this is essentially a ’small sample' problem...” (“The
distribution of the means of samples which are not drawn at random,” *Biometrika*, **6**, p. 211.) The other
great figure in small sample statistics was R. A. Fisher. In his *Statistical Methods for Research Workers* (1925)
he writes, “Only by systematically tackling small
sample problems on their merits does it seem possible to apply accurate tests
to practical data.” Author’s Preface. In his
review of Fisher’s book Student explained
that “small samples” are “samples so small that the statistical constants of
the population cannot be replaced by those of the samples without appreciable
error.” More recently the qualifier *small sample* has been giving way to
*exact*, e.g. in *exact distribution* or *exact theory*. The large sample methods
were based on asymptotic approximations but the small sample methods were not
based on approximations at all.

See STUDENT’S *t*-DISTRIBUTION and ASYMPTOTIC.

**SMOOTHING.** The *OED*’s earliest quotation is from Francis Galton’s
*Natural Inheritance* chapter vii, p.
100: "These [curious and apparently very interesting relations] came out
distinctly after I had ’smoothed' the entries."

Mark Nelson has found an earlier instance in C. S. Peirce’s 1873 paper, "On the theory of errors of
observations." Pierce writes, "The curve has, however, not been plotted
directly from the observations, but after they have been smoothed off by the
addition of adjacent numbers in the table eight times over, so as to diminish
the irregularities of the curve." The paper is reprinted in Stephen M. Stigler
(ed.) (1980) vol. 2. In his article, "Mathematical statistics in the early States," *Annals of
Statistics*, **6**, (1978), 239-265, Stigler
relates Pierce’s smoothing method to modern KERNEL
density estimation. In other contexts smoothing may amount to fitting a
TREND, the GRADUATION of a mortality table or the
adjustment of geodetic measurements by the METHOD OF LEAST SQUARES.

The term **SOCIAL MATHEMATICS** was used by Condorcet (1743-1794)
and may have been coined by him.

**SOFTWARE.** According to a biography of John Wilder Tukey by Peter McCullagh to appear
in *Biographical Memoirs of Fellows of the Royal Society of London*,
the term was coined by Tukey. A *JSTOR* search found this from
1958, "Today the ’software' comprising the carefully
planned interpretive routines, compilers and other aspects of automative programming
are at least as important to the modern electronic calculator as its 'hardware'
of tubes, transistors, wires, tapes, tubes and the like." John W. Tukey "The Teaching of Concrete Mathematics,"
*American Mathematical Monthly*,
65, No. 1. (Jan., 1958), p. 2.

**SOLID GEOMETRY** appears in 1733 in the title *Elements
of Solid Geometry* by H. Gore (OED2).

**SOLID OF REVOLUTION** is found in English in 1816 in the
translation of Lacroix’s *Differential and Integral Calculus*:
"To find the differentials of the volumes and curve surfaces of
solids of revolution" (OED2).

**SOLIDUS** (the diagonal fraction bar). Arthur Cayley
(1821-1895) wrote to Stokes, "I think the ’solidus' looks very well
indeed...; it would give you a strong claim to be President of a
Society for the Prevention of Cruelty to Printers" (Cajori vol. 2,
page 313).

The word *solidus* appears in this sense in the *Century
Dictionary* of 1891.

**SOLUBLE (referring to groups).** Ferdinand Georg Frobenius
(1849-1917) wrote in a paper of 1893:

Jede Gruppe, deren Ordnung eine Potenz einer Primzahl ist, ist nach einem Satze von Sylow die Gruppe einer durch Wurzelausdrücke auflösbaren Gleichung oder, wie ich mich kurz ausdrücken will, einer auflösbare Gruppe. [Every group of prime-power order is, by a theorem of Sylow, the group of an equation which is soluble by radicals or, as I will allow myself to abbreviate, a soluble group.]Peter Neumann believes this is likely to be the passage that introduced the term "auflösbar" ["soluble"] as an adjective applicable to groups into mathematical language.

**SOLUTION SET** appears in 1959 in *Fund. Math.* by
Allendoerfer and Oakley: Given a universal set *X* and an
equation *F*(*x*) = *G*(*x*) involving *x,*
the set {*x*|*F*(*x*) = *G*(*x*)} is called
the solution set of the given equation" (OED2).

The term may occur in found in Imsik Hong, "On the null-set of a
solution for the equation $\Delta u+k^2u=0$," *Kodai Math. Semin.
Rep.* (1955).

**SOUSLIN SET** is defined in Nicolas Bourbaki, *Topologie
Generale* [Stacy Langton].

**SPACE.** The word came
into English—from Old French from Latin—around 1300. The *OED* entry distinguishes many meanings. In
one sense (under heading 6b) it has *room* as a synonym. This word derives from the Old English and is related to the
modern German *Raum*. Under heading
17 the *OED* defines “a space” as
“an instance of any of various mathematical concepts, usually regarded as a set
of points having some specified structure.” Among the quotations is a nice one
from 1932: “The word ‘space’ has gradually acquired a mathematical significance
so broad that it is virtually equivalent to the word ‘class’, as used in
logic.” (M. H. Stone *Linear Transformations
in Hilbert Space* p. 1.) The space age was well under way by 1914 when
Hausdorff’s *Grundzüge der Mengenlehre* (Fundamentals of Set Theory) gave axioms for a
METRIC SPACE (*metrischer Raum*) and
for a TOPOLOGICAL SPACE (topologischer
Raum).

See the entries BANACH SPACE, HAUSDORFF SPACE, HILBERT SPACE, METRIC SPACE, POINT, TOPOLOGICAL SPACE and VECTOR SPACE.

The term **SPECIALLY MULTIPLICATIVE FUNCTION** was coined by D. H.
Lehmer (McCarthy, page 65).

The term **SPECIAL FUNCTIONS** for the higher transcendental functions of mathematical physics has been in circulation from
at least the 1920s. A *JSTOR* search found it used as a heading, without
explanation, in the article “American Standard Mathematical Symbols,” *American Mathematical
Monthly*, 35, (1928), p. 303. The symbols given are for BESSEL FUNCTIONS and BERNOULLI
NUMBERS. See the *Encyclopedia of Mathematics* entry.

**SPECTRUM (in operator theory).** The *OED*’s earliest quotation illustrating the scientific (optical)
use of "spectrum" is from Newton *Phil. Trans.* VI. (1671) 3076:
"Comparing the length of this coloured Spectrum with its breadth, I found
it about five times greater." The *OED*’s earliest quotation illustrating
the mathematical use of "spectrum" is from P. R. Halmos *Finite
Dimensional Vector Spaces* (1948, ii. 79): "The set of *n* proper
values [eigenvalues] of *A,* with multiplicities properly counted, is the
spectrum of *A.*" This use of the term goes back to Hilbert’s work
on integral equations in 1904-10. Hilbert used the term "Spektrum"
when discussing quadratic forms in infinitely many variables ("Grundzüge einer
allgemeinen Theorie der linearen Integralgleichungen. Vierte Mitteilung"
Nachrichten
von der Gesellschaft der Wissenschaften zu Göttingen, Mathematisch-Physikalische
Klasse (1906), p. 157.) The English word appears in 1911 in Anna Johnson
Pell "Biorthogonal Systems of Functions," *Transactions of the American
Mathematical Society*, **12**, pp. 135-164. (*JSTOR* search)

There may be a link between Newton and Hilbert for, though
the latter cited no previous writer for "Spektrum," J. Dieudonné *History
of Functional Analysis* (1981, pp. 149-50) suggests he derived the term from
W. Wirtinger
"Beiträge zu Riemann’s Integrationsmethode für hyperbolische Differentialgleichungen,
und deren Anwendungen auf Schwingungsprobleme,"
Mathematische
Annalen, 48, (1897), 365-89. Wirtinger drew upon the similarity with the
optical spectra of molecules when he used the term "Bandenspectrum"
with reference to Hill’s (differential) equation.

The terms **spectral theory** and **spectral theorem**
came into use around 1930: see e.g. A. Wintner *Spektraltheorie unendiclichen Matrizen*
(1929) and B. A. Lengyel & M. H. Stone
"Elementary Proof of the Spectral Theorem," *Annals of Mathematics*,
**37**, (1936), pp. 853-864. "The spectral theorem" of the latter is the
"fundamental theorem on the spectral resolution of self-adjoint operators in
Hilbert space."

This entry was contributed by John Aldrich. See also EIGENVALUE, STATIONARY STOCHASTIC PROCESS.

**SPECTRUM and SPECTRAL DENSITY (in generalised harmonic analysis and stochastic processes).**
The "spectrum" of an irregular motion appears in N. Wiener’s "The Harmonic Analysis of Irregular Motion
(Second Paper)" *J. Math. and Phys.* **5** (1926) 158-189. One of Wiener’s objectives was a theory
which would include "an adequate mathematical account of such continuous spectra as that of white light."
(Wiener *Proc. London Math. Soc.* **27** (1928)) The term "power-spectrum" is also in the 1926
paper. The spectrum and spectral density function were important in the probabilistic theory of Khintchine
(1934) and Wold (1938) but the functions were not given names. The names appear in J. L. Doob’s "The
Elementary Gaussian Processes" *Annals of Mathematical Statistics,* **15,** (1944), 229-282.
Around 1940 it became evident that the spectral theory of time series analysis was related to the spectral
theory of operators. (See also the previous entry and STATIONARY STOCHASTIC PROCESS). [John Aldrich]

**SPERNER’S LEMMA** in algebraic topology appears in
Emanuel Sperner’s
“Neuer Beweis für die Invarianz der Dimensionszahl und des Gebietes,” *Abh. Math. Sem. Univ. Hamburg,*
**6**, (1928) 265—272. It was then used to give a new proof of BROUWER’S FIXED-POINT THEOREM
by B. Knaster, K. Kuratowski, S. Mazurkiewicz, “Ein Beweis des Fixpunktsatzes für *n*-dimensionale Simplexe”
*Fund. Math.*, **14** (1929) pp. 132–137.
The phrase “das Spernersche Lemma” appears in Alexandroff & Hopf’s *Topologie* (1935, p. 376).

**SPHERICAL GEOMETRY** appears in 1728 in Chambers'
*Cyclopedia* (OED2).

The words *spherical geometry* and *versed sine* were used
by Edgar Allan Poe in his short story *The Unparalleled Adventure
Of One Hans Pfaall.*

**SPHERICAL HARMONICS.** A. H. Resal used the term *fonctions
spheriques* (Todhunter, 1873) [Chris Linton].

*Spherical harmonics* was used in 1867 by William Thomson
(1824-1907) and Peter Guthrie Tait (1831-1901) in * Nat.
Philos.*: "General expressions for complete spherical harmonics of
all orders" (OED2).

**SPHERICAL TRIANGLE** Menelaus of Alexandria (fl. A. D. 100) used
the term *tripleuron* in his *Sphaerica,* according to
Pappus. According to the DSB, "this is the earliest known mention of
a spherical triangle."

The OED2 shows a use of *spherical triangle* in English in 1585.

In a letter to L. H. Girardin dated March 18, 1814, Thomas Jefferson (President of the United States) wrote, "According to your request of the other day, I send you my formula and explanation of Lord Napier’s theorem, for the solution of right-angled spherical triangles."

**SPHERICAL TRIGONOMETRY** is found in the title *Trigonometria
sphaericorum logarithmica* (1651) by Nicolaus Mercator
(1620-1687).

The term is found in English in a letter by John Collins to the Governors of Christ’s Hospital written on May 16, 1682, in the phrase "plaine & spherick Trigonometry, whereby Navigation is performed" [James A. Landau].

In a letter dated Oct. 8, 1809, Thomas Jefferson wrote, referring to Benjamin Banneker, "We know he had spherical trigonometry enough to make almanacs, but not without the suspicion of aid from Ellicot, who was his neighbor and friend, and never missed an opportunity of puffing him."

**SPINOR** appears in 1931 in *Physical Review.* The citation
refers to spinor analysis developed by B. Van der Waerden (OED2).

**SPIRAL OF ARCHIMEDES** appears in English in 1813
in *Pantologia. A new cabinet cyclopædia*
by John Mason Good, Olinthus Gilbert Gregory, and N. Bosworth.
[Google print search]

**SPLINE (CURVE).** The *Century Dictionary* of 1891 defines a *spline*
as "a flexible strip of wood or hard rubber used by draftsmen in laying out broad sweeping curves,
especially in railroad work." The word was introduced into mathematics in the form "spline curve" by
I.
J. Schoenberg "Contributions to the problem of approximation of
equidistant data by analytic functions. Part A--on the problem of smoothing or
graduation. A first class of analytic approximation formulae," *Quart. Appl. Math.*, **4**, (1946), 45-99.
The *OED* quotation explains the relation
between old "spline" and new "spine curve," "For *k* = 4 they represent approximately the
curves drawn by means of a spline and for this reason we propose to call them
spline curves of order *k*." Later "spline curve" became abbreviated to "spline."

This entry was contributed by John Aldrich. See INTERPOLATION.

The term **SPORADIC GROUP** was coined by William Burnside
(1852-1927) in the second edition of his *Theory of Groups of
Finite Order,* published in 1911 [John McKay].

**SPURIOUS CORRELATION.** The term
was introduced by Karl Pearson in
"Mathematical
Contributions to the Theory of Evolution
- On a Form of Spurious Correlation Which May Arise When Indices Are Used in
the Measurement of Organs," *Proc. Royal Society,* **60,** (1897), 489-498.
Pearson showed that correlation between indices *u* (= x/*z*)
and *v* (= *y*/*z*) was a misleading guide to correlation between *x* and *y.*
His illustration is

A quantity of bones are taken from anThe term has been applied to other correlation scenarios with potential for misleading inferences. In Student’s "The Elimination of Spurious Correlation due to Position in Time or Space" (ossuarium,and are put together in groups which are asserted to be those of individual skeletons. To test this a biologist takes the triplet femur, tibia, humerus, and seeks the correlation between the indicesfemur/humerusandtibia/humerus.He might reasonably conclude that this correlation marked organic relationship, and believe that the bones had really been put together substantially in their individual grouping. As a matter of fact ... there would be ... a correlation of about 0.4 to 0.5 between these indices had the bones been sorted absolutely at random.

See also VARIATE DIFFERENCE METHOD.

**SQUARE.** The English word comes via Old French from the Latin *ex*- out + *quadrāre*
make square and was first used for a tool for measuring right angles. The *OED’s* first citation in the sense of the
product of a number multiplied by itself is from 1557 in the *Whetstone of
Witte* of R. Record:
“Twoo multiplications doe make a Cubike nomber. Likewaies .3. multiplications
doe giue a square of squares.”

**SQUAREFREE** or **SQUARE-FREE** are
English translations of the German word *quadratfrei* for a number
which is not divisible by the square of any prime. That term is used without
explanation in Edmund Landau “Ueber die asymptotische Werthe einiger
zahlentheoretischer Functionen,” *Mathematische Annalen*,
**54**, (1900), 570-581.
The term is not used in the paper by Leopold Gegenbauer which gave the
well-known asymptotic estimate 6*x*/π^{2}
for the number of quadratfrei numbers not exceeding *x*, viz. “Asymptotische Gesetze der Zahlentheorie,”
*Denkschriften der
Kaiserlichen Akademie der Wissenschaften Wien, 49 (1885)*, 37-80. See
the Wikipedia entry Square-free
integer.

In their *Introduction
to the Theory of Numbers* (1938) Hardy and Wright use the German word
on the ground that “there is no convenient English word.” (p. 254). However an
English word was created by translating the German compound component by
component and a *JSTOR* search found *square-free*
in use from 1931 and *squarefree*
from 1939. *Quadratfrei* is still
found in English writing but is much less common than these equivalents. [John Aldrich]

**SQUARE MATRIX** was used by Arthur Cayley in 1858 in "A Memoir on the Theory of Matrices"
Coll Math Papers, I, 475-96: "The term
matrix might be used in a more general sense, but in the present memoir I consider
only square or rectangular matrices" p. 475. (OED2).

See MATRIX.

**SQUARE THE CIRCLE.** See QUADRATURE OF THE CIRCLE.

**STABLE LAW** (*loi stable*) appears in Paul Lévy
"Sur les lois stables en calcul des probabilités,"
*Comptes Rendus de l'Académie des Sciences,* **176**, 1284-1286.
(David (2001))

See also CAUCHY DISTRIBUTION and NORMAL DISTRIBUTION.

The term **STANDARD DEVIATION** was introduced by Karl
Pearson (1857-1936) in 1893, "although the idea was by then nearly a century
old" (Abbott; Stigler, page 328). According to the DSB:

The term "standard deviation" was introduced in a lecture of 31 January 1893, as a convenient substitute for the cumbersome "root mean square error" and the older expressions "error of mean square" and "mean error."The OED2 shows a use of

See MEAN ERROR, MODULUS, PROBABLE ERROR and VARIANCE.

**STANDARD ERROR** is found in 1897 in G. U. Yule, "On the Theory
of Correlation," *Journal of the Royal Statistical Society*, **60**,
812-854: "We see that _{1}[sqrt](1 - *r*^{2}) is the standard
error made in estimating *x*" (OED2).
There the quantity *x* was being estimated by a regression residual but Yule
applied the term generally in his *Introduction to the Theory of Statistics*
(1911), covering such cases as the standard error of a proportion. [John Aldrich]

See also PROBABLE ERROR.

**STANDARD NORMAL CURVE.** In the biometric era W. F. Sheppard (*Phil. Trans A,* **192**, (1899), p. 105) used the expression
“standard normal curve” for “a normal curve whose area and standard deviation are unity”
in “On the Application of the Theory of Error
to Cases of Normal Distribution and Normal Correlation”, *Phil. Trans A,* **192**, (1899), p. 105.
However the term did not catch on and Sheppard did *not*
use it when he presented (*Biometrika*, **5**, (1907), p. 404)
tables of the standard normal: he spoke of "the value of the deviation, the
standard deviation being taken as unit." See the similar caption to the normal tables
(Tables I. and II.)
in Fisher’s *Statistical Methods for Research Workers* (1925). The term
“unit normal” had some currency but most authors
used *no* term.

The term “standard normal” came into general use around 1950,
appearing in the popular textbooks by P. G. Hoel *Introduction to Mathematical
Statistics* (1947) and A. M. Mood *Introduction to the Theory of Statistics*
(1950).

See also NORMAL.

**STANDARD POSITION** is found in 1873 in
*An elementary course in free-hand geometrical drawing*
by Samuel Edward Warren:
"a right angle is in its simplest, most natural, or *standard position*, when its sides are
in the *fundamental directions of vertical* and *horizontal*" [University of Michigan Digital Library].

*Standard position* is dated 1950 in MWCD10.

**STANDARD SCORE.** In 1913
*Elementary school standards : instruction, course of study, supervision, applied to New York City schools*
by Frank Morton McMurry has:
"The book does not attempt to illustrate accurate measurement of educational results. It is scientific only in
so far as it brings to bear organized knowledge and insight on an educational problem. Scientific measurement
in education is, indeed, as yet too little developed to be applied to more than a very limited portion of the work
of the elementary schools. Except for arithmetic and penmanship, ’standard scores'
or standard achievements are not available for measuring the quality of the results actually attained by the
schools; and even for penmanship and arithmetic, the standard measures for each grade are not yet firmly established"
[University of Michigan Digital Library].

In 1921 *Univ. Illin. Bur. Educ. Res.
Bull.* has: "Provision is made for comparing a pupil’s achievement
score..with the norm corresponding to his mental age by dividing his
achievement age by the standard score for his mental age. This
quotient is called the Achievement Quotient" (OED2).

*Standard score* is dated 1928 in MWCD10.

**STANINE** is a term first used to describe an examinee’s performance on a
battery of tests constructed for the U. S. Army Air Force during World War II.

In a letter dated July 30, 1946, Laurance F. Shaffer, who had been a colonel in charge of Psychological Research Unit No. 1 (PRU #1) at Maxwell Field, Alabama, wrote:

The origin of the word is somewhat hazy. I have complete certainty only with regard to two facts: that the word was originated at PRU #1 at Maxwell Field, and that the date was in the month of February, 1942. According to PRU #1 tradition, the word first appeared in the formIn a letter dated February 23, 1946, Frank A. Geldard, formerly a colonel in charge of the whole program and stationed in Texas, wrote:stand-nineas a shortening of the phrasestandard nine-point scalethat occurred in area directives. This was soon shortened tostannine(withaas instand). Local tradition ascribed the origin of this term to Sol M. Roshal, who was noncom in charge of computations at that time. Fred Wickers has told me that he is very certain that I changedstanninetostaninewithaas instay) when I returned from my expedition to California, which took place in the middle of February, 1942. I do not remember this myself.

Stanineis a portmanteau word deriving from "standard score on aninepoint scale." It was a sheer "shorthand" invention on the part of an enlisted man in Psychological Research Unit No. 1, AAF Classification Center, Maxwell Field, Ala. The term came to have wide usage in the AAF, not only by psychologists, but by all who had occasion to refer to aptitude ratings for pilot, bombardier, and navigator training assignments. At first the word was resisted by psychologists, who felt that the term had little intrinsic logical meaning to recommend it. For a year or so after its invention official reports might not employ the word; it was regarded as inferior slang. Generality of usage within the AAF eventually forced its acceptance, however, and by the end of the war both technical and nontechnical papers on aircrew aptitude, standards of qualification, training programs, aircraft accidents, and a host of other topics, employed it as a "good" word. It avoided considerable circumlocution, and its meaning seems rarely to have been misunderstood.

Both of these letters were written to Atcheson L. Hench and appear in an article by him, "The Coining of ’Stanine'", in *American
Speech,* February 1951.

The term **STAR PRIME** was coined in 1988 by Richard L. Francis (Schwartzman, p. 206).

**STATIONARY STOCHASTIC PROCESS** appears in the title of A Khintchine’s
"Korrelationstheorie der Stationären Stochastischen Prozesse",
Math. Ann. 109, (1934), p. 604.

H. Wold translated it as "stationary random process"
(*A Study in the Analysis of Stationary Time Series* (1938)).

The phrase "stationary stochastic process" appears
in J. L. Doob’s "What is a Stochastic Process?" *American Mathematical
Monthly,* **49,** (1942), 648-653.

An older term was "fonction éventuelle homogène,"
which appears in E. Slutsky’s
"Sur les Fonctions
Éventuelles Continues, Intégrables
et Dérivables dans la Sens Stochastique", *Comptes Rendues,*
**187,** (1928), 878 [John Aldrich].

See STOCHASTIC PROCESS.

**STATISTIC, STATISTICAL** and **STATISTICS.** In the course
of the 19^{th} century *statistics*
acquired its modern meaning(s). **It is** “the department of study that has
its object the collection and arrangement of numerical facts or data, whether
relating to human affairs or to natural phenomena” OR **they are**
“numerical facts or data collected and classified.”
The OED1 of the early 20^{th} century also has *statistical*
in the modern sense but its meanings for *statistic*
are archaic. The recasting of *statistic* came later.

These words all come indirectly from the mediaeval Latin *status*
for a political *state*. More directly *statistics* entered English
from the German *Statistik,* as a term comparable to mathematics or ethics.
The first citation in OED2 is W. Hooper’s translation of *Bielfield’s Elementary
Universal Education:* "The science, that is called statistics, teaches us
what is the political arrangement of all the modern states of the known world."
(1770) However the work that did most to "naturalise" the term in the English
language was Sir John Sinclair’s
*Statistical Account of Scotland* (1791-9).

*Webster’s* dictionary of 1828
defined statistics as: "A collection of facts respecting the state of society,
the condition of the people in a nation or country, their health, longevity,
domestic economy, arts, property and political strength, the state of the country,
&c." Statistical societies, like the *Statistical Society of London*
(later Royal Statistical Society)
founded in 1834, were established to discover such facts. Note
that these facts were not necessarily, or even typically, numerical facts.

In the course of the 19^{th} century *statistics* came to be confined to
numerical facts but the facts did not have to pertain to public administration.
The latter development is illustrated by a quotation from J. C. Maxwell
*Theory of Heat* (1871) xxii. 288: “If however, we adopt a statistical view of the system, and
distribute the molecules into groups . . .”
(OED2) This point of view became fixed in the phrase *statistical mechanics.*
For this the OED2 cites J. W. Gibbs in *Proc. Amer. Assoc. Adv.
Sci.* XXXIII, 1885, 57 (*heading*) “On the fundamental formula of
statistical mechanics, with applications to astronomy and thermodynamics.”

The phrase "lies, damned lies and statistics," a back-handed tribute to the importance of statistics in political life, dates from the end of the 19th century. See Peter Lee’s Lies, Damned Lies and Statistics for a history of the phrase.

*Statistic,* signifying an individual fact, was rare before the 20^{th} century.
There is an example from 1853 in *The United States illustrated*
edited by Charles Anderson Dana: “An old teamster with a dislodged wheel to his
'lumbery' vehicle, claimed a moment of our strength, and in return for that generosity,
*a la Jupiter,* indulged our statistical curiosity with a few minutes of his local knowledge. The
significant placing of his hand upon his pocket, as he
proclaimed the fact that the bridge cost almost a quarter of a million dollars,
plainly showed his appreciation of so vast a sum. Nor was the statistic of the
bridge, being a mile in length, handed over to the fund of general information,
without a look which plainly hinted of the many laggard walks it had cost him
by the side of his sturdy team.” [University of Michigan Digital Library].

In the 20^{th} century the singular form came to be
accepted both in this sense and in another sense. In statistical theory R. A.
Fisher used *statistic* to refer to a quantity derived from the observations--before
settling on it he had used "statistical derivative"
(1915),
"derivate" (1920)
and "statistical derivate" (1921).
Fisher presented the new term in his
"On the
Mathematical Foundations of Theoretical Statistics",
*Philosophical Transactions of the Royal Society of London,* Ser. A., 222,
(1922), 309-368: "These involve the choice of methods of calculating from a
sample statistical derivates, or as we shall call them statistics, which are
designed to estimate the values of the parameters of the hypothetical population."
(p. 318) The term *parameter* was also new and with *statistic* the
two made a pair. (See the entry on parameter for Fisher’s reasoning.) Fisher
called the statistics arising in estimation problems *estimates*. He had
no name for statistics arising in testing but since the 1950s they have been
called "test statistics."

Fisher’s term was *not* well-received initially. Arne
Fisher (no relation) asked him, "Where ... did you get that atrocity, *a statistic*?"
(letter (p. 312) in J. H. Bennett
*Statistical
Inference and Analysis: Selected Correspondence of R. A. Fisher* (1990).) Karl Pearson objected,
"Are we also to introduce the words a mathematic, a physic, an electric etc., for parameters
or constants of other branches of science?" (p. 49n of *Biometrika,* 28,
34-59 1936).

This entry was contributed by John Aldrich, based on G. U.
Yule *Introduction to the Theory of Statistics* (1911) and David (2001).
A complete list of the probability and statistics terms on this web site is
here.

**STATISTICAL TABLE.** The *OED* notes an appearance in 1808 in Zebulon M. Pike’s
*An Account of Expeditions to the Sources of the Mississippi
1805-07* "A statistical table, on which he had in a regular manner taken
the whole province of New Mexico,..giving latitude, longitude, and population."
There is a livelier passage in Melville’s *Moby Dick* (1851) chapter ci,
*The Decanter*:

Most statistical tables are parchingly dry in the reading; not so in the present case, however, where the reader is flooded with whole pipes, barrels, quarts, and gills of good gin and good cheer.

(Quotation provided by John W. McDonald III.)

**Statistical tables** giving integrals and other values used in statistical inference, are
now used mainly by students but for most of the 20^{th} century statisticians
had to make constant reference to volumes of such tables. Karl Pearson created
the specialised volume of statistical tables with his *Tables for Statisticians
and Biometricians *(1914). According to Pearson, "What the true statistician,
the true physicist demands" is "the conversion of algebraical results
into tables." The next great work of tabling was R. A. Fisher and F. Yates’s
*Statistical Tables for Biological Agricultural and Medical Research* (1938);
the sixth edition of 1963 is available
on the
web. The last great set, the two-volume *Biometrika Tables for Statisticians*,
was produced by Pearson’s son E. S. Pearson (with H. O. Hartley) and appeared
in 1954/72.

**STEM-AND-LEAF DISPLAYS** were introduced by J. W. Tukey in
"Some Graphic and Semigraphic Displays"
in *Statistical Papers in Honor of George W. Snedecor* edited by T. A. Bancroft (1972). David (2001).

**STEP FUNCTION** is dated ca. 1929 in MWCD10.

**STEREOGRAPHIC.** According to Schwartzman (p. 207), "the term
seems to have been used first by the Belgian Jesuit François
Aguillon (1566-1617), although the concept was already known to the
ancient Greeks."

In *Flattening the Earth: Two Thousand Years of Map
Projections,* John P. Snyder attributes the term to d'Aguillon in
1613 [John W. Dawson, Jr.].

**STIELTJES INTEGRAL.** T. J. Stieltjes
introduced the integral in his “Recherches sur les fractions continues,”
*Annales de la faculté des sciences de Toulouse*
Sér. 1, 8 no. 4 (1894)
1-122. The work was outside the mainstream of the theory
of integration until M. Riesz’s “Sur les opérations fonctionnelles linéaires,”
*Comptes Rendus de l'Académie des Sciences*, **149** (1909), 974—977.
The relationship between l'intégrale de Stieltjes and the LEBESGUE INTEGRAL is considered
in Lebesgue’s “Sur l'intégrale de Stieltjes et sur les opérations linéaires,”
*Comptes Rendus de l'Académie des Sciences* **150** (1910), 86-88. See the epilogue
on the Lebesgue-Stieltjes integral in T. Hawkins *Lebesgue’s Theory of Integration, its Origins and Development* (1975). [John Aldrich]

**STIGLER’S LAW OF EPONYMY.** See EPONYMY.

The terms **STIRLING NUMBERS OF THE FIRST** and **SECOND KIND**
were coined by Niels Nielsen (1865-1931), who wrote in German
"Stirlingschen Zahlen erster Art" [Stirling numbers of the first kind]
and "Stirlingschen Zahlen zweiter Art" [Stirling numbers of the second
kind]. Nielsen’s masterpiece, "Handbuch der Theorie der
Gammafunktion" [B. G. Teubner, Leipzig, 1906], had a great influence,
and the terms progressively found their acceptance (Julio
González Cabillón).

John Conway believes the newer terms *Stirling cycle* and
*Stirling (sub)set* numbers were introduced by R. L. Graham, D.
E. Knuth, and O. Patshnik in *Concrete Mathematics* (Addison
Wesley, 1989 & often reprinted).

**STIRLING’S FORMULA.** The asymptotic
formula for n! appears in Example 2 to Proposition 28 of
James Stirling’s
*Methodus
Differentialis. sive Tractatus de Summatione
et Interpolatione Serierum Infinitarum* (1730) p. 136.

Lacroix used *Théorème de Stirling* in *Traité élémentaire de calcul différentiel et de
calcul intégral* (1797-1800).

Stirling’s theorem is found in English in 1863 in *The Mathematical and
Other Writings of Robert Leslie Ellis.* [Google print search]

*Stirling’s formula* is found in English in 1880 in *A Treaise on the Calculus
of Finite Differences* by George Boole and John Fletcher Moulton.
[Google print search]

*Stirling’s approximation* appears in 1938 in *Biometrika* (OED2).

**STOCHASTIC** is found in English in 1662 with the meaning "pertaining to conjecture."
(OED) However, despite an early link with probability (see the second quotation
below), the term only entered the vocabulary of probability in the 20^{th}
century.

The modern re-birth of the term can be seen in the OED quotations. In 1917
Ladislaus Josephowitsch Bortkiewicz
(1868-1931) used it in
*Die Iterationem*
p. 3: "Die an der Wahrscheinlichkeitstheorie orientierte, somit auf 'das Gesetz der Grossen Zahlen'
sich gründende Betrachtung empirischer Vielheiten möge als Stochastik ... bezeichnet
werden" A. A. Tschuprow (Chuprov) put the term into English and explained
Bortkiewicz’s choice of term: "I use the word ‘stochastical’ as synonymous to
‘based on the theory of probability’--cf. J. Bernoulli, *Ars Conjectandi*,
Basileae, 1713, p. 213 ‘Ars Conjectandi sive Stochastice nobis definitur ars
metiendi quam fieri potest exactissimi probabilitates rerum’ and L. v. Bortkiewicz,
*Die Iterationen*." (*Metron*, **2** (1923), p. 461) See the translation of
Part IV,
Chapter II of *Ars Conjectandi* for the phrase "stochastic art." [John Aldrich]

**STOCHASTIC PROCESS** is found in A. N. Kolmogorov, "Sulla forma generale di un prozesso
stocastico omogeneo," *Rend. Accad. Lincei Cl. Sci. Fis. Mat.* 15
(1) page 805 (1932) [James A. Landau].

*Stochastic process* is also found in A. Khintchine "Korrelationstheorie der Stationären Stochastischen Prozesse",
Math. Ann. 109, (1934), p. 604 [James A. Landau].

*Stochastic process* occurs in English in J. L. Doob, "Stochastic processes and statistics,"
*Proc. Natl. Acad. Sci. USA* 20 (1934).

See AUTORORRELATION, AUTOREGRESSION, BRANCHING PROCESS, ERGODIC, MARKOV, MARTINGALE, MOVING AVERAGE PROCESS, SAMPLE PATH, SPECTRUM, STATIONARY STOCHASTIC PROCESS, WHITE NOISE, WIENER PROCESS. See also the full list of Probability and Statistics entries.

**STOKES’S THEOREM** was attributed to
George Gabriel Stokes
(1819-1903) by J. C. Maxwell in his
*A Treatise on Electricity and Magnetism*
(1873, p. 27), "This theorem was given by Professor Stokes, *Smith’s Prize
Examination*, 1854, question 8." Maxwell had been a candidate that year!
The question is reprinted in *Mathematical and physical papers by George Gabriel Stokes*,
volume 5, p. 320. In a footnote, however, the editor of that volume reports that Lord Kelvin (William
Thomson) had stated the theorem in a letter to Stokes in July 1850.

(Based on M. J Crowe *A History of Vector Analysis* (2^{nd} edition, p. 147).)

**STONE-WEIERSTRASS THEOREM** is a generalisation of the WEIERSTRASS APPROXIMATION THEOREM given by
M. H.
Stone in his “Applications of the theory of Boolean rings to
general topology,” *Trans. Amer. Math.
Soc.*, **41**, (1937) pp. 375–481.
See *Enyclopedia of Mathematics*.

**STRAIGHT ANGLE** appears in English in
1876 in *Syllabus of Plane Geometry*
by the Association for the improvement of geometrical teaching:
"When the arms of an angle are in the same straight line, the conjugate
angles are equal, and each is then said to be a *straight angle.*"
[Google print search]

There are earlier citations in the OED2 for the term with the obsolete meaning of "a right angle."

The term **STRANGE ATTRACTOR** was coined by David Ruelle
and Floris Takens in their classic paper "On the Nature of Turbulence"
[*Communications in Mathematical Physics,* vol. 20, pp. 167-192,
1971], in which they describe the complex geometric structure of an
attractor during a study of models for turbulence in fluid flow.

**STRATIFIED SAMPLING** occurs in J. Neyman, “On the two different aspects of the representative
method; the method of stratified sampling and the method of purposive
selection,” *Journal of the Royal Statistical Society*, **97**,
(1934), 558-625. The term “stratum” seems to have been first used in this
connection by A. L. Bowley *Elements of Statistics* (4^{th} edition) 1920: “It may happen … that the
universe [population] consists of different regions or strata ..” (p. 332). [James A. Landau]

See NEYMAN ALLOCATION.

**STRONG LAW OF LARGE NUMBERS.** See LAW OF LARGE NUMBERS.

**STRONG PSEUDOPRIME.** According to *Prime Numbers: A
Computational Perspective* by Carl Pomerance and Richard Crandall (page 124),
"J. Selfridge proposed using Theorem 3.4.1 as a pseudoprime test in the early
1970s, and it was he who coined the term ’strong pseudoprime'" [Paul Pollack].

*Strong pseudoprime* is found in Pomerance, Carl; Selfridge,
J.L.; Wagstaff, Samuel S. Jr. "The pseudoprimes to 25 x
10^{9}," *Math. Comput.* 35, 1003-1026 (1980).

**STROPHOID** appears in 1837 in Enrico Montucci, "Delle
proprietà della strefoide, curva algebrica del terzo grado
recentemente scoperta ed esaminata" ("On the property of the
strophoid, an algebraic curve of the third degree recently discovered
and examined"), Memoria letta nell'Accademia dei Fisiocratici ... con
una appendice del Venturoli, Siena, G. Mucci, 1837 [Dic Sonneveld].

*Strophoid* was coined by Montucci in 1846, according to Smith
(vol. 2, page 330).

The term **STRUCTURE** for isomorphic relations seems to have first
appeared in print in Bertrand Russell’s Introduction to Mathematical
Philosophy (1919).
Russell probably had the term from Ludwig Wittgenstein, whose
Tractatus logico-philosophicus (Logisch-philosophische
Abhandlung, Vienna 1918, 4.1211 ff) was first published in 1921, and
in 1922 in English.
The first Structure in the modern sense -- as a tuple composed of sorts or
carrier sets, relations, operations and distinguished elements -- was first
used by David Hilbert in his Grundlagen der Geometrie
(Göttingen 1899), there called a „Fachwerk oder Schema von
Begriffen“ (p. 163, according to
F. Kambartel Erfahrung und Struktur, Münster 1966).
The concept of Structure developed via Rudolf Carnap’s Der logische
Aufbau der Welt (1928), the linguistic and French philosophical
Structuralism, the Éléments de
mathématique of the N. Bourbaki group (Paris, since 1939), to
Category Theory of Samuel Eilenberg and Saunders Mac Lane (1945).
[This entry was contributed by Wolfram Roisch.]

**STUDENT’S t-DISTRIBUTION.** “Student” was the pen-name of
William
Sealy Gosset. (The name was originally written in quotation marks but these
are now usually dispensed with.) Student once told
R. A. Fisher,
“I am sending you a copy of Student’s Tables as you are the
only man that’s ever likely to use them!” (

In his 1908 paper, “The
Probable Error of a Mean”, *Biometrika*, **6**, 1-25 Student introduced the statistic, *z,* for testing hypotheses on the mean of
the normal distribution. Student’s *z* is proportional to the modern *t*
with *t* = *z* √(*n* – 1). Student was not concerned to have a statistic that is asymptotically
standard normal and when he estimated σ he used the divisor *n,* not the modern (*n* – 1). Fisher introduced the
*t* form because it fitted in with a larger
theory based on the notion of the number of DEGREES OF FREEDOM.
Student seems to have introduced the *t* symbol in correspondence with Fisher in 1923. Fisher described
Student’s distribution (and others based on the normal distribution) in
“On a Distribution Yielding the Error Functions of Several well
Known Statistics”, *Proceedings
of the International Congress of Mathematics,* Toronto, **2**, 805-813. In that paper he used the
symbol *t.* A new symbol suited Fisher for he could use *z* for a
statistic of his own (see the entries for *z*
and for *F*). For further information see C. Eisenhart “On the Transition from
Student’s *z* to Student’s *t*,” *The American Statistician*, **33**, (1979), pp. 6-10.

In 1925 Fisher’s version of Student’s distribution
took off. Fisher’s *Statistical Methods for Research Workers*
presented new uses for the tables and made the tables generally
available. His “Applications of ’Student’s' Distribution”,
*Metron*, **5**, 90-104 provided the theory of the
applications. The familiar terminology soon developed. In 1925 Fisher referred
to “Student’s distribution” (without the “*t*”).
The phrase “Student’s” *t*-distribution
appears in 1929 in *Nature* (*OED*). The phrase “*t* distribution” appears in A. T. McKay,
“Distribution of the coefficient of variation and the extended ‘*t*’ distribution,”
*J. Roy. Stat. Soc.,* **95**, (1932). The term “*t*-test” is found in 1932 in the fourth
edition of Fisher’s *Statistical Methods for
Research Workers*: “The validity of the *t*-test, as a test of this hypothesis, is therefore absolute”
(p. 116) (*OED*).

This entry was contributed by John Aldrich. See SMALL SAMPLE problem.

**STUDENTIZATION.** According to Hald (1997, p. 669), Student, i.e. William Sealy Gossett
(1876-1937), used the term *Studentization* in a letter to E. S. Pearson
of Jan. 29, 1932.

At a meeting in 1934 R. A. Fisher
described the relation between his work and Student’s: "It was "Student" himself
who took the really novel step, which had in fact revolutionized the theory
of errors.... All that he [Fisher] had added to it was to "studentize" a number
of analogous problems..." (*Journal of the Royal Statistical
Society*, **97**, p. 619)

*Studentized D ^{2} statistic*
is found in R. C. Bose and S. N. Roy, "The exact distribution of the Studentized
D

**STURM’S THEOREM** appears in 1836 in the title *Du Theoreme de
M. Sturm, et de ses Applications Numeriques* by M. E. Midy [James
A. Landau].

*Sturm’s theorem* appears in English in 1841 in the title
*Mathematical Dissertations, for the use of students in the modern
analysis; with improvements in the practice of Sturm’s Theorem, in
the theory of curvature, and in the summation of infinite series*
by J. R. Young [James A. Landau].

**SUBFACTORIAL** was introduced in 1878 by W. Allen Whitworth in
*Messenger of Mathematics* (Cajori vol. 2, page 77).

**SUBFIELD** is found in "On the Base of a Relative Number-Field,
with an Application to the Composition of Fields," G. E. Wahlin,
*Transactions of the American Mathematical Society,* Vol. 11,
No. 4. (Oct., 1910).

**SUBGROUP.** Felix Klein used the term *untergruppe.*

*Subgroup* appears in 1881 in Arthur Cayley,
"On the Schwarzian Derivative, and the Polyhedral Functions,"
*Transactions of the Cambridge Philosophical Society*:
"But there is no sub-group of an order divisible by 5; and hence, these
two transformations being identified with the two substitutions, the other
transformations correspond each of them to a determinate substitution"
[University of Michigan Historical Math Collection].

**SUBRING** is found in English in 1937 in the phrase *invariant
subring* in *Modern Higher Algebra* (1938) by A. A. Albert
(OED2).

**SUBSET.** Cantor used the word *subset* (in the sense that
"proper subset" is now used) in "Ein Beitrag zur
Mannigfaltigkeitslehre," *Journal für die reine und angewandte
Mathematik* 84 (1878).

*Subset* occurs in English in "A Simple Proof of the Fundamental
Cauchy-Goursat Theorem," Eliakim Hastings Moore, *Transactions of
the American Mathematical Society,* Vol. 1, No. 4. (Oct., 1900).

**SUBTANGENT.** Huygens coined “soutangente” in his “Traité de la lumière” (1690), p. 173,
and used it again in his letter to Leibniz, December 19, 1690 (see G. W. Leibniz, *Sämtliche Schriften und Briefe,*
series III, vol. 4, p. 684); Leibniz started to use “soutangente” in his letter to Huygens, February 20/ March 2, 1691 (vol. 5, p. 63). In October 1691 he sent a paper to Huygens: “Methodus qua innumerarum Linearum Constructio ex data proprietate Tangentium seu aequatio inter Abscissam et Ordinatam ex dato valore Subtangentialis, exhibetur” (vol. 5, p. 181).

Vol. 5 is online here.

In his article “Supplementum geometriae practicae” (*Acta Eruditorum,* April 1693, 178-180; GM 5, 285-288) Leibniz wrote: “subtangentialis (ut Hugeniano verbo utar) seu portio axis intercepta inter tangentem & ordinatam sit t” (GM 5, 287).

[This entry was contributed by Siegmund Probst.]

**SUBTRACT.** When Fibonacci (1201) wishes to say "I subtract," he
uses some of the various words meaning "I take": *tollo,
aufero,* or *accipio.* Instead of saying "to subtract" he
says "to extract."

In English, Chaucer used *abate* around 1391 in *Treatise on
the Astrolabe*: "Abate thanne thees degrees And minutes owt of 90"
(OED2).

In a manuscript written by Christian of Prag (c. 1400), the word
"subtraction" is at first limited to cases in which there is no
"borrowing." Cases in which "borrowing" occurs he puts under the
title *cautela* (caution), and gives this caption the same
prominence as *subtractio.*

In *Practica* (1539) Cardano used *detrahere* (to draw or
take from).

In 1542 in the *Ground of Artes* Robert Recorde used
*rebate*: "Than do I rebate 6 out of 8, & there resteth 2."

In 1551 in *Pathway to Knowledge* Recorde used *abate*:
"Introd., And if you abate euen portions from things that are equal,
those partes that remain shall be equall also" (OED2).

Digges (1572) writes "to subduce or substray any sume, is wittily to pull a lesse fro a bigger number."

Schoner, in his notes on Ramus (1586 ed., p. 8), uses both
*subduco* and *tollo* for "I subtract."

In his arithmetic, Boethius uses *subtrahere,* but in geometry
attributed to him he prefers *subducere.*

The first citation for *subtract* in the OED2 is in 1557 by
Robert Recorde in *The whetstone of witte:* "Wherfore I subtract
16. out of 18."

Hylles (1592) used "abate," "subtact," "deduct," and "take away" (Smith vol. 2, pages 94-95).

From Smith (vol. 2, page 95):

The word "subtract" has itself had an interesting history. The Latinsubappears in French assub, soub, sou,andsous, subtraherebecomingsoustraireandsubtractiobecomingsoustraction.Partly because of this French usage, and partly no doubt for euphony, as in the case of "abstract," there crept into the Latin works of the Middle Ages, and particularly into the books printed in Paris early in the 16th century, the formsubstractio.From France the usage spread to Holland and England, and form each of these countries it came to America. Until the beginning of the 19th century "substract" was a common form in England and America, and among those brought up in somewhat illiterate surroundings it is still to be found. The incorrect form was never popular in Germany, probably because of the Teutonic exclusion of international terms.

Tonstall (1522) devoted 15 pages to *Subductio.* He wrote, "Hanc
autem eandem, uel deductionem uel subtractionem appellare Latine
licet" (1538 ed., p. 23; 1522 ed., fol. E 2, r).

Gemma Frisius (1540) has a chapter *De Subductione siue
Subtractione.*

Clavius (1585 ed., p. 26) says "Subtractio est ... subductio."

See also ADDITION.

**SUBTRAHEND** is an abbreviation of the Latin *numerus
subtrahendus* (number to be subtracted).

**SUCCESSIVE INDUCTION.** This term was suggested by Augustus De
Morgan in his article "Induction (Mathematics)" in the *Penny
Cyclopedia* of 1838. See also MATHEMATICAL INDUCTION,
INDUCTION, COMPLETE INDUCTION.

**SUFFICIENCY, SUFFICIENT STATISTIC** and **CRITERON OF SUFFICIENCY** all appear in 1922 in R. A. Fisher’s
"On the Mathematical Foundations of Theoretical Statistics",
*Philosophical Transactions of the Royal Society of London,* Ser. A, 222,
309-368:

The statistic chosen should summarise the whole of the relevant information supplied by the sample. This may be called the Criterion of Sufficiency. (p. 316)The termIn the case of the normal curve of distribution it is evident that the second moment is a sufficient statistic for estimating the standard deviation. (p. 359)

The concept of sufficiency was already emerging in 1920 (p. 769) when Fisher wrote of the sample variance that "The whole of the information respecting σ, which a sample provides is summed up [in its value]."

This entry was contributed by John Aldrich. See also PETERS’ METHOD.

**SUM.** Nicolas Chuquet used *some* in his *Triparty en la
Science des Nombres* in 1484.

The term **SUMMABLE** (referring to a function that is Lebesgue
integrable such that the value of the integral is finite) was
introduced by Lebesgue (Klein, page 1045).

**SUPPLEMENT.** "Supplement of a parallelogram" appears in English
in 1570 in Sir Henry Billingsley’s translation of Euclid’s
*Elements.*

In 1704 *Lexicon Technicum* by John Harris has "supplement of an
Ark."

In 1796 Hutton *Math. Dict.* has "The complement to 180° is
usually called the supplement.

In 1798 Hutton in *Course Math.* has "supplemental arc" (one of
two arcs which add to a semicircle) (OED2).

Supplement II to the 1801 *Encyclopaedia Britannica* has, "The
supplement of 50° is 130°; as the complement of it is 40
°" (OED2).

In 1840, Lardner in *Geometry* vii writes, "If a quadrilateral
figure be inscribed in a circle, its opposite angles will be
supplemental" (OED2).

*Supplementary angle* is found in 1820 in the fourth edition of *Elements
of Geometry and Plane Trigonometry* by John Leslie. [Google print search]

**SURD.** According to Smith (vol. 2, page 252), al-Khowarizmi
(c. 825) referred to rational and irrational numbers as 'audible' and
'inaudible', respectively.

The Arabic translators in the ninth century translated the Greek
*rhetos* (rational) by the Arabic *muntaq* (made to speak)
and the Greek *alogos* (irrational) by the Arabic *asamm*
(deaf, dumb). See e. g. W. Thomson, G. Junge, *The Commentary of
Pappus on Book X of Euclid’s Elements,* Cambridge: Harvard
University Press, 1930 [Jan Hogendijk].

This was translated as *surdus* ("deaf" or "mute") in Latin.

As far as is known, the first known European to adopt this terminology was Gherardo of Cremona (c. 1150).

Fibonacci (1202) adopted the same term to refer to a number that has no root, according to Smith.

*Surd* is found in English in Robert Recorde’s *The Pathwaie
to Knowledge* (1551): "Quantitees partly rationall, and partly
surde" (OED2).

According to Smith (vol. 2, page 252), there has never been a general
agreement on what constitutes a surd. It is admitted that a number
like sqrt 2 is a surd, but there have been prominent writers who have
not included sqrt 6, since it is equal to sqrt 2 X sqrt 3. Smith
also called the word *surd* "unnecessary and ill-defined" in his
*Teaching of Elementary Mathematics* (1900).

G. Chrystal in *Algebra,* 2nd ed. (1889) says that "...a surd
number is the incommensurable root of a commensurable number," and
says that sqrt *e* is not a surd, nor is sqrt (1 + sqrt 2).

The term **SURFACE INTEGRAL** was used in 1873 by James Clerk Maxwell in a
*Treatise on Electricity and Magnetism*,
p. 12 in the paragraph "Line-integration appropriate
to forces, surface-integration to fluxes." The *OED*’s earliest reference is to Arthur Cayley in 1875
*
Math. Papers IX. p. 321* "On the Prepotential Surface-integral." The concept is much older: see the entry
DIVERGENCE THEOREM.

**SURJECTION** and **SURJECTIVE.** See the entry INJECTION,
SURJECTION and BIJECTION.

The term **SURREAL NUMBER** was introduced by Donald Ervin Knuth (1938- ) in his book *Surreal
numbers: How two ex-students turned on to pure mathematics and found total
happiness* (1974). John Horton Conway (1937- ), who introduced the
concept, later wrote, “I wish I'd invented that name.”
For further information see Surreal Numbers.

**SURVIVAL FUNCTION.** David (2001) gives E. L. Kaplan and Paul Meier "Nonparametric
Estimation from Incomplete Observations," *Journal of the American Statistical Association*,
**53**, (1958), 457-481. However a *JSTOR* search found earlier occurrences
in the writings of Alfred J. Lotka, e.g. *p*(*a*), the function denoting
the probability, at birth, of surviving to age *a*, is called the *life
table* (*survival function*) in his "Biometric Functions in a Population
Growing in Accordance with a Prescribed Law," *Proceedings of the National
Academy of Sciences*, **15**, (1929), 793-798 and the *survival function*
in his "A Contribution to the Theory of Self-Renewing Aggregates, With Special
Reference to Industrial Replacement," *Annals of Mathematical Statistics*,
**10**, (1939), 1-25. The mathematical treatment of "survivorship" is much
older. See chapter 25 of Hald (1990) on "The Insurance Mathematics of de Moivre
and Simpson, 1725-1756."

See LIFE TABLE and RENEWAL THEORY.

**SYLOW’S THEOREM** refers to a result in
Ludwig Sylow’s
"Théorèmes sur les groupes de substitutions,"
*Mathematische
Annalen, 5, (1872)*, pp. 584-594. The expression

*Sylow’s Theorem* is found in English in 1893 in W. Burnside’s "Notes on the Theory of Groups of Finite
Order. I: On the Proof of Sylow’s Theorem. II: On the Possibility of Simple
Groups whose Orders are the Products of Four Primes," *Proceedings of the London Mathematical Society* XXV pp. 9-18. (OED2).

The term **SYMMEDIAN** was introduced in 1883 by Philbert Maurice
d'Ocagne (1862-1938) [Clark Kimberling].

**SYMMEDIAN POINT.** Emil Lemoine (1840-1912) used the term
*center of antiparallel medians.*

The proposal to name the point after Ernst Wilhelm Grebe (1804-1874)
came from E. Hain ("Ueber den Grebeschen Punkt," *Archiv der
Mathematik und Physik* 58 (1876), 84-89). Afterwards, the term
*Grebe’schen Punkt* appeared many times in the *Jahrbuch ueber
die Fortschritte der Mathematik* by reviewers such as Dr. Schemmel
(Berlin, 1875), Prof. Mansion (Gent, 1881), Prof. Lampe (Berlin,
1881), and Dr. Lange (Berlin, 1885) [Peter Schreiber, Julio
González Cabillón].

In 1884, Joseph Jean Baptiste Neuberg (1840-1926) gave it the name
*Lemoine point,* for Emile Michel Hyacinthe Lemoine (1840-1912).

The point was thus called the Lemoine point in France and the Grebe point in Germany [DSB].

*Symmedian point* was coined by Robert Tucker (1832-1905) in the
interest of uniformity and amity.

**SYMMETRIC** (Of a binary relation)
Bertrand Russell wrote in "On the Notion of Order," *Mind*, **10**, (1901), p. 32.
"When ARB implies BRA, I call R *symmetrical* relation."

**SYMMETRIC DIFFERENCE** (of sets). The *OED*
cites M. H. Stone "Theory of representations for Boolean Algebras," *Trans.
Amer. Math. Soc.* XL. (1936), p. 38: "The Union (modulo 2), or symmetric
difference, of two classes is the class of objects belonging to one or the other,
but not to both, of those classes."

**SYMMETRIC FUNCTION** appears in S. F. Lacroix,
*Traité de calcul differéntiel et du calcul intégral,* vol. 1 (1797), p. 277:
"les fonctions dont je parle, son celles qui renferment toutes les racines
[d'une équation] combinées d'une manière semblable, soit entr'elles, soit
avec d'autres quantités, et que pour cela je nommerai *fonctions symétriques*"
[Joao Caramalho Domingues].

The term **SYMPLECTIC GROUP** was proposed in 1939 by
Herman Weyl in *The Classical Groups.* He wrote on page 165:

The name "complex group" formerly advocated by me in allusion to line complexes, as these are defined by the vanishing of antisymmetric bilinear forms, has become more and more embarrassing through collision with the word "complex" in the connotation of complex number. I therefore propose to replace it by the corresponding Greek adjective "symplectic." Dickson calls the group the "Abelian linear group" in homage to Abel who first studied it.[This information was provided by William C. Waterhouse.]

According to *Lectures on Symplectic Geometry* by Ana Cannas da
Silva, "the word symplectic in mathematics was coined by Weyl who
substituted the Greek root in complex by the corresponding Latin
root, in order to label the symplectic group. Weyl thus avoided that
this group connoted the complex numbers, and also spared us from much
confusion had the name remained the former one in honor of Abel:
abelian linear group."

**SYNTHETIC DIVISION** is found in 1850 in
*Theoretical and Practical Treatise on Algebra*
by Horatio Nelson Robinson:
"This last operation is called *synthetic division.*"
[Google print search]

**SYNTHETIC GEOMETRY** appears in Gigon, "Bericht über: Jacob
Steiner’s Vorlesungen über synthetische Geometrie, bearbeitet
von Geiser und Schröter," *Nouv. Ann.* (1868).

*Synthetic geometry* appears in English in 1870 in *Report on education*by
John Wesley Hoyt, published by the U. S. Government Printing Office:
"First year’s course in mathematical section. Theory of numbers;
differential and integral calculus; theory of functions, with
repetitions;
analytical geometry of the plane; experimental physics, with
repetitions; experimental chemistry, with repetitions; descriptive
geometry,
with exercises and repetitions; synthetic geometry; machine-drawing"
[University of Michigan Digital Library].

The term **SYSTEM OF EQUATIONS** is found in 1843 in
"Chapters in the Analytical Geometry of (n) Dimensions" by Arthur Cayley in the *Cambridge Mathematical Journal,*
vol. IV: "On the determination of linear equations in
*x*_{1},
*x*_{2},...,
*x*_{n}which are satisfied by the
values of these quantities derived from given systems of linear
equations."
[This citation, from the University of Michigan Digital Library, is a
chapter title and thus appears in italics in the original.]

**SYZYGY** was coined as a mathematical term by James Joseph Sylvester. The word appears in 1850
in *Cambr. & Dubl. Math. Jrnl.* V. 276:
"The members of any group of functions, more than two in number, whose nullity is implied
in the relation of double contact, ... must be in syzygy. Thus *PQ, PQR, QR,* must form a syzygy." [OED2]