Earliest Known Uses of Some of the Words of Mathematics (S)

Last revision: April 14, 2020.

ST. ANDREW’S CROSS is the term used by Florian Cajori for the multiplication symbol X. It appears in 1916 in his "William Oughtred, A Great Seventeenth-Century Teacher of Mathematics.

St. Andrew’s cross is found in 1615, although not in a mathematical context, in Crooke, Body of Man: "[They] doe mutually intersect themselues in the manner of a Saint Andrewes crosse, or this letter X" [OED].

The ST. PETERSBURG PARADOX was formulated by Niklaus Bernoulli in 1713: see problem 5 in the first letter of Correspondence of Nicholas Bernoulli concerning the St Petersburg game with Montmort, Daniel Bernoulli and Cramer (translation by Richard J. Pulskamp.) The association with St. Petersburg came about because the most prominent discussion was published there: this was Daniel Bernoulli’s "Specimen Theoriae Novae de Mensara Sortis," Commentarii Academiae Scientiarum Imperialis Petropolitana, 5, 175-192 (1738). The paper has been translated as "Exposition of a New Theory on the Measurement of Risk," Econometrica, 22, (1954), 23-36.

In 1768 D'Alembert Opuscules Mathématiques vol. IV, p. 78 (English translation by Richard J. Pulskamp) used the phrase "le probléme de petersbourg." J. Bertrand’s Calcul des probabilités (1889, p. 62) has a section on the "Paradoxe de Saint-Pétersbourg" and the "paradox" appears in English in J. M. Keynes’s A Treatise on Probability (1921).

[John Aldrich, based on Jacques Dutka, "On the St. Petersburg paradox," Arch. Hist. Exact Sci. 39, No.1, 1988 and David (2001)]


SADDLE POINT is found in 1922 in A Treatise on the Theory of Bessel Functions by G. N. Watson [OED].

SAGITTA was used in Latin by Fibonacci (1220) to mean the versed sine (Smith, vol. 2). See VERSED SINE.

In 1726 Alberti’s Archit. has: "The .. Line .. from the middle Point of the Chord up to the Arch, leaving equal Angles on each Side, is call'd the Sagitta" [OED].

Webster’s New International Dictionary (1909) has the following definition for sagitta: "the distance from a point in a curve to the chord; also, the versed sine of an arc; -- so called (by Kepler) from its resemblance to an arrow resting on the bow and string; also, Obs., an abscissa.

The 1961 third edition of the same dictionary has the following definition: "the distance from the midpoint of an arc to the midpoint of its chord."

SALIENT ANGLE. The OED has a 1687 citation for Angle Saliant.

In 1781 Sir John T. Dillon wrote in Travels Through Spain: "He could find nothing which seemed to confirm the opinion relating to the salient and reentrant angles" [OED].

Mathematical Dictionary and Cyclopedia of Mathematical Science (1857) has: "SALIENT ANGLE of a polygon, is an interior angle, less than two right angles."



SAMPLE PATH. This term seems to have originated in sequential analysis and then was transferred to stochastic processes in general. JSTOR gives one pre-1950 reference, to Anscombe (1949) "Large-Sample Theory of Sequential Estimation," Biometrika, 36, 455-458 [John Aldrich].

SAMPLE SPACE was introduced into statistical theory by J. Neyman and E. S. Pearson, "On the Problem of the Most Efficient Tests of Statistical Hypotheses," Philosophical Transactions of the Royal Society, A, 231. (1933), 289-337. It was associated with the representation of a sample comprising n numbers as a point in n-dimensional space, a representation R. A. Fisher had exploited in articles going back to 1915. W. Feller used this notion of sample space in his "Note on regions similar to the sample space," Statist. Res. Mem., Univ. London 2, 117-125 (1938) but in the Introduction to Probability Theory and its Applications, volume one (1950) Feller used the term quite abstractly for the set of outcomes of an experiment. He attributed the general concept to Richard von Mises (1883-1953) who had referred to the Merkmalraum (label space) in writings on the foundations of probability from 1919 onwards: see his "Grundlagen der Wahrscheinlichkeitsrechnung," Math. Zeit. 5, (1919), 52-99.

This entry was contributed by John Aldrich. See also EXPERIMENT.

SAMPLING DISTRIBUTION. R. A. Fisher seems to have introduced this term. It appears incidentally in a 1922 paper (JRSS, 85, 598) and then in the title of his 1928 paper "The General Sampling Distribution of the Multiple Correlation Coefficient", Proc. Roy. Soc. A, 213, p. 654.



SCALAR QUANTITIES. In 1646 Viète used magnitudines scalares to refer to a set of quantities in continual geometrical proportion. The term was not adopted by others. [James A. Landau]

SCALENE. In Sir Henry Billingsley’s 1570 translation of Euclid’s Elements scalenum is used as a noun: "Scalenum is a triangle, whose three sides are all unequall."

In 1642 scalene is found in a rare use as a noun, referring to scalene triangle in Song of Soul by Henry More: "But if 't consist of points: then a Scalene I'll prove all one with an Isosceles."

Scalenous is found in 1656 in Stanley, Hist. Philos.. (1687): "A Pyramid consisteth of four triangles,..each whereof is divided..into six scalenous triangles."

Scalene occurs as an adjective is in 1684 in Angular Sections by John Wallis: "The Scalene Cone and Cylinder."

The earliest use of scalene as an adjective to describe a triangle is in 1734 in The Builder’s Dictionary. (All citations are from the OED.)

SCATTER DIAGRAM. According to H. L. Moore, Laws of Wages (1911), the term "scatter diagram" was due to Karl Pearson. A JSTOR search finds the term first appearing in a 1906 article in Biometrika (which Pearson edited), "On the Relation Between the Symmetry of the Egg and the Symmetry of the Embryo in the Frog (Rana Temporaria)" by J. W. Jenkinson. However the term only came into wide use in the 1920s when it began to appear in textbooks, e.g. F. C. Mills, Statistical Methods of 1925. OED gives the following quotation from Mills: "The equation to a straight line, fitted by the method of least squares to the points on the scatter diagram, will express mathematically the average relationship between these two variables" (X. 366) [John Aldrich].

Scattergram is found in 1938 in A. E. Waugh, Elem. Statistical Method: "This is the method of plotting the data on a scatter diagram, or scattergram, in order that one may see the relationship" [OED].

Scatterplot is found in 1939 in Statistical Dictionary of Terms and Symbols by Kurtz and Edgerton (David, 1998).

For the history of this form of graphical representation of data see Michael Friendly, Daniel Denis "The early origins and development of the scatterplot," Journal of the History of the Behavioral Sciences, 41, Issue 2 (2005), pp. 103-130.

SCHLICHT is a loan-word from the German literature on Funktiontheorie (complex analysis). It entered the English literature in the 1920s and is still used. A schlicht function “takes no value more once” explains J. E. Littlewood in his 1925 article “On inequalities in the theory of functions,” Proc. London Math. Soc., 23, p. 481. Authors of English textbooks have usually preferred other terms. “Biuniform” is used by P. Dienes The Taylor Series (1931). “Simple” is used by E. T. Copson An Introduction to the Theory of Functions of a Complex Variable (1935) presumably because in ordinary non-mathematical German schlicht means “simple” or “plain.” Copson reports that the French term is “univalent.” L. V. Ahlfors Complex Analysis (1953, p. 172) remarks that schlicht “lacks an adequate translation.” Ahlfors prefers univalent and that is now the commonest term in English.



The term SCHUR COMPLEMENT was introduced in 1968 by Emilie V. Haynsworth (1916-1985) and named for the German mathematician Issai Schur (1875-1941) and his lemma of 1917 in “Über Potenzreihen, die im Innern des Einheitskreises beschränkt sind,” Journal für die reine und angewandte Mathematik, 147, (1917), 205-232. However the historical notes in chapter 0 of Fuzhen Zhang (ed.) The Schur Complement and Its Applications (2005) identify “implicit manifestations” in the work of Sylvester in 1851 and even Laplace in 1812.



SCHWARZ’ THEOREM on the equality of mixed partial derivatives. See CLAIRAUT’S THEOREM, SCHWARZ’ THEOREM, and YOUNG’S THEOREM.

SCIENTIFIC NOTATION. In 1895 in Computation Rules and Logarithms Silas W. Holman referred to the notation as "the notation by powers of ten." In the preface, which is dated August 1895, he wrote: "The following pages contain ... an explanation of the use of the notation by powers of ten ... the notation by powers of 10, as in the explanation here given. It seems unfortunate that this simple notation, so useful in computation and so great an aid in the explanation of numerical relations, is not universally incorporated into arithmetical instruction." [James A. Landau]

In A Scrap-Book of Elementary Mathematics (1908) by William F. White, the notation is called the index notation.

Scientific notation is found in 1915 in American Mathematical Monthly 22/328: “The work on logarithms is prefaced by use of the notation 2.417 X 10-8, etc., so common in scientific work. ... No rule for characteristics is needed if the student gets to thinking of every number as expressible in this ‘scientific notation.’” [OED]

According to Webster’s Second New International Dictionary (1934), numbers in this format are sometimes called condensed numbers.

Other terms are exponential notation and standard notation.

SCORE, METHOD OF SCORING and SCORE TEST in Statistics. The derivative of the log-likelihood function played an important part in R. A. Fisher’s theory of maximum likelihood from its beginnings in the 1920s but the name score is more recent. The "score" was originally associated with a particular genetic application; a family is assigned a score based on the number of children of each category and there were different ways scoring associated with different ways of estimating linkage. In a 1935 paper ("The Detection of Linkage with Dominant Abnormalities," Annals of Eugenics, 6, 193) Fisher wrote that, because of the efficiency of maximum likelihood, the "ideal score" is provided by the derivative of the log-likelihood function. In 1948 C. R. Rao used the phrase efficient score (Proc. Cambr. Philos. Soc. 44, 50-57) and score by itself (J. Roy. Statist. Soc., B, 10: 159-203) when writing about maximum likelihood in general, i.e. without reference to the linkage application. Today "score" is so established in this derivative of the log-likelihood sense that the phrases "non-ideal score" or "inefficient score" would convey nothing.

In 1946 - still in the genetic context - Fisher ("A System of Scoring Linkage Data, with Special Reference to the Pied Factors in Mice. Amer. Nat., 80: 568-578) described an iterative method for obtaining the maximum likelihood value. Rao’s 1948 J. Roy. Statist. Soc. B paper treats the method in a more general framework and the phrase "Fisher’s method of scoring" appears in a comment by Hartley. Fisher had already used the method in a general context in his 1925 "Theory of Statistical Estimation" paper (Proc. Cambr. Philos. Soc. 22: 700-725) but it attracted neither attention nor name.

In 1948 Rao introduced a test which he called the efficient score test in his "Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation," Proc. Cambr. Philos. Soc, 44, 50-57. The one-parameter version had already been discussed by Wald in "Some Examples of Asymptotically Most Powerful Tests," Annals of Mathematical Statistics, 12, (1941), 396-408. While "Rao’s efficient score test" is sometimes seen,  the terms "score test" and "Lagrange multiplier test" are more common.

This entry was contributed by John Aldrich, with some information taken from David (1995). See the entries LAGRANGE MULTIPLIER TEST and WALD TEST.

SECANT (in trigonometry) was introduced by Thomas Fincke (1561-1656) in his Thomae Finkii Flenspurgensis Geometriae rotundi libri XIIII, Basileae: Per Sebastianum Henricpetri, 1583. (His name is also spelled Finke, Finck, Fink, and Finchius.) Fincke wrote secans in Latin.

Vieta (1593) did not approve of the term secant, believing it could be confused with the geometry term. He used Transsinuosa instead (Smith vol. 2, page 622).

Secant is found in English with its meaning in trigonometry in 1594 in Exercises by Thomas Blundeville: “The Table of Secants.” [OED]

Secant is found in English with its modern geometric meaning in 1684 in Elementary Geometry: “From the Center D, draw the Secant DC.” [OED]

SECOND DIFFERENCE is found in 1777 in "A Method of finding the Value of an infinite Series of decreasing Quantities of a certain Form," by Francis Maseres in the Philosophical Transactions of the Royal Society vol. 67: "And 2dly, let these numbers be so related to each other, that they not only shall form a decreasing progression theselves, but that their differences, a-b, b-c, c-d, d-e, e-f, f-g, g-h, &c. shall also form a decreasng progression, so that b-c shall be less than a-b, and c-d than b-c, and d-e than c-d, and so on of the following differences; and likewise, that the differences of these differences (which may be called the second differences of the original numbers a, b, c, d, e, f, g, h, &c. shall form a decreasing progression; and that the differences of those second differences, or the third differences of the original numbers a, b, c, d, e, f, g, h, &c. shall also form a decreasing progression; and in like manner, that the differences of the said third differences, or the fourth differences, of the original numbers a, b, c, d, e, f, g, h, &c. and the fifth and sixth differences, and all higher differences, of the same numbers, shall also form decreasing progressions."


SELF-CONJUGATE. Kramer (p. 388) says Galois used this term, referring to a normal subgroup.

The term SEMI-CUBICAL PARABOLA was coined by John Wallis (Cajori 1919, page 181).

The term SEMIGROUP apparently was introduced in French as semi-groupe by J.-A. de Séguier in Élem. de la Théorie des Groupes Abstraits (1904).

SEMI-INVARIANT or HALF-INVARIANT. T. N. Thiele (1838-1910) introduced the concept in a Danish work, called in English, The General Theory of Observations (1889). His "half-invariants" did not fare well in Britain. They were noticed by Karl Pearson in his "Contributions to the Mathematical Theory of Evolution. II. Skew Variation in Homogeneous Material," Philosophical Transactions of the Royal Society A, 186, (1895), p. 412 but only to be dismissed. Arne Fisher, a Danish emigrant to the United States, gave an account of semi-invariants in his Mathematical Theory of Probabilities (2nd edition, 1922); "semi-invariant" was the more common English rendering. In 1929 Ronald Fisher began publishing on cumulative moment functions (see CUMULANT) and several correspondents told him that these were identical to Thiele’s semi-invariants. See the index entry cumulant in J. H. Bennett Statistical Inference and Analysis: Selected Correspondence of R. A. Fisher (1990). Fisher did not agree and did not change his terminology, which became the accepted one.

This entry was contributed by John Aldrich based on S L Lauritzen (2002) Thiele: Pioneer in Statistics.

SENTENTIAL CALCULUS is found in English in 1937 in a translation by Amethe Smeaton of The Logical Syntax of Language by Rudolf Carnap: "Primitive sentences of the sentential calculus" [OED].

SEPARABLE is found in 1803 in A General History of Mathematics translated from the French of John Bossut: “No one could completely accomplish the object: but a great number of cases were pointed out, in which the indeterminates are separable, and in which the equations may consequently be resolved by the quadratures of curves.” [Google print search]

SEQUENCE. The OED shows a use by Sylvester in 1882 in the American Journal of Mathematics with the "rare" definition of a succession of natural numbers in order.

Sequence is found in 1891 in a translation by George Lambert Cathcart of the German An introduction to the study of the elements of the differential and integral calculus by Axel Harnack: "What conditions must be fulfilled in order that for continually diminishing values of Δx, the quotient ... may present a continuous sequence of numbers tending to a determinate limiting value: zero, finite or infinitely great?" [University of Michigan Historical Math Collection; the term may be considerably older.]

SEQUENTIAL ANALYSIS was developed in the Second World War by statisticians in the USA and in Britain. In 1945 Abraham Wald published his "Sequential Tests of Statistical Hypotheses," Annals of Mathematical Statistics, 16, (1945), 117-186 and followed it up with a book Sequential Analysis in 1947. The terms sequential analysis, sequential test and sequential probability ratio test all appear in the 1945 article. (David (2001))

SERIAL CORRELATION. The term was introduced by G. U. Yule in his 1926 paper "Why Do We Sometimes Get Nonsense Correlations between Time-series? A Study in Sampling and the Nature of Time-series," Journal of the Royal Statistical Society, 89, 1-69:  "I propose to term such correlations, r1 between us and us+1, r2 between us and us+2, etc., where us is the value of the variable in year s, the serial correlations for the given series."  (p. 14) (David 2001).

SERIES. According to Smith (vol. 2, page 481), "The early writers often used proportio to designate a series, and this usage is found as late as the 18th century."

John Collins (1624-1683) wrote to James Gregory on Feb. 2, 1668/1669, "...the Lord Brouncker asserts he can turne the square roote into an infinite Series" (DSB, article: "Newton").

James Gregory wrote to John Collins on Feb. 16, 1671 [apparently O. S.]: "I do not question that all equations may be formed by tables, but I doubt exceedingly if all equations can be solved by the help only of the tables of logarithms and sines without serieses."

According to Smith (vol. 2, page 497), "The change to the name ’series' seems to have been due to writers of the 17th century. ... Even as late as the 1693 edition of his algebra, however, Wallis used the expression 'infinite progression' for infinite series."

In the English translation of Wallis' algebra (translated by him and published in 1685), Wallis wrote:

Now (to return where we left off:) Those Approximations (in the Arithmetick of Infinites) above mentioned, (for the Circle or Ellipse, and the Hyperbola;) have given occasion to others (as is before intimated,) to make further inquiry into that subject; and seek out other the like Approximations, (or continual approaches) in other cases. Which are now wont to be called by the name of Infinite Series, or Converging Series, or other names of like import.
The SERPENTINE curve was named by Isaac Newton (1642-1727) in 1701, according to the Encyclopaedia Britannica.

SET and SET THEORY are the modern English equivalents for the German terms Menge and Mengenlehre adopted by Georg Cantor (1845-1918) at the end of the 19th century and used by subsequent German writers. Their French counterparts are ensemble and théorie des ensembles. Before they acquired the new meaning the words Menge, ensemble and set were established non-technical words in their respective languages. Each word had several meanings and the meanings by no means coincided: see dictionary entries. To add to the complexity, the word Menge was not Cantor’s original choice and in English a number of words have been used.

In 1796, William Frend used the phrase “set of numbers” in The Principles of Algebra. This use of the word was found by Stanley Burris, who wites, “This was certainly not an influential book since Frend did not accept negative numbers, but it suggests the use of the word set in math texts may have been common.”

In English the OED records the use of set for a collection of things (musical instruments, say) from the 17th century. In the 19th century the word is found in mathematical contexts. Thus in the Lectures on Quaternions (1853) Hamilton used the word “set” and even once the term “theory of sets.” Hamilton used “set” to mean what we would call an “n-tuple,” that is, a set of numbers which could be used as a coordinate in n-dimensional analytic geometry. However this usage did not become established and set only arrived as a specialised technical term in the 20th century.

In German the old word Menge also began to be used in technical contexts in the 19th century. E.g. in von Staudt’s Geometrie der Lage (2nd ed., 1856): "Wenn man die Menge aller in einem und demselben reellen einfoermigen Gebilde enthaltenen reellen Elemente durch n + 1 bezeichnet und mit diesem Ausdrucke, welcher dieselbe Bedeutung auch in den acht folgenden Nummern hat, wie mit einer endlichen Zahl verfaehrt, so ...".

The modern importance of the term Menge/set is due to Cantor. His work is described both in general works like Kline (ch. 41) and in monographs like J. W. Dauben Georg Cantor : His Mathematics and Philosophy of the Infinite (1979).

Dauben describes the emergence of the Menge terminology on p. 170. In his early work Cantor used the term Mannichfaltigkeit, Riemann’s term often translated as MANIFOLD. The terms Menge and Mengenlehre appear in a note to the article “Über unendliche, lineare Punktmannichfaltigkeiten, (Part 5) Mathematische Annalen, 21 (1883), 545-591, issued separately as a pamphlet Grundlagen einer allgemeinen Mannichfaltigkeitslehre. The meaning of the term Mannichfaltigkeitslehre is discussed in a note on p. 587 of the article. Cantor distinguishes this general theory from geometrical Mengenlehre and goes on to explain, “By an ‘aggregate’ [Mannichfaltigkeit] or ‘set’ [Menge] I mean generally any multitude which can be thought of as a whole, i.e., any collection of definite elements which can be united by a law into a whole.” Dauben’s translation.

However Cantor did not adopt Menge and Mengenlehre as the terms until later. Both are used in his “Beiträge zur Begründung der Transfiniten Mengenlehre,” Mathematische Annalen, 46 (1895), 481-512. The opening words are:

By a “set” [„Menge”] we mean any collection M of into a whole of definite, distinct objects m of our intuition or our thinking (which are called the “elements” of M) of our perception or of our thought.

Translation from Dauben (p. 170).

Beginning in 1883 Cantor’s papers were translated into French for publication in Mittag-Leffler’s journal Acta Mathematica. There the terms ensemble and théorie des ensembles were used. The first of the series was “Une contribution á la théorie des ensembles,” Acta Mathematica, 2, 311-328 (1883) from “Ein Beitrag zur Mannigfaltigkeitslehre,” Journal für die reine und angewandte Mathematik 84 (1878) 242-258. The French terms did not change when When Cantor adopted Menge.

While the French terms were approved by Cantor the English terms, which came somewhat later, were at the writer’s discretion and there was more variation. In 1903 the mathematical logician Bertrand Russell treated Cantor’s Menge as equivalent to the logical term class (see CLASS). In analysis the terms aggregate and set were used and co-existed for some decades.

In 1901 E. H. Moore declared, “It is convenient to use set as the equivalent of Menge and ensemble.” “Concerning Harnack’s Theory of Improper Definite Integrals,” Transactions of the American Mathematical Society, 2, p. 297.

However, the practice of E. W. Hobson was more typical for the time. In The Theory of Functions of a Real Variable and the Theory of Fourier’s Series (1907, p. v) he wrote of “the Theory of Sets of Points, also known in its more general aspect, as the Theory of Aggregates.”

Theory of point sets is found in 1912 in volume II of Lectures on the Theory of Functions of Real Variables by James Pierpont: “After the epoch-making discoveries inaugurated in 1874 by G. Cantor in the theory of point sets...” (Preface p. 5)

In the well-known volume of translations of Cantor’s 1895 papers, Contributions to the Founding of the Theory of Transfinite Numbers (1915, re-issued by Dover 1955), the translator, P. E. B. Jourdain, renders Menge and Punktmenge as aggregate and point-aggregate. In the preface Jourdain states that the broad field is usually described as “the theory of aggregates” or “the theory of sets.”

Set theory is found in 1926 in Orrin Frink “The Operations of Boolean Algebras,” Annals of Mathematics (2d ser.) XXVII. p. 487. (JSTOR).

The term axiomatic set theory came into circulation in the 1930s, while naïve set theory was used occasionally in the 1940s, becoming an established term in the 1950s. It appears in Hermann Weyl’s review of P. A. Schilpp (ed) The Philosophy of Bertrand Russell in the American Mathematical Monthly, 53, No. 4. (1946), p. 210 and Laszlo Kalmar’s review of The Paradox of Kleene and Rosser in Journal of Symbolic Logic, 11, No. 4. (1946), p. 136. (JSTOR)

[John Aldrich, James A. Landau, and Ken Pledger contributed to this entry.]

A complete list of the set theory and logic terms on this web site is here. For set theory symbols see Earliest Use of Symbols.

SEXAGESIMAL appears in English in 1647 in R. Wood’s translation of Oughtred’s Key Mathematics: “The conversion of Sexagesimal into Decimals, and contrariwise of Decimal parts into Sexagesimals.” [OED]

Sexagesimal appears in A Proposal About Printing A treatise of Algebra by John Wallis, which was circulated in 1683: “The Sexagesimal Fractions (introduced it seems by Ptolemy) did but imperfectly supply the want of such a Method of Numerical Figures.” [OED]

SHEAF has been used for a family of rays or planes that pass through a given point. Among the quotations given by the OED are these: “A sheaf of calorific rays” from Tyndall Heat 1863 and “A sheaf (sheaf of planes, sheaf of lines) is a figure made up of planes or straight lines, all of which pass through a given point (the centre of the sheaf)” from an 1885 translation of Cremona Elements of projective geometry.

More recently sheaf has been used in algebraic topology as a translation of the French word “faisceau.” The term was introduced by Jean Leray in his “L'anneau d'homologie d'une representation,” Comptes rendus de l'Académie des sciences, 222, (1946), 1366-1368. The earliest quotation in the OED is:

The French word ‘faisceau’ has been translated into English as ‘sheaf’ or ‘stack’. In this paper we use the word ‘stack’, since ‘sheaf’ has been used before in mathematics.

Ann. Math. 62, (1955), p. 56. Alas, “stack” did not take off.

This entry was contributed by John Aldrich. For sheaf theory see the entry in the Encyclopedia of Mathematics.

SHEPPARD’S CORRECTIONS are adjustments to moments calculated from grouped data, proposed by W. F. Sheppard (1863-1936). The phrase Sheppard’s corrections appears in Karl Pearson’s "Mathematical Contributions to the Theory of Evolution. X. Supplement to a Memoir on Skew Variation," Philosophical Transactions of the Royal Society A, 197, (1901), p. 451. Pearson refers to Sheppard’s "On the Calculation of the Most Probable Values of Frequency Constants for data arranged according to Equidistant Divisions of a Scale," Proceedings of the London Mathematical Society, 29, (1898), 353-80 but an earlier reference is "On the Calculation of the Average Square, Cube, of a Large Number of Magnitudes Journal of the Royal Statistical Society," 60, (1897), 698-703.

(David (2001))

SHORT DIVISION is found in 1777 The Man of Business and Gentleman’s Assistant by William Perry: “Short Division is when the Divisor does not exceed 12.” [Google print search]

SHRINKAGE and SHRINKING in statistical estimation theory. The terms were introduced by J. R. Thompson in “Some Shrinkage Techniques for Estimating the Mean,” Journal of the American Statistical Association, 63, (1968), 113-122. Particular shrinkage estimators had been investigated by earlier writers including Karlin and Bartlett but they did not introduce any terms. [John Aldrich]

SIBLING. The OED shows two citations for sibling from the Middle Ages. In both cases, the word had the obsolete meaning of "one who is of kin to another; a relative."

Sibling does not appear in the 1890 Funk & Wagnalls unabridged dictionary.

The OED shows a use of sib to mean "brother or sister" in 1901.

After the two citations from the Middle Ages, the next citation in the OED for sibling is by Karl Pearson in 1903 in Biometrika, where the word is used in its modern sense: "These [calculations] will enable us .. to predict the probable character in any individual from a knowledge of one or more parents or brethren (’siblings', = brothers or sisters)."

In 1931, a translation by E. & C. Paul of Human Heredity by E. Baur et al. has: "The word ‘sib’ or ‘sibling’ is coming into use in genetics in the English-speaking world, as an equivalent of the convenient German term 'Geschwister'" [OED].

SIEVE OF ERATOSTHENES is attributed to Eratosthenes of Cyrene. The expression is found in English in 1772 in “ΚΟΣΚΙΝΟΝ ΕΡΑΤΟΣΘΕΝΟΥΣ . or, The Sieve of Eratosthenes. Being an Account of His Method of Finding All the Prime Numbers,” by the Rev. Samuel Horsley, F. R. S. Philosophical Transactions (1683-1775), 62, (1772), pp. 327-347. (JSTOR search.) See the Wikipedia entry. [John Aldrich]

SIGN OF AGGREGATION is found in 1863 in The Normal: or, Methods of Teaching the Common Branches, Orthoepy, Orthography, Grammar, Geography, Arithmetic and Elocution by Alfred Holbrook: "The signs of aggregation are the bar ___, which signifies that the numbers over which it is placed are to be taken together as one number; also, the parenthesis, (); the brackets, []; and the braces, {}, which signify that the quantities enclosed by them respectively are to be taken together, as one quantity."

In 1900 in Teaching of Elementary Mathematics, David Eugene Smith wrote: "Signs of aggregation often trouble a pupil more than the value of the subject warrants. The fact is, in mathematics we never find any such complicated concatenations as often meet the student almost on the threshold of algebra."

The SIGN TEST seems to be the oldest formal significance test, for it was used by Dr. John Arbuthnott, “Physitian in Ordinary to Her Majesty,” An Argument for Divine Providence, taken from the constant Regularity observed in the Births of both Sexes, Philosophical Transactions of the Royal Society of London 27, (1710-1712), 186-190.

F. R. Helmert used the test and named it (in German) in 1905.

The sign test appears in R. A. Fisher’s Statistical Methods for Research Workers (1925, ch. V, Example 19) where it is compared with a t-test. [This entry was contributed by John Aldrich, using David (2001)).]

SIGNATURE of a quadratic form. C. C. MacDuffee The Theory of Matrices (1933, p. 57) attributes the term to Frobenius. Die Signatur appears in his “Ueber das Trägheitsgesetz der quadratischen Formen,” Journal für die reine und angewandte Mathematik, 114, (1895), p. 187. See the entry LAW OF INERTIA.

SIGNED NUMBER. Signed magnitude appears in 1873 in Proc. Lond. Math. Soc.: "A signed magnitude" [OED].

Signed number appears in the title "The [Arithmetic] Operations on Signed Numbers" by Wilson L. Miser in Mathematics Magazine (1932).

SIGNIFICANCE. Significance testing is almost as old as the theory of probability. Three hundred years ago Dr. John Arbuthnott, "Physitian in Ordinary to Her Majesty," tested the hypothesis that the probability of a male birth is equal to that of a female birth in An Argument for Divine Providence, taken from the constant Regularity observed in the Births of both Sexes, Philosophical Transactions of the Royal Society of London 27, (1710-1712), 186-190. However the terminology of significance is more recent.

Significant is found in 1885 in F. Y. Edgeworth, "Methods of Statistics," Jubilee Volume, Royal Statistical Society, pp. 181-217: "In order to determine whether the observed difference between the mean stature of 2,315 criminals and the mean stature of 8,585 British adult males belonging to the general population is significant [etc.]" [OED].

Significance is found in 1888 in Logic of Chance by John Venn: "As before, common sense would feel little doubt that such a difference was significant, but it could give no numerical estimate of the significance" (p. 486) [OED].

The terms test of significance and significance test were used before the 1920s but only rarely. A JSTOR search finds significance test in Oswald H. Latter "The Egg of Cuculus Canorus. An Enquiry into the Dimensions of the Cuckoo’s Egg and the Relation of the Variations to the Size of the Eggs of the Foster-Parent, with Notes on Coloration, &c Biometrika, 1, (1902), p. 168.

The expression test of significance was very prominent in R. A. Fisher’s Statistical Methods for Research Workers (1925). This book introduced the related terms level of significance (p. 161), 5 per cent point (p. 198) and statistical significance (p. 218).

Testing the significance is found in Student’s "New tables for testing the significance of observations," Metron 5 (3) pp 105-108 (1925).

Statistically significant is found in 1931 in L. H. C. Tippett, Methods of Statistics: "It is conventional to regard all deviations greater than those with probabilities of 0.05 as real, or statistically significant" [OED].

Curiously no "significance" terms appear in the famous Neyman and Pearson 1933 paper, although Neyman uses "level of significance" in his textbook First Course in Probability and Statistics, (1950) where it is identified with the probability of committing an error of the first kind. (p. 265).

This entry was contributed by John Aldrich. See also HYPOTHESIS AND HYPOTHESIS TESTING.

SIGNIFICANT DIGIT. Smith (vol. 2, page 16) indicates Licht used the term in 1500, and shows a use of "neun bedeutlich figuren" by Grammateus in 1518.

In 1544, Michael Stifel wrote, "Et nouem quidem priores, significatiuae uocantur."

Signifying figures is found in 1542 in Robert Recorde, Gr. Artes (1575): "Of those ten one doth signifie nothing... The other nyne are called Signifying figures" [OED].

Significant figures is found in 1660 in Milton, Free Commw.: "Only like a great Cypher set to no purpose before a long row of other significant Figures" [OED].

Significant figures is found in the first edition of the Encyclopaedia Britannica (1768-1771) in the article "Arithmetick": "Of these, the first nine, in contradistinction to the cipher, are called significant figures."

Mathematical Dictionary and Cyclopedia of Mathematical Science (1857) has this definition:

SIGNIFICANT. Figures standing for numbers are called significant figures. They are 1, 2, 3, 4, 5, 6, 7, 8, and 9.
Significant digit is found in 1871 in Elements of trigonometry, plane and sphericalby Lefebure de Fourcy, translated from the last French ed. by Francis H. Smith: "Thus, the record.5386617, which in reality expresses the logarithm of 3.4567, can be made to express the logarithms of 345670, 34567, 3456.7, 345.67, 34.567, 3.4567, .34567, .034567, or any number formed by adding ciphers to the end of the former, or to beginning of the latter immediately after the decimal point; so that every logarithm taken out of the Tables for a particular number, becomes, by simply altering its characteristic, the logarithm of an infinite variety of other numbers, that is, of all that are expressed by the same succession of significant digits" [University of Michigan Digital Library].

Non-significant digit is found in January 1900 in Neal H. Ewing, "The Shakespeare Name," Catholic World: "Naught is the non-significant digit; though it means nothing, yet it counts for so much."

An article in The Mathematics Teacher in October 1939 explains that zero is sometimes a "significant figure."

SIMILAR. In 1557 Robert Recorde used like in the Whetstone of Witte: "When the sides of one plat forme, beareth like proportion together as the sides of any other flatte forme of the same kinde doeth, then are those formes called like flattes .. and their numbers, that declare their quantities, in like sorte are named like flattes" [OED].

In the manuscript of his Characteristica Geometrica which was not published by him, Leibniz wrote "similitudinem ita notabimus: a ~ b."

In 1660 Isaac Barrow used like in his Euclid: "If in a triangle FBE there be drawn AC a parallel to one side FE, the triangle ABC shall be like to the whole FBE [OED].

In English, similar triangles is found in 1704 in Lexicon technicum: "Similar Triangles are such as have all their three Angles respectively equal to one another" [OED].

SIMILAR (applied to a matrix) was introduced by Frobenius (as "ähnlich") in "Ueber lineare Substitutionen und bilineare Formen," J. reine angew. Math. 84 (1879) p. 21.  This is according to C. C. MacDuffee, The Theory of Matrices, Springer (1933).

SIMILAR REGION was introduced by J. Neyman and E. S. Pearson in "On the Problem of the Most Efficient Tests of Statistical Hypotheses," Philosophical Transactions of the Royal Society of London. Series A, 231, (1933), 289-337. (David, 1995.)


SIMPLE CLOSED CURVE occurs in 1873 in "On Listing’s Theorem" by Arthur Cayley in the Messenger of Mathematics [University of Michigan Historical Math Collection].

SIMPLEX. William Kingdon Clifford (1845-1879) used the term prime confine in "Problem in Probability," Educational Times, Jan. 1886:

Now consider the analogous case in geometry of n dimensions. Corresponding to a closed area and a closed volume we have something which I shall call a confine. Corresponding to a triangle and to a tetrahedron there is a confine with n + 1 corners or vertices which I shall call a prime confine as being the simplest form of confine.
[There could be an error in this citation, as Clifford had died in 1879.]

In a post to a math history mailing list in 2004, John Stillwell wrote, "It is true that Poincare uses the idea of simplex in his pioneering works of algebraic topology, but he does not seem to use the actual word simplex. In his 1st Complement a l'analysis situs of 1900 he calls it a generalized tetrahedron. The first use of the word (in our sense) I can find is in Schoute’s Mehrdimensionale Geometrie of 1902. On p. 9 of volume 1 he suggests the name "Simplicissimum", because it is the simplest piece of d-dimensional space. Then on p. 10 he decides to call it a "simplex" for short."

SIMPLEX METHOD is found in Robert Dorfman, "Application of the simplex method to a game theory problem," Activity Analysis of Production and Allocation, Chap. XXII, 348-358 (1951).

Simplex approach is found in 1951 by George B. Dantzig (1914-2005) in T. C. Koopman’s Activity Analysis of Production and Allocation xxi. 339: "The general nature of the ’simplex' approach (as the method discussed here is known)" [OED].

SIMPLY ORDERED SET was defined by Cantor in Mathematische Annalen, vol. 46, page 496.

SIMPSON’S PARADOX is the name given to a result in conditional probability by C. R. Blyth: "The paradox is the possibility of P{A | B}< P{A | B' while P{A | B}≥ P{A | B'} both under the additional condition C and under the complement C' of that condition." ("On Simpson’s Paradox and the Sure-Thing Principle", Journal of the American Statistical Association, 67, (1972), p. 364.)

Blythe altered the details of the inequalities underlying the "curious case" discussed in paragraphs 8-10 of E. H. Simpson’s "The Interpretation of Interaction in Contingency Tables", Journal of the Royal Statistical Society, B, 13, (1951), pp. 238-241. More importantly, however, the novelty and interest of the case were not in the inequalities and the possible conflict between the unconditional (total) analysis and the conditional (partial) analysis but rather in Simpson’s demonstration that some situations require the one and some the other and there is nothing in the numbers to say which is required.

The possibility of a conflict between total and partial analyses was first noticed in 1899 by Pearson, Lee & Bramley-Moore "Mathematical Contributions to the Theory of Evolution. - VI. Genetic (Reproductive) Selection: Inheritance of Fertility in Man, and of Fecundity in Thoroughbred Racehorses," Philosophical Transactions of the Royal Society A, 192, (1899), p. 278): "We are thus forced to the conclusion that a mixture of heterogeneous groups, each of which exhibits no organic correlation, will exhibit a greater or less amount of correlation. This correlation may properly be called spurious . . ." G. U. Yule gave more attention to the phenomenon and called the correlation (or association) in the population formed from the mixing of records "fictitious" in his 1903 Biometrika paper "Notes on the Theory of Association of Attributes in Statistics" (p. 143) and "illusory" in his 1911 book Introduction to the Theory of Statistics (pp. 49ff.). In the case of conflict Pearson and Yule considered the partial analyses the relevant ones and the total analysis suspect. [John Aldrich]

Given that Simpson referred to Yule and Yule to Pearson, Blythe’s choice of term may seem surprising and unfortunate.  I. J. Good and Y. Mittal "The Amalgamation and Geometry of Two-by-Two Contingency Tables," Annals of Statistics, 15, (1987), p. 695 consider "Simpson’s paradox" an instance of Stigler’s law of eponymy.  Some authors prefer the names "Simpson-Yule Paradox" or "Yule Paradox."  Good and Mittal use the impersonal term "amalgamation paradox."  [John Aldrich]


SIMPSON’S RULE for the numerical evaluation of an integral is named for Thomas Simpson. However, E. T. Whittaker and G. Robinson note, “This formula [generally known as Simpson’s or the parabolic rule] was first given (in a geometrical form) by Cavalieri [1639], and later by James Gregory [1668] and by Thomas Simpson [1743].” (Calculus of Observations (1924, p. 156))

Simpson’s rule is found in 1856 in A treatise on land-surveying by William Mitchell Gillespie: “When the line determined by the offsets is a curved line, ‘Simpson’s rule’ gives the content more accurately” [University of Michigan Digital Library].

In 1911, Elements of the Differential and Integral Calculus by William Anthony Granville has “Simpson’s rule (parabolic rule).” This may also appear in the earlier 1904 edition.

The expression Simpson’s rule has been used in other ways reflecting Simpson’s broad interests. One use was in the theory of annuities and another in algebra. In the latter sense it appears in 1851 in Bonnycastle’s introduction to algebra by John Bonnycastle [University of Michigan Digital Library].

SIMSON LINE. The theorem was attributed to Robert Simson (1687-1768) by François Joseph Servois (1768-1847) in the Gergonne’s Journal, according to Jean-Victor Poncelet in Traité des propriétés projectives des figures. The line does not appear in Simson’s work and is apparently due to William Wallace. [The University of St. Andrews website]

SIMULATE, SIMULATOR, SIMULATION. Words of this family have been in English since the fourteenth century (OED) but the modern technical meaning dates from around 1950 and was a response to the development of the modern COMPUTER. F. C. Williams & F. J. U. Ritson’s “Electronic Servo Simulators,” Jrnl. Inst. Electr. Engineers XCIV. IIA.(1947), p. 112 describes a form of analogue computer, “This paper presents an outline of a method which will allow automatic control systems to be studied experimentally by means of an electronic device called a “simulator,” which is constructed so as to have the same characteristic equation as the control system.” (The OED has a different quotation from the same paper.) For digital computers, a JSTOR search found the term in use in “A Technique for Real Time Simulation of a Rigid Body Problem,” by H. J. Gray, Jr.; M. Rubinoff; H. Sohon in Mathematical Tables and Other Aids to Computation, 7, (1953), pp. 73-77.


SIMULTANEOUS EQUATIONS is found in 1820 in A Collection of Examples of the Applications of the Differential and Integral Calculus by George Peacock: "Let us take the three linear simultaneous equations of the second order...." [Google print search]

SIMULTANEOUS EQUATIONS MODEL in ECONOMETRICS. The model was formulated and an estimation method proposed by Trygve Haavelmo in "The Statistical Implications of a System of Simultaneous Equations," Econometrica, 11, (1943), pp. 1-12.  His starting point was "if one assumes that the economic variables considered satisfy, simultaneously, several stochastic relations, it is usually not a satisfactory method to try to determine each of the equations separately from the data without regard to the restrictions which the other equations might impose upon the same variables." (p. 2) In 1989 Haavelmo received the Nobel Prize in Economics for this work.  The term "simultaneous equations model" entered currency in the early 1950s. For the history of the development of the model see M. S. Morgan A History of Econometric Ideas, Cambridge 1990.

Much of the standard terminology associated with the model was created by T. C. Koopmans. See the entries ENDOGENOUS/EXOGENOUS VARIABLE and IDENTIFIABILTY.

SINC FUNCTION. sinc (x) appears in 1952 in “Information theory and inverse probability in telecommunication,” by P. M. Woodward and I. L. Davies, Proceedings of the IEE - Part III: Radio and Communication Engineering: “....where sinc x is an abbreviation for the function (sin πx)/πx. This function occurs so often in Fourier analysis and its applications that it does seem to merit some notation of its own.”

(Some web pages say that this term is a short form of sinus cardinalis. However, M. Farooq Wahab points out that Woodward and Davies never use sinus cardinalis. Others have suggested that E. T. Whittaker coined the term in 1915 in “On the Functions which are represented by the Expansions of the Interpolation-Theory,” Proceedings of the Royal Society of Edinburgh, 35, 181-194. However Wahab reports that Whittaker never uses the term cardinal sine or sinus cardinalis, although he does introduce the term cardinal.)

SINE. The word has come with some distortion from Sanskrit through Arabic and Latin. Accounts differ on the details but the basic story is this: the Sanskrit jya (“chord”) was taken into Arabic as jiba but the word that was translated into Latin was not this word but jaib (“bay”) and this became sinus (“bay” or “curve”) which was anglicized as sine.

The account of Indian trigonometry in Katz (6.6) begins with a fragment dating from the early fifth century which contains a table of “half-chords.” However, the first fully preserved work is the Aryabhatiya of Aryabhata the Elder. This small astronomical treatise completed in 499 gave a summary of Hindu mathematics up to that time; see here. Aryabhata used the terms ardha-jya (“half-chord”) and jya-ardha (“chord-half”), and abbreviated them to jya (“chord”). The word jya derives from “bowstring.”

From jya the Arabs phonetically derived jiba, which, following the practice in Arabic of omitting vowels, was written as jb. Accounts differ on who was responsible for the subsequent confusion with jaib and who first translated this word into Latin.

In some accounts sinus first appears in Latin in a translation of the Algebra of al-Khowarizmi by Gherard of Cremona (1114-1187). For example, Eves (page 177) writes:

Later writers, coming across jb as an abbreviation for the meaningless jiba, substituted jaib instead, which contains the same letters and is a good Arabic word meaning "cove" or "bay." Still later, Gherardo of Cremona (ca. 1150), when he made his translations from the Arabic, replaced the Arabian jaib by its Latin equivalent, sinus, whence came our present word sine.

Boyer (page 278) places the first appearance of sinus in a translation of 1145:

When Robert of Chester came to translate the technical word jiba, he seems to have confused this with the word jaib (perhaps because vowels were omitted); hence he used the word sinus, the Latin word for "bay" or "inlet." Sometimes the more specific phrase sinus rectus, or "vertical sine," was used; hence the phrase sinus versus, or our "versed sine," was applied to the "sagitta," or the "sine turned on its side."

Smith (vol. 1, page 202) writes that the Latin sinus "was probably first used in Robert of Chester’s revision of the tables of al-Khowarizmi."

According to Cajori (1906), the Latin term sinus was introduced in a translation of the astronomy of Al Battani by Plato of Tivoli (or Plato Tiburtinus).

The term sinus was adopted by European mathematicians in their own writings and it appeared in various phrases. In his Practica Geometriae (1220) Fibonacci used the expressions sinus rectus arcus and sinus versus arcus. Regiomontanus (1436-1476) used sinus, sinus rectus, and sinus versus in De triangulis omnimodis (On triangles of all kinds; Nuremberg, 1533) [James A. Landau]. Smith (vol. 2, page 617) points out that not everyone used the term and that Rheticus (c. 1560) preferred perpendiculum.

The Latin word sinus went into English in two forms, sinus and sine. The OED has citations for both sinus and sine in the sense of a “gulf” or “bay” but it describes the latter usage as obsolete. The word sine has survived only in the mathematical sense and the earliest citation in the OED is to Thomas Fale in 1593:

This Table of Sines may seem obscure and hard to those who are not acquainted with Sinicall computation.

From Horologiographia. The art of dialling: teaching an easie and perfect way to make all kinds of dials vpon any plaine plat howsoeuer placec: With the drawing of the twelue signes, and houres vnequall in them all....

Naturally when English mathematicians wrote in Latin they used the word sinus but the OED does not report any use of the word in the mathematical sense in English texts. In French mathematics the word is sinus.

See also the entry COSINE.

The term SINGLE-VALUED FUNCTION (meaning analytic function) was used by Yulian-Karl Vasilievich Sokhotsky (1842-1927).

The term SINGULAR INTEGRAL (in the theory of differential equations) is due to Lagrange Oeuvres, 3, pp. 549-575 (Kline, page 532).

The term is found in 1831 in Elements of the Integral Calculus (1839) by J. R. Young:

We see, therefore, that it is possible for a differential equation to have other integrals besides the complete primitive, but derivable from it by substituting in it, for the arbitrary constant c, each of its values given in terms of x and y by the equation (5). Such integrals are called singular integrals, or singular solutions of the proposed differential equation.

SINGULAR INTEGRAL (in the theory of integration). A. L. Cauchy introduced l’intégrale singulière in the "Mémoire sur les intégrales définies," (presented in 1814 but published in 1827) Oeuvres Ser. 1, 1 p. 394. (F. Smithies Cauchy and the Creation of Complex Function Theory.) See Mathworld.

SINGULAR MATRIX. Singular matrix and non-singular matrix occur in 1907 in Introduction to Higher Algebra by Maxime Bôcher: "Definition 2. A square matrix is said to be singular if its determinant is zero."

SINGULAR POINT appears in a paper by George Green published in 1828. The paper also contains the synonymous phrase "singular value" [James A. Landau].

Singular point appears in 1836 in the second edition of Elements of the Differential Calculus by John Radford Young. According to James A. Landau, who supplied this citation, it is not clear what the author meant by the term. Landau writes, "Judging by the contents of Chapter IV, to the author ’singular point' was the name of the category to which 'multiple points,' 'cusps,' and 'points of inflexion' belong."

In An Elementary Treatise on Curves, Functions and Forces (1846), Benjamin Peirce writes, "Those points of a curve, which present any peculiarity as to curvature or discontinuity, are called singular points."

SINGULAR VALUE and SINGULAR VALUE DECOMPOSITION. The paper most often cited in connection with the singular value decomposition of a matrix is C. Eckart, & G. Young “The Approximation of One Matrix by Another of Lower Rank,” Psychometrika, 1, 211-218. The result, however, is much older.

G. W. Stewart considers five mathematicians who were responsible for establishing the existence of the singular value decomposition and developing its theory: “Beltrami, Jordan, and Sylvester came to the decomposition through what we should now call linear algebra; Schmidt and Weyl approached it from integral equations.” It is interesting to compare the development of EIGENVALUE theory.

Apart from Beltrami (1873), all the contributions are available on the web: C. Jordan “Sur la réduction des formes bilinéaires,” Comptes Rendus, 78 (1874), 614-617; J. J. Sylvester “On the reduction of a bilinear quantic of the nth order to the form of a sum of n products by a double orthogonal substitution,” Messenger of Mathematics, 19, (1889), 42-46 (Papers IV, 655); E. Schmidt “Zur Theorie der linearen und nichtlinearen Integralgleichungen. I Teil. Entwicklung willkiirlichen Funktionen nach System vorgeschriebener, Math. Ann., 63, (1907), 433-476; H. Weyl “Das asymptotische Verteilungsgesetz der Eigenwert linearer partieller Differentialgleichungen (mit einer Anwendung auf der Theorie der Hohlraumstrahlung), Math. Ann., 71 (1912), 441-479.

These authors used a variety of terms, e.g. Sylvester referred to “canonical multipliers.” The term “singular value” has been used in this context since 1908 but yet quite in the way it is used today. The term is used with its modern meaning by F. Smithies “The Eigen-Values and Singular Values of Integral Equations,” Proc. London Math. Soc., 43, (1938), 255-279. Eckart and Young do not use any special terminology although they refer to Sylvester’s paper.

[This entry was contributed by John Aldrich, based on G. W. Stewart “On the Early History of the Singular Value Decomposition,” SIAM Review, 35, (1993), 551-566.]

SIZE (of a critical region) is found in 1933 in J. Neyman and E. S. Pearson, "On the Problem of the Most Efficient Tests of Statistical Hypotheses," Philosophical Transactions of the Royal Society of London, Ser. A, 231, (1933), 289-337 (David (2001)).


SKEW DISTRIBUTION and SKEW CURVE appear in 1895 in Karl Pearson’s Contributions to the Mathematical Theory of Evolution. II. Skew Variation in Homogeneous Material, Philosophical Transactions of the Royal Society A, 186, 343-414. [James A. Landau]

SKEW SYMMETRIC MATRIX. Skew symmetric determinant appears in 1849 in Arthur Cayley, Jrnl. für die reine und angewandte Math. XXXVIII. 93: "Ces déterminants peuvent être nommés 'gauches et symmétriques'" [OED].

Skew symmetric determinant appears in 1885 in Modern Higher Algebra by George Salmon: "A skew symmetric determinant is one in which each constituent is equal to its conjugate with its sign changed."

Skew symmetric matrix appears in "Linear Algebras," Leonard Eugene Dickson, Transactions of the American Mathematical Society, Vol. 13, No. 1. (Jan., 1912).

SKEWES NUMBER appears in 1949 in Kasner & Newman, Mathematics and the Imagination: "A veritable giant is Skewes' number, even bigger than a googolplex" [OED].

SLIDE RULE. Soon after the introduction of LOGARITHMS devices for multiplication were developed that incorporated the principle of multiplication by addition. Gunter’s scale was introduced in 1620; it had no moving parts.

Cajori gives Edmund Wingate the credit for devising the first slide rule in 1630, although William Oughtred’s device of 1632 is more often cited. Cajori states that Oughtred was an independent discoverer of the rectilinear slide rule and the first to propose a circular rule. The 1632 publication, Circles of Proportion, uses the terms horizontal instrument and circles of proportion.

Slide rule appears in the Diary of Samuel Pepys (1633-1703) in April 1663: "I walked to Greenwich, studying the slide rule for measuring of timber." However, the device referred to may not have been a slide rule in the modern sense.

Slide rule appears in 1838 in Civil Eng. & Arch. Jrnl.: "To assist in facilitating the use of the slide rule among working mechanics" [OED].

Amédée Mannheim (1831-1906) designed (c. 1850) the Mannheim Slide Rule.

Sliding-rule and sliding-scale appear in 1857 in Mathematical Dictionary and Cyclopaedia of Mathematical Science, defined in the modern sense.

Slide rule appears in 1876 in Handbk. Scientif. Appar.: "The slide rule,--an apparatus for effecting multiplications and divisions by means of a logarithmic scale" [OED].

(Florian Cajori’s History of the Logarithmic Slide-Rule (1909) is the standard work but basic information can be found in Museum of HP Calculators: Slide-rules).

SLOPE is found in 1829 in A treatise on practical surveying and topographical plan drawing: "When these lines differ but little from the horizonal lines, they may be taken for them; but if the slope is very great it is easy to reduce them, because we have always the hypothenuse and perpendicular of a right-angled triangle given by our measurement to find the base."

Slope is found in 1835 in Second report addressed to the directors and proprietors of the London and Birmingham railway.

Slope is found in 1854 in A Manual of Topographical Drawing by Richard Somers Smith: "If, for example, it is found that it coincides in length with No. 12 of the scale, then the slope expressed by that interval is 1/12." [Google print search]

Mathematical Dictionary and Cyclopedia of Mathematical Science (1857) has:

SLOPE. Oblique direction. The slope of a plane is its inclination to the horizon. This slope is generally given by its tangent. Thus, the slope, 1/2, is equal to an angle whose tangent is 1/2; or, we generally say, the slope is 1 upon 2; that is, we rise, in ascending such a plane, a vertical distance of 1, in passing over a horizontal distance of 2. The slope of a curved surface, at any point, is the slope of a plane, tangent to the surface at that point.

In 1924 Analytic Geometry by Arthur M. Harding and George W. Mullins has: "If the line is parallel to the y axis, the slope is infinite." Modern textbooks say such a line has undefined slope.

For information on the use of m and other symbols for slope, see Earliest Uses of Symbols for Geometry.

SLOPE FIELD is found in 1955 in "Line Element Fields and Lorentz Structures on Differentiable Manifolds" by L. Markus in Annals of Mathematics: "Moreover, it is possible to construct continuous line element fields, without singularities, on a compact manifold which cannot be oriented, for example, the slope field dy/dx = tan(1 - 2π)π/2 on the torus considered as E2 iwth the coordinates modulo one." [JSTOR search]

An older term is DIRECTION FIELD. M. Golomb and M. Shanks (Elements of Ordinary Differential Equations, 2nd edition 1965) attached this note to their definition of “direction field”: “‘Slope field’ would be a more appropriate name but direction field is a long established term.”

SLOPE-INTERCEPT FORM is found in 1904 in Elements of the Differential and Integral Calculus by William Anthony Granville [James A. Landau].

In Webster’s New International Dictionary (1909), the term is slope form.

SMALL SAMPLE PROBLEM, THEORY etc. in Statistics. In the early 20th century Student argued that existing large sample methods had to be augmented because they could give misleading results in small samples; his best known contribution was the 1908 paper, “The probable error of a mean”, Biometrika, 6, 1-25. The expression “small sample” soon became established. In a 1909 paper Student was writing, “It will be observed that this is essentially a ’small sample' problem...” (“The distribution of the means of samples which are not drawn at random,” Biometrika, 6, p. 211.) The other great figure in small sample statistics was R. A. Fisher. In his Statistical Methods for Research Workers (1925) he writes, “Only by systematically tackling small sample problems on their merits does it seem possible to apply accurate tests to practical data.” Author’s Preface. In his review of Fisher’s book Student explained that “small samples” are “samples so small that the statistical constants of the population cannot be replaced by those of the samples without appreciable error.” More recently the qualifier small sample has been giving way to exact, e.g. in exact distribution or exact theory. The large sample methods were based on asymptotic approximations but the small sample methods were not based on approximations at all.


SMOOTHING. The OED’s earliest quotation is from Francis Galton’s Natural Inheritance chapter vii, p. 100: "These [curious and apparently very interesting relations] came out distinctly after I had ’smoothed' the entries."

Mark Nelson has found an earlier instance in C. S. Peirce’s 1873 paper, "On the theory of errors of observations." Pierce writes, "The curve has, however, not been plotted directly from the observations, but after they have been smoothed off by the addition of adjacent numbers in the table eight times over, so as to diminish the irregularities of the curve." The paper is reprinted in Stephen M. Stigler (ed.) (1980) vol. 2. In his article, "Mathematical statistics in the early States," Annals of Statistics, 6, (1978), 239-265, Stigler relates Pierce’s smoothing method to modern KERNEL density estimation. In other contexts smoothing may amount to fitting a TREND, the GRADUATION of a mortality table or the adjustment of geodetic measurements by the METHOD OF LEAST SQUARES.

The term SOCIAL MATHEMATICS was used by Condorcet (1743-1794) and may have been coined by him.

SOFTWARE. According to a biography of John Wilder Tukey by Peter McCullagh to appear in Biographical Memoirs of Fellows of the Royal Society of London, the term was coined by Tukey. A JSTOR search found this from 1958, "Today the ’software' comprising the carefully planned interpretive routines, compilers and other aspects of automative programming are at least as important to the modern electronic calculator as its 'hardware' of tubes, transistors, wires, tapes, tubes and the like." John W. Tukey "The Teaching of Concrete Mathematics," American Mathematical Monthly, 65, No. 1. (Jan., 1958), p. 2.

SOLID GEOMETRY appears in 1733 in the title Elements of Solid Geometry by H. Gore [OED].

SOLID OF REVOLUTION is found in English in 1816 in the translation of Lacroix’s Differential and Integral Calculus: "To find the differentials of the volumes and curve surfaces of solids of revolution" [OED].

SOLIDUS (the diagonal fraction bar). Arthur Cayley (1821-1895) wrote to Stokes, "I think the ’solidus' looks very well indeed...; it would give you a strong claim to be President of a Society for the Prevention of Cruelty to Printers" (Cajori vol. 2, page 313).

The word solidus appears in this sense in the Century Dictionary of 1891.

SOLUBLE (referring to groups). Ferdinand Georg Frobenius (1849-1917) wrote in a paper of 1893:

Jede Gruppe, deren Ordnung eine Potenz einer Primzahl ist, ist nach einem Satze von Sylow die Gruppe einer durch Wurzelausdrücke auflösbaren Gleichung oder, wie ich mich kurz ausdrücken will, einer auflösbare Gruppe. [Every group of prime-power order is, by a theorem of Sylow, the group of an equation which is soluble by radicals or, as I will allow myself to abbreviate, a soluble group.]
Peter Neumann believes this is likely to be the passage that introduced the term "auflösbar" ["soluble"] as an adjective applicable to groups into mathematical language.

SOLUTION SET appears in 1959 in Fund. Math. by Allendoerfer and Oakley: Given a universal set X and an equation F(x) = G(x) involving x, the set {x|F(x) = G(x)} is called the solution set of the given equation" [OED].

The term may occur in found in Imsik Hong, "On the null-set of a solution for the equation $\Delta u+k^2u=0$," Kodai Math. Semin. Rep. (1955).

SOUSLIN SET is defined in Nicolas Bourbaki, Topologie Generale [Stacy Langton].

SPACE. The word came into English—from Old French from Latin—around 1300. The OED entry distinguishes many meanings. In one sense (under heading 6b) it has room as a synonym. This word derives from the Old English and is related to the modern German Raum. Under heading 17 the OED defines “a space” as “an instance of any of various mathematical concepts, usually regarded as a set of points having some specified structure.” Among the quotations is a nice one from 1932: “The word ‘space’ has gradually acquired a mathematical significance so broad that it is virtually equivalent to the word ‘class’, as used in logic.” (M. H. Stone Linear Transformations in Hilbert Space p. 1.) The space age was well under way by 1914 when Hausdorff’s Grundzüge der Mengenlehre (Fundamentals of Set Theory) gave axioms for a METRIC SPACE (metrischer Raum) and for a TOPOLOGICAL SPACE (topologischer Raum).


The term SPECIALLY MULTIPLICATIVE FUNCTION was coined by D. H. Lehmer (McCarthy, page 65).

The term SPECIAL FUNCTIONS for the higher transcendental functions of mathematical physics has been in circulation from at least the 1920s. A JSTOR search found it used as a heading, without explanation, in the article “American Standard Mathematical Symbols,” American Mathematical Monthly, 35, (1928), p. 303. The symbols given are for BESSEL FUNCTIONS and BERNOULLI NUMBERS. See the Encyclopedia of Mathematics entry.

SPECTRUM (in operator theory). The OED’s earliest quotation illustrating the scientific (optical) use of "spectrum" is from Newton Phil. Trans. VI. (1671) 3076: "Comparing the length of this coloured Spectrum with its breadth, I found it about five times greater." The OED’s earliest quotation illustrating the mathematical use of "spectrum" is from P. R. Halmos Finite Dimensional Vector Spaces (1948, ii. 79): "The set of n proper values [eigenvalues] of A, with multiplicities properly counted, is the spectrum of A." This use of the term goes back to Hilbert’s work on integral equations in 1904-10. Hilbert used the term "Spektrum" when discussing quadratic forms in infinitely many variables ("Grundzüge einer allgemeinen Theorie der linearen Integralgleichungen. Vierte Mitteilung" Nachrichten von der Gesellschaft der Wissenschaften zu Göttingen, Mathematisch-Physikalische Klasse (1906),  p. 157.) The English word appears in 1911 in Anna Johnson Pell "Biorthogonal Systems of Functions," Transactions of the American Mathematical Society, 12, pp. 135-164. (JSTOR search)

There may be a link between Newton and Hilbert for, though the latter cited no previous writer for "Spektrum," J. Dieudonné History of Functional Analysis (1981, pp. 149-50) suggests he derived the term from W. Wirtinger "Beiträge zu Riemann’s Integrationsmethode für hyperbolische Differentialgleichungen, und deren Anwendungen auf Schwingungsprobleme," Mathematische Annalen, 48, (1897), 365-89. Wirtinger drew upon the similarity with the optical spectra of molecules when he used the term "Bandenspectrum" with reference to Hill’s (differential) equation.

The terms spectral theory and spectral theorem came into use around 1930: see e.g. A. Wintner Spektraltheorie unendiclichen Matrizen (1929) and B. A. Lengyel & M. H. Stone "Elementary Proof of the Spectral Theorem," Annals of Mathematics, 37, (1936), pp. 853-864. "The spectral theorem" of the latter is the "fundamental theorem on the spectral resolution of self-adjoint operators in Hilbert space."

This entry was contributed by John Aldrich. See also EIGENVALUE, STATIONARY STOCHASTIC PROCESS.

SPECTRUM and SPECTRAL DENSITY (in generalised harmonic analysis and stochastic processes). The "spectrum" of an irregular motion appears in N. Wiener’s "The Harmonic Analysis of Irregular Motion (Second Paper)" J. Math. and Phys. 5 (1926) 158-189. One of Wiener’s objectives was a theory which would include "an adequate mathematical account of such continuous spectra as that of white light." (Wiener Proc. London Math. Soc. 27 (1928)) The term "power-spectrum" is also in the 1926 paper. The spectrum and spectral density function were important in the probabilistic theory of Khintchine (1934) and Wold (1938) but the functions were not given names. The names appear in J. L. Doob’s "The Elementary Gaussian Processes" Annals of Mathematical Statistics, 15, (1944), 229-282. Around 1940 it became evident that the spectral theory of time series analysis was related to the spectral theory of operators. (See also the previous entry and STATIONARY STOCHASTIC PROCESS). [John Aldrich]

SPERNER’S LEMMA in algebraic topology appears in Emanuel Sperner’s “Neuer Beweis für die Invarianz der Dimensionszahl und des Gebietes,” Abh. Math. Sem. Univ. Hamburg, 6, (1928) 265—272. It was then used to give a new proof of BROUWER’S FIXED-POINT THEOREM by B. Knaster,  K. Kuratowski, S. Mazurkiewicz, “Ein Beweis des Fixpunktsatzes für n-dimensionale Simplexe” Fund. Math., 14 (1929) pp. 132–137. The phrase “das Spernersche Lemma” appears in Alexandroff & Hopf’s Topologie (1935, p. 376).

SPHERICAL CONCHOID was coined by Herschel.

SPHERICAL GEOMETRY appears in 1728 in Chambers' Cyclopedia [OED].

The words spherical geometry and versed sine were used by Edgar Allan Poe in his short story The Unparalleled Adventure Of One Hans Pfaall.

SPHERICAL HARMONICS. A. H. Resal used the term fonctions spheriques (Todhunter, 1873) [Chris Linton].

Spherical harmonics was used in 1867 by William Thomson (1824-1907) and Peter Guthrie Tait (1831-1901) in Nat. Philos.: "General expressions for complete spherical harmonics of all orders" [OED].

SPHERICAL TRIANGLE Menelaus of Alexandria (fl. A. D. 100) used the term tripleuron in his Sphaerica, according to Pappus. According to the DSB, "this is the earliest known mention of a spherical triangle."

The OED shows a use of spherical triangle in English in 1585.

In a letter to L. H. Girardin dated March 18, 1814, Thomas Jefferson (President of the United States) wrote, "According to your request of the other day, I send you my formula and explanation of Lord Napier’s theorem, for the solution of right-angled spherical triangles."

SPHERICAL TRIGONOMETRY is found in the title Trigonometria sphaericorum logarithmica (1651) by Nicolaus Mercator (1620-1687).

The term is found in English in a letter by John Collins to the Governors of Christ’s Hospital written on May 16, 1682, in the phrase "plaine & spherick Trigonometry, whereby Navigation is performed" [James A. Landau].

In a letter dated Oct. 8, 1809, Thomas Jefferson wrote, referring to Benjamin Banneker, "We know he had spherical trigonometry enough to make almanacs, but not without the suspicion of aid from Ellicot, who was his neighbor and friend, and never missed an opportunity of puffing him."

SPINOR appears in 1931 in Physical Review. The citation refers to spinor analysis developed by B. Van der Waerden [OED].

SPIRAL OF ARCHIMEDES appears in English in 1813 in Pantologia. A new cabinet cyclopædia by John Mason Good, Olinthus Gilbert Gregory, and N. Bosworth. [Google print search]

SPLINE (CURVE). The Century Dictionary of 1891 defines a spline as "a flexible strip of wood or hard rubber used by draftsmen in laying out broad sweeping curves, especially in railroad work." The word was introduced into mathematics in the form "spline curve" by I. J. Schoenberg  "Contributions to the problem of approximation of equidistant data by analytic functions. Part A--on the problem of smoothing or graduation. A first class of analytic approximation formulae," Quart. Appl. Math., 4, (1946), 45-99. The OED quotation explains the relation between old "spline" and new "spine curve," "For k = 4 they represent approximately the curves drawn by means of a spline and for this reason we propose to call them spline curves of order k." Later "spline curve" became abbreviated to "spline."

This entry was contributed by John Aldrich. See INTERPOLATION.

The term SPORADIC GROUP was coined by William Burnside (1852-1927) in the second edition of his Theory of Groups of Finite Order, published in 1911 [John McKay].

SPURIOUS CORRELATION. The term was introduced by Karl Pearson in "Mathematical Contributions to the Theory of Evolution - On a Form of Spurious Correlation Which May Arise When Indices Are Used in the Measurement of Organs," Proc. Royal Society, 60, (1897), 489-498. Pearson showed that correlation between indices u (= x/z) and v (= y/z) was a misleading guide to correlation between x and y. His illustration is

A quantity of bones are taken from an ossuarium, and are put together in groups which are asserted to be those of individual skeletons. To test this a biologist takes the triplet femur, tibia, humerus, and seeks the correlation between the indices femur/humerus and tibia/humerus. He might reasonably conclude that this correlation marked organic relationship, and believe that the bones had really been put together substantially in their individual grouping. As a matter of fact ... there would be ... a correlation of about 0.4 to 0.5 between these indices had the bones been sorted absolutely at random.
The term has been applied to other correlation scenarios with potential for misleading inferences. In Student’s "The Elimination of Spurious Correlation due to Position in Time or Space" (Biometrika, 10, (1914), 179-180) the source of the spurious correlation is the common trends in the series. In H. A. Simon’s "Spurious Correlation: A Causal Interpretation," Journal of the American Statistical Association, 49, (1954), pp. 467-479 the source of the spurious correlation is a common cause acting on the variables. In the recent spurious regression literature in time series econometrics (Granger & Newbold, Journal of Econometrics, 1974) the misleading inference comes about through applying the regression theory for stationary series to non-stationary series. The dangers of doing this were pointed out by G. U. Yule in his 1926 "Why Do We Sometimes Get Nonsense Correlations between Time-series? A Study in Sampling and the Nature of Time-series," Journal of the Royal Statistical Society, 89, 1-69. For another popular scenario see the entry on Simpson’s paradox. (Based on Aldrich 1995)


SQUARE. The English word comes via Old French from the Latin ex- out + quadrāre make square and was first used for a tool for measuring right angles. The OED’s first citation in the sense of the product of a number multiplied by itself is from 1557 in the Whetstone of Witte of R. Record: “Twoo multiplications doe make a Cubike nomber. Likewaies .3. multiplications doe giue a square of squares.”

SQUAREFREE or SQUARE-FREE are English translations of the German word quadratfrei for a number which is not divisible by the square of any prime. That term is used without explanation in Edmund Landau “Ueber die asymptotische Werthe einiger zahlentheoretischer Functionen,” Mathematische Annalen, 54, (1900), 570-581. The term is not used in the paper by Leopold Gegenbauer which gave the well-known asymptotic estimate 6x2 for the number of quadratfrei numbers not exceeding x, viz. “Asymptotische Gesetze der Zahlentheorie,” Denkschriften der Kaiserlichen Akademie der Wissenschaften Wien, 49 (1885), 37-80. See the Wikipedia entry Square-free integer.

In their Introduction to the Theory of Numbers (1938) Hardy and Wright use the German word on the ground that “there is no convenient English word.” (p. 254). However an English word was created by translating the German compound component by component and a JSTOR search found square-free in use from 1931 and squarefree from 1939. Quadratfrei is still found in English writing but is much less common than these equivalents. [John Aldrich]

SQUARE MATRIX was used by Arthur Cayley in 1858 in "A Memoir on the Theory of Matrices" Coll Math Papers, I, 475-96: "The term matrix might be used in a more general sense, but in the present memoir I consider only square or rectangular matrices" p. 475. [OED].



STABLE LAW (loi stable) appears in Paul Lévy "Sur les lois stables en calcul des probabilités," Comptes Rendus de l'Académie des Sciences, 176, 1284-1286. (David (2001))


The term STANDARD DEVIATION was introduced by Karl Pearson (1857-1936) in 1893, "although the idea was by then nearly a century old" (Abbott; Stigler, page 328). According to the DSB:

The term "standard deviation" was introduced in a lecture of 31 January 1893, as a convenient substitute for the cumbersome "root mean square error" and the older expressions "error of mean square" and "mean error."
The OED shows a use of standard deviation in 1894 by Pearson in "Contributions to the Mathematical Theory of Evolution," (Philosophical Transactions of the Royal Society A, 185, (1894), 71-110.): "Then σ will be termed its standard-deviation (error of mean square)." (p. 80) He had "always found it more convenient to work with the standard-deviation than with the probable error or the modulus, in terms of which the error-function is usually tabulated." (p. 88n) On p. 70 he identified the standard deviation with Gauss’s mean error.


STANDARD ERROR is found in 1897 in G. U. Yule, "On the Theory of Correlation," Journal of the Royal Statistical Society, 60, 812-854: "We see that lower case
sigma1[sqrt](1 - r2) is the standard error made in estimating x" [OED]. There the quantity x was being estimated by a regression residual but Yule applied the term generally in his Introduction to the Theory of Statistics (1911), covering such cases as the standard error of a proportion. [John Aldrich]


STANDARD NORMAL CURVE. In the biometric era W. F. Sheppard (Phil. Trans A, 192, (1899), p. 105) used the expression “standard normal curve” for “a normal curve whose area and standard deviation are unity” in “On the Application of the Theory of Error to Cases of Normal Distribution and Normal Correlation”, Phil. Trans A, 192, (1899), p. 105. However the term did not catch on and Sheppard did not use it when he presented (Biometrika, 5, (1907), p. 404) tables of the standard normal: he spoke of "the value of the deviation, the standard deviation being taken as unit." See the similar caption to the normal tables (Tables I. and II.) in Fisher’s Statistical Methods for Research Workers (1925). The term “unit normal” had some currency but most authors used no term.

The term “standard normal” came into general use around 1950, appearing in the popular textbooks by P. G. Hoel Introduction to Mathematical Statistics (1947) and A. M. Mood Introduction to the Theory of Statistics (1950).

See also NORMAL.

STANDARD POSITION is found in 1873 in An elementary course in free-hand geometrical drawing by Samuel Edward Warren: "a right angle is in its simplest, most natural, or standard position, when its sides are in the fundamental directions of vertical and horizontal" [University of Michigan Digital Library].

Standard position is dated 1950 in MWCD10.

STANDARD SCORE. In 1913 Elementary school standards : instruction, course of study, supervision, applied to New York City schools by Frank Morton McMurry has: "The book does not attempt to illustrate accurate measurement of educational results. It is scientific only in so far as it brings to bear organized knowledge and insight on an educational problem. Scientific measurement in education is, indeed, as yet too little developed to be applied to more than a very limited portion of the work of the elementary schools. Except for arithmetic and penmanship, ’standard scores' or standard achievements are not available for measuring the quality of the results actually attained by the schools; and even for penmanship and arithmetic, the standard measures for each grade are not yet firmly established" [University of Michigan Digital Library].

In 1921 Univ. Illin. Bur. Educ. Res. Bull. has: "Provision is made for comparing a pupil’s achievement score..with the norm corresponding to his mental age by dividing his achievement age by the standard score for his mental age. This quotient is called the Achievement Quotient" [OED].

Standard score is dated 1928 in MWCD10.

STANINE is a term first used to describe an examinee’s performance on a battery of tests constructed for the U. S. Army Air Force during World War II.

In a letter dated July 30, 1946, Laurance F. Shaffer, who had been a colonel in charge of Psychological Research Unit No. 1 (PRU #1) at Maxwell Field, Alabama, wrote:

The origin of the word is somewhat hazy. I have complete certainty only with regard to two facts: that the word was originated at PRU #1 at Maxwell Field, and that the date was in the month of February, 1942. According to PRU #1 tradition, the word first appeared in the form stand-nine as a shortening of the phrase standard nine-point scale that occurred in area directives. This was soon shortened to stannine (with a as in stand). Local tradition ascribed the origin of this term to Sol M. Roshal, who was noncom in charge of computations at that time. Fred Wickers has told me that he is very certain that I changed stannine to stanine with a as in stay) when I returned from my expedition to California, which took place in the middle of February, 1942. I do not remember this myself.
In a letter dated February 23, 1946, Frank A. Geldard, formerly a colonel in charge of the whole program and stationed in Texas, wrote:
Stanine is a portmanteau word deriving from "standard score on a nine point scale." It was a sheer "shorthand" invention on the part of an enlisted man in Psychological Research Unit No. 1, AAF Classification Center, Maxwell Field, Ala. The term came to have wide usage in the AAF, not only by psychologists, but by all who had occasion to refer to aptitude ratings for pilot, bombardier, and navigator training assignments. At first the word was resisted by psychologists, who felt that the term had little intrinsic logical meaning to recommend it. For a year or so after its invention official reports might not employ the word; it was regarded as inferior slang. Generality of usage within the AAF eventually forced its acceptance, however, and by the end of the war both technical and nontechnical papers on aircrew aptitude, standards of qualification, training programs, aircraft accidents, and a host of other topics, employed it as a "good" word. It avoided considerable circumlocution, and its meaning seems rarely to have been misunderstood.

Both of these letters were written to Atcheson L. Hench and appear in an article by him, "The Coining of ’Stanine'", in American Speech, February 1951.

The term STAR PRIME was coined in 1988 by Richard L. Francis (Schwartzman, p. 206).

STATIONARY STOCHASTIC PROCESS appears in the title of A Khintchine’s "Korrelationstheorie der Stationären Stochastischen Prozesse", Math. Ann. 109, (1934), p. 604.

H. Wold translated it as "stationary random process" (A Study in the Analysis of Stationary Time Series (1938)).

The phrase "stationary stochastic process" appears in J. L. Doob’s "What is a Stochastic Process?" American Mathematical Monthly, 49, (1942), 648-653.

An older term was "fonction éventuelle homogène," which appears in E. Slutsky’s "Sur les Fonctions Éventuelles Continues, Intégrables et Dérivables dans la Sens Stochastique", Comptes Rendues, 187, (1928), 878 [John Aldrich].


STATISTIC, STATISTICAL and STATISTICS. In the course of the 19th century statistics acquired its modern meaning(s). It is “the department of study that has its object the collection and arrangement of numerical facts or data, whether relating to human affairs or to natural phenomena” OR they are “numerical facts or data collected and classified.” The OED1 of the early 20th century also has statistical in the modern sense but its meanings for statistic are archaic. The recasting of statistic came later.

These words all come indirectly from the mediaeval Latin status for a political state. More directly statistics entered English from the German Statistik, as a term comparable to mathematics or ethics. The first citation in OED is W. Hooper’s translation of Bielfield’s Elementary Universal Education: "The science, that is called statistics, teaches us what is the political arrangement of all the modern states of the known world." (1770) However the work that did most to "naturalise" the term in the English language was Sir John Sinclair’s Statistical Account of Scotland (1791-9).

Webster’s dictionary of 1828 defined statistics as: "A collection of facts respecting the state of society, the condition of the people in a nation or country, their health, longevity, domestic economy, arts, property and political strength, the state of the country, &c." Statistical societies, like the Statistical Society of London (later Royal Statistical Society) founded in 1834, were established to discover such facts. Note that these facts were not necessarily, or even typically, numerical facts.

In the course of the 19th century statistics came to be confined to numerical facts but the facts did not have to pertain to public administration. The latter development is illustrated by a quotation from J. C. Maxwell Theory of Heat (1871) xxii. 288: “If however, we adopt a statistical view of the system, and distribute the molecules into groups . . .” [OED] This point of view became fixed in the phrase statistical mechanics. For this the OED cites J. W. Gibbs in Proc. Amer. Assoc. Adv. Sci. XXXIII, 1885, 57 (heading) “On the fundamental formula of statistical mechanics, with applications to astronomy and thermodynamics.”

The phrase "lies, damned lies and statistics," a back-handed tribute to the importance of statistics in political life, dates from the end of the 19th century. See Peter Lee’s Lies, Damned Lies and Statistics for a history of the phrase.

Statistic, signifying an individual fact, was rare before the 20th century. There is an example from 1853 in The United States illustrated edited by Charles Anderson Dana: “An old teamster with a dislodged wheel to his 'lumbery' vehicle, claimed a moment of our strength, and in return for that generosity, a la Jupiter, indulged our statistical curiosity with a few minutes of his local knowledge. The significant placing of his hand upon his pocket, as he proclaimed the fact that the bridge cost almost a quarter of a million dollars, plainly showed his appreciation of so vast a sum. Nor was the statistic of the bridge, being a mile in length, handed over to the fund of general information, without a look which plainly hinted of the many laggard walks it had cost him by the side of his sturdy team.” [University of Michigan Digital Library].

In the 20th century the singular form came to be accepted both in this sense and in another sense. In statistical theory R. A. Fisher used statistic to refer to a quantity derived from the observations--before settling on it he had used "statistical derivative" (1915), "derivate" (1920) and "statistical derivate" (1921). Fisher presented the new term in his "On the Mathematical Foundations of Theoretical Statistics", Philosophical Transactions of the Royal Society of London, Ser. A., 222, (1922), 309-368: "These involve the choice of methods of calculating from a sample statistical derivates, or as we shall call them statistics, which are designed to estimate the values of the parameters of the hypothetical population." (p. 318) The term parameter was also new and with statistic the two made a pair. (See the entry on parameter for Fisher’s reasoning.) Fisher called the statistics arising in estimation problems estimates. He had no name for statistics arising in testing but since the 1950s they have been called "test statistics."

Fisher’s term was not well-received initially. Arne Fisher (no relation) asked him, "Where ... did you get that atrocity, a statistic?" (letter (p. 312) in J. H. Bennett Statistical Inference and Analysis: Selected Correspondence of R. A. Fisher (1990).) Karl Pearson objected, "Are we also to introduce the words a mathematic, a physic, an electric etc., for parameters or constants of other branches of science?" (p. 49n of Biometrika, 28, 34-59 1936).

This entry was contributed by John Aldrich, based on G. U. Yule Introduction to the Theory of Statistics (1911) and David (2001). A complete list of the probability and statistics terms on this web site is here.

STATISTICAL TABLE. The OED notes an appearance in 1808 in Zebulon M. Pike’s An Account of Expeditions to the Sources of the Mississippi 1805-07 "A statistical table, on which he had in a regular manner taken the whole province of New Mexico,..giving latitude, longitude, and population." There is a livelier passage in Melville’s Moby Dick (1851) chapter ci, The Decanter:

Most statistical tables are parchingly dry in the reading; not so in the present case, however, where the reader is flooded with whole pipes, barrels, quarts, and gills of good gin and good cheer.

(Quotation provided by John W. McDonald III.)

Statistical tables giving integrals and other values used in statistical inference, are now used mainly by students but for most of the 20th century statisticians had to make constant reference to volumes of such tables. Karl Pearson created the specialised volume of statistical tables with his Tables for Statisticians and Biometricians (1914). According to Pearson, "What the true statistician, the true physicist demands" is "the conversion of algebraical results into tables." The next great work of tabling was R. A. Fisher and F. Yates’s Statistical Tables for Biological Agricultural and Medical Research (1938); the sixth edition of 1963 is available on the web. The last great set, the two-volume Biometrika Tables for Statisticians, was produced by Pearson’s son E. S. Pearson (with H. O. Hartley) and appeared in 1954/72.

STEM-AND-LEAF DISPLAYS were introduced by J. W. Tukey in "Some Graphic and Semigraphic Displays" in Statistical Papers in Honor of George W. Snedecor edited by T. A. Bancroft (1972). David (2001).

STEP FUNCTION is dated ca. 1929 in MWCD10.

STEREOGRAPHIC. According to Schwartzman (p. 207), "the term seems to have been used first by the Belgian Jesuit François Aguillon (1566-1617), although the concept was already known to the ancient Greeks."

In Flattening the Earth: Two Thousand Years of Map Projections, John P. Snyder attributes the term to d'Aguillon in 1613 [John W. Dawson, Jr.].

STIELTJES INTEGRAL. T. J. Stieltjes introduced the integral in his “Recherches sur les fractions continues,” Annales de la faculté des sciences de Toulouse Sér. 1, 8 no. 4 (1894) 1-122. The work was outside the mainstream of the theory of integration until M. Riesz’s “Sur les opérations fonctionnelles linéaires,” Comptes Rendus de l'Académie des Sciences, 149 (1909), 974—977. The relationship between l'intégrale de Stieltjes and the LEBESGUE INTEGRAL is considered in Lebesgue’s “Sur l'intégrale de Stieltjes et sur les opérations linéaires,” Comptes Rendus de l'Académie des Sciences 150 (1910), 86-88. See the epilogue on the Lebesgue-Stieltjes integral in T. Hawkins Lebesgue’s Theory of Integration, its Origins and Development (1975). [John Aldrich]


The terms STIRLING NUMBERS OF THE FIRST and SECOND KIND were coined by Niels Nielsen (1865-1931), who wrote in German "Stirlingschen Zahlen erster Art" [Stirling numbers of the first kind] and "Stirlingschen Zahlen zweiter Art" [Stirling numbers of the second kind]. Nielsen’s masterpiece, "Handbuch der Theorie der Gammafunktion" [B. G. Teubner, Leipzig, 1906], had a great influence, and the terms progressively found their acceptance (Julio González Cabillón).

John Conway believed the newer terms Stirling cycle and Stirling (sub)set numbers were introduced by R. L. Graham, D. E. Knuth, and O. Patshnik in Concrete Mathematics (Addison Wesley, 1989 & often reprinted).

STIRLING’S FORMULA. The asymptotic formula for n! appears in Example 2 to Proposition 28 of James Stirling’s Methodus Differentialis. sive Tractatus de Summatione et Interpolatione Serierum Infinitarum (1730) p. 136.

Lacroix used Théorème de Stirling in Traité élémentaire de calcul différentiel et de calcul intégral (1797-1800).

Stirling’s theorem is found in English in 1863 in The Mathematical and Other Writings of Robert Leslie Ellis. [Google print search]

Stirling’s formula is found in English in 1880 in A Treaise on the Calculus of Finite Differences by George Boole and John Fletcher Moulton. [Google print search]

Stirling’s approximation appears in 1938 in Biometrika [OED].

STOCHASTIC is found in English in 1662 with the meaning "pertaining to conjecture." (OED) However, despite an early link with probability (see the second quotation below), the term only entered the vocabulary of probability in the 20th century.

The modern re-birth of the term can be seen in the OED quotations. In 1917 Ladislaus Josephowitsch Bortkiewicz (1868-1931) used it in Die Iterationem p. 3: "Die an der Wahrscheinlichkeitstheorie orientierte, somit auf 'das Gesetz der Grossen Zahlen' sich gründende Betrachtung empirischer Vielheiten möge als Stochastik ... bezeichnet werden"  A. A. Tschuprow (Chuprov) put the term into English and explained Bortkiewicz’s choice of term: "I use the word ‘stochastical’ as synonymous to ‘based on the theory of probability’--cf. J. Bernoulli, Ars Conjectandi, Basileae, 1713, p. 213 ‘Ars Conjectandi sive Stochastice nobis definitur ars metiendi quam fieri potest exactissimi probabilitates rerum’ and L. v. Bortkiewicz, Die Iterationen." (Metron, 2 (1923), p. 461) See the translation of Part IV, Chapter II of Ars Conjectandi for the phrase "stochastic art." [John Aldrich]

STOCHASTIC PROCESS is found in A. N. Kolmogorov, "Sulla forma generale di un prozesso stocastico omogeneo," Rend. Accad. Lincei Cl. Sci. Fis. Mat. 15 (1) page 805 (1932) [James A. Landau].

Stochastic process is also found in A. Khintchine "Korrelationstheorie der Stationären Stochastischen Prozesse", Math. Ann. 109, (1934), p. 604 [James A. Landau].

Stochastic process occurs in English in J. L. Doob, "Stochastic processes and statistics," Proc. Natl. Acad. Sci. USA 20 (1934).


STOKES’S THEOREM was attributed to George Gabriel Stokes (1819-1903) by J. C. Maxwell in his A Treatise on Electricity and Magnetism (1873, p. 27), "This theorem was given by Professor Stokes, Smith’s Prize Examination, 1854, question 8." Maxwell had been a candidate that year! The question is reprinted in Mathematical and physical papers by George Gabriel Stokes, volume 5, p. 320. In a footnote, however, the editor of that volume reports that Lord Kelvin (William Thomson) had stated the theorem in a letter to Stokes in July 1850.

(Based on M. J Crowe A History of Vector Analysis (2nd edition, p. 147).)

STONE-WEIERSTRASS THEOREM is a generalisation of the WEIERSTRASS APPROXIMATION THEOREM given by M. H. Stone in his “Applications of the theory of Boolean rings to general topology,” Trans. Amer. Math. Soc., 41, (1937) pp. 375–481. See Enyclopedia of Mathematics.

STRAIGHT ANGLE appears in English in 1876 in Syllabus of Plane Geometry by the Association for the improvement of geometrical teaching: "When the arms of an angle are in the same straight line, the conjugate angles are equal, and each is then said to be a straight angle." [Google print search]

There are earlier citations in the OED for the term with the obsolete meaning of "a right angle."

The term STRANGE ATTRACTOR was coined by David Ruelle and Floris Takens in their classic paper "On the Nature of Turbulence" [Communications in Mathematical Physics, vol. 20, pp. 167-192, 1971], in which they describe the complex geometric structure of an attractor during a study of models for turbulence in fluid flow.

STRATIFIED SAMPLING occurs in J. Neyman, “On the two different aspects of the representative method; the method of stratified sampling and the method of purposive selection,” Journal of the Royal Statistical Society, 97, (1934), 558-625. The term “stratum” seems to have been first used in this connection by A. L. Bowley Elements of Statistics (4th edition) 1920: “It may happen … that the universe [population] consists of different regions or strata ..” (p. 332). [James A. Landau]



STRONG PSEUDOPRIME. According to Prime Numbers: A Computational Perspective by Carl Pomerance and Richard Crandall (page 124), "J. Selfridge proposed using Theorem 3.4.1 as a pseudoprime test in the early 1970s, and it was he who coined the term ’strong pseudoprime'" [Paul Pollack].

Strong pseudoprime is found in Pomerance, Carl; Selfridge, J.L.; Wagstaff, Samuel S. Jr. "The pseudoprimes to 25 x 109," Math. Comput. 35, 1003-1026 (1980).

STROPHOID appears in 1837 in Enrico Montucci, "Delle proprietà della strefoide, curva algebrica del terzo grado recentemente scoperta ed esaminata" ("On the property of the strophoid, an algebraic curve of the third degree recently discovered and examined"), Memoria letta nell'Accademia dei Fisiocratici ... con una appendice del Venturoli, Siena, G. Mucci, 1837 [Dic Sonneveld].

Strophoid was coined by Montucci in 1846, according to Smith (vol. 2, page 330).

The term STRUCTURE for isomorphic relations seems to have first appeared in print in Bertrand Russell’s Introduction to Mathematical Philosophy (1919). Russell probably had the term from Ludwig Wittgenstein, whose Tractatus logico-philosophicus (Logisch-philosophische Abhandlung, Vienna 1918, 4.1211 ff) was first published in 1921, and in 1922 in English. The first Structure in the modern sense -- as a tuple composed of sorts or carrier sets, relations, operations and distinguished elements -- was first used by David Hilbert in his Grundlagen der Geometrie (Göttingen 1899), there called a „Fachwerk oder Schema von Begriffen“ (p. 163, according to F. Kambartel Erfahrung und Struktur, Münster 1966). The concept of Structure developed via Rudolf Carnap’s Der logische Aufbau der Welt (1928), the linguistic and French philosophical Structuralism, the Éléments de mathématique of the N. Bourbaki group (Paris, since 1939), to Category Theory of Samuel Eilenberg and Saunders Mac Lane (1945). [This entry was contributed by Wolfram Roisch.]

STUDENT’S t-DISTRIBUTION. “Student” was the pen-name of William Sealy Gosset. (The name was originally written in quotation marks but these are now usually dispensed with.) Student once told R. A. Fisher, “I am sending you a copy of Student’s Tables as you are the only man that’s ever likely to use them!” (Letters from W. S. Gosset to R. A. Fisher, 1915-1936 (1970)). Student was quite wrong but his tables only came into wide use after Fisher reformulated Student’s statistic and developed the scheme of a family of distributions based on the normal distribution.

In his 1908 paper, “The Probable Error of a Mean”, Biometrika, 6, 1-25 Student introduced the statistic, z, for testing hypotheses on the mean of the normal distribution. Student’s z is proportional to the modern t with t = z √(n – 1). Student was not concerned to have a statistic that is asymptotically standard normal and when he estimated σ he used the divisor n, not the modern (n – 1). Fisher introduced the t form because it fitted in with a larger theory based on the notion of the number of DEGREES OF FREEDOM. Student seems to have introduced the t symbol in correspondence with Fisher in 1923. Fisher described Student’s distribution (and others based on the normal distribution) in “On a Distribution Yielding the Error Functions of Several well Known Statistics”, Proceedings of the International Congress of Mathematics, Toronto, 2, 805-813. In that paper he used the symbol t. A new symbol suited Fisher for he could use z for a statistic of his own (see the entries for z and for F). For further information see C. Eisenhart “On the Transition from Student’s z to Student’s t,” The American Statistician, 33, (1979), pp. 6-10.

In 1925 Fisher’s version of Student’s distribution took off. Fisher’s Statistical Methods for Research Workers presented new uses for the tables and made the tables generally available. His “Applications of ’Student’s' Distribution”, Metron, 5, 90-104 provided the theory of the applications. The familiar terminology soon developed. In 1925 Fisher referred to “Student’s distribution” (without the “t”). The phrase “Student’s” t-distribution appears in 1929 in Nature (OED). The phrase “t distribution” appears in A. T. McKay, “Distribution of the coefficient of variation and the extended ‘t’ distribution,” J. Roy. Stat. Soc., 95, (1932). The term “t-test” is found in 1932 in the fourth edition of Fisher’s Statistical Methods for Research Workers: “The validity of the t-test, as a test of this hypothesis, is therefore absolute” (p. 116) (OED).

This entry was contributed by John Aldrich. See SMALL SAMPLE problem.

STUDENTIZATION. According to Hald (1997, p. 669), Student, i.e. William Sealy Gossett (1876-1937), used the term Studentization in a letter to E. S. Pearson of Jan. 29, 1932.

At a meeting in 1934 R. A. Fisher described the relation between his work and Student’s: "It was "Student" himself who took the really novel step, which had in fact revolutionized the theory of errors.... All that he [Fisher] had added to it was to "studentize" a number of analogous problems..." (Journal of the Royal Statistical Society, 97, p. 619)

Studentized D2 statistic is found in R. C. Bose and S. N. Roy, "The exact distribution of the Studentized D2 statistic," Sankhya 3 pt. 4 (1935) [James A. Landau].

STURM’S THEOREM appears in 1836 in the title Du Theoreme de M. Sturm, et de ses Applications Numeriques by M. E. Midy [James A. Landau].

Sturm’s theorem appears in English in 1841 in the title Mathematical Dissertations, for the use of students in the modern analysis; with improvements in the practice of Sturm’s Theorem, in the theory of curvature, and in the summation of infinite series by J. R. Young [James A. Landau].

SUBFACTORIAL was introduced in 1878 by W. Allen Whitworth in Messenger of Mathematics (Cajori vol. 2, page 77).

SUBFIELD is found in "On the Base of a Relative Number-Field, with an Application to the Composition of Fields," G. E. Wahlin, Transactions of the American Mathematical Society, Vol. 11, No. 4. (Oct., 1910).

SUBGROUP. Felix Klein used the term untergruppe.

Subgroup appears in 1881 in Arthur Cayley, "On the Schwarzian Derivative, and the Polyhedral Functions," Transactions of the Cambridge Philosophical Society: "But there is no sub-group of an order divisible by 5; and hence, these two transformations being identified with the two substitutions, the other transformations correspond each of them to a determinate substitution" [University of Michigan Historical Math Collection].

SUBRING is found in English in 1937 in the phrase invariant subring in Modern Higher Algebra (1938) by A. A. Albert [OED].

SUBSET. Cantor used the word subset (in the sense that "proper subset" is now used) in "Ein Beitrag zur Mannigfaltigkeitslehre," Journal für die reine und angewandte Mathematik 84 (1878).

Subset occurs in English in "A Simple Proof of the Fundamental Cauchy-Goursat Theorem," Eliakim Hastings Moore, Transactions of the American Mathematical Society, Vol. 1, No. 4. (Oct., 1900).

SUBTANGENT. Huygens coined “soutangente” in his “Traité de la lumière” (1690), p. 173, and used it again in his letter to Leibniz, December 19, 1690 (see G. W. Leibniz, Sämtliche Schriften und Briefe, series III, vol. 4, p. 684); Leibniz started to use “soutangente” in his letter to Huygens, February 20/ March 2, 1691 (vol. 5, p. 63). In October 1691 he sent a paper to Huygens: “Methodus qua innumerarum Linearum Constructio ex data proprietate Tangentium seu aequatio inter Abscissam et Ordinatam ex dato valore Subtangentialis, exhibetur” (vol. 5, p. 181).

Vol. 5 is online here.

In his article “Supplementum geometriae practicae” (Acta Eruditorum, April 1693, 178-180; GM 5, 285-288) Leibniz wrote: “subtangentialis (ut Hugeniano verbo utar) seu portio axis intercepta inter tangentem & ordinatam sit t” (GM 5, 287).

[This entry was contributed by Siegmund Probst.]

SUBTRACT. When Fibonacci (1201) wishes to say "I subtract," he uses some of the various words meaning "I take": tollo, aufero, or accipio. Instead of saying "to subtract" he says "to extract."

In English, Chaucer used abate around 1391 in Treatise on the Astrolabe: "Abate thanne thees degrees And minutes owt of 90" [OED].

In a manuscript written by Christian of Prag (c. 1400), the word "subtraction" is at first limited to cases in which there is no "borrowing." Cases in which "borrowing" occurs he puts under the title cautela (caution), and gives this caption the same prominence as subtractio.

In Practica (1539) Cardano used detrahere (to draw or take from).

In 1542 in the Ground of Artes Robert Recorde used rebate: "Than do I rebate 6 out of 8, & there resteth 2."

In 1551 in Pathway to Knowledge Recorde used abate: "Introd., And if you abate euen portions from things that are equal, those partes that remain shall be equall also" [OED].

Digges (1572) writes "to subduce or substray any sume, is wittily to pull a lesse fro a bigger number."

Schoner, in his notes on Ramus (1586 ed., p. 8), uses both subduco and tollo for "I subtract."

In his arithmetic, Boethius uses subtrahere, but in geometry attributed to him he prefers subducere.

The first citation for subtract in the OED is in 1557 by Robert Recorde in The whetstone of witte: "Wherfore I subtract 16. out of 18."

Hylles (1592) used "abate," "subtact," "deduct," and "take away" (Smith vol. 2, pages 94-95).

From Smith (vol. 2, page 95):

The word "subtract" has itself had an interesting history. The Latin sub appears in French as sub, soub, sou, and sous, subtrahere becoming soustraire and subtractio becoming soustraction. Partly because of this French usage, and partly no doubt for euphony, as in the case of "abstract," there crept into the Latin works of the Middle Ages, and particularly into the books printed in Paris early in the 16th century, the form substractio. From France the usage spread to Holland and England, and form each of these countries it came to America. Until the beginning of the 19th century "substract" was a common form in England and America, and among those brought up in somewhat illiterate surroundings it is still to be found. The incorrect form was never popular in Germany, probably because of the Teutonic exclusion of international terms.
SUBTRACTION. Fibonacci (1201) used extractio.

Tonstall (1522) devoted 15 pages to Subductio. He wrote, "Hanc autem eandem, uel deductionem uel subtractionem appellare Latine licet" (1538 ed., p. 23; 1522 ed., fol. E 2, r).

Gemma Frisius (1540) has a chapter De Subductione siue Subtractione.

Clavius (1585 ed., p. 26) says "Subtractio est ... subductio."

See also ADDITION.

SUBTRAHEND is an abbreviation of the Latin numerus subtrahendus (number to be subtracted).

SUCCESSIVE INDUCTION. This term was suggested by Augustus De Morgan in his article "Induction (Mathematics)" in the Penny Cyclopedia of 1838. See also MATHEMATICAL INDUCTION, INDUCTION, COMPLETE INDUCTION.

SUFFICIENCY, SUFFICIENT STATISTIC and CRITERON OF SUFFICIENCY all appear in 1922 in R. A. Fisher’s "On the Mathematical Foundations of Theoretical Statistics", Philosophical Transactions of the Royal Society of London, Ser. A, 222, 309-368:

The statistic chosen should summarise the whole of the relevant information supplied by the sample. This may be called the Criterion of Sufficiency. (p. 316)

In the case of the normal curve of distribution it is evident that the second moment is a sufficient statistic for estimating the standard deviation. (p. 359)

The term sufficient statistic is more prominent in section 3 of Fisher’s Statistical Methods for Research Workers and section 9 of "Theory of Statistical Estimation" both published in 1925.

The concept of sufficiency was already emerging in 1920 (p. 769) when Fisher wrote of the sample variance that "The whole of the information respecting σ, which a sample provides is summed up [in its value]."

This entry was contributed by John Aldrich. See also PETERS’ METHOD.

SUM. Nicolas Chuquet used some in his Triparty en la Science des Nombres in 1484.

The term SUMMABLE (referring to a function that is Lebesgue integrable such that the value of the integral is finite) was introduced by Lebesgue (Klein, page 1045).

SUPPLEMENT. "Supplement of a parallelogram" appears in English in 1570 in Sir Henry Billingsley’s translation of Euclid’s Elements.

In 1704 Lexicon Technicum by John Harris has "supplement of an Ark."

In 1796 Hutton Math. Dict. has "The complement to 180° is usually called the supplement.

In 1798 Hutton in Course Math. has "supplemental arc" (one of two arcs which add to a semicircle) [OED].

Supplement II to the 1801 Encyclopaedia Britannica has, "The supplement of 50° is 130°; as the complement of it is 40 °" [OED].

In 1840, Lardner in Geometry vii writes, "If a quadrilateral figure be inscribed in a circle, its opposite angles will be supplemental" [OED].

Supplementary angle is found in 1820 in the fourth edition of Elements of Geometry and Plane Trigonometry by John Leslie. [Google print search]

SURD. According to Smith (vol. 2, page 252), al-Khowarizmi (c. 825) referred to rational and irrational numbers as 'audible' and 'inaudible', respectively.

The Arabic translators in the ninth century translated the Greek rhetos (rational) by the Arabic muntaq (made to speak) and the Greek alogos (irrational) by the Arabic asamm (deaf, dumb). See e. g. W. Thomson, G. Junge, The Commentary of Pappus on Book X of Euclid’s Elements, Cambridge: Harvard University Press, 1930 [Jan Hogendijk].

This was translated as surdus ("deaf" or "mute") in Latin.

As far as is known, the first known European to adopt this terminology was Gherardo of Cremona (c. 1150).

Fibonacci (1202) adopted the same term to refer to a number that has no root, according to Smith.

Surd is found in English in Robert Recorde’s The Pathwaie to Knowledge (1551): "Quantitees partly rationall, and partly surde" [OED].

According to Smith (vol. 2, page 252), there has never been a general agreement on what constitutes a surd. It is admitted that a number like sqrt 2 is a surd, but there have been prominent writers who have not included sqrt 6, since it is equal to sqrt 2 X sqrt 3. Smith also called the word surd "unnecessary and ill-defined" in his Teaching of Elementary Mathematics (1900).

G. Chrystal in Algebra, 2nd ed. (1889) says that "...a surd number is the incommensurable root of a commensurable number," and says that sqrt e is not a surd, nor is sqrt (1 + sqrt 2).

The term SURFACE INTEGRAL was used in 1873 by James Clerk Maxwell in a Treatise on Electricity and Magnetism, p. 12 in the paragraph "Line-integration appropriate to forces, surface-integration to fluxes." The OED’s earliest reference is to Arthur Cayley in 1875 Math. Papers IX. p. 321 "On the Prepotential Surface-integral." The concept is much older: see the entry DIVERGENCE THEOREM.


The term SURREAL NUMBER was introduced by Donald Ervin Knuth (1938- ) in his book Surreal numbers: How two ex-students turned on to pure mathematics and found total happiness (1974). John Horton Conway (1937-2020), who introduced the concept, later wrote, “I wish I'd invented that name.” For further information see Surreal Numbers.

SURVIVAL FUNCTION. David (2001) gives E. L. Kaplan and Paul Meier "Nonparametric Estimation from Incomplete Observations," Journal of the American Statistical Association, 53, (1958), 457-481. However a JSTOR search found earlier occurrences in the writings of Alfred J. Lotka, e.g. p(a), the function denoting the probability, at birth, of surviving to age a, is called the life table (survival function) in his "Biometric Functions in a Population Growing in Accordance with a Prescribed Law," Proceedings of the National Academy of Sciences, 15, (1929), 793-798 and the survival function in his "A Contribution to the Theory of Self-Renewing Aggregates, With Special Reference to Industrial Replacement," Annals of Mathematical Statistics, 10, (1939), 1-25. The mathematical treatment of "survivorship" is much older. See chapter 25 of Hald (1990) on "The Insurance Mathematics of de Moivre and Simpson, 1725-1756."


SYLOW’S THEOREM refers to a result in Ludwig Sylow’s "Théorèmes sur les groupes de substitutions," Mathematische Annalen, 5, (1872), pp. 584-594.  The expression Sylow’s Theorem is found in German in G. Frobenius, "Neuer Beweis des Sylowschen Satzes," Journ. Crelle, 100, (1887), pp. 179-181 [Dirk Schlimm].

Sylow’s Theorem is found in English in 1893 in W. Burnside’s "Notes on the Theory of Groups of Finite Order. I: On the Proof of Sylow’s Theorem. II: On the Possibility of Simple Groups whose Orders are the Products of Four Primes," Proceedings of the London Mathematical Society XXV pp. 9-18. [OED].

The term SYMMEDIAN was introduced in 1883 by Philbert Maurice d'Ocagne (1862-1938) [Clark Kimberling].

SYMMEDIAN POINT. Emil Lemoine (1840-1912) used the term center of antiparallel medians.

The proposal to name the point after Ernst Wilhelm Grebe (1804-1874) came from E. Hain ("Ueber den Grebeschen Punkt," Archiv der Mathematik und Physik 58 (1876), 84-89). Afterwards, the term Grebe’schen Punkt appeared many times in the Jahrbuch ueber die Fortschritte der Mathematik by reviewers such as Dr. Schemmel (Berlin, 1875), Prof. Mansion (Gent, 1881), Prof. Lampe (Berlin, 1881), and Dr. Lange (Berlin, 1885) [Peter Schreiber, Julio González Cabillón].

In 1884, Joseph Jean Baptiste Neuberg (1840-1926) gave it the name Lemoine point, for Emile Michel Hyacinthe Lemoine (1840-1912).

The point was thus called the Lemoine point in France and the Grebe point in Germany [DSB].

Symmedian point was coined by Robert Tucker (1832-1905) in the interest of uniformity and amity.

SYMMETRIC (Of a binary relation) Bertrand Russell wrote in "On the Notion of Order," Mind, 10, (1901), p. 32. "When ARB implies BRA, I call R symmetrical relation."

SYMMETRIC DIFFERENCE (of sets). The OED cites M. H. Stone "Theory of representations for Boolean Algebras," Trans. Amer. Math. Soc. XL. (1936), p. 38: "The Union (modulo 2), or symmetric difference, of two classes is the class of objects belonging to one or the other, but not to both, of those classes."

SYMMETRIC FUNCTION appears in S. F. Lacroix, Traité de calcul differéntiel et du calcul intégral, vol. 1 (1797), p. 277: "les fonctions dont je parle, son celles qui renferment toutes les racines [d'une équation] combinées d'une manière semblable, soit entr'elles, soit avec d'autres quantités, et que pour cela je nommerai fonctions symétriques" [Joao Caramalho Domingues].

The term SYMPLECTIC GROUP was proposed in 1939 by Herman Weyl in The Classical Groups. He wrote on page 165:

The name "complex group" formerly advocated by me in allusion to line complexes, as these are defined by the vanishing of antisymmetric bilinear forms, has become more and more embarrassing through collision with the word "complex" in the connotation of complex number. I therefore propose to replace it by the corresponding Greek adjective "symplectic." Dickson calls the group the "Abelian linear group" in homage to Abel who first studied it.
[This information was provided by William C. Waterhouse.]

According to Lectures on Symplectic Geometry by Ana Cannas da Silva, "the word symplectic in mathematics was coined by Weyl who substituted the Greek root in complex by the corresponding Latin root, in order to label the symplectic group. Weyl thus avoided that this group connoted the complex numbers, and also spared us from much confusion had the name remained the former one in honor of Abel: abelian linear group."

SYNTHETIC DIVISION is found in 1850 in Theoretical and Practical Treatise on Algebra by Horatio Nelson Robinson: "This last operation is called synthetic division." [Google print search]

SYNTHETIC GEOMETRY appears in Gigon, "Bericht über: Jacob Steiner’s Vorlesungen über synthetische Geometrie, bearbeitet von Geiser und Schröter," Nouv. Ann. (1868).

Synthetic geometry appears in English in 1870 in Report on educationby John Wesley Hoyt, published by the U. S. Government Printing Office: "First year’s course in mathematical section. Theory of numbers; differential and integral calculus; theory of functions, with repetitions; analytical geometry of the plane; experimental physics, with repetitions; experimental chemistry, with repetitions; descriptive geometry, with exercises and repetitions; synthetic geometry; machine-drawing" [University of Michigan Digital Library].

The term SYSTEM OF EQUATIONS is found in 1843 in "Chapters in the Analytical Geometry of (n) Dimensions" by Arthur Cayley in the Cambridge Mathematical Journal, vol. IV: "On the determination of linear equations in x1, x2,..., xnwhich are satisfied by the values of these quantities derived from given systems of linear equations." [This citation, from the University of Michigan Digital Library, is a chapter title and thus appears in italics in the original.]

SYZYGY was coined as a mathematical term by James Joseph Sylvester. The word appears in 1850 in Cambr. & Dubl. Math. Jrnl. V. 276: "The members of any group of functions, more than two in number, whose nullity is implied in the relation of double contact, ... must be in syzygy. Thus PQ, PQR, QR, must form a syzygy." [OED]

Front - A - B - C - D - E - F - G - H - I - J - K - L - M - N - O - P - Q - R - S - T - U - V - W - X - Y - Z - Sources