*Last revision: July 7, 2017*

**BAIRE CATEGORY.** The notions of sets of the first and second category were introduced by
René Baire in his
“Sur les fonctions de variables réelles,” *Annali di
Matematica Pura ed Applicata* (3) 3 (1899), p. 65 according to
Kuratowski *Topologie I* (1933, chapter I, §10, p. 43).
J. L. Kelley *General
Topology* 1955 uses “meager” instead of the first category. See the
entry in the *Encyclopedia of Mathematics*.

**BAIRE CLASSES (of functions).** These concepts were introduced by
René Baire
in his “Sur la théorie des fonctions discontinues,”
*Comptes rendus,* **129**, (1899), 1010-1013.
The phrase appears in the title of C. de la Vallé Poussin’s
*Intégrales de Lebesgue, Fonctions d'ensemble, Classes de Baire* (Paris, 1916). See the entry in the
*Encyclopedia of Mathematics*.

**BANACH SPACE.** In their treatise *Linear Operators Part I* (1957) N. Dunford
& J. T. Schwartz write, "Axioms closely related to those of a normed linear
space were introduced, in 1916, by Bennett ... In 1922, Banach, Hahn and Wiener
published papers using the same or similar sets of axioms. Though Banach did
not initiate the study of these spaces, his contributions were many and deep--for
that reason many authors use the term *Banach space* to refer to a complete
normed linear space." (p. 85)

Stefan Banach’s (1892-1945) paper of 1922,
“Sur les
opérations dans les ensembles abstraits et leur application aux équations
integrales”, *Fundamenta Mathematicae*,
**3**, 133-181, was based on the thesis he submitted in 1920. In 1928 in
*Les Espaces Abstraits* Maurice Fréchet (1878-1973) wrote about
"les espaces de M. Banach." In his own
*Théorie des Operations Linéaires* (1932,
ch. IV, p. 53) Banach used the term "espace du type (B)."
A *JSTOR *search
finds *Banach space* being used in 1934 T. H. Hildebrandt’s
"On Bounded Linear Functional Operations," *Transactions
of the American Mathematical Society*, **36**, 868-875 and the expresssion
soon came into general use.

[This entry was contributed by John Aldrich.]

**BANACH-STEINHAUS THEOREM** is due to
Stefan Banach and
Hugo Steinhaus,
“Sur le principe de la condensation de singularités”
*Fund. Math.*, **9**, (1927), 50–61.
See Enyclopedia of
Mathematics: Banach-Steinhaus theorem.

The

The tag *paradox* seems to have become attached to the
result in the 1940s. A JSTOR search found L. M. Blumenthal "A Paradox,
a Paradox, a Most Ingenious Paradox," *American Mathematical Monthly*,
**47**, (1940), pp. 346-353. (The title is taken from Gilbert and Sullivan’s
story of Frederic, who being born on February 29^{th} had celebrated
only 5 bithdays by the time he was 21:
A most ingenious paradox.) For
Blumenthal the theorem is a paradox, not because it embodies a contradiction,
but because it goes against common sense notions about congruence.
The word is also found in the title of Wacław Sierpinski’s
“Sur le paradoxe de
MM. Banach et Tarski”, *Fundamenta Mathematicae* **33**, 229-234 (1945).

See AXIOM OF CHOICE, HAUSDORFF PARADOX and PARADOX.

**The term BAR CHART** occurs in Nov. 1914 in W. C. Brinton "Graphic Methods for Presenting Data. IV. Time Charts,"
*Engineering Magazine,* 48, 229-241 (David, 1998).

The diagram itself is much older and seems to have been introduced by
William Playfair.
There is an example in his *Commercial and Political Atlas* of 1786.

**BAR GRAPH** is found in 1919 in *School Statistics and Publicity*
by Carter Alexander:
"The data shown in this circle graph for Rockford may be presented in a bar graph which permits of
placing the figures so they can be added." [Google print search]

**BARTLETT ADJUSTMENT or CORRECTION** is a correction factor applied to the likelihood ratio test statistic
to make its distribution under the null hypothesis conform better to the asymptotic
chi-squared form. It was proposed in 1937 by
M. S. Bartlett
"Properties of Sufficiency and Statistical Tests," *Proceedings
of the Royal Society of London. A*, **160**, 268-282 but it has received
most attention in the last 20 years.

The term **BARYCENTRIC CALCULUS** appears in 1827 in the title
*Der barycentrische calkul* by August Ferdinand Möbius
(1790-1868).

**BASE (of a geometric figure)** appears in English in 1570 in Sir
Henry Billingsley’s translation of Euclid’s *Elements* (OED2).

**BASE (in an isosceles triangle)** is found in English in 1571 in
Digges, *Pantom.*: "Isoscheles is such a Triangle as hath onely
two sides like, the thirde being vnequall, and that is the Base"
(OED2).

**BASE (in logarithms)** appears in *Traité
élémentaire de calcul différentiel et de calcul
intégral* (1797-1800) by Lacroix: "Et si *a*
désigne la base du système, il en résulte
l'équation *y* = *a ^{x},* dans laquelle les
logarithmes sont les abscisses."

*Base* is found in the 1828 *Webster* dictionary, in the
definition of *radix*: "2. In logarithms, the base of any system
of logarithms, or that number whose logarithm is unity."

**BASE (of a number system).** *Radix* was used in the sense
of a base of a number system in 1811 in *An Elementary
Investigation of the Theory of Numbers* by Peter Barlow [James A.
Landau].

*Base* is found in the *Century Dictionary* (1889-1897):
"The base of a system of arithmetical notation is a number the multiples of whose powers
are added together to express any number; thus, 10 is the base of the decimal system of arithmetic."

**BASE ANGLE.** A Google print search finds a perhaps non-mathematical use in 1718 in *An Essay On The Ancient and Modern Use of Armories*:
“*Batton,* the Diminutive of the *Bendlet,* the Diminutive of the *Bend,* sometimes called a *Ribbon,* which passes from the right chief Angle of the Shield to the sinister base Angle, as that one, which bruses the Lion of Abernethy, in the Arms of the Duke of Douglass. If it proceed from the left chief Angle to the right base Angle, ‘tis then a *Batton* sinister, a Mark of Illegitimation;” [James A. Landau]

*Base angle* is found in a May 1804 article in *A Journal of Natural Philosophy, Chemistry, and the Arts,*
“On the Figure of the Earth” by Peregrinus Proteus: “...let the colatitudes of the two places, and their difference of longitude form the sides, and contained angle of a spherical triangle, of which find the base angle at the place whose latitude is λ,” [Google print search,
James A. Landau]

**BASIS (of a vector space).** The term *basis*
was used by Frobenius and Stickelberger in 1879 in
Ueber Gruppen von vertauschbaren Elementen in *Crelle’s Journal 86,* page 219.
[Jan Peter Schäfermeyer]

**BAYES** and **BAYESIAN.**
Thomas Bayes
(1702-1761) and his single work on probability, the posthumously published
An Essay towards solving a Problem in the Doctrine of
Chances (*Philosophical Transactions of the Royal Society of London*
**53** (1763), 370-418), have inspired several terms. Some, including "Bayes’s
theorem," have only a tenuous connection to Bayes.

The *Essay* considers the problem: "*Given*
the number of times in which an unknown event has happened and failed: *Required*
the chance of the probability of its happening in a single trial lies somewhere
between any two degrees of probability that can be named." Or, in modern
terms: given the outcomes of a number of Bernoulli trials, find the posterior
distribution of the probability of a success. For the prior Bayes took a uniform
distribution on the unit interval.

The essay is difficult and little is known of its background.
Historians have asked, "Was Bayes a Bayesian?" (D. A. Gillies *Historia
Mathematica*, **14**, (1987), 325-346) and "Who discovered Bayes’s
Theorem?" (reprinted in Stigler (1999)). A. I. Dale’s *A History of Inverse
Probability from Thomas Bayes to Karl Pearson* (2^{nd} edition, 1999)
is useful for the changing interpretations. *Inverse Probability* was the
term used in the 19^{th} and early 20^{th} centuries for the
probability found when reasoning from effects to causes, *direct* probabilities
being used when reasoning from causes to effects. Throughout this period inference
based on inverse probability (now called Bayesian inference) coexisted with
inference based on procedures with 'good' repeated sampling properties (now
called classical inference).

Bayes’s effort was soon superseded by Laplace’s more general
and apparently more powerful work; the first instalment was the "Mémoire
sur la Probabilité des Causes par les événements," *Savants étranges*
**6**, (1774), p. 621-656. *Oeuvres*
**8**, pp. 27-65. (English translation and commentary by S. M. Stigler in
*Statistical Science*, **1**, (1986),
359-378). In 1838 Augustus De Morgan was writing, "This [inverse] method
was first used by the Rev. T. Bayes ... [who], though almost forgotten, deserves
the most honourable remembrance from all who treat the history of this science."
(*An Essay on Probabilities*, p. vii.)

Today the terms **Bayes’s Formula, Rule** and **Theorem**
are associated with a basic theorem on conditional probability. "La règle
de Bayes" appears with this meaning in 1843 in
A. A. Cournot’s
Exposition
de la Théorie des Chances et des Probabilités
(pp. 158-9). Cournot says the rule is "attributed to Bayes." It is
*not* in the *Essay* but comes from Laplace: it is his VI^{th}
principle for the case where the causes are unequally probable: see *Théorie
Analytique des Probabilités* (1814, pp. xiv-xv in the edition on
Gallica.)
Laplace’s terminology, of "cause" or "hypothesis" for the event whose conditional
probability is to be found, survived well into the twentieth century. Thus in
J. L. Coolidge’s *An Introduction to Mathematical Probability *(1925) "Bayes'
Principle" sits in the chapter on "the probability of causes."
However "der Bayessschen Satz" in A. N. Kolmogorov’s *Grundbegriffe
der Wahrscheinlichkeitsrechnung* (1933, p. 46) is just a theorem about events.

**Bayes’s Theorem ** has also been used in a more historically accurate way. In J. W. Lubbock
& J. E. Drinkwater-Bethune’s *On Probability* (1830, p. 48) it refers
to Bayes’s original problem and solution, as it does in Isaac Todhunter’s authoritative
*A History
of the Mathematical Theory of Probability* (1865, p. 299). Karl Pearson
("On the Influence of Past Experience on Future
Expectation," *Philosophical Magazine*, **13**, (1907), 365-378)
and R. A. Fisher took this usage into the 20^{th} century: see e.g.
Fisher’s "On
the Mathematical Foundations of Theoretical Statistics" (*Phil. Trans. R. Soc.*
1922, p. 324). While Pearson gave qualified approval to Bayes’s theorem, Fisher rejected it outright
and was its most persistent critic. This use of "Bayes’s theorem"
(and "Bayes’s postulate" for the uniform prior) appears to have lapsed.

**Bayes Estimate, Bayes Risk** and **Bayes Solution** are terms used in
Abraham Wald’s
classical (= non-Bayesian) statistical decision theory. Wald ("Contributions to the Theory of Statistical
Estimation and Testing Hypotheses," *Annals of Mathematical Statistics,*
**10**, (1939), 299-326) found it "useful" to consider "hypothetical
a priori distributions" of the parameter (p. 306). Wald used the term "Bayes
solution" in his "An Essentially Complete Class of Admissible Decision
Functions" *Annals of Mathematical Statistics,* **18**, (1947),
549-555. In 1948 J. L. Hodges & E. L. Lehmann (*Annals of Mathematical
Statistics,* **19**, 396-407) used the term "Bayes risk" for
a concept Wald had treated in 1939 without naming it. In "Some Problems
in Minimax Point Estimation," *Annals of Mathematical Statistics,*
**21**, (1950), 182-197, they renamed the "minimum risk estimate"
of 1939 the "Bayes estimate."

The term **Bayesian** entered circulation around 1950.
R. A. Fisher used it in the notes
he wrote to accompany the papers in
his *Contributions to Mathematical Statistics* (1950). Fisher thought
Bayes’s argument was all but extinct for the only recent work to take it seriously
was Harold Jeffreys’s *Theory of Probability* (1939). In 1951
L. J. Savage,
reviewing Wald’s *Statistical
Decisions Functions*, referred to "modern, or unBayesian, statistical
theory" ("The Theory of Statistical Decision," *Journal of
the American Statistical Association*, **46**, p. 58.). Soon after, however,
Savage changed from being an unBayesian to being a Bayesian. While the 1960s
would bring a new enthusiasm for inverse probability, this did not extend to
the name. For Jeffreys (p. 29) "the chief rule involved in the process
of learning from experience." was "the principle of inverse probability,
first given by Bayes." But the term "inverse probability" fell
out of use and "Bayesian" started to appear: new works had titles
like *Introduction to Probability and Statistics from a Bayesian View point*
(D. V. Lindley, 1965).

**Empirical Bayes.** The term and the method are due to H. Robbins, “An
Empirical Bayes Approach to Statistics,” *Proceeding
of the Third Berkeley Symposium on Mathematical Statistics, volume 1*,
(1956), 157-163. (David (1998).)

**Bayes Factor** appears in I. J. Good’s 1958 "Significance Tests in Parallel and in Series," *Journal
of the American Statistical Association*, **53**, 799-813. Previously
in his *Probability and the Weighing of Evidence* (1950) Good had used
the term "factor" explaining that "Dr. A. M. Turing suggested
in a conversation in 1940 that the word 'factor' should be regarded as a technical
term ... and that it could be more fully described as *the factor in favour
of the hypothesis H in virtue of the result of the experiment.*" Jeffreys
had introduced this factor, denoting it by *K* but not giving it a name.
(*Theory of Probability* (1939, chapter V.))

See also CLASSICAL STATISTICAL INFERENCE, DECISION THEORY, INVERSE PROBABILITY, LIKELIHOOD, POSTERIOR & PRIOR, PRINCIPLE OF INDIFFERENCE, PROBABILITY and RULE OF SUCCESSION.

[This entry was contributed by John Aldrich, based on
Dale (op. cit.), Hald (1998), David (1995) and David (2000). For further information
see Stephen. E. Fienberg
When did
Bayesian Inference become "Bayesian"? *Bayesian Analysis* (2006).]

**BEHRENS-FISHER DISTRIBUTION, PROBLEM** and **TEST.** In 1929 W. V. Behrens (1902-1962) published
a significance test for the difference between means of random samples from two
Normal populations with unequal variances: Ein Beitrag zur Fehlerberechnung
bei wenigen Beobachtungen, *Landwirtschaftliche
Jahrbücher*, **68**, 807-837. Behrens told
R.
A. Fisher about his work—see p. 53 of J. H. Bennett
*Statistical Inference and Analysis: Selected
Correspondence of R. A. Fisher* (1990)—but then played no further
part in its story. In 1935 Fisher began writing about Behrens’s test as an
application of his new theory of fiducial inference (see
The Fiducial
Argument in Statistical Inference.) Both the general theory and the Behrens
application aroused controversy. The label “Behrens-Fisher” entered circulation
around 1940: M. S. Bartlett referred to the *test* in
“Complete Simultaneous Fiducial Distributions,”
*Annals of Mathematical Statistics*, **10**, (1939), 129-138, Harold Jeffreys to the *formula* in “Note on the
Behrens-Fisher formula,” *Annals of Eugenics*, **10**, (1940), 48-51 and Henry
Scheffé to the *problem* in “On
Solutions of the Behrens-Fisher Problem, Based on the
*t*-Distribution,” *Annals of Mathematical Statistics*, **14**, (1943), 35-44.
This entry was contributed by John Aldrich. See also FIDUCIAL PROBABILITY>.

**BELL-SHAPED.** *Bell-shaped* is found in 1785 in
*Planting and Ornamental Gardening: A Practical Treatise*
by William Marshall. He refers to bell-shaped flowers.

**BELL-SHAPED PARABOLA** appears in 1857 in Mathematical Dictionary and Cyclopedia of Mathematical Science.
The equation is ay^{2} - x^{2} + bx^{2} = 0.

**BELL-SHAPED **and **BELL CURVE** as descriptions of the
NORMAL or GAUSSIAN density.
It has become a cliché to describe the graph of the normal
density as "bell-shaped". This is a relatively recent phenomenon, given that
the distribution was first studied in the 1730s and was used throughout the
19^{th} and early 20^{th} centuries in the theory of errors,
the theory of gases and statistical theory.

"La surface S, en forme de *cloche*" appears in Esprit
Pascal Jouffret‘s "Etude sur l’effet utile du tir" (1872) as a description
of the *bivariate* normal density with independent components. (Kruskal
& Stigler’s "Normative Terminology" (1997 and reprinted in Stigler
(1999)). However, Jouffret and the bell surface made no permanent impression
and neither is mentioned in the chapter on the bivariate normal distribution
(ch. IX "Erreurs de situation d’un point") in Bertrand’s
*Calcul
des Probabilités* (1889).

"Bell-shaped curve" is found in Francis Galton’s *Catalogue
of the Special Loan Collection of Scientific Apparatus at the South Kensington
Museum* (1876). (David (1998)) However, Galton did not use this description
in his books and articles which made so much of the normal distribution. Nor
did other principals in the English statistical tradition, Karl Pearson, Yule
and Fisher. F. Y. Edgeworth was the only statistician of that era to regularly
use *a* visual analogy. This was the *gend’arme’s hat*, which he attributed
to "a lively French statistician"; see "The Statistics of Examinations,"
*Journal of the Royal Statistical Society*,
**51**, (1888), p. 600. The hat is flatter than a bell,
but, hat or bell, it is only a matter of scaling.

A *JSTOR* search of early 20^{th} century articles
finds many occurrences of "bell-shaped" as a description of the normal curve
but there is no identifiable authoritative source for the analogy. The analogy
acquired authority when it appeared in textbooks: in his *Introduction to
Mathematical Probability* (1937) J. V. Uspensky writes, "the probability
curve has a bell-shaped form" and in *An Introduction to Probability
Theory and its Applications* (1950, p. 129) W. Feller writes that the graph
of the density "is the symmetric, bell-shaped curve shown in figure 1."

**THE bell curve**, with its implication that there is only one bell-shaped curve, does
not come naturally to statisticians or probabilists and I could find *no*
use of this term in any of the statistics or mathematics journals on *JSTOR*.
However the term has become very common outside the professional literature
of probability.

*Bell curve* is found in 1920 in an article “A New Point of View in the Interpretation of Threshold Measurements in Psychophysics” by Godfrey H. Thomson, *Psychological Review.*

The suggestion has been made by many that the extreme curves, the ogives, are integrals of the normal curve of error. The bell-curve in the center is deduced by these writers by first forming the outer curves and then subtracting their sum from unity at each point. All the usual arguments, however, which support the view that the outer curves are integral normal curves would lead one to expect, when applied to the central curve, that it is a normal curve as it stnds. But this is impossible. Two such normal ogives added together and subtracted from unity, ordinate by ordinate, do not give a normal bell curve.

[Google print search, James A. Landau]

The phrase "an almost
perfect Bell Curve" appears in D. T. Sisto "Aural Comprehension in Spanish,"
*Modern Language Journal*, **41** (1957), p. 30. *The bell curve*
became more common in the following decades but it really took off in the 90s
with such widely discussed works as Richard J. Herrnstein’s
*The Bell Curve: Intelligence and Class Structure in American Life* (1994).

This entry was contributed by John Aldrich. See the entries CENTRAL LIMIT THEOREM, ERROR, GAUSSIAN and NORMAL and also Symbols associated with the Normal distribution.

The **BERNOULLI** family was one of the wonders of European
mathematics in the 17^{th} and 18^{th} centuries. Eight Bernoullis have
MacTutor
biographies and there are many Bernoulli
references on this site: to see them, use the Search on the front page.
Over the centuries *Bernoulli’s
law,* *Bernoulli’s theorem*, *Bernoulli’s principle*, etc. have
been used in many different ways. The eponymous terms in use today seem to refer mainly to the work of
Jakob
(Jacques, James), in particular to his *Ars Conjectandi* (published 1713), and to the work of his nephew
Daniel.

**BERNOULLI DISTRIBUTION** and **BERNOULLI RANDOM VARIABLE.**
In the past the *Bernoulli distribution* often referred
to what is now generally called the BINOMIAL DISTRIBUTION.
Thus H. Cramér *Random Variables and Probability Distributions* (1937, p. 43)
refers to "a Binomial or Bernoulli distribution". Aurel Wintner had
in mind a different random variable and possibly a different Bernoulli--Daniel
not Jakob--when he discussed the "symmetric Bernoulli distribution" in "On
Analytic Convolutions of Bernoulli Distributions," *American Journal
of Mathematics*, **56**, (1934), p. 662. This distribution has values
±*a* with equal probability.

A Google print search gives a snippet view which suggets that *Bernoulli distribution* appears in 1932 in
*Handbook of Statistical Nomographs, Tables, and Formulas* by Jack Wilbur Dunlap and Albert Kenneth Kurtz. [James A. Landau]

Since the 1960s the random variable, 1 with probability *p*
and 0 with probability (1-*p*), has been prominent in the literature. This has been called the
INDICATOR RANDOM VARIABLE and also the
*Bernoulli random variable.* The second term appears in Allan Birnbaum "On the Foundations of Statistical Inference:
Binary Experiments," *Annals of Mathematical Statistics*, **32**, (1961),
414-435. It was a natural choice given the established term BERNOULLI TRIAL.

**BERNOULLI’S EQUATION** in **ordinary differential
equations.** Kline (p. 474) gives the history of this equation as follows:
Jakob Bernoulli
proposed the problem of solving the equation in the *Acta Eruditorum* of 1695; in 1697 Leibniz showed it
could be reduced to a linear equation by a change of variable;
John Bernoulli
gave another method. In the
*Acta* of 1696 Jakob solved it essentially by separation of variables.

**BERNOULLI’S EQUATION** (or *Bernoulli’s theorem*) in **hydrodynamics** was introduced by
Daniel Bernoulli
in his work *Hydrodynamica* (1738). It is one of Michael Guillen’s
*Five Equations that Changed the World* (1995).

A Google print search shows *Bernoulli’s equation* in 1920 in *Applied Aerodynaics* by
Leonard Bairstow: “The only external forces acting on the fluid occur at the actuator disc, and the simple form of Bernoulli’s equation developed in the chapter on fluid motion may be applied separately to the two parts of streamlines which are separated by the actuator disc.” [James
A. Landau]

A Google print search shows *Bernoulli’s equation* in 1928 in *Analytical Principles of the Production of Oil, Gas, and Water from Wells*
by Stanley C. Herold: “Whereas the science of mechanics deals with [page 78] energy only in the mechanical forms which are appropriately represented by the three terms in Bernoulli’s equation....” [James A. Landau]

**BERNOULLI NUMBERS.** These were first discussed by
Jakob Bernoulli
in the *Ars Conjectandi* (published 1713). See Hald (1990, section 15.4).

In *The Doctrine of Chances* (3^{rd} edition 1733)
Abraham de Moivre referred to them as "the numbers of Mr. *James Bernoulli*
in his excellent Theorem for the Summing of Powers," quoted in I. Todhunter
*A History
of the Mathematical Theory of Probability* (1865, p 152).

According to Cajori (vol. 2, page 42), Leonhard Euler introduced the name "Bernoullian numbers" in 1769 in the title of his "De summis serierum numeros Bernoullianos involventium."

**BERNOULLI’S THEOREM** was once the usual name for the first version of the LAW OF LARGE NUMBERS,
proved by Jacob Bernoulli in
*Ars Conjectandi* (1713). See e.g. Todhunter’s
*A History of the Mathematical Theory of Probability*
(1865, p. 71).

**BERNOULLI TRIAL** is dated 1951 in MWCD10, although James A. Landau has found the phrases
"Bernoullian trials" and "Bernoullian series of trials"
in 1937 in *Introduction to Mathematical Probability* by J. V. Uspensky. The reference is to
Jakob Bernoulli’s
*Ars Conjectandi*.

See BINOMIAL DISTRIBUTION.

**BERTRAND’S PARADOX** in probability theory. Four
situations are presented in the first chapter of
Joseph Bertrand’s
Calcul
des probabilités (1889) and any one could be
referred to as “Bertrand’s paradox.” The most discussed, a paradox in geometric
probability, appears on pp. 4-5; see the entry in the
Encyclopedia of Mathematics.
This problem was treated by J. H. Poincaré in the section “Paradoxe de J. Bertrand” of his
Calcul
des probabilités ch. VII p. 118.
The problem Bertrand treats on pp. 2-3 of the
Probabilités is sometimes referred to as
**Bertrand’s box
paradox**. It has been re-invented several times and is best known
today as the MONTY HALL PROBLEM.

**BERTRAND’S POSTULATE** in number theory. In his “Mémoire sur le nombre de valeurs que
peut prendre une fonction quand on y permute les lettres qu’elle renferme,”
*Journal de l’Ecole Polytechnique*, **18**, (1845), 123-140
Joseph Bertrand
made a claim about the distribution of prime numbers based on the numbers
he had examined. Chebyshev published a proof of “le *postulatum* de M. Bertrand” in 1854 in his
“Mémoire sur les nombres premiers,” reprinted in
*Oeuvres I,* p. 50. See the entry in MathWorld.

**BESSEL EQUATION** and **BESSEL FUNCTION** are named for
Wilhelm
Bessel who made the first systematic study of them
in his “Untersuchung des Theils der planetarischen Störungen, welcher aus der
Bewegung der Sonne entsteht,” *Abh. d. K.
Akad. Wiss. Berlin* 1824 (published 1826) 1–52. *Abhandlungen* 1, p. 84.
See the *Encyclopedia of Mathematics* entries
Bessel equation
and Bessel functions and Peter Colwell “Bessel Functions and Kepler’s Equation,”
*American Mathematical Monthly*, **99**,
(1992), 45-48.

Earlier, Euler and Daniel Bernoulli found series solutions to differential equations that amount to Bessel functions, as a paper by Bocher points out.

Franceschetti (p. 56) implies that the term *Bessel’sche Funktion* was introduced by
Oskar Xavier Schlömilch in 1854. The term is found in 1857 in his
“Ueber die Bessel’sche Funktion”.

*Bessel’s function* is found in English in 1862 in
“Discussion
of the Magnetic and Meteorological Observations Made at the Girard College Observatory”
by A. D. Bache: “...the preceding monthly normals were united into annual means
and the results put into an analytical form, using Bessel’s function applicable to periodical phenomena.” [Google print search]

In 1867 the term *Bessel’sche Differentialgleichung*
was used by Carl Neumann in his
book
on Bessel functions.

In 1870 the term *Bessel’s equation* was used in English by Cayley in an 1870
paper
“On the Geometrical Theory of Solar Eclipses” in the *Monthly
Notices of the Royal Astronomical Society.*

A Google print search finds *Bessel function* in English in 1888 in three different articles.

Information for this entry was contributed by Jan Peter Schäfermeyer, John Aldrich, and James A. Landau.

**BETA DISTRIBUTION.** *Distribuzione β* is found in 1911 in C. Gini, "Considerazioni
Sulle Probabilità Posteriori e Applicazioni al Rapporto dei Sessi Nelle Nascite
Umane," Studi Economico-Giuridici della Università de Cagliari, Anno III,
5-41 (David, 1998).

The distribution has a very long history. The "problem in the doctrine of chances"
that Bayes treated produced a beta distribution for the posterior density of
the probability of a success in Bernoulli trials. In the
early 20^{th} century English literature it was usual to refer to the
distribution by its designation in the Pearson family of curves. (see *Pearson
curves* entry) However the new text-books of the 1940s did not favour
the Pearson classification and the beta designation has become standard: see
e.g. C. E. Weatherburn’s *A First Course in Mathematical Statistics,* (1946).

A Google print search finds *Beta distribution of the first kind* in
C. E. Weatherburn, *A First Course in Mathematical Statistics.*
According to Google Books, this book was published in 1949, but the title page is not displayed so this date cannot be confirmed. [James A. Landau]

See BAYES and GAMMA DISTRIBUTION.

**BETA** and **GAMMA FUNCTIONS.** These terms derive from the symbols *B*
and Γ used to denote the functions
that Adrien Marie Legendre (1752-1833) called the *Eulerian integral of the
first kind* and *second kind*. Legendre introduced the symbol
Γ and Binet introduced
the symbol *B*. See EULERIAN INTEGRAL and
Earliest use of function symbols.

According to Klein (p. 423), Euler’s
research on the functions, published in 1731 and 1771, grew out of earlier work by
Wallis
published in his *Arithmetica infinitorum*
of 1656. A separate development led to the incomplete *B*-function. This
was Bayes’s (1763) solution of a "problem in the doctrine of chances."
See the BETA DISTRIBUTION and BAYES.

The term **BETTI NUMBER** was coined by Henri Poincaré (1854-1912) and named for
Enrico
Betti (1823-1892), according to a history note by Victor Katz in *A First
Course in Abstract Algebra* by John B. Fraleigh. "Les nombres de Betti" appear
in Poincaré’s "Analysis Situs," *Journal de l’École Polytechnique*,
**1**, (1892) 1-121 and in a short communication, which is available on-line, "Sur L’Analysis Situs,"
*Comptes
Rendus*, **115**, (1892), 633-636.

**BETWEENNESS.** The term had
been used earlier in philosophy and psychology (see the *OED*) but it was
first used in connection with geometry by G. B. Halsted in his paper, “The
Betweenness Assumptions,” *American Mathematical Monthly*, **9**,
(1902), 98-101.

“The betweenness assumptions” was Halsted’s term for what Hilbert
had called the Axiome der Anordnung (Axioms of Arrangement) in his *Grundlagen
der Geometrie* (1899). The explanation of the names is given in a passage from
Hilbert that Halsted (p. 99) translates as follows: “The axioms of this group
define the idea of ‘between,’ and make possible on the basis of this idea the *arrangement*
of the points on a straight, in a plane and in space.”

The term **BEZOUTIANT** was coined by Sylvester. It is found in
J. J. Sylvester, "On a Theory of the Syzygetic Relations of Two Rational Integral Functions,
Comprising an Application to the Theory of Sturm’s Functions, and That of the Greatest Algebraical Common Measure,"
*Philosophical Transactions of the Royal Society of London,* **143,** (1853), 407-548:
"This quadratic function, which plays a great part in the last section and in the theory of real roots,
I term the Bezoutiant; it may be regarded as a species of generating function." [*JSTOR* search]

**BIASED** and **UNBIASED.** *Biased errors* and
*unbiased errors* (meaning "errors with zero expectation") are
found in 1897 in A. L. Bowley, "Relations Between the Accuracy of an
Average and That of Its Constituent Parts," *Journal of the Royal
Statistical Society,* 60, 855-866 (David, 1995).

*Biased sample* is found in 1911 *An Introduction to the
theory of Statistics* by G. U. Yule: "Any sample, taken in the
way supposed, is likely to be definitely *biassed,* in the sense
that it will not tend to include, even in the long run, equal
proportions of the A’s and [alpha]’s in the original material"
(OED2).

*Biased sampling* is found in F. Yates, "Some examples of
biassed sampling," *Ann. Eugen.* 6 (1935) [James A. Landau].

See also ESTIMATION.

The term **BICURSAL** was introduced by Cayley (Kline, page 938).

In 1873 Cayley wrote, "A curve of deficiency 1 may be termed bicursal."

**BIJECTION**. See the entry INJECTION, SURJECTION and BIJECTION.

**BILLION.** See MILLION.

**BIMODAL** is found in April 1901 in "A Quantitative Study of Variation in the Smaller North-American Shrikes" by
R. M. Strong in *The American Naturalist.*

**BINARY ARITHMETIC** is found in 1736 in *The Method of Fluxions and Infinite Series; with its Application to the Geometry of Curve-lines.
By the Inventor Sir Isaac Newton, Kt,
Translated from the Author’s Latin Original,*
by John Colson: “Some have consider'd the Binary Arithmetick, or that Scale in which Two is the Root, and have pretended to make Computations by it, and to find considerable advantages in it. But this can never be a convenient Scale to manage and express large Numbers by, because the Root, and consequently its Powers, are so very small, that they make no dispatch in Computations, or converge exceeding slowly. The only Coefficients that are here necessary are 0 and 1.” [Google print search, James A. Landau]

**BINOMIAL.** According to the OED2, the Latin word
*binomius* was in use in algebra in the 16th century.

*Binomial* first appears as a noun in English in its modern
mathematical sense in 1557 in *The Whetstone of Witte* by Robert
Recorde: "The nombers that be compound with + be called
Bimedialles... If their partes be of 2 denominations, then thei named
Binomialles properly. Howbeit many vse to call Binomialles all
compounde nombers that have +" (OED2).

**BINOMIAL COEFFICIENT.** According to Kline (page 272), this term
was introduced by Michael Stifel (1487-1567) about 1544. However,
Julio González Cabillón believes this information is
incorrect. He says Stifel could not have used the word
*coefficient,* which is due to Vieta (1540-1603).

*The Doctrine of Chances: or, A Method of Calculating the Probability of events in Play* by
Abraham De Moivre (1718) has:

...any term of it is related to the Four next preceding ones, according to the following Index,viz. 4 – 6 -+ 4 - 1,whose parts are the Coefficients of the Binomiala - braised to the fourth Power, the first Coefficient being omitted. And generally, if there be any Series of Terms whose last Differences are= 0.Let the number denoting the rank of that difference ben;then the Index of the Relation of each Term to as many of the preceding ones as there are Units inn,will be expressed by the Coefficients of the Binomiala – braised to the Powern,omitting the first.

*Binomial coefficient* is found in English in 1733 in
*The Philosophical Transactions (From the Year 1720, to the Year 1732)
Vol. VI part I.* [Google print search, James A. Lanau]

**BINOMIAL DISTRIBUTION** is found in Karl Pearson, “Contributions to the Mathematical Theory of Evolution---II. Skew Variation in Homogeneous Material, Received December 19, 1894, --- Read January 24, 1895,”
*Philosophical Transactions of the Royal Society of London for the year MDCCCXCV.* A footnote has:
“This result seems of considerable importance, and I do not believe it has yet been noticed. It gives the mean square error for any binomial distribution, and we see that for most practical purposes it is identical with the value , hitherto deduced as an approximate result, by assuming the binomial to be approximately a normal curve.” [Google print search, James A. Landau]

Fisher adopted it in section 18 of his
*Statistical Methods for Research Workers* (1925).

The name is relatively new but the distribution has been studied
since it was obtained by Jakob (Jacques, James) Bernoulli (1654-1705) in
*Ars Conjectandi* (1713) Part 1. Earlier
names included *binomial law*.

See BERNOULLI TRIAL.

**BINOMIAL THEOREM** is found in 1723 in *Lexicon Technicum: Or, An Universal English Dictionary of ARTS and SCIENCES:
The Second Edition.* [Google print search, James A. Landau]

In Gilbert and Sullivan’s *The Pirates of Penzance* (1879), the
song "I Am The Very Model of a Modern Major-General" includes the
lines:

I'm very well acquainted, too, with matters mathematical,

I understand equations, both the simple and quadratical,

About binomial theorem I'm teeming with a lot o' news,

With many cheerful facts about the square of the hypotenuse. [...]

I'm very good at integral and differential calculus;

I know the scientific names of beings animalculous:

**BINORMAL.** *Binormale* was used by Barré de Saint-Venant in
a paper "Mémoire sur les lignes courbes non planes" which was
presented to l'Académie des Sciences on 16 September 1844 and published in
*Journal de L'école Royale Polytechnique* in 1845. Barré de Saint-Venant also
used *binormale* in *Tableau de formules de la Théorie des Courbes
dans l'Espace,* which also appeared in 1845.

In the former work he defines *binormale* as follows, in English translation:
“Binormal, those of the normals which are perpendicular
to the oscillating plane. This line,
which has not been given a name, is, in effect, normal to two consecutive
elements at the same time, whereas the other normals to the curve are but
single elements.”

In the above, “oscillating” apparently should be “osculating.”

The French original: “Binormale, celle des normales qui est perpendiculaire au plan oscillateur. Cette ligne, que l'on est obligé de considérer très-souvent aussi, et à laquelle il n'a pas encore été donné de nom, est, en effet, normale à deux éléments consécutifs à la fois, tandis que les autres normales à la courbe ne le sont qu'à un seul de ses éléments.”

According to Howard Eves in *A Survey of
Geometry,* vol II (1965), “The name *binormal* was introduced
by B. de Saint-Venant in 1845.”

This entry was contributed by James A. Landau.

The term **BIOMATHEMATICS** was coined by William Moses Feldman
(1880-1939), according to Garry J. Tee in "William Moses Feldman:
Historian of Rabbinical Mathematics and Astronomy." The term appears
in Feldman’s textbook *Biomathematics* published in 1923.

The word **BIOMETRY** had been used occasionally before 1901 but in that year a new journal,
*Biometrika*, appeared.
Francis Galton (1822-1911)
wrote the lead article, "Biometry": "The primary object of Biometry
is to afford material that shall be exact enough for the discovery of incipient
changes in evolution which are too small to be otherwise apparent." (**1**.
p. 9) (OED2) Galton’s associates in founding the journal and in establishing
biometry were the mathematician
Karl Pearson
(1857-1936) and the zoologist
W. F. R. Weldon
(1860-1906). Pearson and Weldon had been doing biometric research for about
ten years while Galton’s efforts went back more than thirty.
For further information see Stephen M. Stigler "The Problematic
Unity of Biometrics," *Biometrics* **56**, (2000), p. 653.
[John Aldrich]

See POPULATION and REGRESSION.

**BIOSTATISTICS** has been a popular title for books and courses in the last few decades but the term
appeared in the 19^{th} century: there is an entry in 1868 in *A Dictionary of Medical Science.*
[Google print search] and another in the 1890 edition of Webster. In
the early decades of the twentieth century the term was most associated
with the activities of the Department of Biostatistics, School of
Hygiene and Public Health, Johns Hopkins University. The term overlaps
with medical statistics, VITAL STATISTICS and BIOMETRY.
See also STATISTICS.

**BIPARTITE.** In 1858, Cayley referred to "bipartite binary
quantics."

**BIPARTITE CURVE** appears in 1879 in George Salmon (1819-1904),
*Higher Plane Curves* (ed. 3): "We shall then call the curve we
have been considering a bipartite curve, as consisting of two
distinct continuous series of points" (OED2).

**BIQUATERNION.** Hamilton used the term *biquaternion*
in the sense of a quaternion with complex coefficients.

In the more recent sense, William Kingdon Clifford (1845-1879) coined
the term. It appears in 1873 in *Proc. London Math. Soc.* IV.
386.

**BISECT.** According to the OED2, *bisect* is apparently of
English formation. The word is dated ca. 1645 in MWCD10.

*Bisection* appears in 1656 in a translation of *Hobbes’s
Elem. Philos.* (1839) 307: "By perpetual bisection of an angle"
(OED2).

In 1660, Barrow’s translation of Euclid’s *Elements* has "To
bisect a right line."

*Bisector* appears in English in 1864 in *The Reader* 5
Oct. 483/2: "The internal and external bisectors of the angle"
(OED2).

**BIT** was coined by John W. Tukey (1915-2000).

According to Niels Ole Finnemann in Thought, Sign and Machine, Chapter 6, "After some more informal contacts during the first war years, on the initiative of mathematician Norbert Wiener, a number of scientists gathered in the winter of 1943-44 at a seminar, where Wiener himself tried out his ideas for describing intentional systems as based on feedback mechanisms. On the same occasion J. W. Tukey introduced the term a 'bit' (binary digit) for the smallest informational unit, corresponding to the idea of a quantity of information as a quantity of yes-or-no answers."

Several Internet web pages say Tukey coined the term in 1946. Another web page says, "Tukey records that it evolved over a lunch table as a handier alternative to 'bigit' or 'binit.'"

*Bit* first appeared in print in July 1948 in
"The Mathematical Theory of Communication" by Claude Elwood Shannon (1916-2001) in the *Bell Systems Technical Journal.*
In the article, Shannon credited Tukey with the coinage [West Addison assisted with this entry.]

**BIVARIATE** (in Statistics) is found in 1920 in Karl Pearson “Notes on the History of
Correlation,” *Biometrika*, **13**, p. 37: “Thus in 1885 Galton had
completed the theory of bi-variate normal correlation” (*OED*). The word was soon being written
without a hyphen: thus James Henderson “On Expansions in
Tetrachoric Functions,” *Biometrika*, **14**,
(1922), p. 157 writes of the “normal bivariate frequency
surface.”

See MULTIVARIATE, N-VARIATE, TRIVARIATE and UNIVARIATE.

**BLACK-SCHOLES FORMULA** refers to a formula for the pricing of derivatives
on the assumption that stock prices follow a geometric random walk in
Fischer Black and Myron Scholes "The Pricing of Options and Corporate
Liabilities," *Journal of Political Economy*, **81**, (1973), 637-654.
The name "Black-Scholes formula" came into use more or less immediately. Scholes received the
Nobel Prize
for Economic Sciences in 1997; Black had died in 1995.

**BLOCK** and **RANDOMIZED BLOCK**
in experimental design. R.A. Fisher introduced the term *block* in chapter
VIII, section 48, Technique of Plot Experimentation, of his
*Statistical
Methods for Research Workers* (1925).
He also described the technique of randomized blocks, though he did not use
the term. The term *randomized block* appears in his
"The
Arrangement of Field Experiments",
*Journal of the Ministry of Agriculture of Great Britain,* **33**, (1926)
p. 509. (David 2001)

See RANDOMIZATION.

**BONFERRONI INEQUALITIES.**
According to the
St. Andrews website Carlo Emilio Bonferroni
(1892-1960) published some probability inequalities in 1935-6. They were referred
to as Bonferroni’s inequalities by W. Feller in his *An Introduction to Probability
Theory and its Applications volume 1* (p. 75, 1950). (David 2001)

See BOOLE’S INEQUALITY.

**BOOLEAN** is found in 1851 in the *Cambridge and Dublin
Mathematical Journal* vi. 192: "...the Hessian, or as it
ought to be termed, the first Boolian Determinant" (OED2).

**BOOLEAN ALGEBRA.** *Boolian algebra* appears in 1885 in C. S. Peirce, “On the Algebra of Logic,”
*American Journal of Mathematics Vol VII.* [Google print search, James A. Landau]

According to E. V. Hutington in "New Sets of Independent Postulates
for the Algebra of Logic with Special Reference to Whitehead and
Russell’s Principia Mathematica," *Trans. Amer. Math. Soc.*
(1933), the term *Boolean algebra* was introduced by H. M.
Sheffer in the paper "A Set of Five Independent Postulates for
Boolean Algebras with Application to Logical Constants", *Trans.
Amer. Math. Soc.,* 14 (1913).

In an illuminating passage of "Algebraic Logic", Halmos writes (p. 11):

Terminological purists sometimes object to the Boolean use of the word "algebra". The objection is not really cogent. In the first place, the theory of Boolean algebras has not yet collided, and it is not likely to collide, with the theory of linear algebras. In the second place, a collision would not be catastrophic; a Boolean algebra is, after all, a linear algebra over the field of integers modulo 2. (...) While, to be sure, a shorter and more suggestive term than "Boolean algebra" might be desirable, the nomenclature is so thoroughly established that to change now would do more harm than good.[Carlos César de Araújo]

**BOOLE’S INEQUALITY.** This probability inequality appears in
George
Boole’s *An Investigation into the Laws of Thought, on which are founded
the Mathematical Theories of Logic and Probabilities* (1854). In the 1930s
Bonferroni devised a system of inequalities in which the Boole inequality is
the simplest. As it is the most widely used of the Bonferroni inequalities,
it is often referred to as "the Bonferroni inequality" even though Bonferroni
clearly attributed it to Boole. (Based on "George Boole" in *Statisticians
of the Centuries* (ed. C. C. Heyde and E. Seneta) 2001.)

See BONFERRONI INEQUALITIES.

To **BOOT** a computer. Although the *OED*’s earliest
quotation is from 1980, the term has been in use since the early 1950s. The
term seems to derive from the phrase "to pull oneself up by one’s own bootstrap"
which had been in circulation since 19th century.
Michael Quinion suggests
that the computer people picked up the phrase from Robert Heinlein’s *By
His Bootstraps*, a 1941 short story about time-travel. See the next entry.

**BOOTSTRAP** in Statistics. The term was introduced by Bradley Efron in
"Bootstrap methods: another look
at the jackknife," *Annals of Statistics,* **7**, (1979) 1-26.
Tukey’s "jackknife" had set a precedent for "colorful" terminology
and Efron reported some suggestions for his construct: "*Swiss Army Knife,
Meat Axe, Swan-Dive, Jack Rabbit* and my personal favorite, the *Shotgun,*
which to paraphrase Tukey, 'can blow the head off any problem if the statistician
can stand the resulting mess.'" In his book *An Introduction to the Bootstrap*
(with R. J. Tibshirani) (1993) Efron explained that "the use of the term
bootstrap derives from the phrase to pull oneself up by one’s own bootstrap,
widely thought to be based on one of the eighteenth century *Adventures of
Baron Munchausen*, by Rudolph Erich Raspe. (The Baron had fallen to the bottom
of a deep lake. Just when it looked like all was lost, he thought to pick himself
up by his own bootstraps.)" The words "widely thought" seem to be well
chosen for Michael Quinion
argues that, while the phrase which dates from the 19th century, was probably
inspired by Raspe’s story, the exact incident is not in the book! [John Aldrich]

**BOREL-CANTELLI LEMMAS.** These were given in a simple case by
E. Borel
in 1909 “Les probabilités dénombrables et
leurs applications arithmétiques,” *Rendiconti
del Circolo Matematico di Palermo*, 27, 247-271 and more generally in 1917 by
F. P. Cantelli in “Sulla probabilità
comme limite di frequenza,” *Rendiconti della R. Accademia dei Lincei*, vol. XXVI,
serie V, gennaio, p. 39-45.

**BORROW** is found in English in 1594 in Blundevil,
*Exerc.*: "Take 6 out of nothing, which will not bee, wherefore
you must borrow 60" (OED2).

In October 1947, "Provision for Individual Differences in High School
Mathematics Courses" by William Lee in *The Mathematics Teacher*
has: "The Social Mathematics course stresses understanding of
arithmetic: 'carrying' in addition, 'regrouping' (*not*
'borrowing') in subtraction, 'indenting' in multiplication are
analyzed and understood rather than remaining mere rote operations to
be performed blindly."

**BORSUK-ULAM THEOREM.** The story of this result is told by Steinhaus in a note published in 1938. "Several years ago Mr.
Ulam
conjectured the following theorem: if a sphere is mapped continuously into a
plane set, there is at least one pair of antipodal points having the same image,
that is, they are mapped into the same point of the plane. This was proved by Mr.
Borsuk
in 1933 (Drei
Sätze über die n-dimensionale euklidische Sphäre *Fundamentae
Mathematicae*, **XX**, p. 177) extending the theorem to *n*
dimensions."In the same note Steinhaus
gave the following illustration, "at any moment, there are two antipodal points
on the Earth’s surface that have the same temperature and the same atmospheric
pressure."

See the entry HAM SANDWICH THEOREM for the Steinhaus note.

The term **BOX-COX TRANSFORMATION** is unusual in being inspired—indirectly—by
a comic opera. Box
and Cox with a libretto by Francis Cowley Burnand was
Arthur Sullivan’s first. It tells of a landlord’s scheme to get
double rent from a single room: by day he lets it to Mr. Box (a printer who is
out all night) and by night to Mr. Cox (a hatter who works all day). The
statisticians G. E. P. Box
and D. R. Cox
were serving on a committee and the other members thought they should write
a paper together. "We said, 'Well, obviously the
thing to write about is transformations'" recalled Box to M. H. DeGroot in
"A Conversation with George Box," *Statistical Science*, **2**, (1987), p.
254. The resulting Box and Cox paper is "An Analysis of Transformations," *Journal
of Royal Statistical Society, Series B*, **26**, (1964),
pp. 211-–246.

**BOX-JENKINS APPROACH, METHODS etc.** are terms referring to the form of time series analysis presented
in the 1970 book *Time Series Analysis: Forecasting and Control*, by
George Box and
Gwilym Jenkins. The book “is
concerned with the building of stochastic (statistical) models for discrete
time series in the time-domain and the use of such models in important areas
of application.” (Preface.)

**BOYER’S LAW.** See EPONYMY.

The term **BRACHISTOCHRONE** was introduced by Johann Bernoulli
(1667-1748). Smith (vol. 2, page 326) says the term is "due to
the Bernoullis."

*Brachystochrone* is found in English in about 1774 in
*A survey of experimental philosophy, considered in its present state of improvement* by
Oliver Goldsmith:
“The curve of a cycloid, which was afterwards called by the hard name of a Brachystochrone, or the line of quickest descent.” [OED]

**BRANCHING PROCESS.** The term seems
to have been introduced by A. N. Kolmogorov and N. A. Dimitriev in 1947 ("Branching
Stochastic Processes," *Doklady Akademii Nauk*, USSR, **56**, 5-8).
However, there were many investigations of such processes earlier in the century
and even in the 19^{th} century. The French mathematician I. J. Bienaymé
studied the process in 1845! His "De la Loi de Multiplication et de la Durée
des Familles" is reprinted in Kendall (1975).

Bienaymé’s contribution was overlooked until recently but another investigation was more visible. Francis Galton was also interested in the extinction of surnames:

In each generationGalton’s friend, H. W. Watson, tackled these questions in On the Probability of Extinction of Families. (1874). The name "Galton-Watson process" recalls their work.a_{0}, per cent. of the adult males have no male children who reach adult life;a_{1}have one such male child;a_{2}have two; and so on up toa_{5}who have five. Find (1) what proportion of the surnames will have become extinct afterrgenerations; and (2) how many instances there will be of the same surname being held bympersons.

The process re-appeared in other contexts, e.g. in the genetic work of R. A. Fisher On the Dominance Ratio. (1922) and J. B. S. Haldane.

[John Aldrich, based on D. G. Kendall "The Genealogy of Genealogy: Branching
Processes before (and after) 1873" *Bulletin of the London Mathematical Society*,
**7**, (1975), 225-253 and C. C. Heyde & E. Seneta
*I. J. Bienaymé: Statistical Theory Anticipated,* 1977.]

The terms **BRA VECTOR** and **KET VECTOR** were introduced by
Paul Adrien Maurice Dirac (1902-1984).
The terms appear in 1947 in
*Princ. Quantum Mech.* by Dirac:
"It is desirable to have a special name for describing the vectors which
are connected with the states of a system in quantum mechanics, whether they
are in a space of a finite or an infinite number of dimensions. We shall call
them ket vectors, or simply kets, and denote a general one of them by a special
symbol >|. ...
We
shall call the new vectors bra vectors, or simply bras, and denote a general
one of them by the symbol <|, the mirror image of the symbol for a ket vector" (OED2).

**BRIGGSIAN LOGARITHM,** referring to the COMMON LOGARITHM
(logarithm to base 10), is named after
Henry Briggs.

*Briggs’s logarithm* is found in 1706 in *Synopsis Palmariorum Matheseos: Or, a New Introduction to the Mathematics Containing the Principles of Arithmetic & Geometry Demonstarted, In a Short and Easie Method*:
“In making Briggs’s Logarithms, the Index (n) must be 2,3025850, &c. as was hinted before;” [Google print search, James A. Landau].

See the entries COMMON LOGARITHM, LOGARITHM, NAPIERIAN LOGARITHM, NATURAL LOGARITHM.

**BROKEN LINE.** A Google print search by James A. Landau finds “the Logarithmick broken Line” in 1705 in
*The posthumous Works of Robert Hooke, M.D. S.R.S. Geom. Prof. Gresh. &c. Containing his Cutlerian Lecutres, and Other Discourses Read at the Meetings of the Illustrious Royal Society*: “the Curve Line drawn through the Points C, f 1,2,3,4 will represent [page 530] the Logarithmick broken Line being composed of the Diagonal Lines fC, hg, ki, ml, on &c. in which BC, df, s 1, f 2, v 3, x 4, the ordinates to the Line AB, shall represent the absolute numbers which are here a rank of continual proportionals answering to the numbers 1, 2, 3, 4, 5, b or nought, and Bd, Bs, Br, Bv, Bx.” [James
A. Landau]

*Broken line* is found in 1852 in a French edition (edited by M. A. Blanchet) of
*Éléments de géométrie* by Adrien Marie Legendre:
"Une ligne brisée ou polygonale est une ligne composée de lignes droites."
(A broken line or polygonal line is a line composed of straight lines.)
The term may appear in the original 1794 edition, which I have not seen.

*Broken line* is found in 1852 in
*Elements of geometry
and trigonometry, from the works of A. M. Legendre.
Revised and adapted to the course of mathematical instruction
in the United States, by Charles Davies*:
"5. A Straight Line is one which lies
in the same direction between any two of its points.
6. A Broken Line is one made up of straight lines,
not lying in the same direction."

*Broken line* is found in 1852 in
*Elements of the differential and
integral calculus* by Charles Davies:
"But the arc POM can never be less
than the chord PM, nor greater than the broken line
PNM which contains it; hence, the limit of the ratio
POM/PM = 1; and consequently, the differential
of the arc is equal to the differential of the chord."

*Broken line* is found in 1852 in
*Elements of plane trigonometry,
with its application to mensuration of heights and distances,
surveying and navigation* by William Smyth:
"Instead of a broken line, a field is sometimes
bounded by a line irregularly curves, as by the
margin of a brook, river, or lake. In this case
(fig. 60) we run, as before, a chain line as near
the boundary as possible, and by means of offsets
determine a sufficient number of points in the
curve to draw it." [These three citations were
found using the University of Michigan Historic Math Collection.]

According to Schwartzman (page 38), the "broken line," meaning a curve composed of connected straight line segments, was adopted "around 1898" by David Hilbert (1862-1943).

**BROUWER’S FIXED-POINT THEOREM.** This appears in
L. E. J. Brouwer’s
“Ueber eineindeutige, stetige Transformationen von Flächen in sich”
*Math. Ann.*, **69** (1910) pp. 176–180.
A *JSTOR* search found a reference to J. W. Alexander’s “Note on Brouwer’s fixed point theorem” of 1924.
See the *Encyclopedia of
Mathematics* entry.

**BROWNIAN MOTION. ** In the course
of the 20th century the physical phenomenon described by the botanist
Robert Brown
in 1827 was described in mathematical terms and gradually "Brownian
motion" came to refer as much to the mathematical formalism as to the phenomenon.
Mathematical theories were developed by, inter alia, A. Einstein ("Zur
Theorie der Brownschen Bewegung" (1905)). The "Brownian motion process"
of J. L. Doob’s *Stochastic Processes* (1954) is a type of stochastic process
divested of physical application. Doob states that the process "was first
discussed by Bachelier
[Théorie
de la Speculation 1900] and later,
more rigorously by Wiener ["Differential-space" *J. Math. and Phys.*
**2** (1923) 131-174]. It is sometimes called the Wiener process." An
earlier term in physics (and mathematics) was "Brownian movement."
This slowly gave way to "Brownian motion," although David (2001) reports
an early appearance of "Brownian motion" in 1892 in W. Ramsay’s Report
of a paper read to the Chemical Society, London. *Nature,* **45,** 429/2.
[John Aldrich]

See FOKKER-PLANCK EQUATION and WIENER PROCESS

**BRUN’S CONSTANT** (named for Viggo Brun (1882-1978)) was coined by R. P. Brent in "Irregularities
in the distribution of primes and twin primes," *Math. Comp.* 29
(1975), according to *Algorithmic Number Theory* by Bach and
Shallit [Paul Pollack].

**BUILDING.** See the entry APARTMENT, BUILDING and CHAMBER.

**BUNDLE.** See the entry FIBER BUNDLE.

**BURALI-FORTI PARADOX** is now famous as the earliest
paradox of set theory. It refers to a result in
Cesaro Burali-Forti’s paper,
"Una questione sui numeri transfiniti" *Rendiconti
di Matematico di Palermo*, **11**, (1897), 154-164 (translated in Heijenoort
(1967)). Heijenoort comments that "Burali-Forti himself considered the
contradiction as establishing, by *reductio ad absurdum*, the result that
the natural ordering of ordinals is just a partial ordering." Bertrand
Russell called the result "le paradoxe du Burali-Forti" in his
Les Paradoxes de la Logique
*Revue de métaphysique et de morale* (1906) p. 638.

See PARADOX.

**BURNSIDE PROBLEM.** In 1902
William Burnside
wrote, "A still undecided point in the theory of
discontinuous groups is whether the group order of a group may be not finite,
while the order of every operation it contains is finite." "On an Unsettled Question in the Theory
of Discontinuous Groups." *Quart. J. Pure Appl. Math.* **33**,
230-238, 1902. For the history of the problem see
St. Andrews
and Mathworld.

The term **BYTE** was coined in 1956 by Dr. Werner Buchholz of IBM.
A question-and-answer session at an ACM conference on the history of
programming languages included this exchange:

JOHN GOODENOUGH: You mentioned that the term "byte" is used in JOVIAL. Where did the term come from?

JULES SCHWARTZ (inventor of JOVIAL): As I recall, the AN/FSQ-31, a totally different computer than the 709, was byte oriented. I don't recall for sure, but I'm reasonably certain the description of that computer included the word "byte," and we used it.

FRED BROOKS: May I speak to that? Werner Buchholz coined the word as part of the definition of STRETCH, and the AN/FSQ-31 picked it up from STRETCH, but Werner is very definitely the author of that word.

SCHWARTZ: That’s right. Thank you.