Probability is one of the basic concepts of probability theory. There are several definitions of this concept. Let us give a definition that is called classical. Next, we point out the weaknesses of this definition and give other definitions that make it possible to overcome the shortcomings of the classical definition.

Consider an example. Let an urn contain 6 identical, thoroughly mixed balls, 2 of them red, 3 blue and 1 white. Obviously, the possibility of drawing a colored (i.e., red or blue) ball at random from an urn is greater than the possibility of drawing a white ball. Can this opportunity be characterized by a number? It turns out you can. This number is called the probability of an event (the appearance of a colored ball). Thus, the probability is a number that characterizes the degree of possibility of the occurrence of an event.

Let us set ourselves the task of giving a quantitative estimate of the possibility that a ball taken at random is colored. The appearance of a colored ball will be considered as event A. Each of the possible results of the test (the test consists in extracting a ball from the urn) will be called elementary outcome (elementary event). Denote elementary outcomes by w 1 , w 2 , w 3 , etc. In our example, the following 6 elementary outcomes are possible: w 1 - a white ball has appeared; w 2 , w 3 - a red ball appeared; w 4 , w 5 , w 6 - a blue ball has appeared. It is easy to see that these outcomes form a complete group in pairs incompatible events(only one ball will necessarily appear) and they are equally probable (the ball is taken out at random, the balls are the same and thoroughly mixed).

Those elementary outcomes in which the event of interest to us occurs, we will call favorable this event. In our example, the following 5 outcomes favor event A (appearance of a colored ball): w 2 , w 3 , w 4 , w 5 , w 6 .

Thus, event A is observed if one of the elementary outcomes favoring A occurs in the trial, no matter which one; in our example, A is observed if w 2 or w 3 or w 4 or w 5 or w 6 occurs. In this sense, event A is subdivided into several elementary events (w 2 , w 3 , w 4 , w 5 , w 6 ); the elementary event is not subdivided into other events. This is the difference between event A and elementary event (elementary outcome).

The ratio of the number of elementary outcomes favorable to the event A to their total number is called the probability of the event A and denoted by P (A). In the example under consideration, there are 6 elementary outcomes; of these, 5 favor event A. Therefore, the probability that the drawn ball will be colored is equal to P (A) \u003d 5 / 6. This number gives the quantitative estimate of the degree of possibility of the appearance of a colored ball that we wanted to find. We now give the definition of probability.

Probability of event A is the ratio of the number of outcomes favorable to this event to the total number of all equally possible incompatible elementary outcomes that form a complete group. So, the probability of event A is determined by the formula

where m is the number of elementary outcomes favoring A; n is the number of all possible elementary test outcomes.

It is assumed here that the elementary outcomes are incompatible, equally possible, and form a complete group. The following properties follow from the definition of probability:

With in about y with t in about 1. The probability of a certain event is equal to one.

Indeed, if the event is reliable, then each elementary outcome of the test favors the event. In this case, m = n, therefore,


With in about y with t in about 2. The probability of an impossible event is zero.

Indeed, if the event is impossible, then none of the elementary outcomes of the trial favors the event. In this case, m = 0, therefore,

P (A) \u003d m / n \u003d 0 / n \u003d 0.

With in about y with t in about 3. The probability of a random event is a positive number between zero and one.

Indeed, only a part of the total number of elementary outcomes of the test favors a random event. In this case 0< m < n, значит, 0 < m / n < 1, следовательно,

0 < Р (А) < 1

So, the probability of any event satisfies the double inequality

Remark. Modern rigorous courses in probability theory are built on a set-theoretic basis. We confine ourselves to the presentation in the language of set theory of those concepts that were considered above.

Let one and only one of the events w i , (i = 1, 2, ..., n) occur as a result of the test. Events w i are called elementary events (elementary outcomes). It already follows from this that elementary events are pairwise incompatible. The set of all elementary events that can appear in a trial is called elementary event space W, and the elementary events themselves - points of space W.

Event A is identified with a subset (of space W) whose elements are elementary outcomes favoring A; event B is a subset of W whose elements are outcomes favorable to B, and so on. Thus, the set of all events that can occur in a trial is the set of all subsets W. W itself occurs with any outcome of the trial, so W is a certain event; an empty subset of the space W is an impossible event (it does not occur for any outcome of the test).

Note that elementary events are distinguished from all events by the fact that each of them contains only one element W.

Each elementary outcome w i is assigned a positive number p i is the probability of this outcome, and

By definition, the probability P(A) of an event A is equal to the sum of the probabilities of elementary outcomes favoring A. From this it is easy to obtain that the probability of an event that is reliable is equal to one, impossible is zero, and arbitrary is between zero and one.

Consider an important special case when all outcomes are equally likely. The number of outcomes is n, the sum of the probabilities of all outcomes is equal to one; hence the probability of each outcome is 1/n. Let event A be favored by m outcomes. The probability of event A is equal to the sum of the probabilities of outcomes favoring A:

P(A) = 1 / n + 1 / n + .. + 1 / n.

Considering that the number of terms is equal to m, we have

P (A) \u003d m / n.

The classical definition of probability is obtained.

The construction of a logically complete probability theory is based on the axiomatic definition of a random event and its probability. In the system of axioms proposed by A. N. Kolmogorov, the elementary event and probability are indefinable concepts. Here are the axioms that define the probability:

1. Each event A is assigned a non-negative real number P(A). This number is called the probability of event A.

2. The probability of a certain event is equal to one:

3. The probability of occurrence of at least one of the pairwise incompatible events is equal to the sum of the probabilities of these events.

Based on these axioms, the properties of probabilities and the relationships between them are derived as theorems.

3. Static definition of probability, relative frequency.

The classical definition does not require an experiment. While real applied problems have an infinite number of outcomes, and the classical definition in this case cannot give an answer. Therefore, in such problems we will use static determination of probabilities, which is calculated after the experiment or experiment.

static probability w(A) or relative frequency is the ratio of the number of outcomes favorable to a given event to the total number of actually conducted trials.


The relative frequency of an event has stability property:

lim n→∞P(∣ ∣ nmp∣ ∣ <ε)=1 (свойство устойчивости относительной частоты)

4.Geometric probabilities.

At geometric approach to definition probabilities an arbitrary set is considered as the space of elementary events finite Lebesgue measure on the line, plane or space. Events are called all sorts of measurable subsets of the set.

Probability of event A is determined by the formula

where denotes the Lebesgue measure of the set A. With this definition of events and probabilities, all A.N.Kolmogorov's axioms are fulfilled.

In specific tasks that are reduced to the above probabilistic scheme, the test is interpreted as a random selection of a point in some area, and the event A– as hit of the chosen point in some sub-region A of the region. This requires that all points in the region have the same opportunity to be selected. This requirement is usually expressed in terms "at random", "randomly", etc.

The randomness of the occurrence of events is associated with the impossibility of predicting in advance the outcome of a particular test. However, if we consider, for example, the test: multiple tossing of a coin, ω 1 , ω 2 , … , ω n , then it turns out that in approximately half of the outcomes ( n / 2) a certain pattern is found that corresponds to the concept of probability.

Under probability events A some numerical characteristic of the possibility of the occurrence of an event is understood A. We denote this numerical characteristic R(A). There are several approaches to determining probability. The main ones are statistical, classical and geometric.

Let produced n tests and at the same time some event A came n A times. Number n A is called absolute frequency(or just the frequency) of the event A, and the relation is called the relative frequency of occurrence of event A. Relative frequency of any event characterized by the following properties:

The basis for applying the methods of probability theory to the study of real processes is the objective existence of random events that have the property of frequency stability. Numerous trials of the event under study A show that for large n relative frequency ( A) remains approximately constant.

The statistical definition of probability lies in the fact that the probability of an event A is taken to be a constant value p(A), around which the values ​​of the relative frequencies fluctuate (A) with an unlimited increase in the number of trialsn.

Remark 1. Note that the limits of change in the probability of a random event from zero to one are chosen by B. Pascal for the convenience of its calculation and application. In correspondence with P. Fermat, Pascal pointed out that any interval could be chosen as the indicated interval, for example, from zero to one hundred and other intervals. In the problems below in this tutorial, the probabilities are sometimes given as percentages, i.e. from zero to one hundred. In this case, the percentages given in the tasks must be converted into shares, i.e. divide by 100.

Example 1 Conducted 10 series of coin tosses, 1000 tosses in each. Value ( A) in each of the series is 0.501; 0.485; 0.509; 0.536; 0.485; 0.488; 0.500; 0.497; 0.494; 0.484. These frequencies cluster around R(A) = 0,5.

This example confirms that the relative frequency ( A) is approximately equal to R(A), i.e.

Classical and statistical definition of probability. geometric probability.

The basic concept of probability theory is the concept of a random event. A random event is an event that, under certain conditions, may or may not occur. For example, hitting or missing an object when firing at this object with a given weapon is a random event.

An event is called certain if, as a result of the test, it necessarily occurs. An impossible event is an event that, as a result of a test, cannot occur.

Random events are said to be inconsistent in a given trial if no two of them can appear together.

Random events form a complete group if, on each trial, any one of them can appear and no other event cannot appear that is incompatible with them.

Consider the complete group of equally possible incompatible random events. Such events will be called outcomes. An outcome is called favorable to the occurrence of event A if the occurrence of this event entails the occurrence of event A.

The probability of an event A is the ratio of the number m of outcomes favorable to this event to the total number n of all equally possible incompatible elementary outcomes that form a complete group

Geometric probability is one way of specifying probability; let Ω be a bounded set of Euclidean space with volume λ(Ω) (respectively, length or area in a one-dimensional or two-dimensional situation), let ω be a point taken randomly from Ω, let the probability that a point be taken from a subset be proportional to its volume λ (x), then the geometric probability of a subset is defined as the ratio of volumes: The geometric definition of probability is often used in Monte Carlo methods, for example, to approximate the values ​​of multiple definite integrals.

Theorems of addition and multiplication of probabilities

Theorems of addition and multiplication of probabilities

The sum of two events A and B is the event C, which consists in the occurrence of at least one of the events A or B.

Addition theorem

The probability of the sum of two incompatible events is equal to the sum of the probabilities of these events:

P (A + B) = P (A) + P (B).

In the case when events A and B are joint, the ver-th of their sum is expressed by the formula

P (A + B) \u003d P (A) + P (B) - P (AB),

where AB is the product of events A and B.

Two events are said to be dependent if the probability of one of them depends on the occurrence or non-occurrence of the other. in the case of dependent events, the concept of the conditional probability of an event is introduced.

The conditional probability P(A/B) of event A is the probability of event A calculated assuming that event B has occurred. Similarly, P(B/A) denotes the conditional probability of an event B, provided that the event A has occurred.

The product of two events A and B is the event C, which consists in the joint occurrence of the event A and the event B.

Probability multiplication theorem

The probability of the product of two events is equal to the ver-ty of one of them, multiplied by the conditional probability of the other in the presence of the first:

P (AB) \u003d P (A) P (B / A), or P (AB) \u003d P (B) P (A / B).

Consequence. The probability of the joint occurrence of two independent events A and B is equal to the product of the probabilities of these events:

P (AB) \u003d P (A) P (B).

Consequence. In the case of n identical independent trials, in each of which event A appears with probability p, the probability of occurrence of event A at least once is equal to 1 - (1 - p)n

The probability that at least one event will occur. Example. Bayes formula.

The probability of making at least one mistake on a notebook page is p=0.1. There are 7 written pages in the notebook. What is the probability P that there is at least one error in the notebook?

The probability of occurrence of an event A, consisting of events A1, A2, ..., An, independent in the aggregate, is equal to the difference between unity and the product of the probabilities of opposite events Ǡ1, Ǡ2, ... Ǡn.

P(A) = 1 - q1q2…qn

Probability of the opposite event q = 1 - p.

In particular, if all events have the same probability equal to p, then the probability of the occurrence of at least one of these events is equal to:

P(A) = 1 - qn = 1 - (1 - p)n = 1 - (1 - 0.1)7 = 0.522

Answer: 0.522

Bayes formula.

Let us assume that some experiment is being carried out, and about the conditions for its conduct, n unique and incompatible hypotheses can be stated with probabilities Let the event A occur or not occur as a result of the experiment, and it is known that if the experiment occurs when the hypothesis is fulfilled, then probabilities of hypotheses, if it became known that event A happened? In other words, we are interested in the probabilities. Based on relations (4) and (5), we have whence But according to the total probability formula, therefore Formula (12) is called the Bayes formula*.

6. Bernoulli formula. Examples.

The Bernoulli formula is a formula in probability theory that allows you to find the probability of an event A occurring in independent trials. The Bernoulli formula allows you to get rid of a large number of calculations - addition and multiplication of probabilities - with a sufficiently large number of tests. Named after the outstanding Swiss mathematician Jacob Bernoulli, who developed the formula.


Theorem: If the Probability p of the occurrence of the event Α in each trial is constant, then the probability that the event A will occur k times in n independent trials is equal to: where. .


Since, as a result of independent tests carried out under the same conditions, an event occurs with a probability , therefore, the opposite event with a probability Let us designate the occurrence of an event in the test with a number. Since the conditions for conducting the experiments are the same, these probabilities are equal. Let the event occur once as a result of the experiments, then the rest of the times this event does not occur. An event can appear once in tests in various combinations, the number of which is equal to the number of combinations of elements by. This number of combinations is found by the formula: The probability of each combination is equal to the product of the probabilities: Applying the addition theorem for the probabilities of incompatible events, we obtain the final Bernoulli formula:

Local and integral theorems of Laplace. Examples.

Local and integral Laplace theorems

Local Laplace theorem. The probability that in n independent trials, in each of which the probability of occurrence of an event is equal to p(0< р < 1), событие наступит ровно k раз (безразлично, в какой последовательности), приближенно равна (тем точнее, чем больше n)
To determine the values ​​of φ(x), you can use a special table.

Integral theorem of Laplace. The probability that in n independent trials, in each of which the probability of occurrence of an event is equal to p (0< р < 1), событие наступит не менее k1 раз и не более k2 раз, приближенно равна

P(k1;k2)=Φ(x"") - Φ(x")

Here -Laplace function The values ​​of the Laplace function are found according to a special table.

Example. Find the probability that event A occurs exactly 70 times in 243 trials if the probability of this event occurring in each trial is 0.25.

Solution. By condition, n=243; k = 70; p=0.25; q= 0.75. Since n=243 is a fairly large number, we use the local Laplace theorem: where x = (k-np)/ √npq.

Find the value of x According to the table n, we find f (1.37) \u003d 0.1561. Desired probability

P(243)(70) = 1/6.75*0.1561 = 0.0231.

Numerical characteristics of discrete quantities. Examples

Numerical characteristics of discrete random variables

The distribution law fully characterizes the random variable. However, when it is impossible to find the distribution law, or this is not required, one can limit oneself to finding values, called numerical characteristics of a random variable. These quantities determine some average value around which the values ​​of a random variable are grouped, and the degree of their dispersion around this average value.

Definition. The mathematical expectation of a discrete random variable is the sum of the products of all possible values ​​of the random variable and their probabilities.

The mathematical expectation exists if the series on the right side of the equality converges absolutely.

From the point of view of probability, we can say that the mathematical expectation is approximately equal to the arithmetic mean of the observed values ​​of the random variable.

theoretical moments. Examples.

The idea of ​​this method is to equate theoretical and empirical moments. Therefore, we begin by discussing these concepts.

Let -- independent sample from distribution dependent on unknown parameter The theoretical moment of the th order is the function where is a random variable with a distribution function . We especially note that the theoretical moment is a function of unknown parameters, as long as the distribution depends on these parameters. We will assume that mathematical expectations exist at least for the empirical moment of the th order is called Note that, by definition, empirical moments are functions of the sample. notice, that is the well-known sample mean.

In order to find estimates of unknown parameters using the method of moments, one should:

explicitly compute the theoretical moments, and construct the following system of equations for the unknown variables

In this system are considered as fixed parameters.

solve system (35) with respect to variables Since the right side of the system depends on the sample, the result will be functions of These are the desired estimates of the parameters by the method of moments.

12. Chebyshev's inequality. The law of large numbers.

The Chebyshev inequality, also known as the Bieneme-Chebyshev inequality, is a common inequality in measure theory and probability theory. It was first obtained by Biename (French) in 1853, and later also by Chebyshev. The inequality used in measure theory is more general; in probability theory, its corollary is used.

Chebyshev's inequality in measure theory

The Chebyshev inequality in measure theory describes the relationship between the Lebesgue integral and the measure. An analogue of this inequality in probability theory is the Markov inequality. Chebyshev's inequality is also used to prove the embedding of a space in a weak space


Let be a space with a measure. Let also

summable per function

Then the inequality is true:

More generally:

If is a non-negative real measurable function that is non-decreasing on the domain of definition then In terms of space Let Then

Chebyshev's inequality in probability theory

Chebyshev's inequality in probability theory states that a random variable basically takes values ​​close to its mean. More precisely, it gives an estimate of the probability that a random variable will take on a value that is far from its mean. Chebyshev's inequality is a consequence of Markov's inequality.


Let a random variable be defined on a probability space, and its mathematical expectation and variance be finite. Then where If , where is the standard deviation and , then we get In particular, a random variable with finite variance deviates from the mean by more than standard deviations with a probability less than . It deviates from the mean by standard deviations with a probability less than .

Law of Large Numbers

The basic concepts of probability theory are the concepts of a random event and a random variable. At the same time, it is impossible to predict in advance the result of the test, in which one or another event or some specific value of a random variable may or may not appear, since the outcome of the test depends on many random causes that cannot be taken into account.

However, with repeated repetition of tests, regularities are observed that are characteristic of mass random phenomena. These patterns have the property of stability. The essence of this property is that the specific features of each individual random phenomenon have almost no effect on the average result of a large number of similar phenomena, and the characteristics of random events and random variables observed in tests, with an unlimited increase in the number of tests, become practically non-random.

Let a large series of experiments of the same type be carried out. The outcome of each individual experience is random, indeterminate. However, despite this, the average result of the entire series of experiments loses its random character and becomes regular.

For practice, it is very important to know the conditions under which the cumulative action of very many random causes leads to a result that is almost independent of chance, since it makes it possible to foresee the course of phenomena. These conditions are indicated in the theorems bearing the general name of the law of large numbers.

The law of large numbers should not be understood as any one general law associated with large numbers. The law of large numbers is a generalized name for several theorems, from which it follows that with an unlimited increase in the number of trials, the average values ​​tend to some constants.

These include the Chebyshev and Bernoulli theorems. Chebyshev's theorem is the most general law of large numbers, Bernoulli's theorem is the simplest.

The basis of the proof of theorems, united by the term "law of large numbers", is Chebyshev's inequality, which establishes the probability of deviation from its mathematical expectation:

Mathematical formulation

