Distribution of a random variable x. Discrete random variables. Three sigma rule

As is known, random variable is called a variable that can take on certain values ​​depending on the case. Random variables are denoted by capital letters of the Latin alphabet (X, Y, Z), and their values ​​- by the corresponding lowercase letters (x, y, z). Random variables are divided into discontinuous (discrete) and continuous.

Discrete random variable is called a random variable that takes only a finite or infinite (countable) set of values ​​with certain non-zero probabilities.

The distribution law of a discrete random variable is a function that connects the values ​​of a random variable with their corresponding probabilities. The distribution law can be specified in one of the following ways.

1 . The distribution law can be given by the table:

where λ>0, k = 0, 1, 2, … .

in) by using distribution function F(x) , which determines for each value x the probability that the random variable X takes a value less than x, i.e. F(x) = P(X< x).

Properties of the function F(x)

3 . The distribution law can be set graphically – distribution polygon (polygon) (see problem 3).

Note that in order to solve some problems, it is not necessary to know the distribution law. In some cases, it is enough to know one or more numbers that reflect the most important features of the distribution law. It can be a number that has the meaning of the "average value" of a random variable, or a number that shows the average size of the deviation of a random variable from its average value. Numbers of this kind are called numerical characteristics of a random variable.

Basic numerical characteristics of a discrete random variable :

  • Mathematical expectation (mean value) of a discrete random variable M(X)=Σ x i p i.
    For binomial distribution M(X)=np, for Poisson distribution M(X)=λ
  • Dispersion discrete random variable D(X)=M2 or D(X) = M(X 2) − 2. The difference X–M(X) is called the deviation of a random variable from its mathematical expectation.
    For binomial distribution D(X)=npq, for Poisson distribution D(X)=λ
  • Standard deviation (standard deviation) σ(X)=√D(X).

Examples of solving problems on the topic "The law of distribution of a discrete random variable"

Task 1.

1,000 lottery tickets have been issued: 5 of them will win 500 rubles, 10 will win 100 rubles, 20 will win 50 rubles, and 50 will win 10 rubles. Determine the law of probability distribution of the random variable X - winnings per ticket.

Solution. According to the condition of the problem, the following values ​​of the random variable X are possible: 0, 10, 50, 100 and 500.

The number of tickets without winning is 1000 - (5+10+20+50) = 915, then P(X=0) = 915/1000 = 0.915.

Similarly, we find all other probabilities: P(X=0) = 50/1000=0.05, P(X=50) = 20/1000=0.02, P(X=100) = 10/1000=0.01 , P(X=500) = 5/1000=0.005. We present the resulting law in the form of a table:

Find the mathematical expectation of X: M(X) = 1*1/6 + 2*1/6 + 3*1/6 + 4*1/6 + 5*1/6 + 6*1/6 = (1+ 2+3+4+5+6)/6 = 21/6 = 3.5

Task 3.

The device consists of three independently operating elements. The probability of failure of each element in one experiment is 0.1. Draw up a distribution law for the number of failed elements in one experiment, build a distribution polygon. Find the distribution function F(x) and plot it. Find the mathematical expectation, variance and standard deviation of a discrete random variable.

Solution. 1. Discrete random variable X=(number of failed elements in one experiment) has the following possible values: x 1 =0 (none of the elements of the device failed), x 2 =1 (one element failed), x 3 =2 (two elements failed ) and x 4 \u003d 3 (three elements failed).

Failures of elements are independent of each other, the probabilities of failure of each element are equal to each other, therefore, it is applicable Bernoulli's formula . Given that, by condition, n=3, p=0.1, q=1-p=0.9, we determine the probabilities of the values:
P 3 (0) \u003d C 3 0 p 0 q 3-0 \u003d q 3 \u003d 0.9 3 \u003d 0.729;
P 3 (1) \u003d C 3 1 p 1 q 3-1 \u003d 3 * 0.1 * 0.9 2 \u003d 0.243;
P 3 (2) \u003d C 3 2 p 2 q 3-2 \u003d 3 * 0.1 2 * 0.9 \u003d 0.027;
P 3 (3) \u003d C 3 3 p 3 q 3-3 \u003d p 3 \u003d 0.1 3 \u003d 0.001;
Check: ∑p i = 0.729+0.243+0.027+0.001=1.

Thus, the desired binomial distribution law X has the form:

On the abscissa axis, we plot the possible values ​​x i, and on the ordinate axis, the corresponding probabilities р i . Let's construct points M 1 (0; 0.729), M 2 (1; 0.243), M 3 (2; 0.027), M 4 (3; 0.001). Connecting these points with line segments, we obtain the desired distribution polygon.

3. Find the distribution function F(x) = P(X

For x ≤ 0 we have F(x) = P(X<0) = 0;
for 0< x ≤1 имеем F(x) = Р(Х<1) = Р(Х = 0) = 0,729;
for 1< x ≤ 2 F(x) = Р(Х<2) = Р(Х=0) + Р(Х=1) =0,729+ 0,243 = 0,972;
for 2< x ≤ 3 F(x) = Р(Х<3) = Р(Х = 0) + Р(Х = 1) + Р(Х = 2) = 0,972+0,027 = 0,999;
for x > 3 it will be F(x) = 1, because the event is certain.

Graph of the function F(x)

4. For the binomial distribution X:
- mathematical expectation М(X) = np = 3*0.1 = 0.3;
- dispersion D(X) = npq = 3*0.1*0.9 = 0.27;
- standard deviation σ(X) = √D(X) = √0.27 ≈ 0.52.

Normal law of probability distribution

Without exaggeration, it can be called a philosophical law. Observing various objects and processes of the world around us, we often encounter the fact that something is not enough, and that there is a norm:


Here is a basic view density functions normal probability distribution, and I welcome you to this most interesting lesson.

What examples can be given? They are just darkness. This, for example, is the height, weight of people (and not only), their physical strength, mental abilities, etc. There is a "mass" (in one way or another) and there are deviations in both directions.

These are different characteristics of inanimate objects (the same dimensions, weight). This is a random duration of processes, for example, the time of a hundred-meter race or the transformation of resin into amber. From physics, air molecules came to mind: among them there are slow ones, there are fast ones, but most of them move at “standard” speeds.

Next, we deviate from the center by one more standard deviation and calculate the height:

Marking points on the drawing (green color) and we see that this is quite enough.

At the final stage, we carefully draw a graph, and especially carefully reflect it convexity / concavity! Well, you probably realized a long time ago that the abscissa axis is horizontal asymptote, and it is absolutely impossible to “climb” for it!

With the electronic design of the solution, the graph is easy to build in Excel, and unexpectedly for myself, I even recorded a short video on this topic. But first, let's talk about how the shape of the normal curve changes depending on the values ​​of and .

When increasing or decreasing "a" (with unchanged "sigma") the graph retains its shape and moves right / left respectively. So, for example, when the function takes the form and our graph "moves" 3 units to the left - exactly to the origin:


A normally distributed quantity with zero mathematical expectation received a completely natural name - centered; its density function even, and the graph is symmetrical about the y-axis.

In the event of a change in "sigma" (with constant "a"), the graph "remains in place", but changes shape. When enlarged, it becomes lower and elongated, like an octopus stretching its tentacles. And vice versa, when decreasing the graph becomes narrower and taller- it turns out "surprised octopus." Yes, at decrease"sigma" two times: the previous chart narrows and stretches up twice:

Everything is in full accordance with geometric transformations of graphs.

The normal distribution with unit value "sigma" is called normalized, and if it is also centered(our case), then such a distribution is called standard. It has an even simpler density function, which has already been encountered in local Laplace theorem: . The standard distribution has found wide application in practice, and very soon we will finally understand its purpose.

Now let's watch a movie:

Yes, quite right - somehow undeservedly we have remained in the shadows probability distribution function. We remember her definition:
- the probability that a random variable will take a value LESS than the variable , which "runs" all real values ​​\u200b\u200bto "plus" infinity.

Inside the integral, a different letter is usually used so that there are no "overlays" with the notation, because here each value is assigned improper integral , which is equal to some number from the interval.

Almost all values ​​cannot be accurately calculated, but as we have just seen, with modern computing power, this is not difficult. So, for the function of the standard distribution, the corresponding excel function generally contains one argument:

=NORMSDIST(z)

One, two - and you're done:

The drawing clearly shows the implementation of all distribution function properties, and from the technical nuances here you should pay attention to horizontal asymptotes and an inflection point.

Now let's recall one of the key tasks of the topic, namely, find out how to find - the probability that a normal random variable will take a value from the interval. Geometrically, this probability is equal to area between the normal curve and the x-axis in the corresponding section:

but each time grind out an approximate value is unreasonable, and therefore it is more rational to use "easy" formula:
.

! also remembers , what

Here you can use Excel again, but there are a couple of significant “buts”: firstly, it is not always at hand, and secondly, “ready-made” values, most likely, will raise questions from the teacher. Why?

I have repeatedly talked about this before: at one time (and not very long ago) an ordinary calculator was a luxury, and the “manual” way of solving the problem under consideration is still preserved in the educational literature. Its essence is to standardize the values ​​"alpha" and "beta", that is, reduce the solution to the standard distribution:

Note : the function is easy to obtain from the general caseusing a linear substitutions. Then and:

and from the replacement just follows the formula transition from the values ​​of an arbitrary distribution to the corresponding values ​​of the standard distribution.

Why is this needed? The fact is that the values ​​were scrupulously calculated by our ancestors and summarized in a special table, which is in many books on terver. But even more common is the table of values, which we have already dealt with in Laplace integral theorem:

If we have at our disposal a table of values ​​of the Laplace function , then we solve through it:

Fractional values ​​are traditionally rounded to 4 decimal places, as is done in the standard table. And for control Item 5 layout.

I remind you that , and to avoid confusion always be in control, table of WHAT function before your eyes.

Answer is required to be given as a percentage, so the calculated probability must be multiplied by 100 and provide the result with a meaningful comment:

- with a flight from 5 to 70 m, approximately 15.87% of the shells will fall

We train on our own:

Example 3

The diameter of bearings manufactured at the factory is a random variable normally distributed with an expectation of 1.5 cm and a standard deviation of 0.04 cm. Find the probability that the size of a randomly taken bearing ranges from 1.4 to 1.6 cm.

In the sample solution and below, I will use the Laplace function as the most common option. By the way, note that according to the wording, here you can include the ends of the interval in the consideration. However, this is not critical.

And already in this example, we met a special case - when the interval is symmetrical with respect to the mathematical expectation. In such a situation, it can be written in the form and, using the oddness of the Laplace function, simplify the working formula:


The delta parameter is called deviation from the mathematical expectation, and the double inequality can be “packed” using module:

is the probability that the value of a random variable deviates from the mathematical expectation by less than .

Well, the solution that fits in one line :)
is the probability that the diameter of a bearing taken at random differs from 1.5 cm by no more than 0.1 cm.

The result of this task turned out to be close to unity, but I would like even more reliability - namely, to find out the boundaries in which the diameter is almost everyone bearings. Is there any criterion for this? Exists! The question is answered by the so-called

three sigma rule

Its essence is that practically reliable is the fact that a normally distributed random variable will take a value from the interval .

Indeed, the probability of deviation from the expectation is less than:
or 99.73%

In terms of "bearings" - these are 9973 pieces with a diameter of 1.38 to 1.62 cm and only 27 "substandard" copies.

In practical research, the “three sigma” rule is usually applied in the opposite direction: if statistically found that almost all values random variable under study fit into an interval of 6 standard deviations, then there are good reasons to believe that this value is distributed according to the normal law. Verification is carried out using the theory statistical hypotheses.

We continue to solve the harsh Soviet tasks:

Example 4

The random value of the weighing error is distributed according to the normal law with zero mathematical expectation and a standard deviation of 3 grams. Find the probability that the next weighing will be carried out with an error not exceeding 5 grams in absolute value.

Solution very simple. By the condition, and we immediately note that at the next weighing (something or someone) we will almost 100% get the result with an accuracy of 9 grams. But in the problem there is a narrower deviation and according to the formula :

- the probability that the next weighing will be carried out with an error not exceeding 5 grams.

Answer:

A solved problem is fundamentally different from a seemingly similar one. Example 3 lesson about uniform distribution. There was an error rounding measurement results, here we are talking about the random error of the measurements themselves. Such errors arise due to the technical characteristics of the device itself. (the range of permissible errors, as a rule, is indicated in his passport), and also through the fault of the experimenter - when, for example, "by eye" we take readings from the arrow of the same scales.

Among others, there are also so-called systematic measurement errors. It's already nonrandom errors that occur due to incorrect setup or operation of the device. So, for example, unadjusted floor scales can consistently "add" a kilogram, and the seller systematically underweight buyers. Or not systematically because you can shortchange. However, in any case, such an error will not be random, and its expectation is different from zero.

…I am urgently developing a sales training course =)

Let's solve the problem on our own:

Example 5

The roller diameter is a random normally distributed random variable, its standard deviation is mm. Find the length of the interval, symmetrical with respect to the mathematical expectation, in which the length of the diameter of the bead will fall with probability.

Item 5* design layout to help. Please note that the mathematical expectation is not known here, but this does not in the least interfere with solving the problem.

And the exam task, which I highly recommend to consolidate the material:

Example 6

A normally distributed random variable is given by its parameters (mathematical expectation) and (standard deviation). Required:

a) write down the probability density and schematically depict its graph;
b) find the probability that it will take a value from the interval ;
c) find the probability that the modulo deviates from no more than ;
d) applying the rule of "three sigma", find the values ​​of the random variable .

Such problems are offered everywhere, and over the years of practice I have been able to solve hundreds and hundreds of them. Be sure to practice hand drawing and using paper spreadsheets ;)

Well, I will analyze an example of increased complexity:

Example 7

The probability distribution density of a random variable has the form . Find , mathematical expectation , variance , distribution function , plot density and distribution functions, find .

Solution: first of all, let's pay attention that the condition does not say anything about the nature of the random variable. By itself, the presence of the exhibitor does not mean anything: it can be, for example, demonstrative or generally arbitrary continuous distribution. And therefore, the “normality” of the distribution still needs to be substantiated:

Since the function determined at any real value , and it can be reduced to the form , then the random variable is distributed according to the normal law.

We present. For this select a full square and organize three-story fraction:


Be sure to perform a check, returning the indicator to its original form:

which is what we wanted to see.

In this way:
- on power rule"pinching off". And here you can immediately write down the obvious numerical characteristics:

Now let's find the value of the parameter. Since the normal distribution multiplier has the form and , then:
, from which we express and substitute into our function:
, after which we will once again go over the record with our eyes and make sure that the resulting function has the form .

Let's plot the density:

and the plot of the distribution function :

If there is no Excel and even a regular calculator at hand, then the last chart is easily built manually! At the point, the distribution function takes on the value and here is

We can single out the most common laws of distribution of discrete random variables:

  • Binomial distribution law
  • Poisson distribution law
  • Geometric distribution law
  • Hypergeometric distribution law

For given distributions of discrete random variables, the calculation of the probabilities of their values, as well as numerical characteristics (mathematical expectation, variance, etc.) is carried out according to certain "formulas". Therefore, it is very important to know these types of distributions and their basic properties.


1. Binomial distribution law.

A discrete random variable $X$ is subject to the binomial probability distribution if it takes the values ​​$0,\ 1,\ 2,\ \dots ,\ n$ with probabilities $P\left(X=k\right)=C^k_n\cdot p^k\cdot (\left(1-p\right))^(n-k)$. In fact, the random variable $X$ is the number of occurrences of the event $A$ in $n$ independent trials. Probability distribution law for the random variable $X$:

$\begin(array)(|c|c|)
\hline
X_i & 0 & 1 & \dots & n \\
\hline
p_i & P_n\left(0\right) & P_n\left(1\right) & \dots & P_n\left(n\right) \\
\hline
\end(array)$

For such a random variable, the expectation is $M\left(X\right)=np$, the variance is $D\left(X\right)=np\left(1-p\right)$.

Example . There are two children in the family. Assuming the birth probabilities of a boy and a girl equal to $0.5$, find the law of distribution of the random variable $\xi $ - the number of boys in the family.

Let the random variable $\xi $ be the number of boys in the family. The values ​​that $\xi:\ 0,\ ​​1,\ 2$ can take. The probabilities of these values ​​can be found by the formula $P\left(\xi =k\right)=C^k_n\cdot p^k\cdot (\left(1-p\right))^(n-k)$, where $n =2$ - number of independent trials, $p=0.5$ - probability of occurrence of an event in a series of $n$ trials. We get:

$P\left(\xi =0\right)=C^0_2\cdot (0.5)^0\cdot (\left(1-0.5\right))^(2-0)=(0, 5)^2=0.25;$

$P\left(\xi =1\right)=C^1_2\cdot 0.5\cdot (\left(1-0.5\right))^(2-1)=2\cdot 0.5\ cdot 0.5=0.5;$

$P\left(\xi =2\right)=C^2_2\cdot (0,5)^2\cdot (\left(1-0,5\right))^(2-2)=(0, 5)^2=0.25.$

Then the distribution law of the random variable $\xi $ is the correspondence between the values ​​$0,\ 1,\ 2$ and their probabilities, i.e.:

$\begin(array)(|c|c|)
\hline
\xi & 0 & 1 & 2 \\
\hline
P(\xi) & 0.25 & 0.5 & 0.25 \\
\hline
\end(array)$

The sum of probabilities in the distribution law must be equal to $1$, i.e. $\sum _(i=1)^(n)P(\xi _((\rm i)))=0.25+0.5+0, 25=$1.

Expectation $M\left(\xi \right)=np=2\cdot 0.5=1$, variance $D\left(\xi \right)=np\left(1-p\right)=2\ cdot 0.5\cdot 0.5=0.5$, standard deviation $\sigma \left(\xi \right)=\sqrt(D\left(\xi \right))=\sqrt(0.5 )\approx $0.707.

2. Poisson distribution law.

If a discrete random variable $X$ can take only non-negative integer values ​​$0,\ 1,\ 2,\ \dots ,\ n$ with probabilities $P\left(X=k\right)=(((\lambda )^k )\over (k}\cdot e^{-\lambda }$, то говорят, что она подчинена закону распределения Пуассона с параметром $\lambda $. Для такой случайной величины математическое ожидание и дисперсия равны между собой и равны параметру $\lambda $, то есть $M\left(X\right)=D\left(X\right)=\lambda $.!}

Comment. The peculiarity of this distribution is that, based on experimental data, we find the estimates $M\left(X\right),\ D\left(X\right)$, if the obtained estimates are close to each other, then we have reason to assert that that the random variable is subject to the Poisson distribution law.

Example . Examples of random variables subject to the Poisson distribution law can be: the number of cars that will be serviced tomorrow by a gas station; the number of defective items in the manufactured product.

Example . The plant sent $500$ of products to the base. The probability of product damage in transit is $0.002$. Find the distribution law of the random variable $X$ equal to the number of damaged products; which is equal to $M\left(X\right),\ D\left(X\right)$.

Let a discrete random variable $X$ be the number of damaged products. Such a random variable is subject to the Poisson distribution law with the parameter $\lambda =np=500\cdot 0.002=1$. The probabilities of the values ​​are $P\left(X=k\right)=(((\lambda )^k)\over (k}\cdot e^{-\lambda }$. Очевидно, что все вероятности всех значений $X=0,\ 1,\ \dots ,\ 500$ перечислить невозможно, поэтому мы ограничимся лишь первыми несколькими значениями.!}

$P\left(X=0\right)=((1^0)\over (0}\cdot e^{-1}=0,368;$!}

$P\left(X=1\right)=((1^1)\over (1}\cdot e^{-1}=0,368;$!}

$P\left(X=2\right)=((1^2)\over (2}\cdot e^{-1}=0,184;$!}

$P\left(X=3\right)=((1^3)\over (3}\cdot e^{-1}=0,061;$!}

$P\left(X=4\right)=((1^4)\over (4}\cdot e^{-1}=0,015;$!}

$P\left(X=5\right)=((1^5)\over (5}\cdot e^{-1}=0,003;$!}

$P\left(X=6\right)=((1^6)\over (6}\cdot e^{-1}=0,001;$!}

$P\left(X=k\right)=(((\lambda )^k)\over (k}\cdot e^{-\lambda }$!}

The distribution law of the random variable $X$:

$\begin(array)(|c|c|)
\hline
X_i & 0 & 1 & 2 & 3 & 4 & 5 & 6 & ... & k \\
\hline
P_i & 0.368; & 0.368 & 0.184 & 0.061 & 0.015 & 0.003 & 0.001 & ... & (((\lambda )^k)\over (k}\cdot e^{-\lambda } \\!}
\hline
\end(array)$

For such a random variable, the mathematical expectation and variance are equal to each other and equal to the parameter $\lambda $, i.e. $M\left(X\right)=D\left(X\right)=\lambda =1$.

3. Geometric law of distribution.

If a discrete random variable $X$ can take only natural values ​​$1,\ 2,\ \dots ,\ n$ with probabilities $P\left(X=k\right)=p(\left(1-p\right)) ^(k-1),\ k=1,\ 2,\ 3,\ \dots $, then we say that such a random variable $X$ is subject to the geometric law of probability distribution. In fact, the geometric distribution appears to be Bernoulli's trials to the first success.

Example . Examples of random variables that have a geometric distribution can be: the number of shots before the first hit on the target; number of tests of the device before the first failure; the number of coin tosses before the first heads up, and so on.

The mathematical expectation and variance of a random variable subject to a geometric distribution are respectively $M\left(X\right)=1/p$, $D\left(X\right)=\left(1-p\right)/p^ 2$.

Example . On the way of fish movement to the spawning place there is a $4$ lock. The probability of a fish passing through each lock is $p=3/5$. Construct a distribution series of the random variable $X$ - the number of locks passed by the fish before the first stop at the lock. Find $M\left(X\right),\ D\left(X\right),\ \sigma \left(X\right)$.

Let the random variable $X$ be the number of sluices passed by the fish before the first stop at the sluice. Such a random variable is subject to the geometric law of probability distribution. The values ​​that the random variable $X can take are: 1, 2, 3, 4. The probabilities of these values ​​are calculated by the formula: $P\left(X=k\right)=pq^(k-1)$, where: $ p=2/5$ - probability of fish being caught through the lock, $q=1-p=3/5$ - probability of fish passing through the lock, $k=1,\ 2,\ 3,\ 4$.

$P\left(X=1\right)=((2)\over (5))\cdot (\left(((3)\over (5))\right))^0=((2)\ over(5))=0.4;$

$P\left(X=2\right)=((2)\over (5))\cdot ((3)\over (5))=((6)\over (25))=0.24; $

$P\left(X=3\right)=((2)\over (5))\cdot (\left(((3)\over (5))\right))^2=((2)\ over (5))\cdot ((9)\over (25))=((18)\over (125))=0.144;$

$P\left(X=4\right)=((2)\over (5))\cdot (\left(((3)\over (5))\right))^3+(\left(( (3)\over (5))\right))^4=((27)\over (125))=0.216.$

$\begin(array)(|c|c|)
\hline
X_i & 1 & 2 & 3 & 4 \\
\hline
P\left(X_i\right) & 0.4 & 0.24 & 0.144 & 0.216 \\
\hline
\end(array)$

Expected value:

$M\left(X\right)=\sum^n_(i=1)(x_ip_i)=1\cdot 0.4+2\cdot 0.24+3\cdot 0.144+4\cdot 0.216=2.176.$

Dispersion:

$D\left(X\right)=\sum^n_(i=1)(p_i(\left(x_i-M\left(X\right)\right))^2=)0,4\cdot (\ left(1-2,176\right))^2+0,24\cdot (\left(2-2,176\right))^2+0,144\cdot (\left(3-2,176\right))^2+$

$+\ 0.216\cdot (\left(4-2.176\right))^2\approx 1.377.$

Standard deviation:

$\sigma \left(X\right)=\sqrt(D\left(X\right))=\sqrt(1,377)\approx 1,173.$

4. Hypergeometric distribution law.

If there are $N$ objects, among which $m$ objects have the given property. Randomly, without replacement, $n$ objects are extracted, among which there are $k$ objects that have a given property. The hypergeometric distribution makes it possible to estimate the probability that exactly $k$ objects in a sample have a given property. Let the random variable $X$ be the number of objects in the sample that have a given property. Then the probabilities of the values ​​of the random variable $X$:

$P\left(X=k\right)=((C^k_mC^(n-k)_(N-m))\over (C^n_N))$

Comment. The HYPERGEOMET statistical function of the Excel $f_x$ Function Wizard allows you to determine the probability that a certain number of trials will be successful.

$f_x\to $ statistical$\to $ HYPERGEOMET$\to $ OK. A dialog box will appear that you need to fill out. In the graph Number_of_successes_in_sample specify the value of $k$. sample_size equals $n$. In the graph Number_of_successes_in_population specify the value of $m$. Population_size equals $N$.

The mathematical expectation and variance of a discrete random variable $X$ subject to a geometric distribution law are $M\left(X\right)=nm/N$, $D\left(X\right)=((nm\left(1 -((m)\over (N))\right)\left(1-((n)\over (N))\right))\over (N-1))$.

Example . The credit department of the bank employs 5 specialists with higher financial education and 3 specialists with higher legal education. The management of the bank decided to send 3 specialists for advanced training, selecting them randomly.

a) Make a distribution series of the number of specialists with higher financial education who can be directed to advanced training;

b) Find the numerical characteristics of this distribution.

Let the random variable $X$ be the number of specialists with higher financial education among the three selected. Values ​​that $X:0,\ 1,\ 2,\ 3$ can take. This random variable $X$ is distributed according to the hypergeometric distribution with the following parameters: $N=8$ - population size, $m=5$ - number of successes in the population, $n=3$ - sample size, $k=0,\ 1, \ 2,\ 3$ - number of successes in the sample. Then the probabilities $P\left(X=k\right)$ can be calculated using the formula: $P(X=k)=(C_(m)^(k) \cdot C_(N-m)^(n-k) \over C_( N)^(n) ) $. We have:

$P\left(X=0\right)=((C^0_5\cdot C^3_3)\over (C^3_8))=((1)\over (56))\approx 0.018;$

$P\left(X=1\right)=((C^1_5\cdot C^2_3)\over (C^3_8))=((15)\over (56))\approx 0.268;$

$P\left(X=2\right)=((C^2_5\cdot C^1_3)\over (C^3_8))=((15)\over (28))\approx 0.536;$

$P\left(X=3\right)=((C^3_5\cdot C^0_3)\over (C^3_8))=((5)\over (28))\approx 0.179.$

Then the distribution series of the random variable $X$:

$\begin(array)(|c|c|)
\hline
X_i & 0 & 1 & 2 & 3 \\
\hline
p_i & 0.018 & 0.268 & 0.536 & 0.179 \\
\hline
\end(array)$

Let us calculate the numerical characteristics of the random variable $X$ using the general formulas of the hypergeometric distribution.

$M\left(X\right)=((nm)\over (N))=((3\cdot 5)\over (8))=((15)\over (8))=1,875.$

$D\left(X\right)=((nm\left(1-((m)\over (N))\right)\left(1-((n)\over (N))\right)) \over (N-1))=((3\cdot 5\cdot \left(1-((5)\over (8))\right)\cdot \left(1-((3)\over (8 ))\right))\over (8-1))=((225)\over (448))\approx 0.502.$

$\sigma \left(X\right)=\sqrt(D\left(X\right))=\sqrt(0.502)\approx 0.7085.$

The distribution function of a random variable X is the function F(x), expressing for each x the probability that the random variable X takes the value, smaller x

Example 2.5. Given a series of distribution of a random variable

Find and graphically depict its distribution function. Solution. According to the definition

F(jc) = 0 for X X

F(x) = 0.4 + 0.1 = 0.5 at 4 F(x) = 0.5 + 0.5 = 1 at X > 5.

So (see Fig. 2.1):


Distribution function properties:

1. The distribution function of a random variable is a non-negative function enclosed between zero and one:

2. The distribution function of a random variable is a non-decreasing function on the entire number axis, i.e. at X 2 >x

3. At minus infinity, the distribution function is equal to zero, at plus infinity, it is equal to one, i.e.

4. Probability of hitting a random variable X in the interval is equal to the definite integral of its probability density ranging from a before b(see Fig. 2.2), i.e.


Rice. 2.2

3. The distribution function of a continuous random variable (see Fig. 2.3) can be expressed in terms of the probability density using the formula:

F(x)= Jp(*)*. (2.10)

4. Improper integral in infinite limits of the probability density of a continuous random variable is equal to one:

Geometric properties / and 4 probability densities mean that its plot is distribution curve - lies not below the x-axis, and the total area of ​​the figure, limited distribution curve and x-axis, is equal to one.

For a continuous random variable X expected value M(X) and variance D(X) are determined by the formulas:

(if the integral converges absolutely); or

(if the reduced integrals converge).

Along with the numerical characteristics noted above, the concept of quantiles and percentage points is used to describe a random variable.

q level quantile(or q-quantile) is such a valuex qrandom variable, at which its distribution function takes the value, equal to q, i.e.

  • 100The q%-ou point is the quantile X~ q .
  • ? Example 2.8.

According to example 2.6 find the quantile xqj and 30% random variable point x.

Solution. By definition (2.16) F(xo t3)= 0.3, i.e.

~Y~ = 0.3, whence the quantile x 0 3 = 0.6. 30% random variable point X, or quantile Х)_о,з = xoj» is found similarly from the equation ^ = 0.7. whence *,= 1.4. ?

Among the numerical characteristics of a random variable, there are initial v* and central R* k-th order moments, determined for discrete and continuous random variables by the formulas:


Consider discrete distributions that are often used in modeling service systems.

Bernoulli distribution. The Bernoulli scheme is a sequence of independent trials, in each of which only two outcomes are possible - "success" and "failure" with probabilities R and q = 1 - R. Let the random variable X can take two values ​​with corresponding probabilities:

The Bernoulli distribution function has the form

Its graph is shown in Fig. 11.1.

A random variable with such a distribution is equal to the number of successes in one trial of the Bernoulli scheme.

The generating function, according to (11.1) and (11.15), is calculated as

Rice. 11.1.

Using formula (11.6), we find the mathematical expectation of the distribution:

We calculate the second derivative of the generating function according to (11.17)

By (11.7) we obtain the distribution variance

The Bernoulli distribution plays a big role in the theory of mass service, being a model of any random experiment, the outcomes of which belong to two mutually exclusive classes.

Geometric distribution. Assume that events occur at discrete times independently of each other. The probability that an event will occur is R, and the probability that it won't happen is q = 1-p, For example, a customer who has come to place an order.

Denote by r to the probability that an event will occur 1st time at a time to, those. to-th client made an order, and the previous to- 1 no clients. Then the probability of this complex event can be determined by the theorem of multiplication of the probabilities of independent events

The probabilities of events with a geometric distribution are shown in fig. 11.2.

The sum of the probabilities of all possible events

is a geometric progression, hence the distribution is called geometric. Since (1 - R)

Random value Xs geometric distribution has the meaning of the number of the first successful trial in the Bernoulli scheme.

Rice. 11.2.

Determine the probability that an event will occur for X>k

and the geometric distribution function

Let us calculate the generating function of the geometric distribution according to (11.1) and (11.20)

mathematical expectation of the geometric distribution according to (11.6)

and the dispersion according to (11.7)

The geometric distribution is considered to be a discrete version of the continuous exponential distribution and also has a number of properties useful for modeling service systems. In particular, like the exponential distribution, the geometric distribution has no memory:

those. if / failed experiments, then the probability that for the first success it is necessary to conduct more j of new trials is the same as the probability that a new series of trials requires a first success. /" trials. In other words, previous trials have no effect on future trials and trials are independent. Often this is true. and orders are made randomly.

Let us consider an example of a system whose functioning parameters are subject to a geometric distribution.

The master has at his disposal P identical spare parts. Every detail is likely q has a defect. During repair, the part is installed in the device, which is checked for operability. If the device does not work, then the part is replaced with another one. We consider a random variable X- the number of parts to be checked.

The probabilities of the number of tested parts will have the values ​​shown in the table:

rya"~ x

Here q = 1 - R.

The mathematical expectation of the number of checked parts is defined as

Binomial distribution. Consider a random variable

where Xj obeys the Bernoulli distribution with the parameter R and random variables Xj independent.

Random value X will be equal to the number of occurrences of units at P tests, i.e. a random variable with a binomial distribution has the meaning of the number of successes in P independent tests.

According to (11.9), the generating function of the sum of mutually independent random variables, each of which has a Bernoulli distribution, is equal to the product of their generating functions (11.17):

Expanding the generating function (11.26) into a series, we obtain

In accordance with the definition of the generating function (11.1), the probability that the random variable X will take on the meaning to:

where are binomial coefficients.

11since & units per P places can be arranged in C* ways, then the number of samples containing to units will obviously be the same.

The distribution function for the binomial law is calculated by the formula

The distribution is called binomial due to the fact that the probabilities in form are terms of the expansion of the binomial:

It is clear that the total probability of all possible outcomes is equal to 1:

From (11.29) one can obtain a number of useful properties of binomial coefficients. For example, when R =1, q=1 we get

If we put R =1, q= - 1 , then

For any 1k, the following relations are valid:

The probabilities that in P tests, the event will occur: 1) less than × 2) more to once; 3) at least × 4) no more than & times, find, respectively, according to the formulas:

Using (11.6), we define the expectation of the binomial distribution

and according to (11.7) - dispersion:

Let us consider several examples of systems whose operation parameters are described by the binomial distribution.

1. A batch of 10 products contains one non-standard. Let us find the probability that with a random sample of 5 products, all of them will be standard (event BUT).

Number of all random samples p - S, e 0 , and the number of samples favoring the event is P= C 9 5 . Thus, the desired probability is equal to

2. At the entrance to a new apartment, 2 to new electric lamps. Each electric lamp burns out during the year with a probability R. Let us find the probability that during the year at least half of the initially switched on lamps will have to be replaced with new ones (the event BUT):

3. A person belonging to a certain group of consumers prefers product 1 with a probability of 0.2, product 2 with a probability of 0.3, product 3 with a probability of 0.4, product 4 with a probability of 0.1. A group of 6 consumers. Find the probabilities of the following events: BUT - the group includes at least 4 consumers who prefer product 3; AT- there is at least one consumer in the group who prefers product 4.

These probabilities are:

For large /? probability calculations become cumbersome, so limit theorems are used.

Local Laplace theorem, according to which the probability R p (k) is determined by the formula

where - Gaussian function;

Laplace integral theorem used to calculate the probability that P independent tests, the event will occur at least to ( once and no more to 2 once:

Let's consider examples of using these theorems.

1. The sewing workshop produces tailor-made clothes, among which 90% are of the highest quality. Find the probability that among 200 products there will be at least 160 and at most 170 of the highest quality.

Solution:

2. An insurance company has 12,000 clients. Each of them, insuring himself against an accident, contributes 10 thousand rubles. Probability of an accident R - 0.006, and the payment to the victim of 1 million rubles. Let's find the profit of the insurance company, provided with a probability of 0.995; in other words, what profit can the insurance company expect at a risk level of 0.005.

Solution: The total contribution of all clients is 12,000-10,000 = 120 million rubles. The company's profit depends on the number to accidents and is determined by the equality R = 120,000-1000 /: thousand rubles.

Therefore, it is necessary to find such a number A/ that the probability of the event P(k > M) did not exceed 0.005. Then, with a probability of 0.995, profit will be provided R = 120,000-10,004 / thousand rubles.

Inequality P(k > M) P(k0.995. Since to > 0, then R( 0 0.995. To estimate this probability, we use the Laplace integral theorem for P- 12,000 and /?=0.006, #=0.994:

Because*! F(x]) = -0.5.

Thus, it is necessary to find A/ for which

We find (M- 72)/8.5 > 2.58. Consequently, M>12 + 22 = 94.

So, with a probability of 0.995, the company guarantees a profit

It is often necessary to determine the most probable number to 0 . Probability of an event with a number of successes to 0 exceeds or at least not less than the probability of other possible test outcomes. Most Likely Number to 0 determined from the double inequality

3. Let there be 25 samples of consumer goods. The probability that each of the samples will be acceptable to the client is 0.7. It is necessary to determine the most likely number of samples that will be acceptable to customers. By (11.39)

From here to 0 - 18.

Poisson distribution. The Poisson distribution determines the probability that, given a very large number of trials, P, in each of which the probability of an event R very small, the event will occur exactly to schz.

Let the work pr \u003d k; this means that the average number of occurrences of an event in different series of trials, i.e. at various P, remains unchanged. In this case, the Poisson distribution can be used to approximate the binomial distribution:

Since for big P

The generating function of the Poisson distribution is calculated from (11.1) as

where by the Maclaurin formula

In accordance with the property of the coefficients of the generating function, the probability of occurrence to successes with an average number of successes X is calculated as (11.40).

On fig. 11.3 shows the probability density of the Poisson distribution.

The generating function of the Poisson distribution can also be obtained using the series expansion of the generating function of the binomial distribution for pr \u003d X at P-» oo and the Maclaurin formula (11.42):


Rice. 11.3.

We define the mathematical expectation by (11.6)

and the dispersion according to (11.7)

Consider an example of a system with a Poisson distribution of parameters.

The company sent 500 products to the store. The probability of product damage in transit is 0.002. Find the probabilities that products will be damaged in transit: exactly 3 (event R); less than 3 (event AT) more than 3 (event Q; at least one (event D).

Number P= 500 is big, probability R= 0.002 is small, the considered events (product damage) are independent, so the Poisson formula (11.40) can be used.

At x=pr= 500 0.002=1 we get:

The Poisson distribution has a number of useful properties for modeling service systems.

1. Sum of random variables X \u003d X ( + X 2 with a Poisson distribution is also distributed according to Poisson's law.

If random variables have generating functions:

then, according to (11.9), the generating function of the sum of independent random variables with a Poisson distribution will have the form:

The parameter of the resulting distribution is X x + X 2.

2. If the number of elements./V of the set obeys the Poisson distribution with the parameter X and each element is chosen independently with probability R, then the sample elements of size Y distributed according to the Poisson law with the parameter pX.

Let , where corresponds to the Bernoulli distribution, and N- Poisson distribution. The corresponding generating functions, according to (11.17), (11.41):

Generating function of a random variable Y is calculated according to (11.14)

those. generating function corresponds to the Poisson distribution with the parameter pX.

3. As a consequence of property 2, the following property holds. If the number of elements of the set is distributed according to the Poisson law with the parameter X and the set is randomly distributed with probabilities /?, and p 2 = 1 - R into two groups, then the sizes of the sets are 7V, and N 2 are independent and Poisson-distributed with parameters p(k and p(k.

For ease of use, we present the results obtained for discrete distributions in the form of a table. 11.1 and 11.2.

Table 11.1. Main characteristics of discrete distributions

Distribution

Density

Range

Options

tn |

C X--2

Bernoulli

P(X = ) = p P (X = 0} =

R + Z= 1

P - 0,1

Geometric

p(-p) to - 1

k = 1,2,...

^ 1 1 |

1 -R

Binomial

with to p to (- R g to

* = 1,2,...,#"

pr( - p)

1 -r pr

Poisson

E's to!

k = 1,2,...

Table 11. 2. Generating functions of discrete distributions

TEST QUESTIONS

  • 1. What probability distributions are classified as discrete?
  • 2. What is a generating function and what is it used for?
  • 3. How to calculate the moments of random variables using the generating function?
  • 4. What is the generating function of the sum of independent random variables?
  • 5. What is called a composite distribution and how are the generating functions of composite distributions calculated?
  • 6. Give the main characteristics of the Bernoulli distribution, give an example of its use in service tasks.
  • 7. Give the main characteristics of the geometric distribution, give an example of use in service tasks.
  • 8. Give the main characteristics of the binomial distribution, give an example of use in service tasks.
  • 9. Give the main characteristics of the Poisson distribution, give an example of its use in service tasks.