Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Benford’s Law. According to Benford’s law, a variety of different data sets include numbers with leading (first) digits that follow the distribution shown in the table below. In Exercises 21–24, test for goodness-of-fit with the distribution described by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

1

30.10%

2

17.60%

3

12.50%

4

9.70%

5

7.90%

6

6.70%

7

5.80%

8

5.10%

9

4.60%

Author’s Computer Files The author recorded the leading digits of the sizes of the electronic document files for the current edition of this book. The leading digits have frequencies of 55, 25, 17, 24, 18, 12, 12, 3, and 4 (corresponding to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively). Using a 0.05 significance level, test for goodness-of-fit with Benford’s law.

Short Answer

Expert verified

There is not enough evidence to conclude that the observed frequencies of the leading digits of the sizes of the electronic document files are not the same as the frequencies expected from Benford’s law.

Step by step solution

01

Given information

The frequencies of the different leading digits from IRS tax files are recorded.

02

Step 2:Check the requirements

Assume that random sampling is conducted.

Let O denote the observed frequencies of the leading digits.

The observed frequencies are noted below:

\(\begin{aligned}{c}{O_1} = 55\\{O_2} = 25\\{O_3} = 17\;\;\\{O_4} = 24\end{aligned}\)

\({O_5} = 18\)

\(\begin{aligned}{c}{O_6} = 12\\{O_7} = 12\;\;\\{O_8} = 3\;\;\\{O_9} = 4\end{aligned}\)

The sum of all observed frequencies is computed below:

\(\begin{aligned}{c}n = 55 + 25 + ... + 4\\ = 170\end{aligned}\)

Let E denote the expected frequencies.

Let the expected proportion and expected frequencies of the i-th digit as given by Benford’s law.

Leading Digits

Benford's Law: Distribution of leading digits

Proportions

Expected Frequencies

1

30.10%

\(\begin{aligned}{c}{p_1} = \frac{{30.1}}{{100}}\\ = 0.301\end{aligned}\)

\(\begin{aligned}{c}{E_1} = n{p_1}\\ = 170\left( {0.301} \right)\\ = 51.17\end{aligned}\)

2

17.60%

\(\begin{aligned}{c}{p_2} = \frac{{17.6}}{{100}}\\ = 0.176\end{aligned}\)

\(\begin{aligned}{c}{E_2} = n{p_2}\\ = 170\left( {0.176} \right)\\ = 29.90\end{aligned}\)

3

12.50%

\(\begin{aligned}{c}{p_3} = \frac{{12.5}}{{100}}\\ = 0.125\end{aligned}\)

\(\begin{aligned}{c}{E_3} = n{p_3}\\ = 170\left( {0.125} \right)\\ = 21.25\end{aligned}\)

4

9.70%

\(\begin{aligned}{c}{p_4} = \frac{{9.7}}{{100}}\\ = 0.097\end{aligned}\)

\(\begin{aligned}{c}{E_4} = n{p_4}\\ = 170\left( {0.097} \right)\\ = 16.49\end{aligned}\)

5

7.90%

\(\begin{aligned}{c}{p_5} = \frac{{7.9}}{{100}}\\ = 0.079\end{aligned}\)

\(\begin{aligned}{c}{E_5} = n{p_5}\\ = 170\left( {0.079} \right)\\ = 13.43\end{aligned}\)

6

6.70%

\(\begin{aligned}{c}{p_6} = \frac{{6.7}}{{100}}\\ = 0.067\end{aligned}\)

\(\begin{aligned}{c}{E_6} = n{p_6}\\ = 170\left( {0.067} \right)\\ = 11.39\end{aligned}\)

7

5.80%

\(\begin{aligned}{c}{p_7} = \frac{{5.8}}{{100}}\\ = 0.058\end{aligned}\)

\(\begin{aligned}{c}{E_7} = n{p_7}\\ = 170\left( {0.058} \right)\\ = 9.86\end{aligned}\)

8

5.10%

\(\begin{aligned}{c}{p_8} = \frac{{5.1}}{{100}}\\ = 0.051\end{aligned}\)

\(\begin{aligned}{c}{E_8} = n{p_8}\\ = 170\left( {0.051} \right)\\ = 8.67\end{aligned}\)

9

4.60%

\(\begin{aligned}{c}{p_9} = \frac{{4.6}}{{100}}\\ = 0.046\end{aligned}\)

\(\begin{aligned}{c}{E_9} = n{p_9}\\ = 170\left( {0.046} \right)\\ = 7.82\end{aligned}\)

As all the expected values are higher than 5, the requirements of the test are satisfied.

03

State the hypotheses

The null hypothesis for conducting the given test is as follows:

The observed frequencies of leading digits are the same as the frequencies expected from Benford’s law.

The alternative hypothesis is as follows:

The observed frequencies of leading digits are not the same as the frequencies expected from Benford’s law.

The test is right-tailed.

If the absolute value of the test statistic is greater than the critical value, the null hypothesis is rejected.

04

Conduct the hypothesis test

The table below shows the necessary calculations:

Leading Digits

O

E

\(\left( {O - E} \right)\)

\(\frac{{{{\left( {O - E} \right)}^2}}}{E}\)

1

55

51.17

3.83

0.286670

2

25

29.92

-4.92

0.809037

3

17

21.25

-4.25

0.850000

4

24

16.49

7.51

3.420261

5

18

13.43

4.57

1.555093

6

12

11.39

0.61

0.032669

7

12

9.86

2.14

0.464462

8

3

8.67

-5.67

3.708062

9

4

7.82

-3.82

1.866036

The value of the test statistic is equal to:

\(\begin{aligned}{c}{\chi ^2} = \sum {\frac{{{{\left( {O - E} \right)}^2}}}{E}} \;\\ = 0.28667 + 0.809037 + ... + 1.866036\\ = 12.992\end{aligned}\)

Thus,\({\chi ^2} = 12.992\).

Let k be the number of digits, equal to 9.

The degrees of freedom for\({\chi ^2}\)is computed below:

\(\begin{aligned}{c}df = k - 1\\ = 9 - 1\\ = 8\end{aligned}\)

05

State the conclusion

The critical value of\({\chi ^2}\)at\(\alpha = 0.05\)with 8 degrees of freedom is equal to 15.507, taken from the chi-square table.

The p-value is,

\(\begin{aligned}{c}p - value = P\left( {{\chi ^2} > 12.992} \right)\\ = 0.112\end{aligned}\)

Since the test statistic value is less than the critical value and the p-value is greater than 0.05, the null hypothesis is failed to be rejected.

There is not enough evidence to conclude that the observed frequencies of the leading digits of the sizes of the electronic document files are not the same as the frequencies expected from Benford’s law.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

In Exercises 5–20, conduct the hypothesis test and provide the test statistic and the P-value and , or critical value, and state the conclusion.

California Daily 4 Lottery The author recorded all digits selected in California’s Daily 4 Lottery for the 60 days preceding the time that this exercise was created. The frequencies of the digits from 0 through 9 are 21, 30, 31, 33, 19, 23, 21, 16, 24, and 22. Use a 0.05 significance level to test the claim of lottery officials that the digits are selected in a way that they are equally likely.

In Exercises 5–20, conduct the hypothesis test and provide the test statistic and the P-value and , or critical value, and state the conclusion.

Testing a Slot Machine The author purchased a slot machine (Bally Model 809) and tested it by playing it 1197 times. There are 10 different categories of outcomes, including no win, win jackpot, win with three bells, and so on. When testing the claim that the observed outcomes agree with the expected frequencies, the author obtained a test statistic of\({\chi ^2} = 8.185\). Use a 0.05 significance level to test the claim that the actual outcomes agree with the expected frequencies. Does the slot machine appear to be functioning as expected?

In a clinical trial of the effectiveness of echinacea for preventing

colds, the results in the table below were obtained (based on data from “An Evaluation of Echinacea Angustifoliain Experimental Rhinovirus Infections,” by Turner et al., NewEngland Journal of Medicine,Vol. 353, No. 4). Use a 0.05 significance level to test the claim that getting a cold is independent of the treatment group. What do the results suggest about the

effectiveness of echinacea as a prevention against colds?

Treatment Group


Placebo

Echinacea:

20% Extract

Echinacea:

60% Extract

Got a Cold

88

48

42

Did Not Get a Cold

15

4

10

American Idol Contestants on the TV show American Idol competed to win a singing contest. At one point, the website WhatNotToSing.com listed the actual numbers of eliminations for different orders of singing, and the expected number of eliminations was also listed. The results are in the table below. Use a 0.05 significance level to test the claim that the actual eliminations agree with the expected numbers. Does there appear to be support for the claim that the leadoff singers appear to be at a disadvantage?

Singing Order

1

2

3

4

5

6

7–12

Actual Eliminations

20

12

9

8

6

5

9

Expected Eliminations

12.9

12.9

9.9

7.9

6.4

5.5

13.5

In Exercises 1–4, use the following listed arrival delay times (minutes) for American Airline flights from New York to Los Angeles. Negative values correspond to flights that arrived early. Also shown are the SPSS results for analysis of variance. Assume that we plan to use a 0.05 significance level to test the claim that the different flights have the same mean arrival delay time.

Flight 1

-32

-25

-26

-6

5

-15

-17

-36

Flight 19

-5

-32

-13

-9

-19

49

-30

-23

Flight 21

-23

28

103

-19

-5

-46

13

-3

Test Statistic What is the value of the test statistic? What distribution is used with the test statistic?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free