Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Benford’s Law. According to Benford’s law, a variety of different data sets include numbers with leading (first) digits that follow the distribution shown in the table below. In Exercises 21–24, test for goodness-of-fit with the distribution described by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

1

30.10%

2

17.60%

3

12.50%

4

9.70%

5

7.90%

6

6.70%

7

5.80%

8

5.10%

9

4.60%

Tax Cheating? Frequencies of leading digits from IRS tax files are 152, 89, 63, 48, 39, 40, 28, 25, and 27 (corresponding to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively, based on data from Mark Nigrini, who provides software for Benford data analysis). Using a 0.05 significance level, test for goodness-of-fit with Benford’s law. Does it appear that the tax entries are legitimate?

Short Answer

Expert verified

There is not enough evidence to conclude thatthe observed frequencies of the leading digits are not the same as the frequencies expected from Benford’s law.

Yes, it appears that the tax entries are legitimate.

Step by step solution

01

Given information

The frequencies of the different leading digits from IRS tax files are recorded.

02

Check the requirements

Assume that random sampling is conducted.

Let O denote the observed frequencies of the leading digits.

The observed frequencies are noted below:

\(\begin{aligned}{c}{O_1} = 152\\{O_2} = 89\\{O_3} = 63\;\;\\{O_4} = 48\end{aligned}\)

\({O_5} = 39\)

\(\begin{aligned}{c}{O_6} = 40\\{O_7} = 28\;\;\\{O_8} = 25\;\;\\{O_9} = 27\end{aligned}\)

The sum of all observed frequencies is computed below:

\(\begin{aligned}{c}n = 152 + 89 + ... + 27\\ = 511\end{aligned}\)

Let E denote the expected frequencies.

Let the expected proportion and expected frequencies of the ith digit as given by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

Proportions

Expected Frequencies

1

30.10%

\(\begin{aligned}{c}{p_1} = \frac{{30.1}}{{100}}\\ = 0.301\end{aligned}\)

\[\begin{aligned}{c}{E_1} = n{p_1}\\ = 511\left( {0.301} \right)\\ = 153.811\end{aligned}\]

2

17.60%

\(\begin{aligned}{c}{p_2} = \frac{{17.6}}{{100}}\\ = 0.176\end{aligned}\)

\[\begin{aligned}{c}{E_2} = n{p_2}\\ = 511\left( {0.176} \right)\\ = 89.936\end{aligned}\]

3

12.50%

\(\begin{aligned}{c}{p_3} = \frac{{12.5}}{{100}}\\ = 0.125\end{aligned}\)

\[\begin{aligned}{c}{E_3} = n{p_3}\\ = 511\left( {0.125} \right)\\ = 63.875\end{aligned}\]

4

9.70%

\[\begin{aligned}{c}{p_4} = \frac{{9.7}}{{100}}\\ = 0.097\end{aligned}\]

\[\begin{aligned}{c}{E_4} = n{p_4}\\ = 511\left( {0.097} \right)\\ = 49.567\end{aligned}\]

5

7.90%

\[\begin{aligned}{c}{p_5} = \frac{{7.9}}{{100}}\\ = 0.079\end{aligned}\]

\[\begin{aligned}{c}{E_5} = n{p_5}\\ = 511\left( {0.079} \right)\\ = 40.369\end{aligned}\]

6

6.70%

\[\begin{aligned}{c}{p_6} = \frac{{6.7}}{{100}}\\ = 0.067\end{aligned}\]

\[\begin{aligned}{c}{E_6} = n{p_6}\\ = 511\left( {0.067} \right)\\ = 34.237\end{aligned}\]

7

5.80%

\(\begin{aligned}{c}{p_7} = \frac{{5.8}}{{100}}\\ = 0.058\end{aligned}\)

\[\begin{aligned}{c}{E_7} = n{p_7}\\ = 511\left( {0.058} \right)\\ = 29.638\end{aligned}\]

8

5.10%

\(\begin{aligned}{c}{p_8} = \frac{{5.1}}{{100}}\\ = 0.051\end{aligned}\)

\[\begin{aligned}{c}{E_8} = n{p_8}\\ = 511\left( {0.051} \right)\\ = 26.061\end{aligned}\]

9

4.60%

\(\begin{aligned}{c}{p_9} = \frac{{4.6}}{{100}}\\ = 0.046\end{aligned}\)

\[\begin{aligned}{c}{E_9} = n{p_9}\\ = 511\left( {0.046} \right)\\ = 23.506\end{aligned}\]

As all the expected values are higher than 5, the requirements of the test are satisfied.

03

State the hypotheses

The null hypothesis for conducting the given test is as follows:

The observed frequencies of leading digits are the same as the frequencies expected from Benford’s law.

The alternative hypothesis is as follows:

The observed frequencies of leading digits are not the same as the frequencies expected from Benford’s law.

04

Conduct the hypothesis test

The table below shows the necessary calculations:

Leading Digits

O

E

\(\left( {O - E} \right)\)

\(\frac{{{{\left( {O - E} \right)}^2}}}{E}\)

1

152

153.811

-1.811

0.021323

2

89

89.936

-0.936

0.009741

3

63

63.875

-0.875

0.011986

4

48

49.567

-1.567

0.049539

5

39

40.369

-1.369

0.046426

6

40

34.237

5.763

0.970067

7

28

29.638

-1.638

0.090527

8

25

26.061

-1.061

0.043196

9

27

23.506

3.494

0.519358

The value of the test statistic is equal to:

\[\begin{aligned}{c}{\chi ^2} = \sum {\frac{{{{\left( {O - E} \right)}^2}}}{E}} \\ = 0.021323 + 0.009741 + ... + 0.519358\\ = 1.762163\end{aligned}\]

Thus,\({\chi ^2} = 1.762\).

Let k be the number of digits, which are 9.

The degrees of freedom for\({\chi ^2}\)is computed below:

\(\begin{aligned}{c}df = k - 1\\ = 9 - 1\\ = 8\end{aligned}\)

05

State the conclusion

The critical value of\({\chi ^2}\)at\(\alpha = 0.05\)with 8 degrees of freedom is equal to 15.507, obtained using the chi-square table.

The p-value is equal to 0.987.

Since the test statistic value is less than the critical value and the p-value is greater than 0.05, the null hypothesis is failed to be rejected.

There is not enough evidence to conclude thatthe observed frequencies of the leading digits are not the same as the frequencies expected from Benford’s law.

Yes, it appears that the tax entries are legitimate.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Benford’s Law. According to Benford’s law, a variety of different data sets include numbers with leading (first) digits that follow the distribution shown in the table below. In Exercises 21–24, test for goodness-of-fit with the distribution described by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

1

30.10%

2

17.60%

3

12.50%

4

9.70%

5

7.90%

6

6.70%

7

5.80%

8

5.10%

9

4.60%

Author’s Computer Files The author recorded the leading digits of the sizes of the electronic document files for the current edition of this book. The leading digits have frequencies of 55, 25, 17, 24, 18, 12, 12, 3, and 4 (corresponding to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively). Using a 0.05 significance level, test for goodness-of-fit with Benford’s law.

Exercises 1–5 refer to the sample data in the following table, which summarizes the last digits of the heights (cm) of 300 randomly selected subjects (from Data Set 1 “Body Data” in Appendix B). Assume that we want to use a 0.05 significance level to test the claim that the data are from a population having the property that the last digits are all equally likely.

Last Digit

0

1

2

3

4

5

6

7

8

9

Frequency

30

35

24

25

35

36

37

27

27

24

Given that the P-value for the hypothesis test is 0.501, what do you conclude? Does it appear that the heights were obtained through measurement or that the subjects reported their heights?

Exercises 1–5 refer to the sample data in the following table, which summarizes the last digits of the heights (cm) of 300 randomly selected subjects (from Data Set 1 “Body Data” in Appendix B). Assume that we want to use a 0.05 significance level to test the claim that the data are from a population having the property that the last digits are all equally likely.

Last Digit

0

1

2

3

4

5

6

7

8

9

Frequency

30

35

24

25

35

36

37

27

27

24

Is the hypothesis test left-tailed, right-tailed, or two-tailed?

In a study of high school students at least 16 years of age,

researchers obtained survey results summarized in the accompanying table (based on data from “Texting While Driving and Other Risky Motor Vehicle Behaviors Among U.S. High School Students,” by O’Malley, Shults, and Eaton, Pediatrics,Vol. 131, No. 6). Use a 0.05 significance level to test the claim of independence between texting while driving and irregular seat belt use. Are those two risky behaviors independent of each other?


Irregular Seat Belt Use?


Yes

No

Texted while driving

1737

2048

No Texting while driving

1945

2775

Weather-Related Deaths Review Exercise 5 involved weather-related U.S. deaths. Among the 450 deaths included in that exercise, 320 are males. Use a 0.05 significance level to test the claim that among those who die in weather-related deaths, the percentage of males is equal to 50%. Provide an explanation for the results.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free