Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Benford’s Law. According to Benford’s law, a variety of different data sets include numbers with leading (first) digits that follow the distribution shown in the table below. In Exercises 21–24, test for goodness-of-fit with the distribution described by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

1

30.10%

2

17.60%

3

12.50%

4

9.70%

5

7.90%

6

6.70%

7

5.80%

8

5.10%

9

4.60%

Detecting Fraud When working for the Brooklyn district attorney, investigator Robert Burton analyzed the leading digits of the amounts from 784 checks issued by seven suspect companies. The frequencies were found to be 0, 15, 0, 76, 479, 183, 8, 23, and 0, and those digits correspond to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively. If the observed frequencies are substantially different from the frequencies expected with Benford’s law, the check amounts appear to result from fraud. Use a 0.01 significance level to test for goodness-of-fit with Benford’s law. Does it appear that the checks are the result of fraud?

Short Answer

Expert verified

There is enough evidence to conclude thatthe observed frequencies are not the same as the frequencies expected from Benford’s law.

Since the observed frequencies differ from the expected frequencies, the check amounts are a result of fraud.

Step by step solution

01

Given information

The frequencies of the different leading digits of the amounts of 784 checks are recorded.

02

Check the requirements

Assume that random sampling is conducted.

Let O denote the observed frequencies of the leading digits.

The observed frequencies are noted below:

\(\begin{aligned}{c}{O_1} = 0\\{O_2} = 15\;\;\\{O_3} = 0\;\;\\{O_4} = 76\end{aligned}\)

\({O_5} = 479\)

\(\begin{aligned}{c}{O_6} = 183\\{O_7} = 8\;\;\\{O_8} = 23\;\;\\{O_9} = 0\end{aligned}\)

The sum of all observed frequencies is computed below:

\(\begin{aligned}{c}n = 0 + 15 + ...... + 0\\ = 784\end{aligned}\)

Let E denote the expected frequencies.

Let the expected proportion and expected frequencies of the ith digit as given by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

Proportions

Expected Frequencies

1

30.10%

\(\begin{aligned}{c}{p_1} = \frac{{30.1}}{{100}}\\ = 0.301\end{aligned}\)

\[\begin{aligned}{c}{E_1} = n{p_1}\\ = 784\left( {0.301} \right)\\ = 235.984\end{aligned}\]

2

17.60%

\(\begin{aligned}{c}{p_2} = \frac{{17.6}}{{100}}\\ = 0.176\end{aligned}\)

\[\begin{aligned}{c}{E_2} = n{p_2}\\ = 784\left( {0.176} \right)\\ = 137.984\end{aligned}\]

3

12.50%

\(\begin{aligned}{c}{p_3} = \frac{{12.5}}{{100}}\\ = 0.125\end{aligned}\)

\[\begin{aligned}{c}{E_3} = n{p_3}\\ = 784\left( {0.125} \right)\\ = 98\end{aligned}\]

4

9.70%

\[\begin{aligned}{c}{p_4} = \frac{{9.7}}{{100}}\\ = 0.097\end{aligned}\]

\[\begin{aligned}{c}{E_4} = n{p_4}\\ = 784\left( {0.097} \right)\\ = 76.048\end{aligned}\]

5

7.90%

\[\begin{aligned}{c}{p_5} = \frac{{7.9}}{{100}}\\ = 0.079\end{aligned}\]

\[\begin{aligned}{c}{E_5} = n{p_5}\\ = 784\left( {0.079} \right)\\ = 61.936\end{aligned}\]

6

6.70%

\[\begin{aligned}{c}{p_6} = \frac{{6.7}}{{100}}\\ = 0.067\end{aligned}\]

\[\begin{aligned}{c}{E_6} = n{p_6}\\ = 784\left( {0.067} \right)\\ = 52.528\end{aligned}\]

7

5.80%

\(\begin{aligned}{c}{p_7} = \frac{{5.8}}{{100}}\\ = 0.058\end{aligned}\)

\[\begin{aligned}{c}{E_7} = n{p_7}\\ = 784\left( {0.058} \right)\\ = 45.472\end{aligned}\]

8

5.10%

\(\begin{aligned}{c}{p_8} = \frac{{5.1}}{{100}}\\ = 0.051\end{aligned}\)

\[\begin{aligned}{c}{E_8} = n{p_8}\\ = 784\left( {0.051} \right)\\ = 39.984\end{aligned}\]

9

4.60%

\(\begin{aligned}{c}{p_9} = \frac{{4.6}}{{100}}\\ = 0.046\end{aligned}\)

\[\begin{aligned}{c}{E_9} = n{p_9}\\ = 784\left( {0.046} \right)\\ = 36.064\end{aligned}\]

Since the expected values are larger than 5, the requirements of the test are met.

03

State the hypotheses

The null hypothesis for conducting the given test is as follows:

The observed frequencies are the same as the frequencies expected from Benford’s law.

The alternative hypothesis is as follows:

The observed frequencies are not the same as the frequencies expected from Benford’s law.

04

Conduct the hypothesis

The table below shows the necessary calculations:

Leading Digits

O

E

\(\left( {O - E} \right)\)

\(\frac{{{{\left( {O - E} \right)}^2}}}{E}\)

1

0

235.984

-235.984

235.984

2

15

137.984

-122.984

109.6146

3

0

98

-98

98

4

76

76.048

-0.048

0.00003

5

479

61.936

417.064

2808.421

6

183

52.528

130.472

324.0737

7

8

45.472

-37.472

30.87946

8

23

39.984

-16.984

7.214292

9

0

36.064

-36.064

36.064

The value of the test statistic is equal to:

\[\begin{aligned}{c}{\chi ^2} = \sum {\frac{{{{\left( {O - E} \right)}^2}}}{E}} \\ = 235.984 + 109.6146 + ....... + 36.064\\ = 3650.251\end{aligned}\]

Thus,\({\chi ^2} = 3650.251\).

Let k be the number of digits, which is 9.

The degrees of freedom for\({\chi ^2}\)is computed below:

\(\begin{aligned}{c}df = k - 1\\ = 9 - 1\\ = 8\end{aligned}\)

05

State the decision

The chi-square table is used to obtain the critical value of\({\chi ^2}\)at\(\alpha = 0.01\)with 8 degrees of freedom is equal to 20.090.

The p-value is,

\(\begin{aligned}{c}p - value = P\left( {{\chi ^2} > 3650.251} \right)\\ = 0.000\end{aligned}\)

Since the test statistic value is greater than the critical value and the p-value is less than 0.01, the null hypothesis is rejected.

06

State the conclusion

There is enough evidence to conclude thatthe observed frequencies are not the same as the frequencies expected from Benford’s law.

Since the observed frequencies differ from the expected frequencies, the check amounts are a result of fraud.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Exercises 1–5 refer to the sample data in the following table, which summarizes the last digits of the heights (cm) of 300 randomly selected subjects (from Data Set 1 “Body Data” in Appendix B). Assume that we want to use a 0.05 significance level to test the claim that the data are from a population having the property that the last digits are all equally likely.

Last Digit

0

1

2

3

4

5

6

7

8

9

Frequency

30

35

24

25

35

36

37

27

27

24

Given that the P-value for the hypothesis test is 0.501, what do you conclude? Does it appear that the heights were obtained through measurement or that the subjects reported their heights?

In Exercises 5–20, conduct the hypothesis test and provide the test statistic and the P-value and, or critical value, and state the conclusion.

Police Calls Repeat Exercise 11 using these observed frequencies for police calls received during the month of March: Monday (208); Tuesday (224); Wednesday (246); Thursday (173); Friday (210); Saturday (236); Sunday (154). What is a fundamental error with this analysis?

Weather-Related Deaths Review Exercise 5 involved weather-related U.S. deaths. Among the 450 deaths included in that exercise, 320 are males. Use a 0.05 significance level to test the claim that among those who die in weather-related deaths, the percentage of males is equal to 50%. Provide an explanation for the results.

Weather-Related Deaths For a recent year, the numbers of weather-related U.S. deaths for each month were 28, 17, 12, 24, 88, 61, 104, 32, 20, 13, 26, 25 (listed in order beginning with January). Use a 0.01 significance level to test the claim that weather-related deaths occur in the different months with the same frequency. Provide an explanation for the result.

In Exercises 5–20, conduct the hypothesis test and provide the test statistic and the P-value and, or critical value, and state the conclusion.

Baseball Player Births In his book Outliers, author Malcolm Gladwell argues that more baseball players have birth dates in the months immediately following July 31, because that was the age cutoff date for nonschool baseball leagues. Here is a sample of frequency counts of months of birth dates of American-born Major League Baseball players starting with January: 387, 329, 366, 344, 336, 313, 313, 503, 421, 434, 398, 371. Using a 0.05 significance level, is there sufficient evidence to warrant rejection of the claim that American-born Major League Baseball players are born in different months with the same frequency? Do the sample values appear to support Gladwell’s claim?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free