Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Benford’s Law. According to Benford’s law, a variety of different data sets include numbers with leading (first) digits that follow the distribution shown in the table below. In Exercises 21–24, test for goodness-of-fit with the distribution described by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

1

30.10%

2

17.60%

3

12.50%

4

9.70%

5

7.90%

6

6.70%

7

5.80%

8

5.10%

9

4.60%

Author’s Check Amounts Exercise 21 lists the observed frequencies of leading digits from amounts on checks from seven suspect companies. Here are the observed frequencies of the leading digits from the amounts on the most recent checks written by the author at the time this exercise was created: 83, 58, 27, 21, 21, 21, 6, 4, 9. (Those observed frequencies correspond to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively.) Using a 0.01 significance level, test the claim that these leading digits are from a population of leading digits that conform to Benford’s law. Does the conclusion change if the significance level is 0.05?

Short Answer

Expert verified

At\(\alpha = 0.01\), there is not enough evidence to conclude thatthe observed frequencies of the leading digits are not the same as the frequencies expected from Benford’s law.

At\(\alpha = 0.05\), itcan be concluded that the observed frequencies of the leading digits from the amounts of checks are not the same as the expected frequencies using Benford’s law.

Thus, the result changes as the significance level changes.

Step by step solution

01

Given information

The frequencies of the different leading digits of the amounts of checks are recorded.

02

Check the requirements

Assume that random sampling is conducted.

Let O denote the observed frequencies of the leading digits.

The observed frequencies are noted below:

\(\begin{aligned}{c}{O_1} = 83\\{O_2} = 58\;\;\\{O_3} = 27\;\;\\{O_4} = 21\end{aligned}\)

\({O_5} = 21\)

\(\begin{aligned}{c}{O_6} = 21\\{O_7} = 6\;\;\\{O_8} = 4\;\;\\{O_9} = 9\end{aligned}\)

The sum of all observed frequencies is computed below:

\(\begin{aligned}{c}n = 83 + 58 + ... + 9\\ = 250\end{aligned}\)

Let E denote the expected frequencies.

Let the expected proportion and expected frequencies of the ith digit as given by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

Proportions

Expected Frequencies

1

30.10%

\(\begin{aligned}{c}{p_1} = \frac{{30.1}}{{100}}\\ = 0.301\end{aligned}\)

\(\begin{aligned}{c}{E_1} = n{p_1}\\ = 250\left( {0.301} \right)\\ = 75.25\end{aligned}\)

2

17.60%

\(\begin{aligned}{c}{p_2} = \frac{{17.6}}{{100}}\\ = 0.176\end{aligned}\)

\(\begin{aligned}{c}{E_2} = n{p_2}\\ = 250\left( {0.176} \right)\\ = 44\end{aligned}\)

3

12.50%

\(\begin{aligned}{c}{p_3} = \frac{{12.5}}{{100}}\\ = 0.125\end{aligned}\)

\(\begin{aligned}{c}{E_3} = n{p_3}\\ = 250\left( {0.125} \right)\\ = 31.25\end{aligned}\)

4

9.70%

\(\begin{aligned}{c}{p_4} = \frac{{9.7}}{{100}}\\ = 0.097\end{aligned}\)

\(\begin{aligned}{c}{E_4} = n{p_4}\\ = 250\left( {0.097} \right)\\ = 24.25\end{aligned}\)

5

7.90%

\(\begin{aligned}{c}{p_5} = \frac{{7.9}}{{100}}\\ = 0.079\end{aligned}\)

\(\begin{aligned}{c}{E_5} = n{p_5}\\ = 250\left( {0.079} \right)\\ = 19.75\end{aligned}\)

6

6.70%

\(\begin{aligned}{c}{p_6} = \frac{{6.7}}{{100}}\\ = 0.067\end{aligned}\)

\(\begin{aligned}{c}{E_6} = n{p_6}\\ = 250\left( {0.067} \right)\\ = 16.75\end{aligned}\)

7

5.80%

\(\begin{aligned}{c}{p_7} = \frac{{5.8}}{{100}}\\ = 0.058\end{aligned}\)

\(\begin{aligned}{c}{E_7} = n{p_7}\\ = 250\left( {0.058} \right)\\ = 14.5\end{aligned}\)

8

5.10%

\(\begin{aligned}{c}{p_8} = \frac{{5.1}}{{100}}\\ = 0.051\end{aligned}\)

\(\begin{aligned}{c}{E_8} = n{p_8}\\ = 250\left( {0.051} \right)\\ = 12.75\end{aligned}\)

9

4.60%

\(\begin{aligned}{c}{p_9} = \frac{{4.6}}{{100}}\\ = 0.046\end{aligned}\)

\(\begin{aligned}{c}{E_9} = n{p_9}\\ = 250\left( {0.046} \right)\\ = 11.5\end{aligned}\)

Since the expected values are larger than 5, the requirements of the test are met.

03

State the hypotheses

The null hypothesis for conducting the given test is as follows:

The observed frequencies of leading digits are the same as the frequencies expected from Benford’s law.

The alternative hypothesis is as follows:

The observed frequencies of leading digits are not the same as the frequencies expected from Benford’s law.

04

Conduct the hypothesis

The table below shows the necessary calculations:

Leading Digits

O

E

\(\left( {O - E} \right)\)

\(\frac{{{{\left( {O - E} \right)}^2}}}{E}\)

1

83

75.25

7.75

0.798173

2

58

44

14

4.454545

3

27

31.25

-4.25

0.578

4

21

24.25

-3.25

0.435567

5

21

19.75

1.25

0.079114

6

21

16.75

4.25

1.078358

7

6

14.5

-8.5

4.982759

8

4

12.75

-8.75

6.004902

9

9

11.5

-2.5

0.54348

The value of the test statistic is equal to:

\(\begin{aligned}{c}{\chi ^2} = \sum {\frac{{{{\left( {O - E} \right)}^2}}}{E}} \\ = 0.798173 + 4.454545 + ....... + 0.543478\\ = 18.955\end{aligned}\)

Thus,\({\chi ^2} = 18.955\).

Let k be the number of digits, which are 9.

The degrees of freedom for\({\chi ^2}\)is computed below:

\(\begin{aligned}{c}df = k - 1\\ = 9 - 1\\ = 8\end{aligned}\)

05

State the conclusion

Using the chi-square table, the critical value of\({\chi ^2}\)at\(\alpha = 0.01\)with 8 degrees of freedom is equal to 20.090.

The p-value is,

\(\begin{aligned}{c}p - value = P\left( {{\chi ^2} > 18.955} \right)\\ = 0.015\end{aligned}\).

Since the test statistic value is less than the critical value and the p-value is greater than 0.01, thenull hypothesis is failed to be rejected.

There is enough evidence to conclude that the observed frequencies of the leading digits are the same as the frequencies expected from Benford’s law.

06

Change the level of significance to 0.05

The value of the chi-square test statistic is equal to 18.955.

The critical value of\({\chi ^2}\)at\(\alpha = 0.05\)with 8 degrees of freedom is equal to 15.507.

The p-value is,

\(\begin{aligned}{c}p - value = P\left( {{\chi ^2} > 18.955} \right)\\ = 0.015\end{aligned}\)

Since the test statistic value is greater than the critical value and the p-value is less than 0.05, the null hypothesis is rejected at a 0.05 level of significance.

Thus, it can be concluded that the observed frequencies of the leading digits from the amounts of checks are not the same as the expected frequencies using Benford’s law.

Thus, the conclusion changes as the level of significance changes from 0.01 to 0.05.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Using Yates’s Correction for Continuity The chi-square distribution is continuous, whereas the test statistic used in this section is discrete. Some statisticians use Yates’s correction for continuity in cells with an expected frequency of less than 10 or in all cells of a contingency table with two rows and two columns. With Yates’s correction, we replace

\(\sum \frac{{{{\left( {O - E} \right)}^2}}}{E}\)with \(\sum \frac{{{{\left( {\left| {O - E} \right| - 0.5} \right)}^2}}}{E}\)

Given the contingency table in Exercise 9 “Four Quarters the Same as $1?” find the value of the test \({\chi ^2}\)statistic using Yates’s correction in all cells. What effect does Yates’s correction have?

A case-control (or retrospective) study was conductedto investigate a relationship between the colors of helmets worn by motorcycle drivers andwhether they are injured or killed in a crash. Results are given in the table below (based on datafrom “Motorcycle Rider Conspicuity and Crash Related Injury: Case-Control Study,” by Wellset al., BMJ USA,Vol. 4). Test the claim that injuries are independent of helmet color. Shouldmotorcycle drivers choose helmets with a particular color? If so, which color appears best?

Color of helmet


Black

White

Yellow/Orange

Red

Blue

Controls (not injured)

491

377

31

170

55

Cases (injured or killed)

213

112

8

70

26

Questions 6–10 refer to the sample data in the following table, which describes the fate of the passengers and crew aboard the Titanic when it sank on April 15, 1912. Assume that the data are a sample from a large population and we want to use a 0.05 significance level to test the claim that surviving is independent of whether the person is a man, woman, boy, or girl.


Men

Women

Boys

Girls

Survived

332

318

29

27

Died

1360

104

35

18

Identify the null and alternative hypotheses corresponding to the stated claim.

Benford’s Law. According to Benford’s law, a variety of different data sets include numbers with leading (first) digits that follow the distribution shown in the table below. In Exercises 21–24, test for goodness-of-fit with the distribution described by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

1

30.10%

2

17.60%

3

12.50%

4

9.70%

5

7.90%

6

6.70%

7

5.80%

8

5.10%

9

4.60%

Detecting Fraud When working for the Brooklyn district attorney, investigator Robert Burton analyzed the leading digits of the amounts from 784 checks issued by seven suspect companies. The frequencies were found to be 0, 15, 0, 76, 479, 183, 8, 23, and 0, and those digits correspond to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively. If the observed frequencies are substantially different from the frequencies expected with Benford’s law, the check amounts appear to result from fraud. Use a 0.01 significance level to test for goodness-of-fit with Benford’s law. Does it appear that the checks are the result of fraud?

In Exercises 1–4, use the following listed arrival delay times (minutes) for American Airline flights from New York to Los Angeles. Negative values correspond to flights that arrived early. Also shown are the SPSS results for analysis of variance. Assume that we plan to use a 0.05 significance level to test the claim that the different flights have the same mean arrival delay time.

Flight 1

-32

-25

-26

-6

5

-15

-17

-36

Flight 19

-5

-32

-13

-9

-19

49

-30

-23

Flight 21

-23

28

103

-19

-5

-46

13

-3

P-Value If we use a 0.05 significance level in analysis of variance with the sample data given in Exercise 1, what is the P-value? What should we conclude? If a passenger abhors late flight arrivals, can that passenger be helped by selecting one of the flights?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free