Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Benford’s Law. According to Benford’s law, a variety of different data sets include numbers with leading (first) digits that follow the distribution shown in the table below. In Exercises 21–24, test for goodness-of-fit with the distribution described by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

1

30.10%

2

17.60%

3

12.50%

4

9.70%

5

7.90%

6

6.70%

7

5.80%

8

5.10%

9

4.60%

Author’s Computer Files The author recorded the leading digits of the sizes of the electronic document files for the current edition of this book. The leading digits have frequencies of 55, 25, 17, 24, 18, 12, 12, 3, and 4 (corresponding to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively). Using a 0.05 significance level, test for goodness-of-fit with Benford’s law.

Short Answer

Expert verified

There is not enough evidence to conclude that the observed frequencies of the leading digits of the sizes of the electronic document files are not the same as the frequencies expected from Benford’s law.

Step by step solution

01

Given information

The frequencies of the different leading digits from IRS tax files are recorded.

02

Step 2:Check the requirements

Assume that random sampling is conducted.

Let O denote the observed frequencies of the leading digits.

The observed frequencies are noted below:

\(\begin{aligned}{c}{O_1} = 55\\{O_2} = 25\\{O_3} = 17\;\;\\{O_4} = 24\end{aligned}\)

\({O_5} = 18\)

\(\begin{aligned}{c}{O_6} = 12\\{O_7} = 12\;\;\\{O_8} = 3\;\;\\{O_9} = 4\end{aligned}\)

The sum of all observed frequencies is computed below:

\(\begin{aligned}{c}n = 55 + 25 + ... + 4\\ = 170\end{aligned}\)

Let E denote the expected frequencies.

Let the expected proportion and expected frequencies of the i-th digit as given by Benford’s law.

Leading Digits

Benford's Law: Distribution of leading digits

Proportions

Expected Frequencies

1

30.10%

\(\begin{aligned}{c}{p_1} = \frac{{30.1}}{{100}}\\ = 0.301\end{aligned}\)

\(\begin{aligned}{c}{E_1} = n{p_1}\\ = 170\left( {0.301} \right)\\ = 51.17\end{aligned}\)

2

17.60%

\(\begin{aligned}{c}{p_2} = \frac{{17.6}}{{100}}\\ = 0.176\end{aligned}\)

\(\begin{aligned}{c}{E_2} = n{p_2}\\ = 170\left( {0.176} \right)\\ = 29.90\end{aligned}\)

3

12.50%

\(\begin{aligned}{c}{p_3} = \frac{{12.5}}{{100}}\\ = 0.125\end{aligned}\)

\(\begin{aligned}{c}{E_3} = n{p_3}\\ = 170\left( {0.125} \right)\\ = 21.25\end{aligned}\)

4

9.70%

\(\begin{aligned}{c}{p_4} = \frac{{9.7}}{{100}}\\ = 0.097\end{aligned}\)

\(\begin{aligned}{c}{E_4} = n{p_4}\\ = 170\left( {0.097} \right)\\ = 16.49\end{aligned}\)

5

7.90%

\(\begin{aligned}{c}{p_5} = \frac{{7.9}}{{100}}\\ = 0.079\end{aligned}\)

\(\begin{aligned}{c}{E_5} = n{p_5}\\ = 170\left( {0.079} \right)\\ = 13.43\end{aligned}\)

6

6.70%

\(\begin{aligned}{c}{p_6} = \frac{{6.7}}{{100}}\\ = 0.067\end{aligned}\)

\(\begin{aligned}{c}{E_6} = n{p_6}\\ = 170\left( {0.067} \right)\\ = 11.39\end{aligned}\)

7

5.80%

\(\begin{aligned}{c}{p_7} = \frac{{5.8}}{{100}}\\ = 0.058\end{aligned}\)

\(\begin{aligned}{c}{E_7} = n{p_7}\\ = 170\left( {0.058} \right)\\ = 9.86\end{aligned}\)

8

5.10%

\(\begin{aligned}{c}{p_8} = \frac{{5.1}}{{100}}\\ = 0.051\end{aligned}\)

\(\begin{aligned}{c}{E_8} = n{p_8}\\ = 170\left( {0.051} \right)\\ = 8.67\end{aligned}\)

9

4.60%

\(\begin{aligned}{c}{p_9} = \frac{{4.6}}{{100}}\\ = 0.046\end{aligned}\)

\(\begin{aligned}{c}{E_9} = n{p_9}\\ = 170\left( {0.046} \right)\\ = 7.82\end{aligned}\)

As all the expected values are higher than 5, the requirements of the test are satisfied.

03

State the hypotheses

The null hypothesis for conducting the given test is as follows:

The observed frequencies of leading digits are the same as the frequencies expected from Benford’s law.

The alternative hypothesis is as follows:

The observed frequencies of leading digits are not the same as the frequencies expected from Benford’s law.

The test is right-tailed.

If the absolute value of the test statistic is greater than the critical value, the null hypothesis is rejected.

04

Conduct the hypothesis test

The table below shows the necessary calculations:

Leading Digits

O

E

\(\left( {O - E} \right)\)

\(\frac{{{{\left( {O - E} \right)}^2}}}{E}\)

1

55

51.17

3.83

0.286670

2

25

29.92

-4.92

0.809037

3

17

21.25

-4.25

0.850000

4

24

16.49

7.51

3.420261

5

18

13.43

4.57

1.555093

6

12

11.39

0.61

0.032669

7

12

9.86

2.14

0.464462

8

3

8.67

-5.67

3.708062

9

4

7.82

-3.82

1.866036

The value of the test statistic is equal to:

\(\begin{aligned}{c}{\chi ^2} = \sum {\frac{{{{\left( {O - E} \right)}^2}}}{E}} \;\\ = 0.28667 + 0.809037 + ... + 1.866036\\ = 12.992\end{aligned}\)

Thus,\({\chi ^2} = 12.992\).

Let k be the number of digits, equal to 9.

The degrees of freedom for\({\chi ^2}\)is computed below:

\(\begin{aligned}{c}df = k - 1\\ = 9 - 1\\ = 8\end{aligned}\)

05

State the conclusion

The critical value of\({\chi ^2}\)at\(\alpha = 0.05\)with 8 degrees of freedom is equal to 15.507, taken from the chi-square table.

The p-value is,

\(\begin{aligned}{c}p - value = P\left( {{\chi ^2} > 12.992} \right)\\ = 0.112\end{aligned}\)

Since the test statistic value is less than the critical value and the p-value is greater than 0.05, the null hypothesis is failed to be rejected.

There is not enough evidence to conclude that the observed frequencies of the leading digits of the sizes of the electronic document files are not the same as the frequencies expected from Benford’s law.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

The accompanying table is from a study conducted

with the stated objective of addressing cell phone safety by understanding why we use a particular ear for cell phone use. (See “Hemispheric Dominance and Cell Phone Use,” by Seidman, Siegel, Shah, and Bowyer, JAMA Otolaryngology—Head & Neck Surgery,Vol. 139, No. 5.)

The goal was to determine whether the ear choice is associated with auditory or language brain hemispheric dominance. Assume that we want to test the claim that handedness and cell phone ear preference are independent of each other.

a. Use the data in the table to find the expected value for the cell that has an observed frequency of 3. Round the result to three decimal places.

b. What does the expected value indicate about the requirements for the hypothesis test?

Right Ear

Left Ear

No Preference

Right-Handed

436

166

40

Left-Handed

16

50

3

In Exercises 5–20, conduct the hypothesis test and provide the test statistic and the P-value and, or critical value, and state the conclusion.

Baseball Player Births In his book Outliers, author Malcolm Gladwell argues that more baseball players have birth dates in the months immediately following July 31, because that was the age cutoff date for nonschool baseball leagues. Here is a sample of frequency counts of months of birth dates of American-born Major League Baseball players starting with January: 387, 329, 366, 344, 336, 313, 313, 503, 421, 434, 398, 371. Using a 0.05 significance level, is there sufficient evidence to warrant rejection of the claim that American-born Major League Baseball players are born in different months with the same frequency? Do the sample values appear to support Gladwell’s claim?

Alert nurses at the Veteran’s Affairs Medical Center in Northampton, Massachusetts, noticed an unusually high number of deaths at times when another nurse, Kristen Gilbert, was working. Those same nurses later noticed missing supplies of the drug epinephrine, which is a synthetic adrenaline that stimulates the heart. Kristen Gilbert was arrested and charged with four counts of murder and two counts of attempted murder. When seeking a grand jury indictment, prosecutors provided a key piece of evidence consisting of the table below. Use a 0.01 significance level to test the defense claim that deaths on shifts are independent of whether Gilbert was working. What does the result suggest about the guilt or innocence of Gilbert?

Shifts With a Death

Shifts Without a Death

Gilbert Was Working

40

217

Gilbert Was Not Working

34

1350

Benford’s Law. According to Benford’s law, a variety of different data sets include numbers with leading (first) digits that follow the distribution shown in the table below. In Exercises 21–24, test for goodness-of-fit with the distribution described by Benford’s law.

Leading Digits

Benford's Law: Distributuon of leading digits

1

30.10%

2

17.60%

3

12.50%

4

9.70%

5

7.90%

6

6.70%

7

5.80%

8

5.10%

9

4.60%

Tax Cheating? Frequencies of leading digits from IRS tax files are 152, 89, 63, 48, 39, 40, 28, 25, and 27 (corresponding to the leading digits of 1, 2, 3, 4, 5, 6, 7, 8, and 9, respectively, based on data from Mark Nigrini, who provides software for Benford data analysis). Using a 0.05 significance level, test for goodness-of-fit with Benford’s law. Does it appear that the tax entries are legitimate?

In Exercises 5–20, conduct the hypothesis test and provide the test statistic and the P-value and, or critical value, and state the conclusion.

Police Calls Repeat Exercise 11 using these observed frequencies for police calls received during the month of March: Monday (208); Tuesday (224); Wednesday (246); Thursday (173); Friday (210); Saturday (236); Sunday (154). What is a fundamental error with this analysis?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free