Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Question: Suppose that a Bayesian spam filter is trained on a set of 10,000 spam messages and 5000 messages that are not spam. The word “enhancement” appears in 1500 spam messages and 20 messages that are not spam, while the word “herbal” appears in 800 spam messages and 200 messages that are not spam. Estimate the probability that a received message containing both the words “enhancement” and “herbal” is spam. Will the message be rejected as spam if the threshold for rejecting spam is 0.9?

Short Answer

Expert verified

Answer:

The incoming message containing “enhancement” and “herbal” will be rejected

Step by step solution

Achieve better grades quicker with Premium

  • Unlimited AI interaction
  • Study offline
  • Say goodbye to ads
  • Export flashcards

Over 22 million students worldwide already upgrade their learning with Vaia!

01

Given

A Bayesian spam filter is trained on a set of 10,000 spam messages and 5000 messages that are not spam. The word “enhancement” appears in 1500 spam messages and 20 messages that are not spam, while the word “herbal” appears in 800 spam messages and 200 messages that are not spam.

To find: If the message be rejected as spam or not.

02

Formula

Bayes’ Formula:

Suppose that E is an event from a sample space S and that \({F_1},{F_2}, \ldots ,{F_n}\) are mutually exclusive events such that \(\bigcup\limits_{i = 1}^n {{F_i}} = S{\rm{.\;}}\) Assume that \(p(E) \ne 0\) and \(p\left( {{F_i}} \right) \ne 0\) for \(i = 1,2, \ldots ,n\).Then

03

Calculation

Here,

p(enhancement) =\(\frac{{1500}}{{1000}} = 0.15\)

q(enhancement) =\(\frac{{20}}{{5000}} = 0.004\)

p(herbal) =\(\frac{{800}}{{10000}} = 0.08\)

q(herbal) =\(\frac{{200}}{{5000}} = 0.04\)

Assume the necessary independence to compute

\(\begin{array}{l}p({\rm{\;enhancement, herbal\;}}) = \frac{{p({\rm{\;enhancement\;}})p({\rm{\;herbal\;}})}}{{p({\rm{\;enhancement\;}})p({\rm{\;herbal\;}}) + p({\rm{\;enhancement\;}})q{\rm{\;(herbal\;}})}}\\ = \frac{{(0.15)(0.08)}}{{(0.15)(0.08) + (0.004)(0.04)}}\\ = 0.987\end{array}\)

p(enhancement, herbal) is greater than 0.9.

So, the incoming message containing “enhancement” and “herbal” will be rejected

04

Final Answer

The incoming message containing “enhancement” and “herbal” will be rejected

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Study anywhere. Anytime. Across all devices.

Sign-up for free