Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

If \(R_{i}\) denotes the random amount that is earned in period \(i\), then \(\sum_{i=1}^{\infty} \beta^{i-1} R_{i}\), where \(0<\beta<1\) is a specified constant, is called the total discounted reward with discount factor \(\beta .\) Let \(T\) be a geometric random variable with parameter \(1-\beta\) that is independent of the \(R_{i} .\) Show that the expected total discounted reward is equal to the expected total (undiscounted) reward earned by time \(T\). That is, show that $$ E\left[\sum_{i=1}^{\infty} \beta^{i-1} R_{i}\right]=E\left[\sum_{i=1}^{T} R_{i}\right] $$

Short Answer

Expert verified
In conclusion, using the properties of geometric random variables, the linearity of expectation, and the independence of \(T\) and \(R_i\), we have shown that the expected total discounted reward is equal to the expected total undiscounted reward earned by time \(T\). Specifically, we have demonstrated that: $$ E\left[\sum_{i=1}^{\infty} \beta^{i-1} R_{i}\right]=E\left[\sum_{i=1}^{T} R_{i}\right]. $$

Step by step solution

01

(Step 1: Expected Total Discounted Reward)

(To find the expected total discounted reward, we will utilize the fact that the expectation is a linear operator. For any random variable \(X\), the expectation of that variable multiplied by a constant \(c\) is equal to the expectation of the variable multiplied by the constant: \(E[cX] = cE[X]\). Using this property, we have: $$ E\left[\sum_{i=1}^{\infty} \beta^{i-1} R_{i}\right] = \sum_{i=1}^{\infty} E\left[\beta^{i-1} R_{i}\right] $$ Since \(\beta\) is a constant for all \(i\), we can write: $$ \sum_{i=1}^{\infty} E\left[\beta^{i-1} R_{i}\right] = \sum_{i=1}^{\infty} \beta^{i-1} E[R_i]. $$ )
02

(Step 2: Expected Total Undiscounted Reward by Time \(T\))

(We will now evaluate the expected total undiscounted reward earned by time \(T\). To do this, let's first compute the joint probability mass function (PMF) of \(T=t\) and \(R_i=r_i\) for \(i=1,2,\dots,t\). Since \(T\) is independent of the \(R_i\)s, we have: $$ P(T=t, R_1=r_1, R_2=r_2, \dots, R_t=r_t) = P(T=t) \prod_{i=1}^{t} P(R_i=r_i) $$ Now, we can compute the expected total undiscounted reward for time \(t\): $$ E\left[\sum_{i=1}^{t} R_{i}\right] = \sum_{i=1}^{t} E[R_i] $$ Next, we find the expected total undiscounted reward earned by time \(T\) by summing the expected values multiplied by the probability mass function of \(T\) for each integer value of \(t\): $$ E\left[\sum_{i=1}^{T} R_{i}\right] = \sum_{t=1}^{\infty} P(T=t) \sum_{i=1}^{t} E[R_i] $$ )
03

(Step 3: Show That Both Expressions Are Equal)

(We can now show that the expressions for the expected total discounted reward and the expected total undiscounted reward by time \(T\) are equal. Recall that \(T\) is a geometric random variable with parameter \(1-\beta\) and PMF \(P(T=t) = (1-\beta)^{t-1}\beta\): $$ E\left[\sum_{i=1}^{T} R_{i}\right] = \sum_{t=1}^{\infty} P(T=t) \sum_{i=1}^{t} E[R_i] = \sum_{t=1}^{\infty} (1-\beta)^{t-1}\beta \sum_{i=1}^{t} E[R_i] $$ Now, let's manipulate this expression further: $$ \sum_{t=1}^{\infty} (1-\beta)^{t-1}\beta \sum_{i=1}^{t} E[R_i] = \sum_{t=1}^{\infty} \beta^{t-1} (1-\beta) \sum_{i=1}^{t} E[R_i] $$ Now we can change the order of summation, first summing over \(t\) and then summing over \(i\): $$ \sum_{t=1}^{\infty} \beta^{t-1} (1-\beta) \sum_{i=1}^{t} E[R_i] = \sum_{i=1}^\infty E[R_i] \sum_{t=i}^\infty \beta^{t-1} (1-\beta) $$ Using the geometric series formula and noting that \(\sum_{t=i}^{\infty} \beta^{t-1} (1-\beta) = 1\), we have: $$ \sum_{i=1}^\infty E[R_i] \sum_{t=i}^\infty \beta^{t-1} (1-\beta) =\sum_{i=1}^{\infty} E[R_i] =\sum_{i=1}^{\infty} \beta^{i-1} E[R_i] = E\left[\sum_{i=1}^{\infty} \beta^{i-1} R_{i}\right] $$ Therefore, we have shown that the expected total discounted reward is equal to the expected total undiscounted reward earned by time \(T\). That is, $$ E\left[\sum_{i=1}^{\infty} \beta^{i-1} R_{i}\right]=E\left[\sum_{i=1}^{T} R_{i}\right]. $$ )

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

\(A, B\), and \(C\) are evenly matched tennis players. Initially \(A\) and \(B\) play a set, and the winner then plays \(C\). This continues, with the winner always playing the waiting player, until one of the players has won two sets in a row. That player is then declared the overall winner. Find the probability that \(A\) is the overall winner.

Two players alternate flipping a coin that comes up heads with probability \(p\). The first one to obtain a head is declared the winner. We are interested in the probability that the first player to flip is the winner. Before determining this probability, which we will call \(f(p)\), answer the following questions. (a) Do you think that \(f(p)\) is a monotone function of \(p ?\) If so, is it increasing or decreasing? (b) What do you think is the value of \(\lim _{p \rightarrow 1} f(p) ?\) (c) What do you think is the value of \(\lim _{p \rightarrow 0} f(p) ?\) (d) Find \(f(p)\).

A coin having probability \(p\) of coming up heads is continually flipped. Let \(P_{j}(n)\) denote the probability that a run of \(j\) successive heads occurs within the first \(n\) flips. (a) Argue that $$ P_{j}(n)=P_{j}(n-1)+p^{j}(1-p)\left[1-P_{j}(n-j-1)\right] $$ (b) By conditioning on the first non-head to appear, derive another equation relating \(P_{j}(n)\) to the quantities \(P_{j}(n-k), k=1, \ldots, j\)

Suppose that \(X\) and \(Y\) are independent random variables with probability density functions \(f_{X}\) and \(f_{Y}\). Determine a one-dimensional integral expression for \(P[X+\) \(Y

Each element in a sequence of binary data is either 1 with probability \(p\) or 0 with probability \(1-p .\) A maximal subsequence of consecutive values having identical outcomes is called a run. For instance, if the outcome sequence is \(1,1,0,1,1,1,0\), the first run is of length 2, the second is of length 1, and the third is of length \(3 .\) (a) Find the expected length of the first run. (b) Find the expected length of the second run.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free