Expectation Sunday, May 2 2010

I love the subject of probability. Almost every book I’ve read on Probability, the explanation of expectation has been unintuitive and de-motivating. Except a few – one that is written by my professor, Prof. Norm Matloff, which is available here.

Consider that I am buying 10 oranges, which sums up to a weight of 1lb. Now the weight of one orange on an average is ${\frac{1}{10}}$lb. This concept is so simple that there is nothing to explain here. We all know what an average value means.

Expectation is nothing but an average value. Assume the case of throwing a die ${n}$ times. Let us assume we have a variable that marks the value we get for every throw. Let ${X_i}$ denote the value we get in the ${i^{\text{th}}}$ throw. So after ${n}$ throws I get a sum of values:

$\displaystyle X_1 + X_2 + ... + X_i + ... + X_n$

.

Let us denote the sum by a variable ${S}$; then: ${S = X_1 + X_2 + ... + X_i + ... + X_n}$.

Now what is the average value that we get for each throw. That would be ${\frac{S}{n}}$.

As a side note, these variables, the ${\left(X_i'\right)s}$ are called random variables. Why are they called random variables? Just because they are initialized to some value randomly. I can not say ${X_1}$ will always be 1. All I can say is ${X_1}$ will be an integer between 1 and 6.

Let us try rewriting our equation for the sum ${S}$ in a fancy way:

$\displaystyle \begin{array}{rcl} S = & & (0. \text{number of times we throw a 0}\\ & + & 1.\text{number of times we throw a 1}\\ & + & ... \\ & + & 6.\text{number of times we throw a 6}) \end{array}$

If we take this sum over our ${n}$ throws, we get the exact sum ${S}$.

With this new equation if we rewrite our average:

$\displaystyle \begin{array}{rcl} \frac{S}{n} = & & (0. \text{number of times we throw a 0}+ \\ & & 1.\text{number of times we throw a 1}+\\ & & ...+\\ & & 6.\text{number of times we throw a 6}) / n \end{array}$

That is

$\displaystyle \begin{array}{rcl} \frac{S}{n} & = &(\frac{0. \text{number of times we throw a 0}}{n})+\\ & &(\frac{1. \text{number of times we throw a 1}}{n})+\\ & & ...+\\ & &(\frac{6. \text{number of times we throw a 6}}{n}) \end{array}$

$\displaystyle \frac{\text{number of times we throw a 0}}{n} =\\ \text{probability of getting a 0 on throwing a die}$

$\displaystyle \frac{\text{number of times we throw a 1}}{n} =\\ \text{probability of getting a 1 on throwing a die}$

And so on.

Then our average becomes:

$\displaystyle \begin{array}{rcl} \frac{S}{n} & = & 0.\text{probability of getting a 0 in throwing a die}\\ & + & 1. \text{probability of getting a 1 in throwing a die} \\ & + & .... \\ & + & 6.\text{probability of getting a 6 in throwing a die} \end{array}$

We can write this concisely as:

$\displaystyle \begin{array}{rcl} \frac{S}{n} & = & \sum_{i=1}^{6}i.[\text{probability of getting an }i\text{ in throwing a die}]\\ & = & \sum_{i=1}^{6}i.P[X=i] \end{array}$

where we wrote a generic single random variable ${X}$ instead of differentiating each of them with ${X_i}$‘s.

We can call this average ${\frac{S}{n}}$ by a fancy name too, expectation. The mathematical notation of which would be ${E[X]}$.

This gives ${E[X] = \sum_{i=1}^{6}i.P[X=i] = 3.5\text{(by doing the cumbersome arithmetic)}}$, in case of rolling a die. Here the sum is taken over all possible values that ${X}$ can take.

Let us denote that as a set, named ${A}$, which in our case will be ${A = \{1, 2, ...., 6\}}$.

Using this we can write our generic equation for expectation (as found in many textbooks):

$\displaystyle E[X] = \sum_{i \in A}i.P[X=i]$

And finally, the word expectation, as my professor explains, says only one thing about that value; “you should never expect the expected value”.

Random Question (1) Sunday, Oct 25 2009

Are there transformers for P problems, like SAT in NP-Complete? For all problems X in P, there’s a single problem Y s.t. Y is polynomial time reducible to X?

Revisiting Algorithm Analysis Friday, Apr 24 2009

Yet again, another subject that I dreaded due to the incomprehensible and intangible approach in which the subject was taught. One fundamental question I had was, knowing the definition of Big O notation, what does $O(f(n))$ really mean?

Let me try to put them in my perspective. Think that I am planning to go and buy a new shirt and I do not know what its exact cost is. But to make sure that I have enough money to buy one, I kind of predict, it would cost at the maximum of say a \$1000 (I would love it as a gift, better). That is sort of the upper bound. I know that the cost will not go beyond that; even in the worst case.

Big O notation is similar to that. It defines an upperbound to the complexity of the function. Consider the case of sorting $n$ elements, the time taken (considering time complexity, we can worry about space complexity some other time) is given by the function $f(n) = n^3+n^2+1$. By the definition of Big O, the upper bound is $O(f(n)) = n^3$. We can get this by removing all constants, and lower order terms, leaving us with just $n^3$.

By definition, $f(n) = O(g(n))$ means that for some constants $n_0>0$ and $C>0$, $0 \leq f(n) \leq C.g(n)$, $\forall n \geq n_0$

Let us try to interpret this. $0 \leq f(n)$ part means that we are considering the positive values of $f(n)$.

Following shows how the graphs of the functions $f(x) = x^3+x^2+14$, $f(x) = x^3$ and $f(x) = 2.x^3$ looks like.

It is can be seen that $f(x) = 2.x^3$ is above $f(n) = x^3+x^2+1 \forall x \geq 0$.

So we have for $f(n) = n^3+n^2+1$,
$O(f(n) = g(n)$, where $g(n) = n^3, C = 2 and n_0 = 0$.

Taylor’s Formula Wednesday, Nov 14 2007

Let $f(x)$ be a function continous over the closed interval, [$a,b$] such that$f(x)$ has $(n+1)^{th}$ derivative, which is written as $f^{(n+1)}(x)$; then:

$f(b) = f(a) + f'(a) (b-a) + ... + \frac{f^{(n)}(a)}{n!} (b-a)^n + \frac{f^{(n+1)}(z)}{(n+1)!} (b-a)^{(n+1)}$

PROOF:

Let

$H = f(b) - f(a) - f'(a) (b-a) - .. - \frac{f^{(n)}(a) }{n!}(b-a)^n$

Also,

$K = \frac{H}{(b-a)^(n+1)}$

Consider a polynomial,

$g(x) = f(b) - f(x) - f'(x)(b-x) - f''(x)(b-x)^2... - \frac{f^{(n)}(x)}{n!}(b-x)^n - K(b-x)^{n+1}$

We can see that,

$g(a) = H - K(b-a)^{n+1} = 0$

Similarly,

$g(b) = f(b) - f(b) - f'(b)(b-b) - ... - \frac{f^{(n)}(b) }{n!}(b-b)^{n} - K(b-b)^{n+1} = 0$

Fuzzy Logic Tuesday, Apr 10 2007

This was one subject which kept haunting me through out the engineering curriculum. Not only was that my ma’am, teaching the subject, make any sense; but also, the text book, reminded me more of heiroglyphics.

Life went on smoothly, after some how I managed to pass the paper. Finally a job demanded me to shift to a far off place, requiring me to take care of myself. Its all easy I thought. All except a single job, washing. I hated washing. So I decided to buy one of the best washing machines around (all that my pocket could afford, though).

Seeing me, the shop keeper greeted me so well. I threw at him all my requirements. So he showed me a beautiful washing machine (which I hope works). It had a lot of features, gratifying the needs of the lazy soul inside me. He said it had an additional feature, it can automatically decide on the amount of water, detergent and all such inputs required, which is supported by fuzzy logic. Think of the devil, and here comes the devil himself. This theory, has decided with whatever mind it had, not to leave me alone, till my grave may be.

Even my shopkeeper keeps talking about fuzzy logic and its uses. The engineer inside me got hurt a lot. So a decision came to browse through a couple of pages on the web till I can spell the word fuzzy logic in a proper way.

I am now in my office. Is this sentance true. It is true if I am inside the office. It is false if I am outside the office. So what will happen if I am at the gate. It is partly true and partly false. Just like a half full glass is half empty too.

In normal, classic logic, every statement, proposition, has a value of 0 or 1, depending on it is true or false. So in a situation like the one mentioned previously, we cant assign a proper value like this. The value is 0.5 in that case.

Now, think of a statement P. Its truth value assignment is p. p=1 means the statement is true. p=0 means the statement is false. Here comes fun, think of those paradox that your friends used to bombard you with. Think, of a statement like, “this statement is false”.

Let’s call that statement P. Its truth value is p. The truth value for the converse of P (i.e. Not-P) is (1-p). The statement says, in words, that the truth value of P is (1-p). This means if p=0, p becomes (1-0) =1 and if p=1, (1-p) gives 0. So the value of p keeps oscillating. The only case where this becomes true is when p=0.5. That is permissible with fuzzy logic.

Euclidean Postulates Friday, Mar 30 2007

1. There is a unique straight line segment connecting any two points

2. Unlimited (continuous) extendability of any straight line segment .

3. Existence of a circle with any center and any value for radius .

4. Equality of right angles.

5. If two straight line segments, a and b in a plane intersect another straight line c such that the sum of the interior angles on the same side of c is less than two right angles, then a and b, extended far enough on that side of c, will intersect somewhere.

Playfair‘s Axiom is an alternate variation of the 5th Euclidean postulate (parallel postulate). It states that for any line and any point that is not on the line, there is a unique straight line through the point which is parallel to the line.

Parallel postulate has a lot of story attached to it. History says of a lot of people who tried to make the fifth postulate a theorem, derivable from the rest four. But finally Beltrami proved the independence of 5th postulate from others.

Remarkable is the effort by Saccheri, which speaks of the origins of elliptic geometry and hyperbolic geometry.

Beltrami Geometry (Stereographic Projection) Friday, Mar 30 2007

This is an alternate representaion for Hyperbolic geometry. Wikipedia says that this done by projecting a point on a sphere onto a plane tangential to the sphere at the point antinodal to the center of projection (that is the point diametrically opposite to the center of projection).

With this projection, any circle that crosses through the center of projection becomes a straight line.Any other circle (that does not touch center of projection), can become circles (possibly ellipses in the case of inclined circles).

For an example, think that we are projecting Earth. The center of projection is absolute North Pole. The plane of projection is parallel to the equitorial circle. Now consider the projection of a longitude. It will be a straight line.

Consider the case of a latitude. This becomes a circle. An extreme case will be the equator. It will be a bounding circle. It is called primitive of the projection. Any other latitude becomes a circle concentric to the primitive and inside it. Infact those latitudes in southern hemisphere will have projection with radius greater than the primitive (Still haven’t figured out how).

A beautiful description is also available from International Union of Crystallography. Stereography is used in crystallography it explains.

Hyperbolic Geometry Thursday, Mar 29 2007

What happens when sum of all angles of a triangle is not $180^\circ$. Is it possible? Can the area of a triangle be the sum of all its angles? Can a straight line be curved?

If the mind says, no. Its wrong. It is possible. Beyond the Euclidean geometry that we are used to though. The case is totally different with Hyperbolic spaces.

Think of a line drawn on a sphere, a straight line. There is no doubt that the line is perfectly straight. But when you see the line after drawing it, you can feel the curve in the line. The curvature is attributed to the spherical surface. So if you accept the curvature, or rather think of such a line on earth, the curvature is so big that we feel the plane as prefectly flat. So the line becomes perfectly straight. Thus it is the preception that makes the line straight or curved. Hyperbolic geometry is on a spherical universe. So when we try to see it in the Euclidean space, we get confused. We start thinking if the straight line is actually straight or not.

Now think of a sphere. A triangle is drawn on it. The projection is taken onto 2-d plane. The bounding circle will be the projection of the equator (actually this is one of the formal representation method for Hyperbolic geometry, called . The lines under this representation will be straight only if it is a diameter to the circle. With such a representaion, the angle doesnt always sum upto $180^\circ$, as it normally does. It falls short, and by how much depends on the area of the triangle.

$\alpha+\beta+\gamma = C\Delta$

Taking the constant C=1, we have the formula for the angle.

Yet again, this is not inside the universe of Hyperbolic geometry. Taking a triangle in a hyperbolic space, and trying to represent it in Euclidean one gives us this pain. There are alternate representations available, which keeps the lines straight, and angles add to $180^\circ$. But these add in other types of complications. It is even said that an equation for the distance between two points gets more complex this way.