a thoughtful web.
Share good ideas and conversation.   Login or Take a Tour!
Markov chains and binomial distribution - what is the nature of the relationship between them?

Hi everyone

I suspect it is quite simple actually but I don't have the knowledge of where I should be looking to solve this one... I thought I might pop this up on hubski in the meantime; I know there is a mathematical bent here. A small part of my work involves simulating sequences of years that are classified as either wet or dry. There is good reason for assuming that this behaviour can be modelled as a two-state markov chain, with different probabilities for transitioning from W>D, D>W etc.

Suppose then I have a sequence of years that I want to use to derive these transition probabilities e.g. D-WWW-DD-WW-DD-W-D-WWWW.

I can calculate the average dry and wet lengths - 1.5 and 2.5 years respectively. I can also estimate (I think) the transition probabilities by taking the number of transitions from a state over the chances that there had been to transition from that state. So assuming the first and last are not intersecting a sequence, I would have for the wet case 4 transitions over 10 wet years, i.e. P = 0.4. The same process for dry gives P = 0.67.

Now for the interesting part - I was wondering whether there was any relationship between the average length and the transition probability. Intuitively it should be the case, as the length will ultimately be determined by the behaviour caused by the probabilities.

Because each iteration behaves like a biased coin toss, I started thinking that the binomial distribution might have something to do with it. An interesting property I noticed is that 1 + the infinite sum of (P^n) from n=1 gives the average lengths as calculated above.

Of course each term of the sum here is the probability of n successess in n tests according to the binomial distribution. As I said, I am sure there is a fairly simple explanation, or perhaps I have gotten something wrong. Either way I suppose I hope someone could shed some light on this, or at least finds this interesting.

Bonus: some nice graphs from the model showing stochastic rainfall and evaporation predictions.  Cheers

De Waal

bfv  ·  34 days ago  ·  link  ·

Unless the weather is really weird in your neck of the woods, your markov chain is irreducible and ergodic, so it has a stationary distribution. With two states, the state your process ends up in is a Bernoulli random variable.

DWol  ·  34 days ago  ·  link  ·

Ah very cool... This set me off on the right track. I think I found an inductive type explanation which I will try to go through.

So supposing we find ourselves in a given state, there is a chance, p, that we will transition to the other state and hence end up with length of 1. On the other hand there is 1-p chance that we will stay, and have a length 1 + L' where L' is the length of whatever happens after that. So the length, L, of our sequence is then:

``  L = 1(p) + (1-p)(1+L')``

But L' is just the next iteration of the same kind of process that produces L, and so their expected values should be the same, so L = L' leaving:

``  L = 1/p``

Which gels with what I found above... But on the other a little learning is a dangerous thing so I am not 100% confident that it is up to scratch.

Incidentally, my class was the last set who did not have intro statistics as a core course... Instead we did electromagnetism, used approximately never in the rest of the curriculum. Meanwhile statistics knowledge is something we had to cobble together over time as required. Each time the activation energy of finding a textbook and doing it properly from the start was just a little bit too high for what was at stake. It makes me wonder about the different roles of structured or formalised learning compared to figuring it out yourself. Knowing where to start looking is one of the big types of knowledge that a curriculum affords you I think.

Considering the circumstances, now is probably as good a time as any to bury the hatchet and dive into khanacademy or whatever. Personally that is probably what I will do as this has felt like a big hole in my competency for a while now. But on a general level, I wonder about when the right way is to dive in, mess around, fail, learn by doing etc. vs. deciding that you should rather let someone else guide the process, even if it means starting at a very basic level. I'm sure we go back and forth between these, and it's probably not meaningful to try and say that there is a "optimum" way...

bfv  ·  34 days ago  ·  link  ·

That works if what you have is actually a Bernoulli process. You can tell if both rows of your transition matrix are identical.

Dala  ·  34 days ago  ·  link  ·

Considering the circumstances, now is probably as good a time as any to bury the hatchet and dive into khanacademy or whatever. Personally that is probably what I will do as this has felt like a big hole in my competency for a while now.

You should do that thing! I am working through third grade math on there right now because I have forgotten how to do pretty much all the math, and I also don't think I learned how to do any of it very well the first time. Join me!