The Initial (or Seed) values
The strangest part of the formula is the little "d".
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) .
What good does it do to multiply every page's PageRank by the same factor - and what's the purpose of adding (1-d) to every PageRank?
Consider the simple set-up to the left: Page1 links to all pages - and all subpages link back. This is specified on the spreadsheet like below:
How to specify the simple structure |
Initial value set to 10 |
Initial value set to minus 1 |
Try the spreadsheet for different set-ups and if you have enough iterations, the average always become exactly 1 - as long as you don't have any dead ends.
So that's the purpose of adding (1-d) to every PageRank and multiplying by a dampening factor d: to ensure that the average PageRank becomes 1, or as they say at Stanford, "Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages' PageRanks will be one".
Now, let's look at the value of the damping factor.