Comscore data for January 2008 paints a slightly different picture, although relative positions/trends hold up. Total # of searches conducted at ‘core’ engines in January 2008 was 10.5B.
Other notable sites with major query volumes include eBay (460M/month), craigslist (250M), Amazon (160M), MySpace (375M), and Facebook (100M).
Compete reports slightly different numbers for Google & Ask: For July 2007 Google comes in at 66% – 4.8B queries; Yahoo at 20% – 1.44B queries; Microsoft at 10% – 744M queries; Ask.com at 3.3% – 244M queries. Compete must be including AOL under the Google column, which is fair. Not sure what else is counted under Google to make it 66% (counting Google.com only & AOL makes it 50 + 5 = 55%). Also # of queries on the Ask network may not be counted appropriately in this data set: the discrepancy b/w 244M queries reported here and 400M reported by comScore is huge.
The random network theory of Erdos and Renyi and its cluster-friendly extension by Watts and Strogatz both insisted that the number of nodes with k links should decreases exponentially–a much faster decay than that predicted by a power law. They both told us, in rigorous mathematical terms, that hubs do not exist.
Networks before the proposal of scale-free model by Barabasi’s group were modeled as static and randomly connected, which even though could model a small world, did not model the Web or spread of AIDS or computer viruses, etc. Real world networks grow, and there is preferential attachment (rich get richer). Incorporating these two conditions into a network model helps capture the scale-free model represented by the Web, for instance.
Scale-free networks can also be small-world, just like the random + clustered model proposed by Watts & Strogatz; meaning in few hops one can get from any node to another node. The difference in these networks comes in how they break down and how information spreads through these networks. In random networks, taking many nodes out won’t affect reachability in most of the network until a large % of the network nodes are removed. In scale-free networks as well, if nodes are removed randomly then the network holds up, which is great news for the stability of a network like the Internet. However, if key hubs are removed in the scale-free network, you will immediately have islands in the network thereby crippling reachability across the network. Consider the spread of information or virus or a disease: if the web of intimate links in a human society were to be random, something like AIDS may not have spread like it did. Unfortunately, that intimate links graph apparently is a scale-free hubs-type network and so once a hub gets a disease, it spreads rapidly to many-many links in the network. The silver lining here could be that curing only the hubs, which may be cost effective, can drastically reduce the spread of the disease. If it were a random network, we would have to cure all nodes in the network to stop the spread of a disease in the network. The challenge I would think is figuring out who the hubs are, especially in a disease-spread scenario. In the case of computers, it is relatively easy to figure that out given there is lot of knowledge about the network.
Nielsen figures have been different. Trends in Google hold up though. Ask and AOL swap places in Nielsen figures. Per Nielsen Google has 56% share, Yahoo 22%, MSN 8%, AOL 5%, Ask 2% (excluding some network searches I think).
According to HitWise, Google accounted for 65% of all US searches in a 4-week period in May 2007. Yahoo stood at 21%, MSN 8.4%, and Ask 4%.
Yahoo & MSN stats are quite close in Hitwise and Nielsen measurements (21% and 8% respectively) — their comScore numbers are not far either (26% and 10%). Ask is also in the 4-5% range per Hitwise and comScore (the 2% reported by Nielsen is due to exclusion of Ask’s network searches I believe). Only Google numbers vary a lot across various measurement systems. Differences could be due to sampling methodology, and the way searches / queries are counted, etc.
So what would it take to double the GDP from here on? If India grows at 9% per annum rate, it would take 8 years — 2015. I would think that once an economy hits $1T, it kicks into higher gear and things would accelerate further. So the next trillion may happen sooner — 2013-2014, say, assuming a growth of 10-11%. Crippling infrastructure (power shortage, clean water, roads, ports) could be our only brake. Agriculture: Although agriculture’s share of gdp is declining, majority of the country engages in it and so agri’s performance is key to driving up consumer-demand, which clearly is a big chunk of the GDP.
“8:40 p.m.: Q.: Advice for the upcoming entrepreneur?
Gates: The idea of being at the forefront and increasing in size has been one of our greatest challenges. Our business is really about the passion.
Jobs: If you don’t love it, you’re going to fail. You’ve got to love it and you’ve got to have passion. And you’ve got to be a great talent scout, you can only build a great organization around great people.”
Being successful is all about having “passion” for what you do. This is the essence really. I have seen this in my own work; when I am passionate about what I do and really love it, thats when my best creativity comes forward. When I don’t like what I do, I don’t even come close.
Strength of Weak Ties
Mark Granovetter identified a critical element in modeling real world networks, called the strength of weak ties. In this model, we have very close ties to few friends, forming a complete graph — implying all our friends are friends of one another too (strong ties). Some members of our close-friend-circle have acquaintance relationships (weak ties) with others, who in turn have their friend circles. So the entire human network graph is connected that has lumps of close friends /strong ties, who are joined to other lumps with weak ties.
These weak ties are what help us find jobs apparently–at least better than our strong ties. The weak ties lead us to new worlds and new opportunities that we ourselves do not know of or our strongly-tied friends are not aware of. Our close friend circle is presumably aware of similar opportunities as we do…so it is unlikely to open new doors.
Contrast this with the random model of Erdos and Renyi — in that model any two arbitrary nodes are just as likely to be connected as our close friends are! That seems quite unlikely given what we know of our world. Granovetter says that social networks are not random and that our close friends form a near complete graph (strong ties) with a high clustering coefficient and we are tied to acquaintances through weak ties.
Duncan Watts & Steven Strogatz proposed a model where people are envisioned to live on a circle. We are closed to the nodes next to us and also the ones one step away from the immediate neighbors. This network offers a highly clustered world model–like Granovetter imagined–but is also a large world model. It would take several steps to reach a node that is diametrically opposite to a node on the circle. Watts & Strogatz went on to add few random links between distant nodes on the circle. This suddenly shrunk the distance between diametrically opposite nodes and their next neighbors. Importantly a few such long-distance links are enough to reduce the overall average separation between nodes. This model then accomodates the six-degrees world view as well. Few nodes / people have distant links to people living far-off and thereby become bridges/connectors reducing hopping distance.
According to the book “this [Watts & Strogatz] model offered an elegant compromise between the completely random world of Erdos and Renyi, which is a small world but hostile to circles of friends, and a regular lattice, which displays high clustering but in which nodes are far from each other.”
Six degrees of Separation
In 1967 Stanley Milgram, a Harvard professor, ran an interesting experiment. He chose two people in Boston as the targets. He sent letters to randomly chosen people in the midwest (Omaha, Wichita). He asked these people to send letters to these targets if they knew them directly; if not, they should send letters to their personal acquaintances, who they thought may know these targets directly. Apparently 42 of the 160 letters he sent made it back to the targets! (26%). The median number of intermediaries required to reach the target was 5.5. Rounding it up to 6 gives the famous “any two people are separated by six degrees of separation“.
This model of the world says that we live in a small world. So any two nodes in a large human network can, on average, be reached via 6 links. This does not imply reaching a node 6 links away is easy…because at each node you would have to know which out-bound link to pursue to get to your target and without that knowledge, the search quickly becomes exponential and impossible to navigate to the target.
Random Network model
Paul Erdos and Alfred Renyi, both great mathematicians, assumed that complex networks are essentially random. Start with a large set of unconnected nodes and begin adding links randomly between nodes. After a while, most of the nodes will be connected and each node will approximately have the same number of links. There may be some outliers–far more links than most or far fewer links than most–but in general most of the nodes will end up with approximately same number of links. This is like a Poisson distribution.
Imagine a cocktail party with a large number of guests. You incentivize your guests to pass on a secret by introducing it to a node or few nodes. The guests have to make acquaintances to pass on the message. Will the message reach everyone? Almost all is the answer. If you plot a histogram of how many of the guests had 1, 2, …N acquaintances, the distribution will turn out to be Poisson! This is as per the random network model of Erdos and Renyi. A majority of the guests would have made the same # of acquaintances and on either sides of the peak the distribution diminishes rapidly indicating that extreme variations are very rare.
It is worthwhile to note that Erdos and Renyi did not intend to model real-world phenomenon like web-page distributions, cell-phone distributions, etc., with their random network model. They were purely interested in the mechanics of graphs.
To summarize the random network model then, it states that the average is the norm. Most people have same number of acquaintances and very few people know tons of others and very few people are compleltely isolated. This model does not answer how real world networks indeed look like. Other models were derived to explain that.