I like DDG and I even converted chrome over to use it as its primary engine, however, recently I am noticing more and more spam links and less and less links that really matter. I think I am coming to the conclusion that DDG did this is the wrong order. They are developing a great frontend and using others backends, but to really beat google or MS or yahoo you need to develop a great backend that can filter results well and troll the web efficiently then you can put an awesome frontend on it.
I still think the real silver bullet will be to make a backend system that can peer with other systems to gather data and can be customized to get deep results from a small subset of the web that interests a particular user ( think corporation uses it for internal search and to make their site show up better on a common front end). I even started a research project to begin working with some of the things needed to accomplish it[1].
Not to hijack further but 'troll' is technically fine there.
troll ... 4.(intransitive, fishing, by extension) To fish using a line and bait or lures trailed behind a boat similarly to trawling; to lure fish with bait. [from circa 1600]
I would imagine there would be a few problems with that which might be insurmountable.
The first being speed. People complain about DDG's speed already. Relying on a host of external searches would only cause more issues. Especially if as Google reports that a huge amount of queries are unique.
Another would be ranking. Who determines what is the most relevant result to a query? While you have multiple sources knowing which one to go to has to be determined somewhere. If you hand it over to the peers, they can game the system by insisting they are the most relevant. If you leave it to the server what incentive do the peers have to participate? If you use an open algorithm which people use whats to stop people from gaming the system?
I don't think there is an easy answer to the search game. You either need to build on others systems to have something compelling, or have millions (if not billions) of cash to have enough runway to build/improve your own system, improve it to the point its worth using and then turn a profit.
I don't even think its the complexity of the problem that stops the second. It's bandwidth and disk storage. My personal prediction is once we get disk's up-to a size where you can store a sizable chunk of the web, coupled with enough bandwidth to crawl it in a reasonable time you will see more innovation in the search space as the barrier to entry will be lowered.
We can probably assume it's not evenly distributed across people. Some people would probably always get cached queries, while others would need fresh results at least half the time.
That's probably true. I imagine it would also be true to say anyone using a new search engine would be those throwing unique queries at it most of the time too.
With that in mind, and with the tech world generally driving adoption of players in the search space I can't see the approach working. Nobody would switch when the queries are massively slower, even if they were 99% accurate.
I still think the real silver bullet will be to make a backend system that can peer with other systems to gather data and can be customized to get deep results from a small subset of the web that interests a particular user ( think corporation uses it for internal search and to make their site show up better on a common front end). I even started a research project to begin working with some of the things needed to accomplish it[1].
1. https://github.com/dkhenry/SimpleMapReduce