network

Aardvark Publishes A Research Paper Offering Unprecedented Insights Into Social Search

Source: http://feedproxy.google.com/~r/Techcrunch/~3/IMDRrISRf-8/

In 1998, Larry Page and Sergey Brin published a paper[PDF] titled Anatomy of a Large-Scale Hypertextual Search Engine, in which they outlined the core technology behind Google and the theory behind PageRank. Now, twelve years after that paper was published, the team behind social search engine Aardvark has drafted its own research paper that looks at the social side of search. Dubbed Anatomy of a Large-Scale Social Search Engine, the paper has just been accepted to WWW2010, the same conference where the classic Google paper was published.

Aardvark will be posting the paper in its entirety on its official blog at 9 AM PST, and they gave us the chance to take a sneak peek at it. It’s an interesting read to say the least, outlining some of the fundamental principles that could turn Aardvark and other social search engines into powerful complements to Google and its ilk. The paper likens Aardvark to a ‘Village’ search model, where answers come from the people in your social network; Google is part of ‘Library’ search, where the answers lie in already-written texts. The paper is well worth reading in its entirety (and most of it is pretty accessible), but here are some key points:

  • On traditional search engines like Google, the ‘long-tail’ of information can be acquired with the use of very thorough crawlers. With Aardvark, a breadth of knowledge is totally reliant on how many knowledgeable users are on the service. This leads Aardvark to conclude that “the strategy for increasing the knowledge base of Aardvark crucially involves creating a good experience for users so that they remain active and are inclined to invite their friends”. This will likely be one of Aardvark’s greatest challenges.
  • Beyond asking you about the topics you’re most familiar with, Aardvark will actually look at your past blog posts, existing online profiles, and tweets to identify what topics you know about.
  • If you seem to know about a topic and your friends do too, the system assumes you’re more knowledgeable than if you were the only one in a group of friends to know about that topic.
  • Aardvark concludes that while the amount of trust users place in information on engines like Google is related to a source website’s authority, the amount they trust a source on Aardvark is based on intimacy, and how they’re connected to the person giving them information
  • Some parts of the search process are actually easier for Aardvark’s technology than they are for traditional search engines. On Google, when you type in a query, the engine has to pair you up with exact websites that hold the answer to your query. On Aardvark, it only has to pair you with a person who knows about the topic — it doesn’t have to worry about actually finding the answer, and can be more flexible with how the query is worded.
  • As of October 2009, Aardvark had 90,361 users, of whom 55.9% had created content (asked or answered a question). The site’s average query volume was 3,167.2 questions per day, with the median active user asking 3.1 questions per month. Interestingly, mobile users are more active than desktop users. The Aardvark team attributes this to users wanting quick, short answers on their phones without having to dig for anything. They also think people are more used to using more natural language patterns on their phones.
  • The average query length was 18.6 words (median of 13) versus 2.2-2.9 words on a standard search engine.  Some of this difference comes from the more natural language people use (with words like “a”, “the”, and “if”).  It’s also because people tend to add more context to their queries, with the knowledge that it will be read by a human and will likely lead to a better answer.
  • 98.1% of questions asked on Aardvark were unique, compared with between 57 and 63% on traditional search engines.
  • 87.7% of questions submitted were answered, and nearly 60% of them were answered within 10 minutes.  The median answering time was 6 minutes and 37 seconds, with the average question receiving two answers.  70.4% of answers were deemed to be ‘good’, with 14.1% as ‘OK’ and 15.5% were rated as bad.
  • 86.7% of Aardvark users had been asked by Aardvark to answer a question, of whom 70% actually looked at the question and 38% could answer.  50% of all members had answered a question (including 75% of all users who had ever actually interacted with the site), though 20% of users accounted for 85% of answers.
Information provided by CrunchBase


Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Tuesday, February 2nd, 2010 Uncategorized No Comments

Facebook is going down – pageviews, average stay, pages per visit – why?

From the Compete charts below, it is clear that Facebook is seeing a decline in pageviews, average stay, and pages per visit.  But why?

I know that I have reduced the time I spend on Facebook and I have also reduced the number of messages and other social actions as well.  And I have deleted virtually all of my personal and family photos and will not upload any more. These may be the first signs of a waning of Facebook due to a number of factors.

I can’t get my stuff back out

For example, Facebook has stated that it will not participate in OpenSocial because they do not want people to be able to export their content, conversations, photos, etc, out of Facebook and use on another social network. I am concerned that I will not be able to retrieve or back up content which I believe is mine. I like to have control over my family photos, conversations with friends, etc. I am willing to accept as a “cost” of using the Facebook system the fact that they know who my friends are.  But I am less willing or unwilling to continue putting my content where I cannot get it back, in its entirety.  (Google Docs, for example, just launched a feature where you can back up everything back out of Google Docs into Microsoft Office formats).

Ads in the stream, erosion of trust

A second issue mentioned in a previous post is the increase in advertising on Facebook and also the more unscrupulous practice of injecting ads “into the stream” — ads masquerading as status updates. These are harmful to the overall trust built up in the community and I have un-friended quite a few people whose accounts were clearly used to promote events, products, etc.

Ad-effectiveness sucks

From a prior post – http://bit.ly/EhiW9 – Facebook advertising metric are absolutely abysmal. They keep trying to sell advertisers on the hundreds of billions of pageviews they throw off. But advertisers are getting smarter and more and more of them will buy ads on a cost-per-click basis (instead of CPM, cost per thousand impressions basis).  This means that the ad revenues that Facebook enjoyed from gross INefficiencies will be decimated.


facebook-pageviews

facebook-average-stay

facebook-pages-per-visit

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Friday, October 30th, 2009 Uncategorized No Comments

Map of IP addresses around the world used to commit Click-Fraud

Source: http://feeds.gawker.com/~r/gizmodo/full/~3/QE1Gthuy4_k/3-million-in-click-fraud-over-two-weeks-just-the-beginning

A recently disbanded click fraud ring in China racked up $3 million worth of clicks in two weeks. $3 million that we’re aware of. Just how detectable is this whole business of racking up fraudulent ad revenue clicks?

That intricate mess of lines above represents a portion of DormRing1, the click fraud bunch that was caught in China. The lines show the relationship of some of the IP addresses involved in the fraud and how they are connected to some fraudulent ad clicks. The whole network actually “involved 200,000 different IP addresses and racked up more than $3 million worth of fraudulent clicks across 2,000 advertisers in a two-week period.” Impressive and scary at the same time.

The trouble is that no one really knows how much ad revenue DormRing1 collected before they were caught. Click-fraud monitoring services such as Anchor Intelligence, the ones behind this catch, are evolving to keep up with the scale on which these rings are operating. It’s still difficult to judge just how well they’re doing as they’re having to infiltrate forums and gain the trust of the perpetrators in a manner reminiscent of drug busts. But as the criminals are getting more elaborate, the investigations are too.

That good news aside, do me a favor: after you read this post, comment, and all that jazz, refresh the page a few times and—Ah…I mean, heh…just kidding. [Tech Crunch]


Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Friday, October 9th, 2009 Uncategorized No Comments

Smaller social networks are losing even the few users they have…

…to larger social networks like facebook where more of the users’ friends actually are.

Hi5, Bebo, and even Ning — the social network predicted to “host some 4 million social networks serving up billions of page views daily” by Gina Bianchini (FastCompany: http://www.fastcompany.com/magazine/125/nings-infinite-ambition.html) — are losing traction.

hi5-bebo-ning-unique-visitors

Related:  ”If you’re just a feature, someone else will just add you and your raison d’être vanishes (you “tweet” your status in Facebook, LinkedIn, MySpace, etc.)

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , ,

Thursday, July 9th, 2009 Uncategorized No Comments

What is Web 3.0? Characteristics of Web 3.0

2009 06 16 What Is Web 3.0

2009 06 16 What Is Web 3.0 – Presentation Transcript

  1. What is Web 3.0? Dr. Augustine Fou June 16, 2009. June 16, 2009.
  2. Evolution of the Internet microprocessor 40 yrs 10 yrs 20 yrs 5 yrs present web internet 2.5 yrs social networks e-commerce 1.5 yrs Web 1.0 Web 2.0 Web 3.0? June 16, 2009.
  3. Evolution of the “Web” content commerce search social networks social content social search social commerce As each stage reaches critical mass, the next stage is tipped into present June 16, 2009.
  4. Key Characteristics present web 1.0 web 2.0 web 3.0
    • Speedy
    • more timely information and more efficient tools to find information
    • Collaborative
    • actions of users amass, police, and prioritize content
    • Trust-worthy
    • users establish trust networks and hone trust radars
    • Content
    • content destination sites and personal portals
    • Search
    • critical mass of content drives need for search engines
    • Commerce
    • commerce goes mainstream; digital goods rise
    • Ubiquitous
    • available at any time, anywhere, through any channel or device
    • Individualized
    • filtered and shared by friends or trust networks
    • Efficient
    • relevant and contextual information findable instantly

June 16, 2009.

  1. Illustrative Examples – retail/shopping present web 1.0 web 2.0 web 3.0
    • what friends bought or want to buy
    • drag-to-share items which friends know friends are looking for
    • item collections
    • value in the aggregation

overstock.com amazon.com FB app: MyFaveThings

    • contextual reviews
    • reviews of reviews
    • what others bought
    • individualized recommendations

June 16, 2009.

  1. Illustrative Examples – social networks present web 1.0 web 2.0 web 3.0
    • aggregates all your online identities
    • syndicates all your updates to all social networks
    • social actions visible to friends
    • trust networks across geography, time, and interests
    • collection of personal homepages

geocities.com facebook.com peoplebrowsr.com June 16, 2009.

  1. Illustrative Examples – restaurant reviews present web 1.0 web 2.0 web 3.0
    • Yelp content vetted through a user’s trust network and individual recommendations made based on situation and need, in real-time
    • user submitted reviews
    • related items based on similarity of user preferences
    • infrequent publication
    • centralized editorial control

zagat‘s yelp need reco for great Italian + GPS + Yelp 5-star Babbo, been there, love it June 16, 2009.

  1. Illustrative Examples – photos present web 1.0 web 2.0 web 3.0
    • real-time, contextual “do you like this knit shirt?”
    • friends give immediate feedback
    • share photos with friends and strangers
    • enable visitors to tag and comment
    • individual albums

kodakgallery.com flickr.com ? June 16, 2009.

  1. Illustrative Examples – real estate present web 1.0 web 2.0 web 3.0
    • information vetted by fellow users, recommended directly an in context
    • listings plus relevant information like school zones, comparable sales, alerts
    • listings based on parameters

corcoran.com streeteasy.com trulia iphone app June 16, 2009.

  1. Illustrative Examples – encyclopedia present web 1.0 web 2.0 web 3.0
    • content is ubiquitous and available through any channel or device
    • trust network proactively forwards relevant info to user who needs it
    • created, updated, and edited (policed) by user actions
    • digitized version of printed encyclopedia

britannica.com wikipedia.com chacha.com June 16, 2009.

  1. Illustrative Examples – online coupons present web 1.0 web 2.0 web 3.0
    • coupons delivered contextually and proactively when user needs it (without the user even asking for it)
    • instant feedback
    • community action makes it more accurate and useful for others
    • collection of online coupons – value in the aggregation

dealcatcher.com retailmenot.com June 16, 2009.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

trends – coupon sites, network TV sites

coupon sites: definitely headed upward with a spike in Dec 08.

coupon-sites

network TV sites are seeing healthy increases, likely due to “view full episodes” on their websites – but even this increase in traffic will not replace the advertising revenues lost on network television

tv-network-sites

Tags: , , , , , , , , , , , , , , ,

Monday, April 27th, 2009 integrated marketing, marketing, metrics No Comments

no, twitter will NOT be the next google

Every year around SXSW, there’s a surge in interest about twitter. This time around people have even gone as far as to proclaim twitter to be “the next google” or “the future of search” etc.  Bullocks!

Here’s why:

1) distant from other social networks – While we are seeing a massive surge in interest and usage of twitter, it is still a long way off from the number of users of other social networks; it will take a long time to get to critical mass; and this is a prerequisite for twitter to assail the established habit of the majority of consumers to “google it.” — Google’s already a verb.

2) no business model – It remains to be seen whether Twitter can come up with a business model to survive for the long haul. Ads with search are proven. Ads on social networks are not. And given the 140-character limit, there’s hardly any space to add ads.

3) lead adopters’ perspective is skewed – Twitter is still mostly lead adopters and techies so far; so the perspectives on its potential may be skewed too positively. As more mainstream users start to use it, we’re likely to see more tweets about nose picking, waking up, making coffee, being bored, etc….  This will quickly make the collective mass of content far less specialized and useful (as it is now).

4) too few friends to matter – Most people have too few friends. Not everyone is a Scott Monty ( @scottmonty ) with nearly 15,000 followers. So while a user’s own circle of friends would be useful for real-time searches like “what restaurant should I go to right now?” the circle is too small to know everything about everything they want to search on. And even if you take it out to a few concentric circles from the original user who asked, that depends on people retweeting your question to their followers and ultimately someone notifying you when the network has arrived at an answer — not likely to happen.

5) topics only interesting to small circle of followers – Most topics tweeted are interesting to only a very small circle of followers, most likely not even to all the followers of a particular person. A great way to see this phenomenon is with twitt(url)y. It measures twitter intensity of a particular story and lists the most tweeted and retweeted stories.  Out of the millions of users and billions of tweets, the top most tweeted stories range in the 100 – 500 tweet range and recently these included March 18 – Apple’s iPhone OS 3.0 preview event; #skittles; and the shutdown of Denver’s Rocky Mountain News.  Most other tweets are simply not important enough to enough people for them to retweet.

6) single purpose apps or social networks go away when other sites come along with more functionality or when big players simply add their functionality to their suite of services.

twitter

twitturly

Am I missing something here, people?  Agree with me or tell me I’m stupid @acfou :-)

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Wednesday, March 18th, 2009 digital, social networks No Comments

last-ad accounting, last-ad-attribution model

Why the Click Is the Wrong Metric for Online Ads

http://adage.com/digital/article?article_id=134787

There is a whole ruckus around ad networks getting too little credit for helping to drive customers’ awareness and clicks for advertisers. In the past, ad networks wanted to claim credit for type-ins (people going to an advertiser’s site by typing the URL instead of clicking on an ad). They called this “view through” and the ad networks wanted these to be attributed to their showing the ad somewhere on their network.

Now they claim that getting credit for only the last-ad is not enough — the ad the user actually clicked on to get to the advertiser’s site, the one that can actually be tracked and properly attributed.

What’s at stake is the relatively large piece of “direct” or referrer-less traffic. Analytics packages can only assign these to type-ins or bookmarks since there was no referring site to attribute them to, let alone ad creative version, etc.

But while there is demonstrable lift in click rates when display ads and search ads are running at the same time — i.e. they reinforce and complement each other — it does not mean that ad networks can or should claim credit for the lift. After all, advertising running on another network COULD also cause a lift in results of ads running on another network if they are run simultaneously.

So the bottom line is if the click or the visit is not directly attributable, it should not be attributed.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Monday, February 23rd, 2009 display advertising No Comments