Monday, 9 April 2007

SNA Map of my blog - second edition

My blog social network analysis (SNA) map (really an egonet map) from a few days ago was ok, but I wasn't completely satisfied. Inspired by this other visualisation, I decided to try using the Technorati Cosmos API to extract some new data.

This new map (and one containing all the data) looks quite different and overall I think its a more interesting map, however I still have some issues:

  • One of the limitations of both approaches is that I'm still only using incoming links (or rather, number of unique blogs linking to each blog), where as I would like to use both outgoing and incoming links;
  • The other frustration is that I still needed to clean up the data from Technorati to remove duplicates blogs, either with different domain names or links to specific blog posts (I'm not the first to notice this) - I suspect the best thing to do would be to create my own startup blog search site and spider it myself; and
  • I need to find a better publishing tools for the map - Many Eyes's network visualisation tool has very limited functionality compared to the desktop SNA tools I normally use, which makes it difficult to explore the network.

In my blog egonet, these are the top 20 most popular blogs based on the number of other blogs that link to them as reported by Technorati (that's number of unique blogs, not number of links):

www.elsua.net (37 incoming blogs)
www.anecdote.com.au (30 incoming blogs)
www.innovationcreators.com (28 incoming blogs)
www.fullcirc.com (26 incoming blogs)
www.mcgeesmusings.net (25 incoming blogs)
chieftech.blogspot.com (24 incoming blogs)
fastforwardblog.com (24 incoming blogs)
billives.typepad.com (23 incoming blogs)
blog.jackvinson.com (22 incoming blogs)
confusedofcalcutta.com (18 incoming blogs)
www.sdownes.co.uk (17 incoming blogs)
www.rossdawsonblog.com (17 incoming blogs)
www.michaelsampson.net (17 incoming blogs)
www.edbrill.com (17 incoming blogs)
engineerswithoutfears.blogspot.com (10 incoming blogs)
www.jasonkolb.com (7 incoming blogs)
mitchell.wordpress.com (6 incoming blogs)
planetkm.org (6 incoming blogs)
www.craigbellamy.net (6 incoming blogs)
rcd.typepad.com (5 incoming blogs)

Also, here are the top linkers within this network ("real" blogs only, again number of unique blogs, not number of links):

listics.com (18 outgoing blogs)
www.langemark.com (15 outgoing blogs)
intranetblog.blogware.com (12 outgoing blogs)
www.lucasmcdonnell.com (10 outgoing blogs)
billives.typepad.com (9 outgoing blogs)
chieftech.blogspot.com (9 outgoing blogs)
www.gerryriskin.com (8 outgoing blogs)
www.duperrin.com (8 outgoing blogs)
www.edu2do.com (7 outgoing blogs)
www.greenchameleon.com (7 outgoing blogs)
fastforwardblog.com (7 outgoing blogs)
mikeg.typepad.com (6 outgoing blogs)
c21org.typepad.com (5 outgoing blogs)
enterprise2rave.com (5 outgoing blogs)
www.fullcirc.com (5 outgoing blogs)
www.davidmaister.com (5 outgoing blogs)
www.mindthis.net (5 outgoing blogs)
www.zoliblog.com (5 outgoing blogs)
www.elsua.net (5 outgoing blogs)
jisi.dreamblog.jp (4 outgoing blogs)

Now to the technical stuff... If you're interested in know how I pulled all this together, these are the tools I used:

  • Technorati's API for the link data
  • I wrote an Microsoft Excel macro to automatically download data from Technorati using the API;
  • I also used Microsoft Access to help deduplicate and tidy up data; and
  • Many Eyes for visualisation.

I was surprised how easy it was to use Excel to access the API - basically I wrote a script that built a URL with the correct parameters for each blog and then just "opened" it. The data was then saved into a master table. Once I wrote the script, getting the data was quick compared to cleaning up the data.

BTW If you're interested in SNA, you might like to take a look at this article, Small World! (PDF, 89KB).

4 comments:

  1. tomoaki sawada7:11 pm

    Hi James, I just found you have linked to my blog and your SNA base data includes my blog. Your blogroll includes a lot of same names as mine. Just curious to see how your SNA map show these BLOG names in sanpshot or video.
    Best regards. Tomoaki

    ReplyDelete
  2. Hi,

    Thanks for linking to my Technorati visualizations.

    Recently I made a new draft using my own parser. I took my OPML-Blogroll from Bloglines, parsed the actual RSS-Feeds from the blogs and analyzed how they interconnect.

    http://www.metaportaldermedienpolemik.net/blog/Blog/2007-04-04/blogroll-graph

    Regards,
    Walter

    ReplyDelete
  3. Hi James,
    Great stuff, I love My Cosmos! If you'd entertain a question a question from a non-techie, tho: is there another way to put the Cosmos data into Many V without running the API script you mentioned? I ask b/c I don't know how to write such a script (sad, I know). thanksmuch!

    ReplyDelete
  4. Michele - I can send you an Excel 2003 spreadsheet that will download the data from the API for you, but you'll still need to do a bit of work to tidy up the data for duplicates etc. You'll also still need to register on Technorati for your API key (that's easier that it sounds!). Just drop me an email and I'll send you the Excel 2003 file for you to try out.

    ReplyDelete

Note: only a member of this blog may post a comment.