Sunday, October 12, 2008

Data Mining in Social Networks

With huge social networks around, I felt like it might be easier to apply some typical data mining algorithm on any dataset that I will extract from these networks, using their released APIs.

So, it turned out that it isn't a child's play for two reasons:

1- the data from social networks is relational, and hence data objects are linked in one way or other. Contrary to this, in typical propositional data, for e.g. patient records, data objects are independent of one other. So, a different breed of mining algorithms are required (e.g. Bayesian classifiers).

2- the process of extraction of data, keeping in consideration the legality of the technique, and other limitations.

Anyways, we've got to find something for our data mining class project. I jumped into Web Mining from typical and *boring* UCI data sets' based projects. Then, I started noticing interesting approaches to mining social networks.

So, I was looking for an introductory paper on mining social networks and I found this paper hosted by the Knowledge Discovery Laboratory of the Computer Science department at Purdue University, here. It really serves the purpose and the title "Data Mining in Social Networks" makes it more interesting for newbies. The authors are David Jensen and Jennifer Neville. I would recommend anyone interested in having a decent introduction to the subject matter, to go through this paper once!

~over and out

No comments: