Friday, October 31, 2008

Poke my name!

SALMAN is the most popular 1565th name in USA. One in every 12,107 Americans are named as SALMAN and popularity of name SALMAN is 82.6 people per million.

If we compare the popularity statistics of SALMAN to USA's population statistics, we can estimate that as of October.31.2008 01:04 there are 25,216 people named as SALMAN in the United States and the number of SALMAN's are increasing by 217 people every year.

Usage of salman as a first name is 75.86% and its usage as a middle name is 24.14%. The sum of alphabetical order of letters in SALMAN is 60 and this makes SALMAN arithmetic buddies with words like Pure, Active, Dapper, Holy.

Yes :p
You get all this information about your name here

By the way, I was experimenting (building from source in eclipse ide) Carrot2's opensource clustering search engine demo browser, I was frustated and tired; [so,] the first thing I search for was my name, and some cluster pointed me to this interesting junk :D

Wednesday, October 22, 2008

UnChrome your Chrome

Regarding to Google, "Google Chrome is a browser that combines a minimal design with sophisticated technology to make the web faster, safer, and easier". Unfortunately, each Google Chrome installation contains a unique ID that allowing identifying its user. Google doesn't make it an easy job to remove this ID.

The UnChrome application was designed to help you with this task. It replaces your unique ID with Null values so that your browser cannot be identified any longer. The functionality of Google Chrome is not influenced by this. You only need to apply UnChrome once.

If Google Chrome is started now then please close it. Afterwards, please click on the link below to anonymize your Chrome installation.

Article Link
Download Link

Sunday, October 12, 2008

Messing around Facebook..

What's the quickest way to findout answers to following questions about facebook users:

1- How many male computer science graduates from my *university* aged between 18 and 25 are registered on facebook?

2- How many male computer science graduates from my *country* aged between 18 and 25 are registered on facebook?

3- How many *female* graduates from my university aged between 18 and 25 are registered on facebook?

4- How many graduates from my university are registered on facebook?

The quickest thing that came to my mind was to use the API somehow, however there is something else that facebook wants advertisers to use to answer such questions!! But hey, anyone can turn out to be an advertiser on facebook :D

So, yeh! it's pretty easy and thanks to Alec Saunder for innovating this trick!

To answer countably infinite such questions, follow the steps that he has neatly outlined on his blog here.

Summarizing it, there is no big trick here. Each advertiser is facilitated by facebook with a query builder that immediately returns *ONLY* the number of results of a specific input query. With a little investigative approach and excessive show of curiousity, you can use this tool to build a strong data set of facebook statistics, which by all means, can be called a representative sample of all the world wide web users!

Following is a depiction of my lack of creativity to use this tool, :p

There are 2440 18 year old males from China registered on Facebook.
There are 1,143,180 18 year old males from USA registered on Facebook.
There are 31,020 18 year old males from India registered on Facebook.
There are 11,400 18 year old males from Pakistan registered on Facebook.
There are 267,760 18 year old males from UK registered on Facebook.

Data Mining in Social Networks

With huge social networks around, I felt like it might be easier to apply some typical data mining algorithm on any dataset that I will extract from these networks, using their released APIs.

So, it turned out that it isn't a child's play for two reasons:

1- the data from social networks is relational, and hence data objects are linked in one way or other. Contrary to this, in typical propositional data, for e.g. patient records, data objects are independent of one other. So, a different breed of mining algorithms are required (e.g. Bayesian classifiers).

2- the process of extraction of data, keeping in consideration the legality of the technique, and other limitations.

Anyways, we've got to find something for our data mining class project. I jumped into Web Mining from typical and *boring* UCI data sets' based projects. Then, I started noticing interesting approaches to mining social networks.

So, I was looking for an introductory paper on mining social networks and I found this paper hosted by the Knowledge Discovery Laboratory of the Computer Science department at Purdue University, here. It really serves the purpose and the title "Data Mining in Social Networks" makes it more interesting for newbies. The authors are David Jensen and Jennifer Neville. I would recommend anyone interested in having a decent introduction to the subject matter, to go through this paper once!

~over and out