Wednesday, August 05, 2009

Travelling Salesman Problem: Solved!

travelling_salesman_problem

[source: http://xkcd.com/399/]

Saturday, July 25, 2009

Digging Digg: Comment Mining, Popularity Prediction, and Social Network Analysis

Recently, one of my research works got accepted at The 2009 International Conference on Web Information Systems and Mining (WISM' 09). The conference proceedings will be published by IEEE-CS and will be indexed by both EI (Compendex) and ISTP. Following is the abstract:


Using comment information available from Digg we define
a co-participation network between users. We focus on
the analysis of this implicit network, and study the behavioral
characteristics of users. Using an entropy measure,
we infer that users at Digg are not highly focused and
participate across a wide range of topics. We also use the
comment data and social network derived features to predict
the popularity of online content linked at Digg using a
classification and regression framework. We show promising
results for predicting the popularity scores even after limiting
our feature extraction to the first few hours of comment
activity that follows a Digg submission.


I am grateful to my advisor, Dr. Huzefa Rangwala, who pushed me real hard and stayed with me to get it done!

I am trying to move to wordpress and restructure my blog, but till then, I don't have any section to upload my paper. Anyways, it's available through George Mason University's Technical Reports Series for 2009, which can be located here: http://cs.gmu.edu/~tr-admin/papers/GMU-CS-TR-2009-7.pdf. Also, the paper will soon be available through IEEE.

Thursday, June 04, 2009

The next big thing. [2]

There was a change in plan; following is the configuration of my newest machine:

  • AMD Phenom 2 x4 940 (3.0 ghz, 4 cores)
  • Biostar Ta790gx (amd 790gx chipset, radeon 3300 builtin)
  • GSkills 4 GB RAM (ddr2, 1060mhz)
  • Western Digital 1 Terabyte HDD (32mb cache, 7200 rpm)
  • Antec 300 ATX case and Anter 430w PSU
  • Samsung 23' High Definition LCD Monitor (max. res. 1920 x 1080, 5 ms response time)
I don't know how much I saved, but I am sure that it's atleast $300. I found really good deals. Anyways, adding everything I paid, the sum was $650 for all of the above. I do not plan to overclock it, and I will not use it for gaming. The average CPU temperature is around 30c, which is not bad. I did an Ubuntu 9.04 installation on it (with default options) and following was the first exception:

error 18 selected cylinder exceeds maximum supported by bios

I looked around for workarounds, and found nothing straightforward. A few fellows hinted that the boot partition shouldn't be huge because the location of the kernel must be in the first few gigs of hdd. So, all newbies who'll go with the default ubuntu installations are going to see this error (if they have large hard drives). Here is the simple workaround assuming that it's a new system/build:
  • You'll need to re-install ubuntu.
  • This time, choose manual partitioning option.
  • The problem will be solved if you'll make a separate boot partition (/boot) at the beginning of the disk. It's size can even be 32 mb, but I chose 128mb to be on safe side (this has something to do with ppl who play with kernels)
Following is my new partition table:
  • /boot (128mb, primary partition)
  • / (20gb, primary partition)
  • swap (2gb, logical partition)
  • /home (900+gb, logical partition)

Benefits:

  • I can update my ubuntu installation without messing with my home folder.
  • I don't really need a swap, but I've too much of free space :p
  • Twenty gigs for Root partition (/) is enough for default installation and many softwares.
  • Above all, separating boot partition helped me to get rid of Error 18

My next step - to use some hypervisor and ensure a separation of concern. Primarily, I want to isolate my (future) webserver from everything else I'll be doing on this machine. I know XEN and VMWare ESXI. I am analyzing the comparisons of both; with all I know by now, I might settle down for Xen.

Above all, I am loving it :)

Thursday, May 28, 2009

If research papers had a comment section..

phd052709s

Tuesday, May 26, 2009

The next big thing.

It's going to be an Intel Quad Core based system; I'll get it all within hours. Here is the config:

Processor - Q9400 (6mb cache, 2.66ghz, 4 cores,1333 mhz)
Memory - DDR3 6 GB
Motherboard - Intel DP45SG
HDD - SAMSUNG Spinpoint 1TB F1 HD103UJ 7200 RPM 32MB Cache SATA 3.0Gb/s 3.5" Hard Drive
Power Supply - Coolmax M-500B 500 Watt ATX 12V
Graphics - MSI N95GT-MD512-OC GeForce 9500 GT 512MB 128-bit GDDR2 PCI Express 2.0 x16
Casing - Antec Three Hundred ATX Case

I am building this thing; it's fun. I am getting it all for $615, which is not bad at all! I was almost finalizing the Mac Mini, but within my budget, I was getting nothing better than a refurb core duo, 1 gig box; all crap.. I hope this one lives with me and my needs for atleast an year.

p.s. Why did I buy one? because I hate to shutdown my notebook even once in a month :p

Wednesday, April 08, 2009

Specialization is for Insects


The Origins of the Thesis

A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects.

-Robert A. Heinlein

just a joke, no offense to all my fellows around :)

Sunday, April 05, 2009

The bugs in my life

For over a week now, I've been dealing with numerous errors, exception, time outs, and malfunctions related to the same problem. Few are,

  • Error Code : 2006 - MySQL server has gone away,
  • Exception in thread "main" java.lang.OutOfMemoryError: Java heap space,
  • Error occurred during initialization of VM - Could not reserve enough space for object heap,
  • 100 thousand Null Pointer Exceptions,
  • java.lang.ArrayIndexOutOfBoundsException: Array index out of range ..
  • Error Code : 1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server ..
  • ssh: connect to host [server] port 22: Connection timed out
  • and many more :p

And in the meantime, I improved my track record of

  • Living with 1 coffee, 1 vitamin water and 1 meal per day..
  • Reworking for the 5th time on the same 400 lines of code and fail for the 5th time..
  • Thinking effectively for a complete solution .. while taking a bath..
  • Staring at random people and realizing the wrong after a bunch of seconds..

Friday, March 27, 2009

Talking about php frameworks

I am thinking of updating my blog. Here is the thing: I can do more productive stuff than searching for good looking blogger templates.. Although I am an advocate of simple looking text oriented blogs, but still, I hate the restricted and limited blogger dashboard whenever I think of tweaking the layout somehow.

Anyways, a geek friend talked about cakePHP. I looked it up and it seems interesting. The last time I worked with PHP, there was no hype of php frameworks, and perhaps the only one I heard about was by Zend. So, anyways, this world of frameworks is ranked here. By having a couple of sneak peaks at Yii-powered sites, I declare it to be boring to try out.. Next, I've got CodeIgnitor, CakePHP, and Akelos.

On the other hand, I've got RubyOnRails!

Now, I've worked with PHP and it's the easiest language I've worked with by far (considering the compiled languages I know). I wouldn't be comfortable to work with a framework that might make the whole process as easy as a cake (from cakePHP) and at the same time I am thinking to stick to php and do more practical things with it instead of trying out something new like Ruby.

So, I'll think :) One stupid way of comparison is the time it takes to make a blog with these frameworks: RoR takes 15 minutes, and CodeIgnitor and Akelos take 20 minutes :p For more intelligent comparative analysis, I'll look around for resources. I am planning to export all my posts from this blog to my new blog (I don't know how this work though) ..

Wednesday, March 25, 2009

OnLive

screen001

Although it’s a neat idea, but I doubt it to be the next big thing in gaming world. It’s only targeting a small niche of gamers: those who have high speed internet connections!

The big question - why do I need it in the first place? If it’s not priced significantly lower than it’s competitor big fishes, then hardcore gamers would surely prefer the ugly big boxes over unexpected interruptions in their gaming performance due to primary dependence on internet connection.

[more about onLive]

Monday, March 23, 2009

An Honest Scammer

wow2

Saturday, February 14, 2009

honda student 1999 civic salman sedan jamali google thesis

Few days back, I bought a used car. It's an awesome deal, but still a used car and hence, I am looking for friends who are mechanics :p Anyways, I ran to honda and asked them to inspect my car. They charged me $105 and with a smiling face smashed on my face a bunch of repair recommendations. So I was supposed to prioritize this list and tackle each one by one under a limited, poor-ish and a miserable budget. In the mean time, being a nerd googler, I was able to investigate a lot about the geeky terms of car mechanics and common problems and their solutions. Just today, I realized that I know a lot more than I knew about a car's anatomy, e.g. I can refill my coolant, fuel my car by myself, etc.

following are few of my recent search strings on google:

  • how to check fuse of auxiliary power outlet civic 99
  • cigarette outlet not working civic 99
  • is my coolant leaking? civic 99
  • what is oxygen sensor?
  • what is primary oxygen sensor 02 civic 99
  • how to attach a number plate on front bumper
  • lost my driver's license dmv reporting
  • zune fm transmitter
  • zune car kit
  • best fm transmitter reviews
  • pioneer car cd player deal
  • sony car cd player deal
  • magellon gps deals
  • garmin gps deals
  • garmin vs magellon vs tomtom
  • navteq versus teleatlas maps
  • samsung bluetooth headset deals
  • samsung wep200 vs wep500
  • samsung wep200 vs wep410
  • antifreez coolant autoparts
  • 1999 honda civic mpg
  • 1999 honda civic lx kbb
  • magellon 4210 review cnet
  • progressive insurance
  • what is liability insurance
  • geico
  • mr car wash
  • ..
  • ..

and seems like this list would go on and on.. errr. Anyways, above all, i wish i could drive here the way we used to do it in Pakistan.. :D I don't like it, it's so peaceful here. I notice people actually waiting for the red to be green and literally stopping when it's about to be red! See that's not way, you're wasting fuel, your time and hmm.. some excitement :D We are not supposed to switch lanes with indications and thank people for letting you cross.. it's all right, just show some aggression in your acceleration and they'll stop.. it works always :D

And what's up with these service stations here?! $90 just to check for the problem in the power outlet? $600-$700 to replace timing belt that cost's $40.. I know it's labor, but still.. it's so systematic that you just don't have any cheaper options, or perhaps i'll get to find few eventually.

One more thing, every third mechanic is a Pakistani just like every third software engineer is an Indian and every third human being around is a Chinese :D

To summarize, after 2.5 months of craigslisting, I think I've got a nice deal! :)

tht's it for now, over n out.

Wednesday, February 04, 2009

Digg's new recommendation engine




Now this is what I call Science :D

Tuesday, February 03, 2009

Our jokes, about us.

I came across the following comics.. they all relate to the software development world. Although they might not be funny enough for non-techies, but they justify a lot the source of our hopelessness :p

Suggestion: click on the image for bigger size.














Sunday, February 01, 2009

.. and 694000 songs were downloaded illegally ..



a bunch of mind-blowing facts! But still, I am just wondering how this combinatorial explosion of facilities and ease of staying updated isn't enough for us to seek a few pretty simple reality checks on what's happening around us :p .. instead we rely on a bunch of cute & dumb anchors on the tv! grrr..

Sunday, January 25, 2009

A Green Leaf-Like Bug!

These images are the courtesy of a friend in Pakistan. I didn't try to discover if he loves photography, but I am sure that these are awesome captures. This would definitely amaze anyone like me; anyone living in a place where you waste 2 hours of planning after seeing a couple of dead cockroaches around the kitchen cabinets :p

Anyways, This thing is called Microcentrum retinerve. It sounds something like this; actually, this is how I can recall that it's very common around. Following is it's classification:

  • Class Insecta
  • Order Orthoptera
  • Suborder Ensifera
  • Family Tettigoniidae
  • Species Microcentrum retinerve







[image courtesy: Yawar Abbas]

Sunday, January 11, 2009

How to quantify the frustration of our youth!

Ok. First of all, a disclaimer - I wasn't looking for one of these predicted results!



My query was like "how to know if a timing belt has been changed". After driving a car for 5-6 years back in my country.. all I learned about the technicalities of cars was limited to refueling it, excellent gear shifting, and racing with odd one outs in random traffic. I am also proud of the fact that because of me and my driving skills, a number of human beings thanked God sincerely that He saved them, who knows that might be a turning point in their lives to submission to All-mighty :D

Anyways, I wanted to search for some articles about discovering the last time a timing belt of a car was changed.. there was this civic (yeh still looking for a car), the owner didn't had any related receipt/records, so I thought that there must be someway for a mechanic to get to know this by a brief inspection.. that's it, over and out.

Friday, January 09, 2009

the ridiculous perspective.

Sometimes you want to sit & think, but sometimes you just want to sit. Sometimes you want to plan out things, but sometimes, you just don't want to think about plans. This isn't ridiculous, because I am sure that many a times we miss the 'kid' that we once were. That kid was so natural, free of troubles, and all stuffed up with naive patriotism and the desires to score high in games, only. Yes, I miss him, no shame!

But, above all, no one's to blame here; I know we're supposed to move on. I'll move on, Inshallah, but I have my memories.. so deep-rooted somewhere that no brainwashing detergent can erase their traces. I feel of the recap as a seizure, a malfunction.. but when I recover from it, I always pop out with a lesson, a simple and clean guideline for the incoming tremors. And, I move on.

Still sometimes, I desire a free fall, as in skydiving; but sometimes, I just want to dive and fly away..

Monday, January 05, 2009

My car buying experience

First of all, I still couldn't buy one.. but anyways, I had 3 close encounters of actually wasting up my super duper budget of 3500$ on the following cars (after analyzing 25+ cars):

1 - 1997 civic ex coupe automatic
2 - 1999 civic ex sedan automatic
3 - 2000 civic ex coupe automatic

For 1, we settled for a done deal happily. Later, the carfax tells me that the car has a rollbacked odometer. Amazingly, we couldn't guess it from the extremely humble speech of the seller that he's perhaps being untruthful and that the rollback amounts to 45000 miles!

Exhausted, as well as excited.. I went for 2, perfect history, affordable, I am all ready.. and there you go.. sold-out! [yehh and didn't care to take off the advertisement from the list!]

The biggest blow, number 3. Not affordable, but for the price a great deal. So, I thought I'll drag my budget a little more. I did that. After a bunch of email ins and outs, we decided on a meeting time. I was desperate to just get it for whatever it had, for any extra cost. And once again, I receive an email: "Sorry, sold out!" Perhaps, what’s most disappointing about this one was that the seller listed it for ~4000$ and I wanted to get it for $3600 and he sold it for $3400 !!! Yeh, i know something's wrong.. :(

So, I've 20 days more days to stay patient.. then my carfax account will expire.. and I am not going to think about a car anymore..

Recommendation for craigslist: as soon as an item is sold-out, something should force the seller to close his advertisement.

Recommendation for buyers: as soon as you find out a good deal, don't wait for another good deal, don't push your luck :p

Recommendation for sellers: if you can't negotiate, don't face the buyers ;)

Friday, January 02, 2009

Status: sitting on notebook watching couch (& eating carpet)

~no comments!

Saturday, December 27, 2008

Tan Le Brings the Force to Life with Mind Control Device

The Entertainment Gathering 2008
Monterey, CA
Dec 12th, 2008

Tan Le, co-founder and president of Emotiv Systems, gives a live demo of a mind control device that uses a person's thoughts to input computer commands.

See demo video here!

PHP - tutorials and resources for all levels of expertise!

I had been planning to revise the things I did in my mere 4 months professional/commerical work experience with PHP & MYSQL (and Apache) over Windows platform. I did a lot.. solved numerous complications, disiciplined my interpreted programming language skills, and incorporated my ideas into the couple of projects..

What helped me the most? the experience and skills of my extraordinarily talented team lead;
What bored me the most? coding the next .php without thinking to innovate more durable and appealing solutions to reccurent problems..
What did I like? Scrum + Agility + Web Development .. exponential success!

But after 1.5 years, all I can recall are some echos and the nightmares of being lost in some opensource OO php code.If I had the mountains of php code and mysql stuff that I wrote in those 4 months, I would have been proud again of my skills in mere 4 days..

Without it, I needed some resources for quick overviews that I could walk through while doing brainstorming for my thesis study 80% of my daily time.. So I found this digg.com entry:

25 Resources to Get You Started with PHP from Scratch
Dec 23rd in Web Roundups by Drew Douglass

I liked the smooth transition from overly simple walkthroughs to "Advanced and OOP techniques" stuff. Thanks Drew! This article can definitely forward anyone interested in almost any of PHP related things to the finest resources on the web.. give it a try!

over and out.

Thursday, November 13, 2008

If you can't look him straight in the eye

Following is the poem that Chuck Hagel, Nebraska's senior U.S. Senator read out in his Farewell to the U.S. Senate on 2nd October, 2008. The poem's actual author for some is unknown, but this site claims that it's Peter "Dale" Wimbrow Sr. Anyways, it's so perfectly written that none can just ignore it; yet, we can only wish..

When you get what you want in your struggle for self

And the world makes you king for a day,

Just go to the mirror and look at yourself

And see what that man has to say.

For it isn't your father or mother or wife

Whose judgment upon you must pass.

The fellow whose verdict counts most in your life

Is the one staring back from the glass.

You may be like Jack Horner and chisel a plum

And think you're a wonderful guy.

But the man in the glass says you're only a bum

If you can't look him straight in the eye.

He's the fellow to please--never mind all the rest,

For he's with you clear to the end.

And you've passed your most dangerous, difficult test

If the man in the glass is your friend.

You may fool the whole world down the pathway of years

And get pats on the back as you pass.

But your final reward will be heartache and tears

If you've cheated the man in the glass.

[source: http://hagel.senate.gov/public/index.cfm?FuseAction=Home.FarewellSpeech]

Friday, November 07, 2008

Subversion – Introduction

I have been working with Subversion for a while now, but I started right out of an immediate need, and hence without any conceptual background. This time however, a lack of some basic understanding hindered my learning, so, I took out some time to skim through the most reliable source of information. I found this Subversion book titled as "Version Control with Subversion", which can be downloaded here. It's time consuming to go through a 400 pages book to learn about something secondary in importance to my (or any developer's) work, so, I thought it might be useful to summarize things I'll read right away.

Following is the summarized selection I have extracted from the first chapter of the book..

A version control system tries to enable collaborative editing and sharing of data. Subversion is one such system. Different systems use different technique to implement this collaborative environment; even Subversion supports a couple of different methods. It can manage any sort of file collections (not limited to source-code only). At its core, just like any other version control system, there is a repository; it stores information in the form of a file-system tree of files and directories (just like a typical file server). Clients connect to the repository, to read and write files. The operations are synchronized, and each client sees the latest version of the files stored on the repository.

What distinguishes it from a file server, however, is its ability to remember all the changes ever made to any of the files or directories, as well changes in the directory structure and addition and deletion of files. The fundamental problem faced by all version control systems is as questioned: how will the system allow users to share information, but prevent them from accidentally stepping on each other's feet? It's all too easy for users to accidentally overwrite each other's changes in the repository.

In sum, the problem reduces to the following questions: How the latest version of a file should represent all the changes made by some writers, when some reader is reading it? Anyways, two solutions have been proposed to this problem:

The Lock-Modify-Unlock Solution

Three problems:

  1. Locking may cause administrative problems – you lock a file and go on vacation
  2. Locking may cause unnecessary serialization – both need to modify different portions
  3. Locking may create a false sense of security – lock A, then ask for B; lock B, then ask for A

The Copy-Modify-Merge Solution (Used by Subversion and many other systems)

In the fourth action, Harry failed to write the file back to repository because he had an older version (which he was modifying) as compared to the one currently on the latest repository. This is where the concept of “merge” needs to be introduced. So, once Harry downloads the latest repository version, he merges his changes done in the older version into this downloaded latest version, and, writes this merged file back to the repository (A* above). Finally, Sally can now read this updated file that has her modifications as well as that of Harry’s. This solution is much preferred over The Lock-Modify-Unlock Solution in many cases except when the files are sound files or binary files where it will become almost impossible to ensure consistency of the changes made by multiple clients at the same time.

SUBVERSION - IN ACTION

Repository URLs You can access Subversion repositories through many different methods—on local disk or through various network protocols, depending on how your administrator has set things up for you. A repository location, however, is always a URL. Table 1.1, “Repository access URLs” describes how different URL schemes map to the available access methods. Working Copies Subversion has this concept of a working copy, which is essentially an up to date copy of the project source-code that you’ll download (or checkout) from the underlying repository. If the repository has multiple projects, then you’ll need to specifically mention the exact URL of the project subdirectory while issuing a check-out command to SVN, say something like this:

svn checkout http://svn.example.com/repository/my_project

Now, there are two possibilities:

  1. You can checkout/download the source-code of my_project in some ordinary directory; you’ll need to get some typical SVN client to do so. I use TortoiseSVN, which can be downloaded here.
  2. Alternatively, you can checkout/download the source-code into some workspace project’s source folder of the IDE you use for development. I use Eclipse Ganymede these days, and the Subclipse v1.4 plugin does the job for me; it can be downloaded here.

It’s easy to get used to the synchronization mechanisms adopted by Subversion. Most of the times, the only operations one shall use as a developer are Commit, Update, Merge, Compare, Restore, etc. I won’t get into details of each command here (respecting the order of knowledge in the book). Once you’ve checked out the source-code, it’s time for you to play around with it just like you can with any of your local projects. Important thing to note here is that whatever you’ll modify will not have an effect on the original source-code in the repository, and hence the name ‘working copy’. But, at some point in time, you’ll need to incorporate (or commit) your changes into the original files in the project repository. This operation is known as COMMIT, and to do this commit, you’ll execute a command like this:

svn commit MyMainClass.java -m "Fixed a bug in main class"

If all goes good, you’ll be shown a confirmation that your changes have been committed and the repository is now updated. So, the next time you’ll check-out the same my_project code from the repository, you can witness your updated MyMainClass.java code. There is just one last thing that needs to be understood. Consider the following case:

  1. You and some other developer checkout my_project at the same time (hence the same version).
  2. You make changes to MyMainClass.java’s methodX, and, you commit the code.
  3. Then, the other developer makes his/her changes to the same class’s methodY, and tries to commit. This commit however, will fail.

Reason: he/she is trying to commit a modified yet NOT up to date version of the file to the repository. It’s no more up to date, because the latest on repository is YOUR’S MyMainClass.java.

To get over this problem, the other developer will issue an update command, like this:

svn update

The svn will then automatically TRY to update the working copy of this developer by incorporating the changes made by you (or any other changes in the latest version) . The developer, however, is required to do some manual modifications, if he was also updating the same method (i.e. method) or section of code.

Switching to Carrot2.. for now, over & out.

Sunday, November 02, 2008

Stage set - Hillary, Incominggg..

Venue: George Mason University, Fairfax, VA
Date: 2nd November, 2008

Saturday, November 01, 2008

Introduction to Carrot2 Clustering Engine/API


I was pretty much impressed by the easily comprehensible yet powerful facilities provided by this component based clustering engine; thanks to it's devleopers for releasing it's source code. Unlike this opensource engine, there is another clustering engine by Vivisimo, called Vivisimo Velocity, which is commerical and hence helps us in no way. Anyways, to see the best that can be done with VV, try Clusty, which efficiently clusters search results using the Vivisimo clustering technology. However, clusty isn't state of the art in coming up with semantically near-perfection cluster label names, as discussed here.

Anyways, let's appreciate opensource - back to Carrot2.

So, Carrot2 is an Open Source Search Results Clustering Engine. It can automatically organize (cluster) search results into thematic categories, called clusters.

Carrot2 provides an architecture for acquiring search results from various sources (YahooAPI, GoogleAPI(deprecated), MSN Search API, eTools Meta Search, Alexa Web Search, PubMed, OpenSearch, Lucene index, SOLR), clustering the results and visualising the clusters. Currently, 5 clustering algorithms are available that are suitable for different kinds of document clustering tasks.










The architecture of Carrot2 is based on a pipeline of components of three types: input components, filter components and visualization components. The task of input components is to provide search results for clustering based on a user query. Filter components transform the results in some way (e.g. apply clustering, case normalization), and the visualisation components render the transformed results for the user.



I have successfully walked through the most basic example application of this api. It simply uses yahoo api and the lingo clustering algorithm to cluster the results of a certain query. The best way to understand the mechanics of carrot2 and to make the most out of it's abilites, one needs to follow the code while studying the comprehensively written javadoc documentation.

So, I did so.. but for now it's about time for me to shutdown my brain for next 4-5 hours.. I'll continue this post and will try to put forward a precise text extracted out of those comments in the next part, explaining in detail things one needs to know to get started with carrot2!

for now, over and out!

Friday, October 31, 2008

Poke my name!

SALMAN is the most popular 1565th name in USA. One in every 12,107 Americans are named as SALMAN and popularity of name SALMAN is 82.6 people per million.

If we compare the popularity statistics of SALMAN to USA's population statistics, we can estimate that as of October.31.2008 01:04 there are 25,216 people named as SALMAN in the United States and the number of SALMAN's are increasing by 217 people every year.

Usage of salman as a first name is 75.86% and its usage as a middle name is 24.14%. The sum of alphabetical order of letters in SALMAN is 60 and this makes SALMAN arithmetic buddies with words like Pure, Active, Dapper, Holy.






















Yes :p
You get all this information about your name here

By the way, I was experimenting (building from source in eclipse ide) Carrot2's opensource clustering search engine demo browser, I was frustated and tired; [so,] the first thing I search for was my name, and some cluster pointed me to this interesting junk :D

Wednesday, October 22, 2008

UnChrome your Chrome

Regarding to Google, "Google Chrome is a browser that combines a minimal design with sophisticated technology to make the web faster, safer, and easier". Unfortunately, each Google Chrome installation contains a unique ID that allowing identifying its user. Google doesn't make it an easy job to remove this ID.

The UnChrome application was designed to help you with this task. It replaces your unique ID with Null values so that your browser cannot be identified any longer. The functionality of Google Chrome is not influenced by this. You only need to apply UnChrome once.

If Google Chrome is started now then please close it. Afterwards, please click on the link below to anonymize your Chrome installation.


Article Link
Download Link

Sunday, October 12, 2008

Messing around Facebook..

What's the quickest way to findout answers to following questions about facebook users:

1- How many male computer science graduates from my *university* aged between 18 and 25 are registered on facebook?

2- How many male computer science graduates from my *country* aged between 18 and 25 are registered on facebook?

3- How many *female* graduates from my university aged between 18 and 25 are registered on facebook?

4- How many graduates from my university are registered on facebook?

The quickest thing that came to my mind was to use the API somehow, however there is something else that facebook wants advertisers to use to answer such questions!! But hey, anyone can turn out to be an advertiser on facebook :D

So, yeh! it's pretty easy and thanks to Alec Saunder for innovating this trick!

To answer countably infinite such questions, follow the steps that he has neatly outlined on his blog here.

Summarizing it, there is no big trick here. Each advertiser is facilitated by facebook with a query builder that immediately returns *ONLY* the number of results of a specific input query. With a little investigative approach and excessive show of curiousity, you can use this tool to build a strong data set of facebook statistics, which by all means, can be called a representative sample of all the world wide web users!

Following is a depiction of my lack of creativity to use this tool, :p

There are 2440 18 year old males from China registered on Facebook.
There are 1,143,180 18 year old males from USA registered on Facebook.
There are 31,020 18 year old males from India registered on Facebook.
There are 11,400 18 year old males from Pakistan registered on Facebook.
There are 267,760 18 year old males from UK registered on Facebook.

Data Mining in Social Networks

With huge social networks around, I felt like it might be easier to apply some typical data mining algorithm on any dataset that I will extract from these networks, using their released APIs.

So, it turned out that it isn't a child's play for two reasons:

1- the data from social networks is relational, and hence data objects are linked in one way or other. Contrary to this, in typical propositional data, for e.g. patient records, data objects are independent of one other. So, a different breed of mining algorithms are required (e.g. Bayesian classifiers).

2- the process of extraction of data, keeping in consideration the legality of the technique, and other limitations.

Anyways, we've got to find something for our data mining class project. I jumped into Web Mining from typical and *boring* UCI data sets' based projects. Then, I started noticing interesting approaches to mining social networks.

So, I was looking for an introductory paper on mining social networks and I found this paper hosted by the Knowledge Discovery Laboratory of the Computer Science department at Purdue University, here. It really serves the purpose and the title "Data Mining in Social Networks" makes it more interesting for newbies. The authors are David Jensen and Jennifer Neville. I would recommend anyone interested in having a decent introduction to the subject matter, to go through this paper once!

~over and out

Tuesday, September 30, 2008

Underwater astonishments - Camouflaging Octopus Footage

I have always felt proud of my inquisitiveness and investigative approach to dig deeper into almost anything that amazes me (yeh anyhting!). I have been following very aggressively things people share on Digg, Dzone, and Youtube; and, I usually get to know more about them, beyond the content that's shared..

But, something, I don't know how, but I missed it. It was shared by a very good brother, and literally, this was beyond most of the great thing I have ever seen or heard about. For me, as being a computer freak, this was beyond calculations, or algorithms.. and I am sure it's a jaw-dropping footage of the beauty of nature, mashallah for everyone out there associated with any sort of engineering and non-engineering field of study. The only comment I can bring to words is that it's not out there without a purpose, it's not a coincidence, and it's not a yet-another illusion; but, it's a real-world miracle of Allah, a fascination that carries numerous signs with it (atleast for me).

Anyways, so it's an Octopus (Vulgaris) that for various reasons use its ability to match its pattern, color, brightness, and texture of apparently anyhting that it's resting upon. Above all, this is the first time it's caught on camera.
See the following video:



Following is the complete talk given by David Gallo. "David Gallo works to push the bounds of oceanic discovery. Active in undersea exploration (sometimes in partnership with legendary Titanic-hunter Robert Ballard), he was one of the first oceanographers to use a combination of manned submersibles and robots to map the ocean world with unprecedented clarity and detail.
He was a co-expedition leader during an exploration of the RMS Titanic and the German battleship Bismarck, using Russian Mir subs. On behalf of the Woods Hole labs, he appears around the country speaking on ocean and water issues, and leading tours of the deep-ocean submersible Alvin."