Consider the "friends/foes" feature of a typical set of online forums — you simply store who you like and who you do not like. Maybe you can get a notification of whether your friends are online. Maybe you ignore posts by your enemies, or have a separate, smaller ignore list for that purpose. This functionality is a degenerate form of a social graph. It is a first-order approximation that, if not generalized to use graph theory, precludes more powerful functionality.
A graph is simply a set of nodes that are connected by edges. The nodes can contain any information (but usually have the same data type) and the edges can contain information (they too usually have the same data type).
A social graph is simply a graph where each node is a person and each edge is a relationship between two people. We can make this graph more specific by making it a directed graph
with two edges between each person; one pointing from person A
to person B
and one pointing in the reverse direction.
At this point, our social graph contains all the proper "containers" for our original friends&foes list. It contains a place to store any information that relates two people. However, that would just be a "first order approximation" — that is, if one starts at one's own (the "root") node and only looks at the other nodes directly connected
What if we were to take an existing social graph and start traversing out from the root node more than one step? All of a sudden, we would be able to see the friends' list of our friends (but probably not of our enemies). This is what happens with Facebook.
Now, just traversing the graph manually is somewhat interesting, but we can look at the connectivity of a social graph and start understanding the "cliques" of friends. One can look for denser connectivity and do cool things like the Facebook Friend Wheel
. This, in and of itself, is quite cool. It would allow, for instance, testing of the six degrees of separation
meme. But more is possible.
higher order operations
What if we were to take the doubly-connected, directed graph discussed above and start encoding some sort of trust information? For example, I might trust a few other people's opinions very highly. Maybe they
opinions highly as well; while I might not want to trust these "second degree" people quite as much as my friends do, I will probably still give some credence to my friends' judgment. Therefore, if I store some kind of trust number in the graph's edges, I can apply some sort of decaying function (lower multiplier for higher order) to it and get a "trust metric" out. Then, I can, say, sort search results by this trust metric and see opinions that I am more likely to value, bubble to the top.
Now, a single integer to describe trust is very coarse and probably not too useful. What if we actually have a whole array of numbers? Perhaps we have different opinions of our friends' ideas on politics, religion, and technical skills (and those split into many subcategories). Why not simply have a whole array of "trust metrics"? There is no need to keep this system one-dimensional.
We can even get more
complicated and establish coupling between different trust metrics. For example, someone who is knowledgeable about C# and Haskell might be likely to know at least a little bit about Lisp. This means, with enough values for a given person, we might be able to infer
other values. This is a relatively unexplored phenomenon; it is perhaps most explored by Netflix with their movie recommendation system. The contest for that system has definitely established that people's preferences are more complicated than one might like to admit!
gaming & feedback
Just like any computer system, the social network described above would be ripe for gaming. People would strive to get high trust ratings; they will do this to feel good, to get popular, or to just game the system. There are two ways to prevent this:
- highly limit the information available that isn't one's own trust preferences
- a feedback system
- include human moderator input
- try to characterize the simpler problems so that they can be algorithmically handled
- the algorithm might simply make suggestions to the moderators