by: danah boyd

Over 40 people sent me a link to the New Scientist article: Pentagon sets its sights on social networking websites. First, thank you. Second, ::sigh:: I wish that i could say that i'm shocked, but i'm not. Still, i want to address it.

Those of you who knew my work at MIT knew that i was obsessed with the socio-structural information one could derive from email correspondences. Jeff Potter and i put together Social Network Fragments to visually convey how much information was available through just a single person's email archives. If you've ever CCed anyone, you've told everyone on that list meaningful information about your connections. Individual archives hold meaningful data about dozens of people's social networks. To show this, we put our visualization up at a gallery in New York to make a statement about the privacy implications. Of course, one person's data is nothing compared to the data that AOL, Hotmail and Yahoo! have. We couldn't find anyone who had never sent or received an email from each of those companies. I'd guess that each could generate a pretty decent model of the entire nation's social network.

Part of what makes email networks so powerful is the redundancy. It's not just the one email you received from or the fact that you have this in your addressbook, but the fact that you have an ongoing dialogue. Repetitive CC patterns are also super informative. What emerges is pretty fascinating - you can see who operates as bridges and start to get a sense for different functioning clusters and the power of structural holes. Spam is blatantly apparent, but you can also find breakups and love affairs without even getting into content analysis.

The government asked us to engage with them to help them track terrorists; we refused. But, given where we presented this work and who was in the audience, i can't pretend as though this work didn't help them think of ways to make sense of communication pattern networks and this has often haunted me. I'm all in favor of tracking down malicious individuals, but, as we've seen with the AT&T case, the government is happy to step all over individual privacy in the process. I understand why network researchers want to work for government agencies: infinite funding, computation power and the ability to access massive data sets. Still, i could not do it, as intriguing as the work is.

While our work was fascinating, in order for the big questions to be asked, you'd need to get one of the major three email providers to turn over their data. My hope was that this would never happen, but i have to say, i never thought a telco would sleep with the NSA. The fact of the matter is that the data that the NSA has because of AT&T is far far far more powerful than what they can derive from MySpace or other social networks. Why? It is behavior data, not articulated data. I will come back to that in a moment. But think for a moment... even if you don't subscribe to AT&T, do you know who your friends subscribe to? All of their data is included which means your conversations with them are too. Thus, even if you're like me and are boycotting AT&T, you're still in the system. We all are.

So behavioral and articulated... we've talked about this before, but there's a huge difference between saying you're friends with someone and actually being friends with them. Of course, this probably doesn't matter in a McCarthy era where any thread that connects you to a Communist is good enough. Friends on MySpace are equivalent to all the other familiar strangers you interact with every day - shopkeepers, cabbies, waiters, etc. If the government is really trying to gather information, they cannot be stupid enough to think that your list of 9000 friends is meaningful, but people have been accused of patronizing the wrong stores before. Of course, the value in the bazillion friends is that it provides starting information to find network clusters. In other words, it's one thing if you're friends with Lucky, but it's another if all of your friends are also friends with Lucky. What is most interesting though is if all of your friends are friends with Lucky and you aren't... experience has shown me that you and Lucky were once good friends/lovers and are now not on the best of terms.

There is also a lot of other public data in MySpace that is meaningful and i've been using this for analysis purposes. Top 8s are quite significant... even more so when they change. Combined with the Top 8s of those people (etc.), you can start to get a really meaningful picture of cliques. Comments are meaningful (except for the "Thanks for the add" ones) but picture comments are even more meaningful and repeat comments from the same person are the most meaningful. By complementing friends lists, this material provides a layer of behavioral data on top of the articulated data.

All this aside, what bothers me the most about this is the fact that the government thinks this is OK just because it's possible to do. Some people will immediately argue that of course they should, it's public data! Whenever you leave your home, someone could track your movements, marking every time you enter and leave different buildings, marking what you're wearing and who you're speaking with in public, etc. People hire PIs to do precisely this (often when they assume our partners are cheating). No one would be cool with a government snoop sitting on every street corner marking the public paths of every citizen just because they could. Luckily, the overhead of this is so outrageous that we only do it when we are really concerned about a particular individual. Networked technologies not only make this easier, but they also make the snoop invisible. Problematically, people don't sweat the invasion so much because they can't see it.

An argument that people make is that you should have nothing to fear if you've done nothing wrong. This is sooooo irritating. First, this is only true if you are interested in upholding hegemonic cultural norms. The adorable gay couple next door are doing nothing wrong in my eyes, but their kissing is all sorts of problematic to a government that wants to ban their right to love each other. Aside from queer life, think about all of the decisions you made that aren't necessarily "normal" even if many of us live a pretty privileged life. Second, there's a difference between illegal and not exactly the best impression. I want the ability to pick my nose when i don't think anyone's looking and i don't want a camera to capture me scratching my ass on a cigarette break outside of work. That's just plain embarassing. I don't want to always smile or stand up straight or pretend like i'm in a good mood just because an image might go down on my permanent record. That's just plain exhausting. Third, everything is context dependent. I've done nothing wrong when i stumble out of 1015 drunk as hell and hail a cab, but my drunken stumble is not something that i want to expose to my advisor or, frankly, the government. These are the types of images that people turn around to accuse me of being a citizen or clearly guilty of something else.

I will never forget sitting in the courtroom when my stepfather countersued my mother and accused her of cheating on him. We were all dumbfounded - i didn't think my mother had cheated and she was pretty sure she hadn't so we were all curious what this magical evidence was going to be. Apparently, he had hired a PI and he'd snapped photo after photo of... *my* high school boyfriend. Rob always looked older, but the fact that someone thought that my mom was dating him had me laughing for days. Yet, the humor of this paled in comparison to one utterly hysterical photo. It was taken pretty late at night and there was Rob walking the dog out back of our apartment near the woods. The dog was squatting and peeing and Rob was holding on to his penis peeing about 2 feet away from the peeing dog. This picture went down on the divorce record. I often tried to imagine how his Naval officer would've felt about this image.

Just because things can be made persistent or information about people's social lives can be revealed does not mean that it should be done. What the government is doing is not simply watching people in public - they are taking this data and computationally analyzing it to get to the core of people's practices. This is an invasion of privacy and an act of intense surveillance where the government is spying on its own people. They are doing so without a warrant and justifying it by saying that it is public. Just because people act in public does not mean that it should be stored, analyzed and graphed. Of course, i doubt the law in on my side on this one - it was not written for a world in which such data would be so easily accessible and most of the law concerns the collection of data, not the analysis of it.

Original Post:

Leave a Comment