Using Twitter to Locate Justin Bieber’s Heart
Location-based services and unreliable user data.
Although much human activity—from communication to commerce, and from work to dating—has migrated online, physical location is still relevant in the digital world. As more online activity intersects with geography, location-aware systems have become prevalent, raising questions for service providers and users alike.
When people join social networks and other online platforms, they’re often prompted to specify a location as part of their user profile. Long assumed to contain reliable information, the location field has been used in the development and delivery of many online services.
But in a first-of-its-kind study, University of Minnesota assistant professor Brent Hecht and co-authors determined that 34% of user profiles on Twitter contain invalid or no geographic information in the location field. Invalid responses included imaginary places (“over the rainbow”), non-specific places (“in jail”), otherworldly locations (“Jupiter”), expressions related to privacy (“none of your business”), and a surprising number of references to pop culture (“in Justin Bieber’s heart”). Such a high rate of unpredictability is likely to have adverse effects on the accuracy and utility of location-based services, from search results to targeted advertising.
Building on this discovery, the researchers wanted to see if location could be inferred in the absence of GPS or reliable user-provided location data. The study found that the content users shared via Twitter implicitly revealed location information, and that machine-learning techniques could use that information to predict location with surprising accuracy.
This work has immediate relevance for online systems that deliver location-based services: the high rate of unreliable location information uncovered suggests that such systems may need to be redesigned to improve performance. Long-term implications include the discovery that implicit human behavior and machine-learning techniques can be combined to predict information that could be used to improve services for users. At the same time, this discovery raises questions about privacy, security, and how to protect users from malicious inferences.
Hecht, B., Hong, L., Suh, B. and Chi, E.H. (2011). Tweets from Justin Bieber’s Heart: The Dynamics of the “Location” Field in User Proﬁles. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI 2011), pp. 237-246. New York: ACM Press.