Is it possible that Twitter’s users, rather than Twitter itself, are to blame for the micro-blogging platform’s relatively useless search engine? Perhaps. According to new research by Twitter’s data science team, Twitter search is used often as a tool for finding breaking news in real time, which makes it difficult for Twitter to assign relevance to any given tweet or topic in the long run. So while the world bemoans Twitter search as useless, maybe we’re doing so through last generation’s Google-colored glasses that don’t let us see Twitter for what it is and the challenges it faces.
In a Twitter Engineering blog post explaining its findings, analytics research scientist Jimmy Lin explains the problem of ranking tweets by relevance as partly being a problem of time. In the case of breaking news, the system is simply overwhelmed by tweets and queries on that topic, which means Twitter’s relevancy models can’t always keep up to determine which ones you probably want to see. While it’s relatively easy to build a simple search algorithm utilizing the concept of “term frequency-inverse document frequency weight” when the overall corpus of documents is fairly static, it’s a lot harder when terms suddenly surge in popularity and a system has to constantly re-process the dataset in real time.
These numbers from Twitter’s research help explain the problem:
Of course, this particular phenomenon doesn’t explain why Twitter’s search doesn’t go back further in time, or why its algorithms for ranking tweets based on source or the number of time they’ve been retweeted don’t appear too accurate. Even if relevancy improves, there’s still a lot to be desired in terms of getting Twitter to return the types of results users have come to expect.
Lin’s post also highlights another piece of research from Twitter that’s less noteworthy to individual users but probably more telling about the world as a whole. A visualization of Twitter usage patterns in New York City, Tokyo, Sao Paulo and Istanbul creates a picture of cultural and seasonal differences at play.
Twitter users in Tokyo, we see, tweet a lot less during the work day and also go to bed and wake up at about the same times throughout the year. Elsewhere, users show pretty distinct differences in activity as the seasons change. Lin also points out the afternoon lull in Sao Paulo. It’s difficult to discern the exact reason from looking at this chart, but the lull does coincide with Sao Paulo’s winter season and a generally later beginning to the tweeting day.
I’d love to see these results analyzed against other cultural datasets, or even just against a knowledge base of local customs and behaviors, to see how Twitter use — and web use, generally — comports or doesn’t comport with a region’s typical norms.
Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.