Early in January every year, nearly a thousand people who study how language works flock together for the annual meeting of the Linguistic Society of America together with six smaller groups under its wings, including the Society for the Study of the Indigenous Languages of the Americas, the Association for Linguistic Evidence, and of course the American Dialect Society.
This year they migrated to Portland, Ore., for meetings January 8 through 11. There were hundreds of talks on the workings of language in general and of specific languages in particular. A typical example of a title would be the presidential address by Joan Maling of Brandeis University: “A Syntactic Rubin’s Vase: The Inherent Ambiguity of Non-promotional Passives and Unspecified Subject Constructions.” There has been much discussion here on Lingua Franca about passive constructions, but this talk, using evidence from Icelandic, Irish, Kaqchikel (a language spoken by Maya Indians of Guatemala), Polish, and Ukrainian, went a little deeper.
I’ll stick to something simpler, something that has recently hatched as an astonishing new way to learn about the dialects of American English, something that didn’t even exist a decade ago: Twitter.
In olden days, if you wanted to find out the words and pronunciations people were using, you’d have to interview them one by one, in person or at least on the phone. But that was before Twitter was launched in 2006.
The informality of a Twitter post allows dialect words and pronunciations to reveal themselves almost as naturally as in speech. “People tweet how they speak,” said Taylor Jones, a researcher at the University of Pennsylvania. The public nature of a Twitter post (you agree to that when you sign up) means posts can be used for research without special permission. And many posts are “geotagged” indicating the author’s location. So without conducting any interviews, a researcher can search millions of tweets for dialect patterns. Make that billions: Only 5 percent of tweets are geotagged, but that makes 25 million of them every day.
And you don’t even need to write your own program. You can use the program called TwitterR (though you do have to register as a Twitter developer and take several other steps to do your specific kind of research.)
What can you learn from mining Twitter data? Well, you can investigate the geographic distribution of hella (as in “hella good”) in California—mostly northern but also a bit in L.A., as we learned from Robert Bayley, Chelsea Escalante, Renee Kemp, Alex Mendes, and Emily Moline of the University of California Davis (who also provided instructions on how too use TwitteR). And you can watch the emergence and geographical spread of what Jack Grieve calls a “rising word” like unbothered (meaning unconcerned and disengaged)—especially prevalent in the South. And just about anything else.
So sit back and tweet away. You will be counted.