Crisis Text Line Releases De-Identified Dataset From 13 Million Conversations


New York-based startup Crisis Text Line announces that it will release a massive dataset containing de-identified text message conversations between its crisis counselors and users. The new dataset is comprised of 13 million text messages from users in crisis, representing one of the largest single sources of information on mental health and crisis available, rivaled only by a CDC survey on the same topic published every two years. Access to the text messages will be made freely available to researchers working across a multitude of fields to help improve understanding about mental health issues and the events that precipitate a crisis.

Crisis Text Line is a spinoff from, an organization focused on helping organize public service projects. While working on a teen outreach project within DoSomething, then CEO Nancy Lublin realized that while there were dozens of resources available to teens in crisis, there were no national text-based outreach programs aimed at anonymously connecting teens with crisis councilors over text. In response, Lublin handed over the reigns of DoSomething and launched Crisis Text Line to address the problem. Lublin piloted the service in Chicago and El Paso, TX and within four months crisis counselors had received texts from teens living in every area code in the country. Because the team was still in pilot mode, they did no marketing to promote the service. Early growth was entirely attributed to word of mouth among teens. In the two years since its launch, counselors have exchanged 13 million texts with users.


Crisis Text Line itself is a technology savvy startup. The company uses real time data analytics to screen texts and route them to crisis counselors with the most relevant experience. Lublin explains, “We know if you type sex, oral, and Mormon, you’re questioning if you’re gay.” Software also follows along throughout the conversation and automatically pulls up resources that the counselor can offer, such as information on local drug treatment centers for patients showing signs of probable drug abuse. The team has also created a number of data visualizations of its growing dataset, showing geographic maps of where users are experiencing the most bullying, suicidal thoughts, or other issues.

Now, that data will be turned over to third-party researchers. While access to the data is free, researchers will still have a number of obstacles ahead of them before they are able to use the data in research projects. The company has established an ethics board and stringent data privacy standards that it will require all researchers to comply with.

Enjoy HIStalk Connect? Sign up for update alerts, or follow us at @HIStalkConnect.

↑ Back to top

Founding Sponsors

Platinum Sponsors