Mining Social Data and Putting it to Work

Sept. 2, 2015, 1:45 PM UTC

Editor’s Note: This article is authored by two BakerHostetler partners — an information governance practice leader and a new media, advertising, IT and privacy partner — and an associate.

By Judy Selby, Alan Friel and Jenna Felz of BakerHostetler

Social media data is a rich source of information for companies competing in today’s global marketplace.

Companies can exploit social data for a variety of purposes, such as gaining better knowledge about their customers, improving business intelligence, and developing a competitive edge in the global marketplace. Perhaps the most valuable aspect of “social data” is the variety of data points generated on social media platforms. Facebook alone collects nearly 60 different pieces of data for its application programming interfaces (“API”), and its “like” button is pressed a staggering 2.7 billion times every day across the web. Facebook users alone post 684,478 pieces of contentper minute, and Twitter users tweet over 100,000 tweets per minute.

The amount of social data being generated is staggering, and many companies are trying to utilize this big data to increase their bottom line. For a technology that is so widely used, however, companies surprisingly have little guidance on the legal implications of collecting and using consumer data generated by social media.

This article is the first in a four-part series. In this article, we provide an overview of the methods by which companies collect social data. In our second article, we will discuss the ways companies analyze social data. In our third and fourth articles, we will identify the legal and ethical considerations companies should keep in mind in collecting, analyzing, and using social data, as well as the risks and challenges posed by the collection and use of social data.

The Types of Big Data Collected by Social Media

Social networking sites, chief among them Facebook, Twitter, YouTube, LinkedIn, Google+, and Pinterest, collect a plethora of information about consumers that can be split into two main categories. The first category is information collected about users by the social networking sites themselves, including age, name, sex, gender, interests, occupation, etc. The second category is information generated by the individual users themselves, including posts, photos, videos, and “like’s.” Both types of data are available to companies through social network APIs. Companies can create complex, sophisticated algorithms to analyze this data, or use a third party to perform this analysis for a fee.

Collecting “Social Data”

Social media platforms use APIs to allow the development of web applications suited to their own programming structure for third parties to use and integrate the platform’s service features to their own websites. Through APIs, social media websites can share information seamlessly with companies’ apps. APIs allow application developers to access data from social networks in real time. The third party must agree to the platform’s terms of use and policies for uploading and publishing information via the API, which limits the ways in which apps can use social data about individual users.

Facebook API

In 2014, Facebook introduced a new version of its Facebook Graph API , which is the primary way to get data in and out of Facebook’s social graph. Facebook’s social graph is the largest social network data set in the world, and contains the largest number of defined relationships between the largest number of people among all websites. Facebook’s social graph is a representation of the information on Facebook, composed of:

  • Nodes(“things”, such as a Facebook User, a photo, a Facebook page, or a comment)


  • Edges(connections between those “things”, such as a Facebook page’s photos, or a photo’s comments)


  • Fields(information about those “things”, such as the birthday of a User, or the name of a Facebook page)

To access information contained on Facebook’s social graph, Facebook apps must use the Facebook Graph API. The Facebook Graph API allows Facebook apps to access around 60 different data points, including a User’s Facebook friend list (now restricted to only friends that have connected to the company’s application); a hashtag; a User’s photos on Facebook; Users that “like” a Facebook Page; whether two Users are Facebook friends; the amount of “likes” a Facebook post has; and information about a User’s Facebook profile. It also allows companies to notify existing Users when a User’s friend registers with the app, and allows existing Users to invite their Facebook friends to join the app. For more information on how to use the Facebook Graph API, visit Facebook’s website .

Twitter API and Twitter Ads API

The Twitter API is Twitter’s database of user data that includes every Twitter User’s

personal information, from their age, to who their followers are, and who they follow. It is also the platform that third party apps connect to in order to pull valuable user data to help businesses target their customers more effectively.

There are four main “objects” that the Twitter API tracks: Tweets, Users, Entities, and Places. Users can be anyone or anything. They tweet, follow, create lists, have a home_timeline, can be mentioned, and can be looked up in bulk. Tweets are the basic atomic building block of all things Twitter. Tweets can be embedded , replied to , favorited , unfavorited and deleted . Entities provide metadata and additional contextual information about content posted on Twitter. Entities are never divorced from the content they describe. Places are specific, named locations with corresponding geo coordinates. They can be attached to Tweets by specifying a place_id when tweeting. Tweets associated with places are not necessarily issued from that location but could also potentially be about that location. Places can be searched for . Tweets can also be found by the place from which the Tweet was made or is about. For more information on how to use the Facebook Graph API, visit Facebook’s website .

The Twitter Ads API was developed in 2013, and allows partners to integrate with the Twitter advertising platform in their own advertising solutions. Selected partners have the ability to create custom tools to manage and execute Twitter Ad campaigns. Twitter’s Advertising API provides programmatic access to advertising accounts. Partners can integrate their solutions with the API to promote Tweets or Twitter accounts, schedule campaigns, retrieve analytics, manage audiences, and more. To learn more about Twitter’s Ads API, visit Twitter’s website .

These are just two of the many social networking sites that provide valuable information through their APIs. YouTube , Pinterest , Google + and LinkedIn also have APIs that can deliver important and timely social data. Companies can also create advanced algorithms to track the social media information flow in Twitter, which can trace: (1) the spread of a “hashtag” over the network; (2) the spread of a particular URL; and (3) the amount and spread of re-tweets of a company’s tweet.

In our next article, we will discuss the ways social data can be analyzed and used by companies to spread brand awareness and increase profits.

Learn more about Bloomberg Law or Log In to keep reading:

Learn About Bloomberg Law

AI-powered legal analytics, workflow tools and premium legal & business news.

Already a subscriber?

Log in to keep reading or access research tools.