Real-time Social Network Analysis
The internet age and growth of social networks was a game changer for human sciences. If previously researchers and analysts had to carefully design data collection processes, now people are sharing their thoughts and concerns on a daily (or hourly) basis. For example, the retail sector can gain instant feedback about a new product.
However, high frequency of the data makes the data collection and analysis extremely difficult. Moreover, the volume of the data from the social networks can exceed the capacity of in-house data processing systems.
In this use case, data pipelines are developed to collect, process, and store the data using the Cloud Pub/Sub, Cloud Dataflow and BigQuery services from the Google Cloud Platform. The solution enables effective and holistic data collection and protection against data loss that increases the confidence of the data analysis. Moreover, the CAPEX and OPEX required for such a system are dramatically reduced by using infrastructure and services managed by Google Cloud. In addition, by using the Natural Language API of the Google Cloud in the pipelines, one can analyze the texts of the captured content on-the-fly and load the results of the analysis (e.g. sentiment analysis) in BigQuery together with the captured data.
The solution was also presented as a hands-on workshop at the IWR (Interdisciplinary Center for Scientific Computing) of University of Heidelberg.