Making ONA count with Natural Language Processing

Togy Jose
OrgLens

--

While Organizational Network Analytics is becoming increasingly relevant for Community Managers to identify influencers and drive change — the allied field of Natural Language Processing is a major force-multiplier for ONA.

While ONA can help with identifying influencers, the next logical step is to work with them to drive change and manage perceptions. One way to do this is to analyze conversations with NLP and provide that feedback to influencers to close the feedback loop.

NLP primarily helps with sensing the mood (using Sentiment Analysis) and identifying what the community is talking about (using Topic Modelling and Word Clouds). There are many tools (like R and Python) to conduct NLP at scale. In this article we will look at how R can help.

Topic Modelling: In active online communities with a lot of messages getting generated, for community managers to effectively respond it is imperative to divide messages in logical groups. R uses models like Latent Dirichlet Allocation (LDA ) to achieve this. By bucketing messages into categories / topics — we can also assess overall sentiment around these topics.

Few use cases for Topic Modelling are — 1) Sorting hundreds of questions you may have received in response to a TownHall announcement 2) Reviewing the effectiveness of your help desk by improving the classification of support tickets 3) Actively scan the Enterprise Social Media to view what your employees are talking about.

Sentiment Analysis: This is critical to assess the engagement levels. One way of doing Sentiment Analysis is to simply break messages into words and assign positive / negative sentiment scores to assess overall sentiment. While this may provide a high-level assessment — it may miss certain nuances like negation and sarcasm. For example — consider the following message — “I did not like that approach” — instead of assigning a positive score to “like”, it should be assigned a negative score because of the negation term, not , behind it. Machine Learning packages like R address this problem by using “bi-grams” where messages are broken into pairs of word and in the presence of a negation in the first word (eg: not) — sentiment score for the 2nd word (eg: like) is reversed.

R also enables the analysis of higher order n-grams i.e. groups of 3 or more words. N-grams help highlight if the community is engaged in conversations around specific themes like “human centred design” or “enabling a diverse workplace”. R Packages like dplyr and tidytext are easy to learn and can provide these analytics.

Word Clouds: A well visualised wordcloud can provide a good summary of often used #hashtags or words. Two key concepts for reducing noise in a wordcloud are stop-words and tf-idf analytics. Stop words are the most common words in English language (eg: to, the, at, on etc.) which dont add much value in NLP. Machine Learning packages have ready-made Stop Word lists which can be directly plugged into wordclouds to reduce the noise. The tf-idf (term frequency — inter document frequency) is a slightly more nuanced approach to reducing noise. This is different from stop words in that it identifies words that may not be common in English language but they do not add any value during sentiment analysis, since within that context the word does not add value. For example — if you are analyzing conversations of employees on the corporate social network of, say, Accenture. Terms like “Accenture” or “Accenture.com” will be equally used across all communities and will not provide any specific insights. Most NLP tools have packages for this analysis as well (eg: tidytext package in R). Words identified using tf-idf analysis can then be added to the stop-words list.

This article is not meant to be an exhaustive overview of NLP, but the focus has been on enabling community managers to leverage these tools to understand the zeitgiest and take the community’s engagement to the next level.

#R is a good option to generate code around NLP (even #machine #learning use cases). Here is the R code related to this article.

Hope you found this useful! Please feel free to provide feedback.

--

--

Togy Jose
OrgLens
Writer for

Founder @ hrness.ai #graphanalytics #ml #ai #peopleanalytics #startup #networks #communities. Twitter: @togyjose