Integration Cookbook Final
Share!TwitterFacebookLinkedInGoogle+Email

Hey everyone, thanks for tuning in for the last five weeks as I detailed my experiences integrating different sentiment analysis APIs. To close off, I wanted to quickly summarize my overall impressions and final thoughts on each API.

AlchemyAPI

AlchemyAPI is straight forwards and supports both plain text and URL analysis out of the box, which is useful for analyzing links. Once integrated with their API, Alchemy returns an overall numerical sentiment score and sentiment polarity, which is referred to as ‘type’, on every text document sent to their engine. Although the API works well, I found the AlchemyAPI engine slow, even when processing tweets one-by-one. I feel that sentiment analysis is not Alchemy’s strongest feature, but they support a diverse range of features and languages. Additionally, Alchemy offers a free trial of 1,000 transactions per day.

Semantria

Integrating Semantria with SemantAPI went smoothly, even when using batch mode. Semantria was the fastest among the six engines, with the average analysis time being only seconds. Similar to AlchemyAPI, Semantria returns sentiment scores between -1 and 1, along with sentiment polarity values. Sentiment output between Semantria and AlchemyAPI were similar. In order to try the Semantria service, users can register for a free trial with 10,000 transactions.

Chatterbox

Chatterbox does not have its own infrastructure, so their API is published on the Mashape API marketplace. This doesn’t hamper integration; however, there is little documentation for support. Once integrated, Chatterbox API limits the size of text documents to 300 characters. The character limit presents issues when trying to analyze content longer than Tweets and very shorts posts. This limitation makes it difficult to compare it against other engines. On their website, Chatterbox claims 95% accuracy for all of their supported languages; however, like all other solutions, I experienced a sentiment accuracy of less than 80%. Users can try Chatterbox with SemantAPI with their free usage quota of 500 transactions per day.

ViralHeat

Unlike the other services featured in SemantAPI, Viralheat’s primary focus is in social media marketing. However, they have a sentiment API, which I integrated with SemantAPI. Viralheat’s sentiment API extracts the sentiment score of any document with fewer than 360 characters. Like Chatterbox, the character limit prevents long content from being analyzed and reinforces Viralheat’s focus towards social content. Additionally, despite having only evaluated sentiment with SemantAPI, Viralheat’s sentiment API lacks the feature extraction capabilities available with the other APIs. In order to use Viralheat, users can register to for a free usage quota of 1,500 transactions per day.

Bitext

Bitext’s output is slightly different from the other services because it doesn’t return the overall sentiment score for the whole document. Instead, it extracts the sentiment for every sentence in the source text. In order to achieve an overall document sentiment score to compare against the other services, I calculated the average sentiment score across all the sentences. While I am not sure if this was the most accurate method, I found it to be the best way to arrive to a document’s sentiment score, considering Bitext does not have any documentation related to this topic. To evaluate Bitext with SemantAPI, you can register for a 30-day trial with 1000 API calls per day.

 

Performance

Based on their performance, the services ranked as follows:

  1. Semantria thread execution time: 9,273.4935 milliseconds.
  2. Alchemy thread execution time: 40,490.6459 milliseconds.
  3. Viralheat thread execution time: 70,057.9383 milliseconds.
  4. Chatterbox thread execution time: 37,000.806 milliseconds.
  5. Bitext thread execution time: 121,559.5146 milliseconds.

While accuracy is critical for proper analysis, performance (speed of transaction) is also an important factor for real-time solutions. In order to test the performance of the six services on SemantAPI, I decided to analyze 100 Trip Advisor comments from Bellagio hotel guests. The comments differ in length, with the shortest comment being approximately 18 characters and the longest comment being 1566 characters long.

 

Note: As Viralheat and Chatterbox are limited to 360 and 300 characters respectively; their results are not comparable with the other services.  However, for the purpose of our performance test I extrapolated their results to consider the analysis of all 100 comments.

 

George Kozlov is a software engineering guru. He specializes in software research, architecture and maintenance. He co-founded Semantria, and is currently their CTO and go-to guy when things need improving.

Share!TwitterFacebookLinkedInGoogle+Email