When it comes to sentiment, we tend to think of news content as bearing the least. The whole mantra of most news sources is to report events as accurately as possible, and that usually means reporting the facts and refraining from any judgment calls. We all know this doesn’t actually happen and in fact some sources might even gloat in their particular bias.

When we as readers are aware of this bias, we can think critically about the information given to us, compare across sources, and make our own value judgments, but this takes time and effort. Furthermore, the way bias is conveyed isn’t always transparent.

If censorship comes into play this can create a new sort of bias based on only having limited information that favors a specific viewpoint. During my time working with sentiment analysis, I found the idea of using sentiment to characterize the voice of different news sources intriguing, especially cross-culturally.

I have been working with substantial amounts of Chinese data lately and I thought it would be interesting to compare entity sentiment between two quintessentially opposite news sources, Xinhua (, the state-run news agency of China, and the Chinese version of The Wall Street Journal (

I used five days of content from each news source and compared entity sentiment for a prescribed list of entities using Salience’s Chinese pack. I chose these entities based on their potential to expose key points of disagreement between sources. With that criterion, I came up with the following 14 entities:

  1. 中国       China
  2. 美国       America
  3. 俄罗斯    Russia
  4. 奥巴马    Obama
  5. 习近平    Xi Jinping
  6. 普京       Putin
  7. 巴西       Brazil
  8. 百度       Baidu
  9. 谷歌       Google
  10. 马克思    Marx
  11. 毛泽东    Mao Zedong
  12. 日本       Japan
  13. 钓鱼岛    Diaoyu Islands
  14. 斯诺登    Snowden

I obtained results for the user-defined entities listed above for both sets of content, and calculated the number of times each entity was positive overall for a given document, versus the number of times it was negative overall for a given document (all in all, the number of positive/negative hits per entity would not exceed the number of documents).

The results are shown in tables A and B. The green bar indicates the number of positive hits and the red indicates the number of negative hits. The translated name of the entity is colored either red or green based on its majority polarity. Can you guess which table corresponds to which source?

Table A


Table B


From history, we know communist governments tend to avoid self-criticism, and leaders are often exalted as god-like symbols. If we can apply that general understanding here, it becomes fairly easy to decipher which graph corresponds to which news source.

You guessed it, Table B, the one that sees “China”, “Xi Jinping” (the current Party leader), and “Mao Zedong” (the symbolic first Party leader of Communism in China), as positive the majority of the time is Xinhua. Table A is therefore The Wall Street Journal.

What stands out about The Wall Street Journal, a Western news source, is the lack of favoritism towards entities that would be more inherently western, like the “USA”, which is overwhelmingly negative. I think the more negative tone of The Wall Street Journal is actually quite in line with our understanding of most Western media.

Self-criticism is often a central theme in Western media; some might argue it is the prerogative of a democratic country, where free-speech reigns, to be overly critical of those in power. It’s almost impossible for the President of the United States to please everyone and he is often the subject of criticism. Even if the same is true of the Xi Jinping in China, one tends to be more reserved about their opinions in public settings, and even more so as an employee of the state-run news agency.

Let’s look at the graphs from a slightly different perspective. The following graphs emphasize the negative or positive polarity of each entity, by highlighting the difference between polarity tallies.



The Wall Street Journal


These graphs emphasize the general positive bias of Xinhua, with a slight negative sentiment towards “Obama”, the “USA”, “Japan”, and the “Diaoyu Islands”. There are many ways we can speculate as to why this might be the case, and I’ve given the communist–democratic contrast as a possible underlying feature of these sentiments. Furthermore, it’s no mystery that there is anti-Japanese sentiment in China and most recently surrounding territorial conflicts in the Diaoyu Islands. You can form your own analysis from there.

If I can add my own speculations, it seems that, generally speaking, The Wall Street Journal does not hold back in criticizing whomever they’d like to criticize. This includes the country and the leader of the country from which they are from – a stark contrast from the self-exalting Chinese news source. Interestingly, “China”, “Russia”, and “Mao” are painted positively overall in the WSJ which went against my original [unsaid] hypothesis.

I should note that the data used for this analysis was relatively small so we want to be careful not to draw any real conclusions about what any of this may mean. Even so, these initial results are in line with the intuitions of many with regards to Chinese media, and particularly state-run news agencies in China versus news agencies originating in the West.

It would be worthwhile to continue this study in a more rigorous fashion with more data, across a larger span of time, and across more diverse sources to corroborate these initial findings. Anchoring entity sentiment around a particular event might also provide a more accurate representation of bias across news sources.

That way we would be able to say things like, “The WSJ is positive towards Japan and neutral towards China in the context of the Diaoyu Islands territorial conflict, but Xinhua is very negative towards Japan and slightly negative towards America.” This makes more sense since we tend to not attach inherent goodness or badness to entities but will instead view them negatively or positively within a certain context.

If a certain source always views something to a consistent sentiment polarity, regardless of context, then that may be grounds to assert the existence of a bias. Until then, hopefully these initial findings have sparked your interest in cross-cultural media perspectives and will encourage you to do your own comparative studies (and hopefully share them with us)!

Elizabeth Baran is a computational linguist at Lexalytics, Inc., where she recently headed the release of a Chinese language pack for the company’s core text analysis engine, Salience. She has a B.A. from Georgetown University in Chinese and a Master’s in Computational Linguistics from Brandeis University. Elizabeth speaks multiple languages and has lived and studied for extended periods of time in both China and France. She is currently working on expanding language support for the Lexalytics Salience engine.