By 苏剑林 | December 01, 2016
Some time ago, I participated in a ridiculous online competition—Aspect-based Sentiment Analysis in specific domains. The homepage is here. The task of the competition was to identify entities in a segment of text and then judge the sentiment. For example, in the sentence "I like Honda, but I don't like Toyota," one needs to mark "Honda" and "Toyota." From the perspective of Honda, the sentiment is positive, while from the perspective of Toyota, the sentiment is negative. In other words, it is equivalent to combining entity recognition and sentiment analysis.
It sounds high-end, so why is it ridiculous? The competition task itself is quite good and worth researching. However, the organizers were very frustrating, mainly manifested in: 1. The competition was divided into preliminary, semi-final, and final stages. The preliminary lasted over a month, after which some were selected for the semi-final. The semi-final simply involved changing the data slightly, with no changes to the problem or the data domain. The semi-final also lasted a month. What on earth was the point of this semi-final? 2. Take a look at what the contestants were discussing in the group:
Aowu 17:40:54
128004 【Hangzhou Deao Audi Certified Used Car】Audi ttcoupe45tfsiquattro 2015 536,900 RMB
Aowu 17:40:57
@Guoshuang Competition Guidance
Aowu 17:41:09
Where should I cut the perspective for this one?
Guoshuang Competition Guidance 17:41:19
Audi ttFengyun 20:19:47
I haven't driven many good cars, but I feel Honda's handling is better than Toyota and Nissan. Should "Toyota" and "Nissan" here be labeled as neg or neu?
Fengyun 20:20:00
I feel the standards for this are inconsistent between the preliminary and semi-final.
Fengyun 20:20:12
@Guoshuang Competition Guidance @Guoshuang Competition Guidance3
Guoshuang Competition Guidance 21:29:52
neuKk_asd 10:15:00
@Guoshuang Competition Guidance For Shanghai Volkswagen, should "Shanghai" be deleted?
Guoshuang Competition Guidance 10:15:18
No (bu)Going Right 20:49:06
Is there a perspective like "Imported Ford"? @Guoshuang Competition Guidance
Going Right 20:49:16
Imported BMW?
Guoshuang Competition Guidance 20:54:43
NoKk_asd 10:57:28
Kia Rio (起亚律动) appears a lot, should I mark "Kia"? @Guoshuang Competition Guidance
Guoshuang Competition Guidance 11:43:04
No
I won't say much more. If the organizers think this is machine learning, then so be it, but to me, it looks more like "Administrator Learning."
Anyway, since it's a ridiculous competition, I'll join in the absurdity. I don't expect any high rankings. Since the competition isn't over yet, I'll disclose my model first. If your score is lower than mine, you can use this template to boost your results.
In fact, for this task, my approach is almost the same as in "Sequence-to-Sequence Core Entity Recognition Based on Bidirectional LSTM and Transfer Learning". I treat it as a sequence labeling problem, except I replaced the LSTM with a GRU, which has fewer parameters. This time, I used character-level tagging, using 0 for non-entity parts, 1 for positive entities, 2 for neutral entities, and 3 for negative entities—that's it. Since the labeled corpus belongs to the automotive domain, I scraped some automotive domain corpora myself and wrote a GRU-based language model to train character vectors. I felt that the Word2Vec approach for character vectors is too coarse and might not perform well on small corpora.
And then? There is no "then." The rest is basically a repetition of "Sequence-to-Sequence Core Entity Recognition Based on Bidirectional LSTM and Transfer Learning"; even the code is the same. Of course, they eventually provided a list of entities in the automotive domain, so I used this list to perform forced alignment in the post-processing Viterbi algorithm. The final transfer learning didn't provide much of an improvement; you can decide whether to use it or not.
The part I am most satisfied with is that the whole process is end-to-end. Once the corpus is provided, there is almost no manual intervention. If you change to a corpus from another domain, it will still run through quickly.
Preliminary accuracy was 0.56, and my current semi-final accuracy is 0.55. It's not great; the best score on the leaderboard is around 0.67. I don't know what methods they are using, but I hope some experts can provide guidance. Regardless, I don't plan to work on it anymore.
Download Package: Aspect-Based Domain Sentiment Analysis_Packaged.7z
Reproduction Notice: Please include the original address of this article: https://kexue.fm/archives/4118
If you found this article helpful, you are welcome to share or support this article. Support is not for the sake of profit, but to let me know how much genuine attention Scientific Spaces is receiving. Of course, if you ignore it, it will not affect your reading. Welcome and thank you once again!