Based on Xception for Tencent Captcha Recognition (Samples + Code)

By 苏剑林 | July 24, 2017

Last year, I was fortunate enough to receive a batch of Tencent captcha samples from a netizen. Consequently, I conducted some research on them, the process of which was documented in "End-to-End Tencent Captcha Recognition (46% Accuracy)".

Later, that article attracted considerable interest from readers, with some requesting samples, others asking for the model, and many engaging in discussions, which was quite unexpected. In fact, the original model was relatively crude, especially since its accuracy was not high enough for practical use, offering limited reference value. Over the past few days, I revisited this and developed a model with higher accuracy; simultaneously, I am making the samples public for everyone.

The reasoning behind the model remains the same as in "End-to-End Tencent Captcha Recognition (46% Accuracy)", except the CNN component has been replaced with the standard Xception architecture. Of course, readers could also experiment with VGG, ResNet50, etc.; in fact, for captcha recognition, these models are all quite capable. I chose Xception because it has fewer layers and smaller model weights, which I personally prefer.

Code

Github: https://github.com/bojone/n2n-ocr-for-qqcaptcha/

# (The code block from the original post is placed here)

It is worth noting that the pre-trained weights for Xception are intended for ImageNet image classification tasks, which are clearly not applicable to captcha recognition. Therefore, all layers have been unfrozen for training here, rather than fixing most weights as is standard for general classification tasks.

Results

After training with the code mentioned above, the recognition rate of the model on the test set (where all four characters must be correct to be counted as a correct prediction) can reach over 85%. With finer hyperparameter tuning (considering adjustments to the learning rate, increasing/decreasing iterations, adjusting model architecture, etc.), it can reach over 90%.

Additionally, you may refer to the masterpiece by the great Yang Peiwen, which uses CTC for final classification: "Using Deep Learning to Crack Captchas".

Resources

The 100,000 captcha samples are publicly available below:

Link: https://pan.baidu.com/s/1mhO1sG4 Password: j2rj

Reprinting Note: Please include the original article address: https://kexue.fm/archives/4503 ("Based on Xception for Tencent Captcha Recognition (Samples + Code)")

If you found this article helpful, you are welcome to share or reward this article. Rewards are not for profit, but to know how much sincere attention Scientific Spaces has received from readers. Of course, if you ignore it, it will not affect your reading. Thank you again for your visit and support!