Semi-supervised deep learning approach to break common CAPTCHAs
Abstract
Manual data annotation is a time consuming activity. A novel strategy for automatic training of the CAPTCHA breaking system with no manual dataset creation is presented in this paper. We demonstrate the feasibility of the attack against a text-based CAPTCHA scheme utilizing similar network infrastructure used for Denial of Service attacks. The main goal of our research is to present a possible vulnerability in CAPTCHA systems when combining the brute-force attack with transfer learning. The classification step utilizes a simple convolutional neural network with 15 layers. Training stage uses automatically prepared dataset created without any human intervention and transfer learning for fine-tuning the deep neural network classifier. The designed system for breaking text-based CAPTCHAs achieved 80% classification accuracy after 6 fine-tuning steps for a 5 digit text-based CAPTCHA system. The results presented in this paper suggest, that even the simple attack with a large number of attacking computers can be an effective alternative to current CAPTCHA breaking systems.
Persistent identifier
http://hdl.handle.net/11012/203005Document type
Peer reviewedDocument version
PostprintFulltext will be available on 13. 04. 2022
Source
NEURAL COMPUTING & APPLICATIONS. 2021, vol. 33, issue 20, p. 13333-13343.https://link.springer.com/article/10.1007%2Fs00521-021-05957-0