python如何处理掉12306的验证码-图灵python

400-090-8899

导航

图灵python

首页

关于我们

课程大纲

图灵AI

图灵资讯

Python全套实战项目

当前位置：首页 > 图灵资讯 > 行业资讯> python如何处理掉12306的验证码

python如何处理掉12306的验证码

发布时间:2025-05-08 10:40:05

本文实例介绍了Python破解12306图片验证码的方法。与大家分享供大家参考，具体如下：

不知道什么时候，12306的登录验证码变成了按字找图，可以说提高了一个等级，甚至使用了所有的图像识别。但是有些图片，不得不说有些变态，更别说图片的清晰图片了，显然是从网络上的图库里搬过来的。

相关推荐:Python基础教程

谁知道呢，没过多久，网络就惊讶地发现了Python代码，破解了12306图片验证码。作为一个爱玩爱刺激的网虫，当然要分享一个。

代码一般流程：

1、下载验证码图片，然后切割图片；

2、利用百度识图进行图片分析；

3、然后用正则表达式取出百度识图的关键词，最后输出。

代码：

#!/usr/bin/python
##FileName:fuck12306.py
##Author:MaoMaoWang<andelf@gmail.com>
##Created:MonMar1622:08:412015byShuuWang
##Copyright:Feather(c)2015
##Description:fuckfuck123060
##Time-stamp:<2015-03-1710:57:44andelf>
fromPILimportImage
fromPILimportImageFilter
importurllib
importurliblib
importre
importjson
#hackCERTIFICATE_VERIFY_FAILED
#https://github.com/mtschirs/quizduellapi/issues/2
importssl
ifhasattr(ssl,'_create_unverified_context'):
ssl._create_default_https_context=ssl._create_unverified_context
UA="Mozilla/5.0(Macintosh;IntelMacosx10_10_2AppleWebKit/537.36(KHTML,likeGecko)Chrome
/41.0.272.89Safari/537.36"
pic_url="https://kyfw.12306.cn/otn/passcodeNew/getPassCodeNew?module=login&rand=sjrand&0.21191171556711197"
defget_img():
resp=urllib.urlopen(pic_url)
raw=resp.read()
withopen("./tmp.jpg",'wb')asfp:
fp.write(raw)
returnImage.open("./tmp.jpg")
defget_sub_img(im,x,y):
assert0<=x<=3
assert0<=y<=2
WITH=HEIGHT=68
left=5+(67+5)*x
top=41+(67+5)*y
right=left+67
bottom=top+67
returnim.crop((left,top,right,bottom))
defbaidu_stu_lookup(im):
url="http://stu.baidu.com/n/image?fr=html5&needRawImageUrl=true&id=WU_FILE_0&name=233.png&type=
image%2Fpng&lastModifiedDate=Mon+Mar+16+2015+20%3A49+GMT%2B0800+(CST)&size="
im.save("./query_temp_img.png")
raw=open("./query_temp_img.png",'rb').read()
url=url+str(len(raw))
req=urllib2.Request(url,raw,{'Content-Type':'image/png','User-Agent':UA})
resp=urllib2.urlopen(req)
resp_url=resp.read()#returnapureurl
url="http://stu.baidu.com/n/searchpc?queryImageUrl="+urllib.quote(resp_url)
req=urllib2.Request(url,headers={'User-Agent':UA})
resp=urllib2.urlopen(req)
html=resp.read()
returnbaidu_stu_html_extract(html)
defbaidu_stu_html_extract(html):
#pattern=re.compile(r'<scripttype="text/javascript">(.*?)</script>',re.DOTALL|re.MULTILINE)
pattern=re.compile(r"keywords:'(.*?)'")
matches=pattern.findall(html)
ifnotmatches:
return'[UNKNOWN]'
json_str=matches[0]
json_str=json_str.replace('\\x22','"').replace('\\\\\','\\')
#printjson_str
result=[item['keyword']foriteminjson.loads(json_str)]
return'|'.join(result)ifresultelse'[UNKNOWN]'
defocr_question_extract(im):
#git@github.com:madmaze/pytesseract.git
globalpytesseract
try:
importpytesseract
except:
print"[ERROR]pytesseractnotinstalled"
return
im=im.crop((127,3,260,22))
im=pre_ocr_processing(im)
#im.show()
returnpytesseract.image_to_string(im,).strip()
defpre_ocr_processing(im):
im=im.convert("RGB")
width,height=im.size
white=im.filter(ImageFilter.BLUR).filter(ImageFilter.MaxFilter(23))
grey=im.convert('L')
impix=im.load()
whitepix=white.load()
greypix=grey.load()
foryinrange(height):
forxinrange(width):
greypix[x,y]=min(255,max(255+impix[x,y][0]-whitepix[x,y][0],
255+impix[x,y][1]-whitepix[x,y][1],
255+impix[x,y][2]-whitepix[x,y][2]))
new_im=grey.copy()
binarize(new_im,150)
returnnew_im
defbinarize(im,thresh=120):
assert0<thresh<255
assertim.mode=='L'
w,h=im.size
foryinxrange(0,h):
forxinxrange(0,w):
ifim.getpixel((x,y))<thresh:
im.putpixel((x,y),0)
else:
im.putpixel((x,y),255)
if__name__='__main__':
im=get_img()
#im=Image.open("./tmp.jpg")
print'OCRQuestion:',ocr_question_extract(im)
foryinrange(2):
forxinrange(4):
im2=get_sub_img(im,x,y)
result=baidu_stu_lookup(im2)
print(y,x),result

上一篇python是汇编语言吗

下一篇 Python无法验证ssl证书怎么解决

相关文章

如何让vim支持python3

如何让vim支持python3

python2.7和3.6区别有哪些

python2.7和3.6区别有哪些

python3有serial库吗

python3有serial库吗

python中w、r表示什么意思

python中w、r表示什么意思

python中如何把list变成字符串

python中如何把list变成字符串

python命名空间是什么

python命名空间是什么