开发者社区> 问答> 正文

python socket提取网络图片提示UnicodeDecodeError:?报错

我在学习Python for Information 这本书,第12章是用socket去抓取一张网络图片,但是报错UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 246: invalid start byte

用python 2.7是可以正常用的,但是python3.5 在

picture = picture + bytes.decode(data)

所有的代码如下,请大神帮忙改进,谢谢!


import socket
import time

mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('www.py4inf.com', 80))
s = 'GET http://www.py4inf.com/cover.jpg HTTP/1.0\n\n'
mysock.send(str.encode(s))

count = 0
picture = ''
while True:
    data = mysock.recv(5120)
    if (len(data) < 1):
        break
    #time.sleep(0.25)
    count = count + len(data)
    print(len(data), count)
    picture = picture + bytes.decode(data)
mysock.close()

# Look for the end of header
pos = picture.find('\r\n\r\n')
print('Header length', pos)
print(picture[:pos])

# Skip past the header and save the picture data
picture = picture[pos+4:]
fhand = open('stuff.jpg', 'wb')
fhand.write(picture)
fhand.close()



展开
收起
爱吃鱼的程序员 2020-06-10 14:27:27 629 0
1 条回答
写回答
取消 提交回答
  • https://developer.aliyun.com/profile/5yerqm5bn5yqg?spm=a2c6h.12873639.0.0.6eae304abcjaIB

    这个你要先了解http协议的格式

    <status-line><headers><blankline>[<response-body>]

    这段代码的返回结果应该是这样的:

    HTTP/1.1200OKDate:Tue,23Feb201609:32:03GMTServer:ApacheLast-Modified:Fri,04Dec201519:05:04GMTETag:"b294001f-111a9-526172f5b7cc9"Accept-Ranges:bytesContent-Length:70057Connection:closeContent-Type:image/jpeg图片数据而socket返回的结果是经过编码过的,所以要找头的时候,

    应该用pos=picture.find('\r\n\r\n'.encode('utf-8'))

    而图片本身就是二进制的,所以不要解码,直接往文件里写就可以了。

    修改后的就像这样:

    importsocketimporttimemysock=socket.socket(socket.AF_INET,socket.SOCK_STREAM)mysock.connect(('www.py4inf.com',80))s='GEThttp://www.py4inf.com/cover.jpgHTTP/1.0\n\n'mysock.send(s.encode('utf-8'))count=0picture=b''whileTrue:data=mysock.recv(5120)if(len(data)<1):break#time.sleep(0.25)count=count+len(data)print(len(data),count)picture=picture+datamysock.close()#Lookfortheendofheaderpos=picture.find('\r\n\r\n'.encode('utf-8'))header=picture[:pos]print('Headerlength',pos)print(picture[:pos].decode('utf-8'))#Skippasttheheaderandsavethepicturedatapicture=picture[pos+4:]fhand=open('stuff.jpg','wb')fhand.write(picture)fhand.close()


    2020-06-10 14:27:42
    赞同 展开评论 打赏
问答排行榜
最热
最新

相关电子书

更多
From Python Scikit-Learn to Sc 立即下载
Data Pre-Processing in Python: 立即下载
双剑合璧-Python和大数据计算平台的结合 立即下载