一次Python爬虫的修改,抓取淘宝MM照片

本修改为在原基础上的一次学习与优化,毕竟刚开始学习,故代码中注释有所保留,方便以后回顾与学习语法。

请先了解原来能实现的目标,在此不再赘述。

原文地址:Python爬虫实战四之抓取淘宝MM照片

感谢庆才哥的代码与思路。

本修改详情

  • 代码改写为python3.5运行,因为3.5版本语法与模块有所变更
  • 增加cookie验证解决跳转
  • 无法保存图片能够容错而不是停止运行
  • 其他细节修改

程序流程介绍

加上cookie与伪装浏览器后先保存所有详情页,因为cookie有过期时间,而保存所有图片太耗时,故先保存详情页面下来,再提取网址获取图片。也可以将提取出来的地址保存为文件再导入获取图片,另一个思路而已,皆可行。

代码部分

cookie获取方式

注意只取cookie部分复制,代码中把#去掉

一次Python爬虫的修改,抓取淘宝MM照片

实现效果

一次Python爬虫的修改,抓取淘宝MM照片一次Python爬虫的修改,抓取淘宝MM照片一次Python爬虫的修改,抓取淘宝MM照片一次Python爬虫的修改,抓取淘宝MM照片


刚开始学习,欢迎大家一块学习探讨啊。

转载请注明原作者,如果你觉得文章对你有帮助或启发,也可以来请我喝咖啡

点赞

  1. haar vitamine说道:

    I'm really loving the theme/design of your blog. Do you ever run into any
    web browser compatibility problems? A handful of my blog audience have
    complained about my blog not working correctly in Explorer but looks great in Chrome.
    Do you have any recommendations to help fix this problem?

    1. 王子龙说道:

      This theme designed by KRATOS(https://www.vtrois.com).I found that web fonts display smaller in IE,and with no other problems any more.It works good in Chrome and Firefox.

  2. deblji penis说道:

    Howdy would you mind letting me know which webhost you're utilizing?
    I've loaded your blog in 3 different internet browsers and I must say this blog loads a lot
    quicker then most. Can you suggest a good hosting provider at a reasonable
    price? Thanks a lot, I appreciate it!

    1. 王子龙说道:

      I'm using alibabacloud.But it's expensive.Some hosting provider like DigitalOcean and GoDaddy are awesome.I use "WP Super Cache" plugin to generate static html pages,and CDN for image storage.Actualy my webhost provides little traffic.

发表评论

电子邮件地址不会被公开。 必填项已用*标注