在Scrapy中发送发帖请求


问题内容

我正在尝试从Google Play商店抓取最新评论,并得到我需要发出发帖请求的信息。

有了邮递员,我收到了满意的回复。

但是终端中的发布请求给了我一个服务器错误

curl -H "Content-Type: application/json" -X POST -d '{"id": "com.supercell.boombeach", "reviewType": '0', "reviewSortOrder": '0', "pageNum":'0'}' https://play.google.com/store/getreviews

给出服务器错误并

Scrapy只是忽略了这一行:

frmdata = {"id": "com.supercell.boombeach", "reviewType": 0, "reviewSortOrder": 0, "pageNum":0}
        url = "https://play.google.com/store/getreviews"
        yield Request(url, callback=self.parse, method="POST", body=urllib.urlencode(frmdata))
python python-2.7 scrapy web-crawler

问题答案:

确保你的每个元素formdata的类型均为字符串/ Unicode

frmdata = {"id": "com.supercell.boombeach", "reviewType": '0', "reviewSortOrder": '0', "pageNum":'0'}
url = "https://play.google.com/store/getreviews"
yield FormRequest(url, callback=self.parse, formdata=frmdata)

我认为这会做

In [1]: from scrapy.http import FormRequest

In [2]: frmdata = {"id": "com.supercell.boombeach", "reviewType": '0', "reviewSortOrder": '0', "pageNum":'0'}

In [3]: url = "https://play.google.com/store/getreviews"

In [4]: r = FormRequest(url, formdata=frmdata)

In [5]: fetch(r)
 2015-05-20 14:40:09+0530 [default] DEBUG: Crawled (200) <POST      https://play.google.com/store/getreviews> (referer: None)
[s] Available Scrapy objects:
[s]   crawler    <scrapy.crawler.Crawler object at 0x7f3ea4258890>
[s]   item       {}
[s]   r          <POST https://play.google.com/store/getreviews>
[s]   request    <POST https://play.google.com/store/getreviews>
[s]   response   <200 https://play.google.com/store/getreviews>
[s]   settings   <scrapy.settings.Settings object at 0x7f3eaa205450>
[s]   spider     <Spider 'default' at 0x7f3ea3449cd0>
[s] Useful shortcuts:
[s]   shelp()           Shell help (print this help)
[s]   fetch(req_or_url) Fetch request (or URL) and update local objects
[s]   view(response)    View response in a browser