4 8 2018

 

  • url:https://www.aqistudy.cn/html/city_detail.html

  • 分析思路:

    • 1.页面中是有相关的查询条件,指定查询条件后点击查询按钮,就会加载出相关的数据。

      • 查询的条件:

        • 城市名称

        • 查询的时间范围

      • 当点击了查询按钮后,整张页面没有刷新,而是局部页面发生了刷新

        • 说明:点击了查询按钮后,发起了一个ajax请求,该请求可以帮我们进行页面的局部刷新,且请求到符合查询条件的相关指标数据。

    • 2.目的:想要获取点击查询按钮后加载出来的数据。想要将ajax请求到的数据获取即可。

      • 可以通过抓包工具捕获到ajax请求的数据包,从数据包中想要提取出

        • 请求url:https://www.aqistudy.cn/apinew/aqistudyapi.php

        • 请求方式:post

        • 请求参数:d,参数值是一组看不懂的字符串

          • 是动态变化的请求参数(每次请求对应的请求参数的值都是不一样的)

        • 响应数据:一组看不懂的字符串

          • 响应一定是需要加载到前台页面进行显示,但是捕获到响应数据并不是前台页面加载出来的原文数据。说明请求到的响应数据一定是一组密文数据。

    • 3.处理动态变化的请求参数

      • 该请求是一个ajax请求,在ajax请求对应的js代码或者Jquery代码中有没有请求参数的设置呢?

      • 当点击了页面中的查询按钮后,就会发起一个ajax请求。说明该查询按钮一定被绑定了一个点击事件,当点击事件发生后就会执行一组ajax请求的代码。

      • 通过查看ajax请求代码,就可以发现请求参数的参数值。

    • 4.找寻ajax请求的代码。

      • 找到查询按钮绑定的点击事件。(火狐浏览器的开发者工具)

  • 5.发现getData函数就是点击了查询按钮后触发的函数,想要去该函数实现内容去查询ajax请求的代码

  • 6.查看getData函数的实现:

    • type=="HOUR":查询条件是以小时为单位

    • 没有发现ajax请求代码,但是发现了另外的两个函数的调用,说明ajax请求代码一定是存在于这两个函数实现中。getAQIData();getWeatherData();

  • 7.查看getAQIData的实现:

    • 没有发现ajax请求代码,发现了另一个函数调用getServerData,

      • getServerData参数:

        • method = GETDETAIL

        • param = {city,type,starttime,endtime}

  • 8.getServerData(method,param)的实现:

    • 可以基于谷歌浏览器的抓包工具进行全局搜索。

    • js混淆:服务器端会将一些比较重要的函数实现进行加密。

    • js反混淆:暴力破解。线上平台爆破破解,平台的url:http://www.bm8.com.cn/jsConfusion/

    • 破解后找到了该函数的实现

function getServerData(method, object, callback, period) { const key = hex_md5(method + JSON.stringify(object)); const data = getDataFromLocalStorage(key, period); if (!data) { var param = getParam(method, object); $.ajax({ url: '../apinew/aqistudyapi.php', data: { d: param }, type: "post", success: function (data) { data = decodeData(data); obj = JSON.parse(data); if (obj.success) { if (period > 0) { obj.result.time = new Date().getTime(); localStorageUtil.save(key, obj.result) } callback(obj.result) } else { console.log(obj.errcode, obj.errmsg) } } }) } else { callback(data) } }

  • 通过getServerData函数的实现终于找到了ajax请求代码,且发现

    • 动态变化的请求参数:getParam(method, object)函数返回

      • 函数参数

        • method:GETDETAI

        • object:{city,type,starttime,endtime}

    • 加密的响应数据解密的函数:decodeData(data)

      • 参数data为加密的响应数据,返回值为解密后的原文数据

$.ajax({          
         url:"发送请求(提交或读取数据)的地址", 
         dataType:"预期服务器返回数据的类型",  
         type:"请求方式", 
         async:"true/false",
         data:{id:func},
         success:function(data){请求成功时执行},      
         error:function(){请求失败时执行}
});
​
  • js逆向

    • 可以将js代码改写成python代码进行相关的调用。

    • 方式:

      • 1.手动将js函数改写成python函数

      • 2.可以使用工具自动进行python函数的改写

        • pyexclJS

  • PyExecJS介绍:PyExecJS 是一个可以使用 Python 来模拟运行 JavaScript 的库。我们需要pip install PyExecJS对其进行环境安装。

    • 注意:如果想使用PyExecJS的话,在你的本机中必须安装好nodejs的环境。

#动态实时获取ajax请求的参数
import execjs
node = execjs.get()
 
# Params
method = 'GETCITYWEATHER'
city = '北京'
type = 'HOUR'
start_time = '2018-01-25 00:00:00'
end_time = '2018-01-25 23:00:00'
 
# Compile javascript
file = 'jsCode.js'
ctx = node.compile(open(file,encoding='utf-8').read())
 
# Get params
js = 'getPostParamCode("{0}", "{1}", "{2}", "{3}", "{4}")'.format(method, city, type, start_time, end_time)
params = ctx.eval(js)
print(params)
tdgHOYxwKdDSgYXe+RLPzYCgLvrddahasI5XXklB4gVLYqab+XRPpMD/oSqnJ/aEmFwzVEUhLnPzRy03+X1BI4qc9EYeRPqiKrT+f1JQExGQ4ii8kKvZhGH+nPffaX/xq5iLB6vblcvBC/L8e6Uxdoy1N4PQCvheW5e+Ue+GW8E/8rUazpxk5L1PcTdlYGVelfCF4Ggz/cipYbNwLxW5Z40n9tSmyybjH0FWbw5qXRgjV7WJpw3lryWXHtUFNufOqsYveofgGAr1AFFbQem3A8fU8jlTIBdC03Tq4ct5fqaGzlaX4LTG6L0M4sSm4y4MgdPb7WF8oXriA5oJ2CIidNVPeFXU7n/sp7H0d+It+H1322kuZ4aaJeDVdLsd7+SwfMiurRgSqmof3CPYWVIfeQ59WNd1MImbrpEd3TyKV+gjWWeqoYqzOQBLQVKlqQSXtQZPfonCMur3vCqrCZ+9RraKsnr0EoqQb1VimVTSLCU=

 

#携带上动态变化的请求参数发起ajax请求获取加密的响应数据
import execjs
import requests
​
node = execjs.get()
 
# Params
method = 'GETCITYWEATHER'
city = '北京'
type = 'HOUR'
start_time = '2018-01-25 00:00:00'
end_time = '2018-01-25 23:00:00'
 
# Compile javascript
file = 'jsCode.js'
ctx = node.compile(open(file,encoding='utf-8').read())
 
# Get params
js = 'getPostParamCode("{0}", "{1}", "{2}", "{3}", "{4}")'.format(method, city, type, start_time, end_time)
params = ctx.eval(js)
​
#发起post请求
url = 'https://www.aqistudy.cn/apinew/aqistudyapi.php'
response_text = requests.post(url, data={'d': params}).text
print(response_text)
zWuCNHIajZGK3SuGLkK7croI+D0P2Zs04EFeGWwtH/WEvYlmCQE8ZYJulHhzk+NKmSgV+5/BhyoLRoHfASQ6dEcad5P0dmzI29nOPVRoFury3HYpMkintvgOrVt0QwWsztKE7tc6vDiM1zSkDP0goA==

 

#对加密的响应数据进行解密
import execjs
import requests
​
node = execjs.get()
 
# Params
method = 'GETCITYWEATHER'
city = '北京'
type = 'HOUR'
start_time = '2018-01-25 00:00:00'
end_time = '2018-01-25 23:00:00'
 
# Compile javascript
file = 'jsCode.js'
ctx = node.compile(open(file,encoding='utf-8').read())
 
# Get params
js = 'getPostParamCode("{0}", "{1}", "{2}", "{3}", "{4}")'.format(method, city, type, start_time, end_time)
params = ctx.eval(js)
​
#发起post请求
url = 'https://www.aqistudy.cn/apinew/aqistudyapi.php'
response_text = requests.post(url, data={'d': params}).text
​
#对加密的响应数据进行解密
js = 'decodeData("{0}")'.format(response_text)
decrypted_data = ctx.eval(js)
print(decrypted_data)

 

  • 返回的原文数据

{"success":true,"errcode":0,"errmsg":"success","result":{"success":true,"data":{"total":24,"rows":[{"time":"2018-01-25 00:00:00","temp":"-7","humi":"35","wse":"1","wd":"\u4e1c\u5317\u98ce","tq":"\u6674"},{"time":"2018-01-25 01:00:00","temp":"-9","humi":"38","wse":"1","wd":"\u897f\u98ce","tq":"\u6674"},{"time":"2018-01-25 02:00:00","temp":"-10","humi":"40","wse":"1","wd":"\u4e1c\u5317\u98ce","tq":"\u6674"},{"time":"2018-01-25 03:00:00","temp":"-8","humi":"27","wse":"2","wd":"\u4e1c\u5317\u98ce","tq":"\u6674"},{"time":"2018-01-25 04:00:00","temp":"-8","humi":"26","wse":"2","wd":"\u4e1c\u98ce","tq":"\u6674"},{"time":"2018-01-25 05:00:00","temp":"-8","humi":"23","wse":"2","wd":"\u4e1c\u5317\u98ce","tq":"\u6674"},{"time":"2018-01-25 06:00:00","temp":"-9","humi":"27","wse":"2","wd":"\u4e1c\u5317\u98ce","tq":"\u591a\u4e91"},{"time":"2018-01-25 07:00:00","temp":"-9","humi":"24","wse":"2","wd":"\u4e1c\u5317\u98ce","tq":"\u591a\u4e91"},{"time":"2018-01-25 08:00:00","temp":"-9","humi":"25","wse":"2","wd":"\u4e1c\u98ce","tq":"\u6674\u8f6c\u591a\u4e91\u8f6c\u591a\u4e91\u95f4\u6674"},{"time":"2018-01-25 09:00:00","temp":"-8","humi":"21","wse":"3","wd":"\u4e1c\u5317\u98ce","tq":"\u6674\u8f6c\u591a\u4e91\u8f6c\u591a\u4e91\u95f4\u6674"},{"time":"2018-01-25 10:00:00","temp":"-7","humi":"19","wse":"3","wd":"\u4e1c\u5317\u98ce","tq":"\u6674\u8f6c\u591a\u4e91\u8f6c\u591a\u4e91\u95f4\u6674"},{"time":"2018-01-25 11:00:00","temp":"-6","humi":"18","wse":"3","wd":"\u4e1c\u5317\u98ce","tq":"\u591a\u4e91"},{"time":"2018-01-25 12:00:00","temp":"-6","humi":"17","wse":"3","wd":"\u4e1c\u5317\u98ce","tq":"\u591a\u4e91"},{"time":"2018-01-25 13:00:00","temp":"-5","humi":"17","wse":"2","wd":"\u4e1c\u5317\u98ce","tq":"\u591a\u4e91"},{"time":"2018-01-25 14:00:00","temp":"-5","humi":"16","wse":"2","wd":"\u4e1c\u98ce","tq":"\u591a\u4e91"},{"time":"2018-01-25 15:00:00","temp":"-5","humi":"15","wse":"2","wd":"\u5317\u98ce","tq":"\u591a\u4e91"},{"time":"2018-01-25 16:00:00","temp":"-5","humi":"16","wse":"2","wd":"\u4e1c\u5317\u98ce","tq":"\u591a\u4e91"},{"time":"2018-01-25 17:00:00","temp":"-5","humi":"16","wse":"2","wd":"\u4e1c\u98ce","tq":"\u591a\u4e91"},{"time":"2018-01-25 18:00:00","temp":"-6","humi":"18","wse":"2","wd":"\u4e1c\u98ce","tq":"\u6674\u95f4\u591a\u4e91"},{"time":"2018-01-25 19:00:00","temp":"-7","humi":"19","wse":"2","wd":"\u4e1c\u98ce","tq":"\u6674\u95f4\u591a\u4e91"},{"time":"2018-01-25 20:00:00","temp":"-7","humi":"19","wse":"1","wd":"\u4e1c\u98ce","tq":"\u6674\u95f4\u591a\u4e91"},{"time":"2018-01-25 21:00:00","temp":"-7","humi":"19","wse":"0","wd":"\u5357\u98ce","tq":"\u6674\u95f4\u591a\u4e91"},{"time":"2018-01-25 22:00:00","temp":"-7","humi":"22","wse":"1","wd":"\u4e1c\u5317\u98ce","tq":"\u6674\u95f4\u591a\u4e91"},{"time":"2018-01-25 23:00:00","temp":"-9","humi":"27","wse":"1","wd":"\u897f\u5357\u98ce","tq":"\u6674\u95f4\u591a\u4e91"}]}}}

  • 反爬机制:

    • js加密

      • js逆向可以破解js加密

    • 动态变化的请求参数

    • js混淆

延伸阅读
    < /body> < /html>