开发者社区> 问答> 正文

HttpClient4.x模拟登陆请求保护的url

请教一下各位大神。 我需要用HttpClient4.x来模拟登陆一个网站,然后再打开里面的一个链接进行数据抓取。 HttpClient的使用策略等应该是这么样设置? 我实例出来一个HttpClient之后用它进行了登陆Post,然后再使用这个HttpClient去请求受限资源,报没登陆的错误。

HttpClient是这样设置的:
 // 设置组件参数, HTTP协议的版本,1.1/1.0/0.9 
   HttpParams params = new BasicHttpParams(); 
   HttpProtocolParams.setVersion(params, HttpVersion.HTTP_1_1); 
   HttpProtocolParams.setUserAgent(params, "HttpComponents/1.1"); 
   HttpProtocolParams.setUseExpectContinue(params, true);


   //设置连接超时时间 
   int REQUEST_TIMEOUT = 10*1000; //设置请求超时10秒钟 
int SO_TIMEOUT = 10*1000; //设置等待数据超时时间10秒钟 
//HttpConnectionParams.setConnectionTimeout(params, REQUEST_TIMEOUT);
//HttpConnectionParams.setSoTimeout(params, SO_TIMEOUT);
   params.setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, REQUEST_TIMEOUT);  
   params.setParameter(CoreConnectionPNames.SO_TIMEOUT, SO_TIMEOUT); 
 
//设置访问协议 
SchemeRegistry schreg = new SchemeRegistry();  
schreg.register(new Scheme("http",80,PlainSocketFactory.getSocketFactory())); 
schreg.register(new Scheme("https", 443, SSLSocketFactory.getSocketFactory()));  

//多连接的线程安全的管理器 
PoolingClientConnectionManager pccm = new PoolingClientConnectionManager(schreg);
pccm.setDefaultMaxPerRoute(20); //每个主机的最大并行链接数 
pccm.setMaxTotal(100); //客户端总并行链接最大数    

HttpClient httpClient = new DefaultHttpClient(pccm, params);

//这两个策略都试过了,不行。
//httpClient.getParams().setParameter(ClientPNames.COOKIE_POLICY, CookiePolicy.BROWSER_COMPATIBILITY);
httpClient.getParams().setParameter(ClientPNames.COOKIE_POLICY, CookiePolicy.BEST_MATCH);

求救大神给个Demo或指导HttpClient应该怎么设置。

展开
收起
小旋风柴进 2016-03-05 14:45:44 2393 0
1 条回答
写回答
取消 提交回答
  • 不用特殊设置,httpClient会自动提交登录成功后保存session的cookie。

    我这样用过,可以抓取:

    public class Spider {
        private DefaultHttpClient httpClient;
        private HttpResponse response;
        private HttpEntity entity;
         
        public Spider()
        {
            this.httpClient = new DefaultHttpClient();
            HttpParams params = httpClient.getParams();
            /*连接超时*/
            HttpConnectionParams.setConnectionTimeout(params, 30000);
            /*读取超时*/
            HttpConnectionParams.setSoTimeout(params, 30000);
        }
         
        public void post(String url, List<NameValuePair> nameValuePair) throws ClientProtocolException, IOException {
            HttpPost httpost = new HttpPost(url);
            if(nameValuePair != null)
            {
                httpost.setEntity(new UrlEncodedFormEntity(nameValuePair, HTTP.UTF_8));
            }
            this.response = this.httpClient.execute(httpost);
            this.entity = response.getEntity();                     
        }
         
        public void get(String url) throws ClientProtocolException, IOException {
            HttpGet httpGet = new HttpGet(url);
            this.response = this.httpClient.execute(httpGet);
            this.entity = response.getEntity();         
        }
         
        public void readResponseContent() throws UnsupportedEncodingException, IllegalStateException, IOException
        {
            BufferedReader reader = new BufferedReader(new InputStreamReader(this.entity.getContent(), "utf-8"));
             
            //读取你需要的信息
             
             
            releaseEntity();
        }
         
        private void releaseEntity() throws IOException
        {
            if(this.entity != null){            
                this.entity.consumeContent();           
            }
        }
    }
    2019-07-17 18:53:29
    赞同 展开评论 打赏
问答分类:
问答标签:
问答地址:
问答排行榜
最热
最新

相关电子书

更多
低代码开发师(初级)实战教程 立即下载
冬季实战营第三期:MySQL数据库进阶实战 立即下载
阿里巴巴DevOps 最佳实践手册 立即下载