public class WebSpider { public static void main(String[] args) throws Exception { String urlString = "http://lggege.iteye.com/blog/173840"; URL url = new URL(urlString); Object contentObj = url.getContent(); if (contentObj instanceof InputStream) { new InputStreamReader((InputStream) contentObj); BufferedReader br = new BufferedReader(new InputStreamReader((InputStream) contentObj)); StringBuffer sb = new StringBuffer(); while (br.ready()) { sb.append(br.readLine()); } // 这步还需要处理编码问题. System.out.println(new String(sb.toString().getBytes(), "UTF-8")); } } }
上面是代码.
在这步:
Object contentObj = url.getContent();
是真正向URL服务器请求得到数据,也就是页面源代码.