Extract code of a web page
Hello everyone,
I tried to extract the source code of a web page in java.
So, my class takes as argument the link (http //....) of the page, and creates an output text file containing the source code.
The program works very well except for google.
When I insert a link to a page of google results, output file does not contain the true code for this page.
To understand what I meant by the true source of a page of google results, compare the code obtained by Firefox (view -> source) and that obtained using google chrome (options for developers -> view source).
A page of google résltats is the page that appears when you search.
My Java code is as follows:
public class test {
public static void getIpFrom(String adresse) {
try {
URL url = new URL(adresse);
URLConnection uc = url.openConnection();
InputStream in = uc.getInputStream();
FileOutputStream fos = new FileOutputStream(new File("source.txt"));
int n =0;
while((n = in.read()) >= 0)
{
fos.write(n);
}
in.close();
fos.close();
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
getIpFrom("web link");
}
}