Here's the NewsEntity.java:
[sourcecode language='java']
package org.gandhim.news.model;
/**
*
* @author gandhim
*/
public class NewsEntity {
private Long id;
private String datePublished;
private String link;
private String subTitle;
private String title;
public void setId(Long id) {
this.id = id;
}
public Long getId() {
return id;
}
public String getLink() {
return link;
}
public void setDatePublished(String datePublished) {
this.datePublished = datePublished;
}
public void setLink(String link) {
this.link = link;
}
public void setSubTitle(String subTitle) {
this.subTitle = subTitle;
}
public void setTitle(String title) {
this.title = title;
}
public NewsEntity(String datePublished, String link, String subTitle, String title) {
this.datePublished = datePublished;
this.link = link;
this.subTitle = subTitle;
this.title = title;
}
public String getDatePublished() {
return datePublished;
}
public String getSubTitle() {
return subTitle;
}
public String getTitle() {
return title;
}
}
[/sourcecode]
And this is the parser, DetikParser.java:
[sourcecode language='java']
package org.gandhim.news.parser;
import java.io.*;
import java.nio.*;
import java.net.*;
import java.util.List;
import java.util.LinkedList;
import org.gandhim.news.model.NewsEntity;
/**
*
* @author gandhim
*/
public class DetikParser {
private static List
private static String url = "http://www.detik.com/indexberita/index.php?fuseaction=indeks.berita&idkanal=10";
public static List
setupProxy();
return parseHtml(getRawHtml());
}
public static void setupProxy() {
System.setProperty("http.proxyHost", "your_proxy_server_ip");
System.setProperty("http.proxyPort", "8080");
}
public static String getRawHtml() {
String rawHtml = "";
try {
URL detik = new URL(url);
URLConnection detikConnection = detik.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(detikConnection.getInputStream()));
String readLine;
while ((readLine = in.readLine()) != null) {
rawHtml += readLine + "\n";
}
} catch (Exception e) {
System.out.println("Error occurred: " + e.getMessage());
}
return rawHtml;
}
public static List
if (rawHtml == null || rawHtml.trim().equals("")) {
System.exit(-1);
}
String startNews = "namakanalindex";
int startPos, endPos;
startPos = rawHtml.indexOf(startNews);
startPos = rawHtml.indexOf("
- ", startPos) + "
- ");
int awal, akhir;
String datePublished, link, subTitle, title;
for (int i = 1; i < news.length; i++) {
datePublished = "";
link = "";
subTitle = "";
title = "";
// untuk tanggal publikasi
awal = news[i].indexOf("") +
"".length();
akhir = news[i].indexOf("");
datePublished = news[i].substring(awal, akhir);
// untuk link
awal = news[i].indexOf(" " akhir = news[i].indexOf("\" class=\"judulindex\">");
link = news[i].substring(awal, akhir);
// untuk subjudul (jika ada)
awal = news[i].indexOf("") +
"".length();
akhir = news[i].indexOf("");
if (akhir != -1) {
subTitle = news[i].substring(awal, akhir);
}
// untuk judul
awal = news[i].indexOf("class=\"judulindex\">") +
"class=\"judulindex\">".length();
akhir = news[i].indexOf("");
title = news[i].substring(awal, akhir);
NewsEntity aNews = new NewsEntity(datePublished, link, subTitle, title);
// simpan ke list untuk ditampilkan di Readers
newsList.add(aNews);
}
return newsList;
}
}
[/sourcecode]
Now you'll just need to write a code to store each of the NewsEntity from the list to a database. I suggest to use Hibernate.
- ".length();
endPos = rawHtml.indexOf("
String[] news = rawHtml.substring(startPos, endPos).trim().split("
i think Hibernate is nightmare for me sir...
ReplyDeletesince i got it in my class.. :D
but the taste NHibernate is good for me... hahaha...
mm... yami..yami..
Then I guess it's because MS provided tools to make coding with NHibernate is easier right?
ReplyDeleteIt doesn't matter whether you use Hibernate or NHibernate, once you grab the concept then the tool is a matter of syntax "only" :)