Java开源Web数据抽取工具: Web-Harvest

jopen 12年前

Web-Harvest是一个Java开源Web数据抽取工具。它能够收集指定的Web页面并从这些页面中提取有用的数据。Web-Harvest主要是运用了像XSLT,XQuery,正则表达式等这些技术来实现对text/xml的操作。

1. Welcome screen with quick links

Java开源Web数据抽取工具: Web-Harvest

2. Web-Harvest XML editing with auto-completion support (Ctrl + Space)

Java开源Web数据抽取工具: Web-Harvest

3. Defining initial variables that are pushed to the Web-Harvest context before execution starts

Java开源Web数据抽取工具: Web-Harvest

4. Settings dialog

Java开源Web数据抽取工具: Web-Harvest

5. Viewing execution result as XML and testing XPath expression agains it

Java开源Web数据抽取工具: Web-Harvest

6. Viewing download images while execution in progress

Java开源Web数据抽取工具: Web-Harvest

7. Checking attributes of HTTP execution

Java开源Web数据抽取工具: Web-Harvest

8. Debugging

Java开源Web数据抽取工具: Web-Harvest

</div>

项目主页:http://www.open-open.com/lib/view/home/1350031305025