Jsoup 1.10.1 发布,Java 的 HTML 解析器
欢迎加入运维网交流群:263444886http://onexin.iyunv.com/source/plugin/onexin_bigdata/https://my.oschina.net/img/hot3.pngJsoup 1.10.1 发布了,Jsoup 是一款 Java 的HTML 解析器,可直接解析某个URL地址、HTML文本内容。它提供了一套非常省力的API,可通过DOM,CSS以及类似于JQuery的操作方法来取出和操作数据。更新内容如下:
改进
[*] Improved support for extended HTML entities, including supplemental characters and multiple character references. Also reduced memory consumption of the entity tables.
[*] Added support for *|E wildcard namespace selectors.
[*] Added support for setting multiple connection headers in Jsoup.connect at once with Connection.headers(Map)
[*] Added support for setting/overriding the response character set in Connection.Response, for cases where the charset is not defined by the server, or is defined incorrectly.
[*]
Improved the performance of>
[*] Improved performance of HTML output by reducing the creation of temporary attribute list iterators.
修复
[*] Fixed an issue when converting to the W3CDom XML, where valid (but ugly) HTML attribute names containing characters like " could not be converted into valid XML attribute names. These attribute names are now normalized if possible, or not added to the XML DOM.
[*] Fixed an OOB exception when loading an empty-body URL and parsing with the XML parser.
[*] Fixed an issue where attribute names starting with a slash would be parsed incorrectly.
[*] Don't reuse charset encoders from OutputSettings, to make threadsafe.
[*] Fixed an issue in connections with a requestBody where a custom content-type header could be ignored.
点此查看完整更新内容和发行说明
下载地址:
[*] https://jsoup.org/download
[*] Source code (zip)
[*] Source code (tar.gz)
页:
[1]