lizongbo at 618119.com 工作,生活,Android,前端,Linode,Ubuntu,nginx,java,apache,tomcat,Resin,mina,Hessian,XMPP,RPC

2011年08月21日

手机wap网站开发过程中通过dtd校验来检查wap1.0和wap2.0的页面语法

Filed under: Xhtml — 标签:, , , , , , , — lizongbo @ 13:40

手机wap网站开发过程中通过dtd校验来检查wap1.0和wap2.0的页面语法

对手机wap网站的wap1.0和wap2.0页面内容进行dtd的校验非常有价值,既可以保证页面对移动终端的兼容性,也能促使开发人员写出高质量的代码。

以前早期在做普通web页面开发的时候,对html的语法一知半解就开搞,标签不闭合,嵌套错误等问题从来没注意过,只管在IE6里打开能正常显示即可,
结果带来的后果就是在页面某个位置需要插入一段新内容时,整个页面就错位了,变得非常难看,也不知道怎么样来搞好,痛苦万分。

后来进行的是wap页面开发,wap页面用的是wap1.0的xml,加上各种山寨手机对wap页面的解析可不像电脑上的浏览器那样可以对错误的写法进行兼容,于是各种兼容性问题就出来了。
因此学会了用m3gate打开页面进行检查,只有m3gate打开时不出错的页面,才能放心的在各种山寨上正常显示而不出错。
再到后来,经过学习和总结,发现m3gate的检查其实和dtd校验的效果是一样的,于是摸索出通过dtd校验来发现标签嵌套错误等问题。

移动终端的发展是日新月异,很快就进入到了wap2.0时代,当然这个时候的手机浏览器对wap2.0的兼容性就好多了。
一些小的错误,也能兼容掉,但是难免也出现莫名其妙的错误,由于wap2.0的xml元素定义用的也还是dtd,因此继续用dtd校验来搞定。

1.0和wap2.0的自动化监控校验流程如下:
测试工具访问各个页面,获取对应的xml内容,保存为字符串,然后将页面的dtd声明部分,替换为对应的本地声明映射(这样避免联网获取dtd文件而影响性能),
生成新的字符串,然后使用java的xml parser进行解析,解析过程中如果有抛出各种异常,那就是把页面内容有错的地方全部揪出来了,再通知开发人员进行页面内容的bug修复。
通常比如img元素缺少alt属性等问题都可以及时发现。

正式环境则通过自动化监控脚本定期访问业务进行检查。

进行dtd校验的相关代码如下:

1.将远程的dtd等实体解析映射到本地的LocalEntityResolver(自动化监控中是通过字符串替换来将dtd的声明替换为本地路径,我是直接通过EntityResolver来进行映射):

[code]
package com.lizongbo.xml;

import java.io.IOException;

import org.xml.sax.EntityResolver;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

/**
* 将xml组件依赖的远程的schema文件都映射到本地
*
* @author lizongbo
*
*/
public class LocalEntityResolver implements EntityResolver {

@Override
public InputSource resolveEntity(String publicId, String systemId)
throws SAXException, IOException {
System.out.println(“publicId==” + publicId + “,systemId==” + systemId);
if (“-//W3C// XHTML 1.0 Strict//EN”.equals(publicId)) {
String url = “/com/lizongbo//validation/dtds/www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”;
System.out.println(systemId + ” LocalEntityResolver yyyy map to:”
+ url);
System.out.println(String.class.getResource(url));
if (String.class.getResource(url) != null) {
InputSource is = new InputSource(
String.class.getResourceAsStream(url));
is.setPublicId(publicId);
is.setSystemId(systemId);
return is;
}
}
if (systemId != null) {
String ids[] = new String[] { “/xhtml-attribs-1.mod”,
“/xhtml-notations-1.mod”, “/xhtml-datatypes-1.mod”,
“/xhtml-qname-1.mod”, “/xhtml-mobile10-model-1.mod”,
“/xhtml-mobile10-model-1.mod”, “xhtml-charent-1.mod”,
“/xhtml-symbol.ent”, “/xhtml-lat1.ent”,
“/xhtml-special.ent”, “/xhtml-inlstruct-1.mod”,
“/xhtml-inlphras-1.mod”, “/xhtml-blkstruct-1.mod”,
“xhtml-blkphras-1.mod”, “/xhtml-lat1.ent”,
“/xhtml-lat1.ent” };

for (int i = 0; i < ids.length; i++) {
if (systemId.endsWith(ids[i])) {
String url = “/com/lizongbo/web/validation/dtds/www.w3.org/xhtml-modularization/DTD/”
+ systemId.substring(systemId.lastIndexOf(“/”) + 1);
System.out.println(” local ” + systemId + ” try map to:”
+ url);
if (String.class.getResource(url) == null) {
url = “/com/lizongbo/web/validation/dtds/www.w3.org/TR/xhtml-modularization/DTD/”
+ systemId
.substring(systemId.lastIndexOf(“/”) + 1);
System.out.println(” local ” + systemId
+ ” retry map to:” + url);
}
if (String.class.getResource(url) == null) {
url = “/com/lizongbo/web/validation/dtds/www.wapforum.org/DTD/”
+ systemId
.substring(systemId.lastIndexOf(“/”) + 1);
System.out.println(” local ” + systemId
+ ” retry map to:” + url);
}
System.out.println(“try get ”
+ String.class.getResource(url));
if (String.class.getResource(url) != null) {
InputSource is = new InputSource(
String.class.getResourceAsStream(url));
is.setPublicId(publicId);
is.setSystemId(systemId);
return is;
}
}
}
if (systemId.endsWith(“.xsd”)) {
String url = “/com/lizongbo/web/validation/xsds/”
+ systemId.replace(“http://”, “”);
System.out.println(systemId + ” LocalEntityResolver 22 map to:”
+ url);
System.out.println(String.class.getResource(url));
if (String.class.getResource(url) != null) {
InputSource is = new InputSource(
String.class.getResourceAsStream(url));
is.setPublicId(publicId);
is.setSystemId(systemId);
return is;
}
} else {
String url = “/com/lizongbo/web/validation/dtds/”
+ systemId.replace(“http://”, “”);
System.out.println(systemId + ” LocalEntityResolver 11 map to:”
+ url);
System.out.println(String.class.getResource(url));
if (String.class.getResource(url) == null) {
url = “/com/lizongbo/web/validation/dtds/www.w3.org/TR/xhtml-modularization/DTD/”
+ systemId.replace(“http://”, “”);
System.out.println(” local ” + systemId + ” retry map to:”
+ url);
}
if (String.class.getResource(url) != null) {
InputSource is = new InputSource(
String.class.getResourceAsStream(url));
is.setPublicId(publicId);
is.setSystemId(systemId);
return is;
}
}
}
return null;
}
}

[/code]

2.实际校验的代码:
[code]
package com.lizongbo.xml;

import java.io.*;
import java.net.*;

import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

import javax.xml.parsers.*;

/**
* DTD校验工具类
*
* @author lizongbo
*
*/
public class DTDUtil {

/**
* @param args
* @throws SAXException
* @throws ParserConfigurationException
*/
public static void main(String[] args) throws SAXException,
ParserConfigurationException {
String urls[] = new String[] { “http://3g.sina.com.cn/?vt=1&pos=200”,// 新浪wap1.0
http://3g.sina.com.cn/?vt=3&pos=200”,// 新浪wap2.0
http://wap.sohu.com/?nid=1&v=1”,// 搜狐
http://wap.sohu.com/?nid=1&v=3”,// 搜狐
http://info50.3g.qq.com/g/s?aid=index&g_ut=1”,// 手机腾讯网wap1.0
http://info50.3g.qq.com/g/s?aid=index&g_ut=2”, // 手机腾讯网wap2.0
};
for (int i = 0; i < urls.length; i++) {
try {
String urlStr = urls[i];
String xml = downloadUrl(urlStr, null, “UTF-8”);
// System.out.println(xml);
validateWap(xml);
} catch (Exception e) {
e.printStackTrace();
}
}
System.out.println(“Over”);
}

/**
* 通过dtd校验wap2.0页面内容,如果抛出异常则说明页面内容有错
*
* @param xml
* @throws ParserConfigurationException
* @throws SAXException
* @throws IOException
*/
public static void validateWap(String xml)
throws ParserConfigurationException, SAXException, IOException {
// //xml = setDTDforWap20(xml);
// //System.out.println(xml);
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(true);
DocumentBuilder db = dbf.newDocumentBuilder();
db.setEntityResolver(new LocalEntityResolver());
Document doc = db.parse(new InputSource(new StringReader(xml)));
System.out.println(doc.getFirstChild());
}

public static String downloadUrl(String urlStr, String referer,
String encoding) throws Exception {
String line = “”;
StringBuilder sb = new StringBuilder();
HttpURLConnection httpConn = null;
try {
URL url = new URL(urlStr);
System.out.println(urlStr);
Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress(
“proxy.lizongbo.com”, 8080));
proxy = Proxy.NO_PROXY;
httpConn = (HttpURLConnection) url.openConnection(proxy);
httpConn.setRequestProperty(“User-Agent”, “Nokia”);
httpConn.setDoInput(true);
httpConn.setDoOutput(true);
httpConn.setRequestMethod(“GET”);
if (referer != null) {
httpConn.setRequestProperty(“Referer”, referer);
}
httpConn.setConnectTimeout(1000);
BufferedReader in = null;
if (httpConn.getResponseCode() != 200) {
System.err.println(“error:” + httpConn.getResponseMessage());
in = new BufferedReader(new InputStreamReader(
httpConn.getErrorStream(), “UTF-8”));
} else {
in = new BufferedReader(new InputStreamReader(
httpConn.getInputStream(), “UTF-8”));
}
while ((line = in.readLine()) != null) {
sb.append(line).append(‘\n’);
}
System.out.println(httpConn.getHeaderFields());
// 关闭连接
httpConn.disconnect();
return sb.toString();
} catch (Exception e) {
// 关闭连接
if (httpConn != null) {
try {
httpConn.disconnect();
} catch (Exception e2) {
// TODO: handle exception
}
}
System.out.println(e.getMessage());
throw e;
}
}
}

[/code]

通过上面的例子,可以看到新浪,搜狐,手机腾讯网的wap1.0都校验ok。
而新浪的wap2.0(3G版)使用的声明不是http://www.wapforum.org/DTD/xhtml-mobile10.dtd,却是:xhtml1-strict.dtd

按xhtml1-strict.DTD校验的出错信息为:
Error: URI=null Line=127: Attribute “width” must be declared for element type “input”.
Error: URI=null Line=127: Attribute “height” must be declared for element type “input”.
Error: URI=null Line=127: The content of element type “form” must match “(p|h1|h2|h3|h4|h5|h6|div|ul|ol|dl|pre|hr|blockquote|address|fieldset|table|noscript|ins|del|script)*”.
Error: URI=null Line=155: The content of element type “form” must match “(p|h1|h2|h3|h4|h5|h6|div|ul|ol|dl|pre|hr|blockquote|address|fieldset|table|noscript|ins|del|script)*”.
Error: URI=null Line=236: Attribute value “gototop” of type ID must be unique within the document.
搜狐的wap2.0(彩版)虽然使用的dtd声明是:http://www.wapforum.org/DTD/xhtml-mobile10.dtd,但是页面内容中却使用xhtml1所定义的元素。
按xhtml-mobile10.dtd校验的出错信息为:
Error: URI=null Line=63: Element type “script” must be declared.
Error: URI=null Line=99: The content of element type “div” must match “(h1|h2|h3|h4|h5|h6|ul|ol|dl|p|div|pre|blockquote|address|hr|table|form|fieldset|br|span|em|strong|dfn|code|samp|kbd|var|cite|abbr|acronym|q|i|b|big|small|a|img|object|input|select|textarea|label)”.
Error: URI=null Line=100: Attribute “name” must be declared for element type “form”.
Error: URI=null Line=100: Attribute “accept-charset” must be declared for element type “form”.
Error: URI=null Line=107: Attribute “onclick” must be declared for element type “a”.
Error: URI=null Line=108: Attribute “onclick” must be declared for element type “a”.
Error: URI=null Line=109: Attribute “onclick” must be declared for element type “a”.
Error: URI=null Line=110: The content of element type “form” must match “(h1|h2|h3|h4|h5|h6|ul|ol|dl|p|div|pre|blockquote|address|hr|table|fieldset)+”.
Error: URI=null Line=112: Element type “font” must be declared.
Error: URI=null Line=112: The content of element type “div” must match “(h1|h2|h3|h4|h5|h6|ul|ol|dl|p|div|pre|blockquote|address|hr|table|form|fieldset|br|span|em|strong|dfn|code|samp|kbd|var|cite|abbr|acronym|q|i|b|big|small|a|img|object|input|select|textarea|label)”.

手机腾讯网的wap2.0使用xhtml-mobile10.dtd进行校验,出错信息最少,只有一条:
Error: URI=null Line=141: The content of element type “form” must match “(h1|h2|h3|h4|h5|h6|ul|ol|dl|p|div|pre|blockquote|address|hr|table|fieldset)+”.

 

 

没有评论 »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress