描述问题
详细过程见代码
上下文环境
---
Metadata-Version: 1.1
Name: lxml
Version: 3.5.0
Summary: Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.
Home-page: http://lxml.de/
Author: lxml dev team
Author-email: lxml-dev@lxml.de
License: UNKNOWN
Location: /usr/lib/python2.7/dist-packages
Requires:
Classifiers:
Development Status :: 5 - Production/Stable
Intended Audience :: Developers
Intended Audience :: Information Technology
License :: OSI Approved :: BSD License
Programming Language :: Cython
Programming Language :: Python :: 2
Programming Language :: Python :: 2.6
Programming Language :: Python :: 2.7
Programming Language :: Python :: 3
Programming Language :: Python :: 3.2
Programming Language :: Python :: 3.3
Programming Language :: Python :: 3.4
Programming Language :: Python :: 3.5
Programming Language :: C
Operating System :: OS Independent
Topic :: Text Processing :: Markup :: HTML
Topic :: Text Processing :: Markup :: XML
Topic :: Software Development :: Libraries :: Python Modules
重现
相关代码
from lxml.etree import XML
doc = """
<?xml version="1.0" encoding="utf-8"?>
<wikimedia>
<projects>
<project name="Wikipedia" launch="2001-01-05">
<editions>
<edition language="English">en.wikipedia.org</edition>
<edition language="German">de.wikipedia.org</edition>
<edition language="French">fr.wikipedia.org</edition>
<edition language="Polish">pl.wikipedia.org</edition>
<edition language="Spanish">es.wikipedia.org</edition>
</editions>
</project>
<project name="Wiktionary" launch="2002-12-12">
<editions>
<edition language="English">en.wiktionary.org</edition>
<edition language="French">fr.wiktionary.org</edition>
<edition language="Vietnamese">vi.wiktionary.org</edition>
<edition language="Turkish">tr.wiktionary.org</edition>
<edition language="Spanish">es.wiktionary.org</edition>
</editions>
</project>
</projects>
</wikimedia>"""
page = XML(doc)
报错信息
XMLSyntaxError: XML declaration allowed only at the start of the document, line 2, column 6
相关截图
已经尝试哪些方法仍然没有解决(附上相关链接)
google-搜索的信息, 但是看不懂
问题简化
xml字符串如果第一行是文档声明,一定要放在第一行,这个是标准,后面不要有"\n"
https://msdn.microsoft.com/zh-cn/library/ms256048.aspx
或者你用代码strip下
doc = """<?xml version="1.0" encoding="utf-8"?>
<wikimedia>
<projects>
<project name="Wikipedia" launch="2001-01-05">
<editions>
<edition language="English">en.wikipedia.org</edition>
<edition language="German">de.wikipedia.org</edition>
<edition language="French">fr.wikipedia.org</edition>
<edition language="Polish">pl.wikipedia.org</edition>
<edition language="Spanish">es.wikipedia.org</edition>
</editions>
</project>
<project name="Wiktionary" launch="2002-12-12">
<editions>
<edition language="English">en.wiktionary.org</edition>
<edition language="French">fr.wiktionary.org</edition>
<edition language="Vietnamese">vi.wiktionary.org</edition>
<edition language="Turkish">tr.wiktionary.org</edition>
<edition language="Spanish">es.wiktionary.org</edition>
</editions>
</project>
</projects>
</wikimedia>"""