2016-08-28 12 views
5

Próbuję uruchomić następujący kod, bezskutecznie. Według mojej wiedzy nie ma żadnych błędów składniowych.Problemy z pandami Pythona: read_html i instalacja python3-lxml

import quandl 
import pandas as pd 

fifty_states =pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states') 
print(fifty_states) 

Dostaję następujący błąd, gdy uruchamiam ten kod:

Traceback (most recent call last):

File "C:/Users/Dave/Documents/Python Files/helloworld.py", line 15, in fiddy_states = pd.read_html(' http://simple.wikipedia.org/wiki/List_of_U.S._states ')

File "C:\Python35\lib\site-packages\pandas\io\html.py", line 874, in read_html parse_dates, tupleize_cols, thousands, attrs, encoding)

File "C:\Python35\lib\site-packages\pandas\io\html.py", line 726, in _parse parser = _parser_dispatch(flav)

File "C:\Python35\lib\site-packages\pandas\io\html.py", line 685, in _parser_dispatch raise ImportError("lxml not found, please install it")

ImportError: lxml not found, please install it

Nie jestem pewien, dlaczego to dzieje się, jak ja (powinny) mieć wszystkie pakiety wymagane do uruchomienia tego kodu. Mam problem z instalacją lxml i python3-lxml, ponieważ pakiety nie mogą zostać zainstalowane. Jako kopię zapasową, Mam zainstalowane następujące:

python-dev libxml2-dev libxslt1-dev zlib1g-dev

oprócz „html5lib”, które czytałem to odpowiedni zamiennik do lxml.

Nie jestem pewien, co jeszcze można zrobić w tym momencie, ponieważ wyszukiwanie podobnych korekt (tj. Instalowanie lxml) nie dotyczy mnie (nie mogę zainstalować lxml w żadnym formacie przez pip w wierszu poleceń).

Każda pomoc jest doceniana.

Edytuj: Wygląda na to, że lxml nigdy nie był zainstalowany na moim komputerze. To dziwne, ponieważ nie mogę go zainstalować przez pip install lxml. Here're dzienniki błąd pojawia się podczas próby instalacyjnego:

Collecting lxml 
    Using cached lxml-3.6.4.tar.gz 
Building wheels for collected packages: lxml 
    Running setup.py bdist_wheel for lxml ... error 
    Complete output from command c:\python35\python.exe -u -c "import setuptools, 
tokenize;__file__='C:\\Users\\Dwang\\AppData\\Local\\Temp\\pip-build-738bf61u\\l 
xml\\setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().rep 
lace('\r\n', '\n'), __file__, 'exec'))" bdist_wheel -d C:\Users\Dwang\AppData\Lo 
cal\Temp\tmpm9z4yol6pip-wheel- --python-tag cp35: 
    Building lxml version 3.6.4. 
    Building without Cython. 
    ERROR: b"'xslt-config' is not recognized as an internal or external command,\r 
\noperable program or batch file.\r\n" 
    ** make sure the development packages of libxml2 and libxslt are installed ** 

    Using build configuration of libxslt 
    running bdist_wheel 
    running build 
    running build_py 
    creating build 
    creating build\lib.win-amd64-3.5 
    creating build\lib.win-amd64-3.5\lxml 
    copying src\lxml\builder.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\cssselect.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\doctestcompare.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\ElementInclude.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\pyclasslookup.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\sax.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\usedoctest.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\_elementpath.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\__init__.py -> build\lib.win-amd64-3.5\lxml 
    creating build\lib.win-amd64-3.5\lxml\includes 
    copying src\lxml\includes\__init__.py -> build\lib.win-amd64-3.5\lxml\includes 

    creating build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\builder.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\clean.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\defs.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\diff.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\ElementSoup.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\formfill.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\html5parser.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\soupparser.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\usedoctest.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\_diffcommand.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\_html5builder.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\_setmixin.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\__init__.py -> build\lib.win-amd64-3.5\lxml\html 
    creating build\lib.win-amd64-3.5\lxml\isoschematron 
    copying src\lxml\isoschematron\__init__.py -> build\lib.win-amd64-3.5\lxml\iso 
schematron 
    copying src\lxml\lxml.etree.h -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\lxml.etree_api.h -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\includes\c14n.pxd -> build\lib.win-amd64-3.5\lxml\includes 
    copying src\lxml\includes\config.pxd -> build\lib.win-amd64-3.5\lxml\includes 
    copying src\lxml\includes\dtdvalid.pxd -> build\lib.win-amd64-3.5\lxml\include 
s 
    copying src\lxml\includes\etreepublic.pxd -> build\lib.win-amd64-3.5\lxml\incl 
udes 
    copying src\lxml\includes\htmlparser.pxd -> build\lib.win-amd64-3.5\lxml\inclu 
des 
    copying src\lxml\includes\relaxng.pxd -> build\lib.win-amd64-3.5\lxml\includes 

    copying src\lxml\includes\schematron.pxd -> build\lib.win-amd64-3.5\lxml\inclu 
des 
    copying src\lxml\includes\tree.pxd -> build\lib.win-amd64-3.5\lxml\includes 
    copying src\lxml\includes\uri.pxd -> build\lib.win-amd64-3.5\lxml\includes 
    copying src\lxml\includes\xinclude.pxd -> build\lib.win-amd64-3.5\lxml\include 
s 
    copying src\lxml\includes\xmlerror.pxd -> build\lib.win-amd64-3.5\lxml\include 
s 
    copying src\lxml\includes\xmlparser.pxd -> build\lib.win-amd64-3.5\lxml\includ 
es 
    copying src\lxml\includes\xmlschema.pxd -> build\lib.win-amd64-3.5\lxml\includ 
es 
    copying src\lxml\includes\xpath.pxd -> build\lib.win-amd64-3.5\lxml\includes 
    copying src\lxml\includes\xslt.pxd -> build\lib.win-amd64-3.5\lxml\includes 
    copying src\lxml\includes\etree_defs.h -> build\lib.win-amd64-3.5\lxml\include 
s 
    copying src\lxml\includes\lxml-version.h -> build\lib.win-amd64-3.5\lxml\inclu 
des 
    creating build\lib.win-amd64-3.5\lxml\isoschematron\resources 
    creating build\lib.win-amd64-3.5\lxml\isoschematron\resources\rng 
    copying src\lxml\isoschematron\resources\rng\iso-schematron.rng -> build\lib.w 
in-amd64-3.5\lxml\isoschematron\resources\rng 
    creating build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl 
    copying src\lxml\isoschematron\resources\xsl\RNG2Schtrn.xsl -> build\lib.win-a 
md64-3.5\lxml\isoschematron\resources\xsl 
    copying src\lxml\isoschematron\resources\xsl\XSD2Schtrn.xsl -> build\lib.win-a 
md64-3.5\lxml\isoschematron\resources\xsl 
    creating build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schematr 
on-xslt1 
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_abstract 
_expand.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-sche 
matron-xslt1 
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_dsdl_inc 
lude.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schemat 
ron-xslt1 
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schematr 
on_message.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-s 
chematron-xslt1 
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schematr 
on_skeleton_for_xslt1.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resource 
s\xsl\iso-schematron-xslt1 
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_svrl_for 
_xslt1.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schem 
atron-xslt1 
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\readme.txt - 
> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schematron-xslt1 
    running build_ext 
    building 'lxml.etree' extension 
    error: Unable to find vcvarsall.bat 

    ---------------------------------------- 
    Failed building wheel for lxml 
    Running setup.py clean for lxml 
Failed to build lxml 
Installing collected packages: lxml 
    Running setup.py install for lxml ... error 
    Complete output from command c:\python35\python.exe -u -c "import setuptools 
, tokenize;__file__='C:\\Users\\Dwang\\AppData\\Local\\Temp\\pip-build-738bf61u\ 
\lxml\\setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().r 
eplace('\r\n', '\n'), __file__, 'exec'))" install --record C:\Users\Dwang\AppDat 
a\Local\Temp\pip-4_tf2u3a-record\install-record.txt --single-version-externally- 
managed --compile: 
    Building lxml version 3.6.4. 
    Building without Cython. 
    ERROR: b"'xslt-config' is not recognized as an internal or external command, 
\r\noperable program or batch file.\r\n" 
    ** make sure the development packages of libxml2 and libxslt are installed * 
* 

    Using build configuration of libxslt 
    running install 
    running build 
    running build_py 
    creating build 
    creating build\lib.win-amd64-3.5 
    creating build\lib.win-amd64-3.5\lxml 
    copying src\lxml\builder.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\cssselect.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\doctestcompare.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\ElementInclude.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\pyclasslookup.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\sax.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\usedoctest.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\_elementpath.py -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\__init__.py -> build\lib.win-amd64-3.5\lxml 
    creating build\lib.win-amd64-3.5\lxml\includes 
    copying src\lxml\includes\__init__.py -> build\lib.win-amd64-3.5\lxml\includ 
es 
    creating build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\builder.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\clean.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\defs.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\diff.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\ElementSoup.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\formfill.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\html5parser.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\soupparser.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\usedoctest.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\_diffcommand.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\_html5builder.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\_setmixin.py -> build\lib.win-amd64-3.5\lxml\html 
    copying src\lxml\html\__init__.py -> build\lib.win-amd64-3.5\lxml\html 
    creating build\lib.win-amd64-3.5\lxml\isoschematron 
    copying src\lxml\isoschematron\__init__.py -> build\lib.win-amd64-3.5\lxml\i 
soschematron 
    copying src\lxml\lxml.etree.h -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\lxml.etree_api.h -> build\lib.win-amd64-3.5\lxml 
    copying src\lxml\includes\c14n.pxd -> build\lib.win-amd64-3.5\lxml\includes 
    copying src\lxml\includes\config.pxd -> build\lib.win-amd64-3.5\lxml\include 
s 
    copying src\lxml\includes\dtdvalid.pxd -> build\lib.win-amd64-3.5\lxml\inclu 
des 
    copying src\lxml\includes\etreepublic.pxd -> build\lib.win-amd64-3.5\lxml\in 
cludes 
    copying src\lxml\includes\htmlparser.pxd -> build\lib.win-amd64-3.5\lxml\inc 
ludes 
    copying src\lxml\includes\relaxng.pxd -> build\lib.win-amd64-3.5\lxml\includ 
es 
    copying src\lxml\includes\schematron.pxd -> build\lib.win-amd64-3.5\lxml\inc 
ludes 
    copying src\lxml\includes\tree.pxd -> build\lib.win-amd64-3.5\lxml\includes 
    copying src\lxml\includes\uri.pxd -> build\lib.win-amd64-3.5\lxml\includes 
    copying src\lxml\includes\xinclude.pxd -> build\lib.win-amd64-3.5\lxml\inclu 
des 
    copying src\lxml\includes\xmlerror.pxd -> build\lib.win-amd64-3.5\lxml\inclu 
des 
    copying src\lxml\includes\xmlparser.pxd -> build\lib.win-amd64-3.5\lxml\incl 
udes 
    copying src\lxml\includes\xmlschema.pxd -> build\lib.win-amd64-3.5\lxml\incl 
udes 
    copying src\lxml\includes\xpath.pxd -> build\lib.win-amd64-3.5\lxml\includes 

    copying src\lxml\includes\xslt.pxd -> build\lib.win-amd64-3.5\lxml\includes 
    copying src\lxml\includes\etree_defs.h -> build\lib.win-amd64-3.5\lxml\inclu 
des 
    copying src\lxml\includes\lxml-version.h -> build\lib.win-amd64-3.5\lxml\inc 
ludes 
    creating build\lib.win-amd64-3.5\lxml\isoschematron\resources 
    creating build\lib.win-amd64-3.5\lxml\isoschematron\resources\rng 
    copying src\lxml\isoschematron\resources\rng\iso-schematron.rng -> build\lib 
.win-amd64-3.5\lxml\isoschematron\resources\rng 
    creating build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl 
    copying src\lxml\isoschematron\resources\xsl\RNG2Schtrn.xsl -> build\lib.win 
-amd64-3.5\lxml\isoschematron\resources\xsl 
    copying src\lxml\isoschematron\resources\xsl\XSD2Schtrn.xsl -> build\lib.win 
-amd64-3.5\lxml\isoschematron\resources\xsl 
    creating build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schema 
tron-xslt1 
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_abstra 
ct_expand.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-sc 
hematron-xslt1 
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_dsdl_i 
nclude.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schem 
atron-xslt1 
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schema 
tron_message.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso 
-schematron-xslt1 
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_schema 
tron_skeleton_for_xslt1.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resour 
ces\xsl\iso-schematron-xslt1 
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\iso_svrl_f 
or_xslt1.xsl -> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-sch 
ematron-xslt1 
    copying src\lxml\isoschematron\resources\xsl\iso-schematron-xslt1\readme.txt 
-> build\lib.win-amd64-3.5\lxml\isoschematron\resources\xsl\iso-schematron-xslt 
1 
    running build_ext 
    building 'lxml.etree' extension 
    error: Unable to find vcvarsall.bat 

    ---------------------------------------- 
Command "c:\python35\python.exe -u -c "import setuptools, tokenize;__file__='C:\ 
\Users\\Dwang\\AppData\\Local\\Temp\\pip-build-738bf61u\\lxml\\setup.py';exec(co 
mpile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __ 
file__, 'exec'))" install --record C:\Users\Dwang\AppData\Local\Temp\pip-4_tf2u3 
a-record\install-record.txt --single-version-externally-managed --compile" faile 
d with error code 1 in C:\Users\Dwang\AppData\Local\Temp\pip-build-738bf61u\lxml 
\ 

Odpowiedz

5

z tego co rozumiem i zgodnie z docs, jeśli read_html() nie używać lxml, powinna ona spaść z powrotem do html5lib, ale wygląda na IKE to robi nie dzieje się w twoim przypadku i zostanie zgłoszony błąd.

Spróbuj wyraźnie podać flavor:

fifty_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states', flavor='html5lib`) 
+0

Matko ... Dzięki! To zadziałało :) Czy musiałbym to zrobić w każdym przypadku, w którym mam 'read_html()' od teraz? Również masz pojęcia, dlaczego mój 'lxml' nie działa? – wowdavers

+0

@ D.Wang Odtworzyłem problem w środowisku wirtualnym - odinstalowałem plik lxml i wykonałem Twój kod w niezmienionym stanie - dokładnie tak, jak w twoim przypadku, ponieważ 'lxml nie został znaleziony, zainstaluj go'. Następnie zainstalowałem 'lxml' ponownie przez 'pip install lxml' będąc w aktywowanym środowisku wirtualnym i ponownie wykonałem kod - działał. Upewnij się, że faktycznie 'lxml' jest zainstalowany w tym samym środowisku, w którym uruchamiasz skrypt. – alecxe

+1

Miałem wrażenie, że' lxml' został fabrycznie zainstalowany z instalacją innego pakietu. Wydaje się, że nie ... Wygląda na to, że 'lxml' nie jest zainstalowany na moim komputerze. Jednak gdy próbuję go zainstalować, otrzymuję kod błędu, mówiąc, że nie można go zbudować. Edytowałem mój oryginalny wpis z dziennikami błędów. – wowdavers