How can I print any website content? (Using something like my code)

I want to open the website and get its content, store it in a variable and print it

from urllib.request import urlopen

url = any_website

content = urlopen(url).read().decode('utf-8')

print(content)

The expected result is that I get what is written in the page

1 answer

  • answered 2019-07-10 22:11 Harry_pb

    In python, there are several libraries you may be interested in. An example of printing contents to get you started below:-

    from bs4 import BeautifulSoup as soup
    import requests
    url = "https://en.wikipedia.org/wiki/List_of_multinational_corporations"
    page = requests.get(url)
    page_html = (page.content)
    page_soup = soup(page_html, "html.parser")
    print (page_soup)
    

    with urlopen, you may try as below

    from bs4 import BeautifulSoup
    import urllib
    url = "https://en.wikipedia.org/wiki/List_of_multinational_corporations"
    r = urllib.urlopen(url).read()
    soup = BeautifulSoup(r)
    print type(soup)
    print (soup.prettify()[0:1000])