Using request in python to download a xls file

In this page you will find a link to download an xls file (below attachment or adjuntos): https://www.banrep.gov.co/es/emisiones-vigentes-el-dcv

The link to download the xls file is: https://www.banrep.gov.co/sites/default/files/paginas/emisiones/EMISIONES.xls

I was using this code to automatically download that file:

import requests
import os

path = os.path.abspath(os.getcwd()) #donde se descargará el archivo

path = path.replace("\\", '/')+'/'

url = 'https://www.banrep.gov.co/sites/default/files/paginas/emisiones/EMISIONES.xls'

myfile = requests.get(url, verify=False)

open(path+'EMISIONES.xls', 'wb').write(myfile.content)

This code was working well, but suddently the downloaded file started being corrupted.

If I run the code, it raises this warning:

InsecureRequestWarning: Unverified HTTPS request is being made to host 'www.banrep.gov.co'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(

1 answer

  • answered 2022-01-23 03:05 Cristian Quintero

    The error is related to how your request is being built. The status_code returned by the request is 403 [Forbiden]. You can see it typing

    myfile.status_code
    

    I guess the security issue is related to cookies and headers in your get request, because of that I suggest you take a view on how the webpage is building its headers in your request before the URL you're using is sent.

    TIP: start you web browser in development mode and using Network tab, try to identify the headers.

    To solve the issue of cookies take a view on how to retrieve naturally cookies pointing out to a previous webpage in www.banrep.gov.co, using requests.sessions

    session_ = requests.Session()
    

    Before coding you could try to test your requests using Postman, or other REST API test software.

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum