Puppeteer Unable to scrape data in headless mode but able to scrape in non-headless mode . How to fix?

When I Run my node API in headless: false mode then it could open a browser instance and I can get the data. but when I use headless: true then it shows access denied and doesn't scrape data. My code below.

(async () => {
const browser = await puppeteer.launch({
  headless: false
});
const page = await browser.pages();
await page[0].goto(url);

const my = await page[0].evaluate(() => {

  let title = document.getElementsByClassName('p-name')[0].innerHTML.trim();
  return title;
});
console.log(my);
res.status(200).json(my);
await browser.close();})(); 

I search for a solution and found this one (Puppeteer opens an empty tab in non-headless mode). This unable to solve my problem completely. This helped me to close the additional browsers that open. Thanks in advance.

This Url I wanna scrape is : https://www.macys.com/shop/product/nike-big-boys-sportswear-t-shirt?ID=11252136&CategoryID=6086&swatchColor=Dark%20Gray%20Heather

1 answer

  • answered 2021-05-06 09:21 madhu P

    I think you have to set user-agent.

    await page[0].setUserAgent("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36")
    

    Below code worked for me.

    const puppeteer = require("puppeteer")
    async function test () {
    const browser = await puppeteer.launch({
      headless: true
    });
    const page = await browser.pages();
    await page[0].setUserAgent("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36")
    await page[0].goto("https://www.macys.com/shop/product/nike-big-boys-sportswear-t-shirt?ID=11252136&CategoryID=6086&swatchColor=Dark%20Gray%20Heather");
    await page[0].screenshot({path: 'screenshot.png'});
    const my = await page[0].evaluate(() => {
      
      let title = document.getElementsByClassName('p-name')[0].innerHTML.trim();
      return title;
    });
    console.log(my);
    
    await browser.close();
    }; 
    
    test();