快捷導(dǎo)航

node.js突破nginx防盜鏈機(jī)制，下載圖片案例分析原創(chuàng)

原創(chuàng) 更新時(shí)間：2023年04月15日 15:58:19 原創(chuàng) 投稿：shichen2014

這篇文章主要介紹了node.js突破nginx防盜鏈機(jī)制，下載圖片的方法,結(jié)合具體案例形式分析了防盜鏈的相關(guān)原理與node.js使用axios庫(kù)下載防盜鏈圖片的相關(guān)操作技巧,需要的朋友可以參考下

問(wèn)題

今天項(xiàng)目需求要求采集幾個(gè)網(wǎng)站的信息，包括一些區(qū)塊鏈統(tǒng)計(jì)圖表之類(lèi)的信息。

筆者使用的是node.js+axios庫(kù)發(fā)送get請(qǐng)求來(lái)獲取在圖片，下載到本地。測(cè)試代碼如下：

import fs from 'fs';
import path from 'path';
import http from 'http';
import https from 'https';

const __dirname = path.resolve();
let filePath = path.join(__dirname,'/imgtmp/');
async function downloadfile(url,filename,callback){
    try {
        let ext = path.extname(url);
        
        console.log('下載的文件名：',filename)
        let mod = null;//http、https 別名
        if(url.indexOf('https://')!==-1){
            mod = https;
        }else{
            mod = http;
        }
        const req = mod.get(url, {
            headers:{
                "Content-Type": "application/x-www-form-urlencoded"
              }
        },(res)=>{
            let writePath = '';
            writePath = filePath + '/' + filename;
            const file = fs.createWriteStream(writePath)
            res.pipe (file)
            file.on ("error", (error) => {
                console.log (`There was an error writing the file. Details: `,error)
                return false;
            })
            file.on ("close", () => {
                callback (filename)
            })

            file.on ('finish', () => {
                file.close ()
                console.log ("Completely downloaded.")
            })
        })

        req.on ("error", (error) => {
            console.log (`Error downloading file. Details: $ {error}`)
        })
    } catch (error) {
        console.log('圖片下載失??！',error);
    }
    
}

let url = 'https://xx.xxxx.com/d/file/zxgg/a2cffb8166f07c0232eca49f8c9cc242.jpg';//圖片url
let filename = path.basename(url);
await downloadfile(url,filename,()=>{
    console.log(filename,"文件已下載成功");
})

運(yùn)行代碼，圖示文件下載成功！

然而當(dāng)筆者打開(kāi)圖片一看，就傻眼了~圖片顯示損壞，再看大小，只有304字節(jié)~

目測(cè)應(yīng)該是圖片保存了一些錯(cuò)誤信息，于是用editplus以文本形式打開(kāi)該圖片，果然看到了錯(cuò)誤信息~

解決方法

百度了一下，確定是圖片nginx服務(wù)器Referer防盜鏈設(shè)置，于是繼續(xù)百度，找到了問(wèn)題的關(guān)鍵~

谷歌瀏覽器打開(kāi)網(wǎng)址，在控制臺(tái)上看到了這段Referer信息：

對(duì)方的網(wǎng)站在Referer設(shè)置的就是他的網(wǎng)址，于是改進(jìn)代碼，在headers中加入Referer參數(shù)"referer":'https://www.xxxx.com/'：

import fs from 'fs';
import path from 'path';
import http from 'http';
import https from 'https';

const __dirname = path.resolve();
let filePath = path.join(__dirname,'/imgtmp/');
async function downloadfile(url,filename,callback){
    try {
        let ext = path.extname(url);
        
        console.log('下載的文件名：',filename)
        let mod = null;//http、https 別名
        if(url.indexOf('https://')!==-1){
            mod = https;
        }else{
            mod = http;
        }
        const req = mod.get(url, {
            headers:{
                "Content-Type": "application/x-www-form-urlencoded",
                "referer":'https://www.xxxx.com/'
              }
        },(res)=>{
            let writePath = '';
            writePath = filePath + '/' + filename;
            const file = fs.createWriteStream(writePath)
            res.pipe (file)
            file.on ("error", (error) => {
                console.log (`There was an error writing the file. Details: `,error)
                return false;
            })
            file.on ("close", () => {
                callback (filename)
            })

            file.on ('finish', () => {
                file.close ()
                console.log ("Completely downloaded.")
            })
        })

        req.on ("error", (error) => {
            console.log (`Error downloading file. Details: $ {error}`)
        })
    } catch (error) {
        console.log('圖片下載失?。?,error);
    }
    
}

let url = 'https://xx.xxxx.com/d/file/zxgg/a2cffb8166f07c0232eca49f8c9cc242.jpg';//圖片url
let filename = path.basename(url);
await downloadfile(url,filename,()=>{
    console.log(filename,"文件已下載成功");
})

再次運(yùn)行代碼，圖片文件下載成功，打開(kāi)顯示一切正常！

后記

筆者又測(cè)試了另一種實(shí)現(xiàn)方法，即使用playwright調(diào)用瀏覽器打開(kāi)頁(yè)面，再使用await page.locator('selector路徑').screenshot({ path: 'image圖片保存路徑'}); 將圖片網(wǎng)頁(yè)截圖保存下載。

對(duì)比了一番，發(fā)現(xiàn)使用playwright截圖的方法需要在遍歷圖片元素的時(shí)候根據(jù)當(dāng)前元素逆向獲取parentNode節(jié)點(diǎn)以及遍歷childNodes節(jié)點(diǎn)，算法相對(duì)比較復(fù)雜！而且screenshot函數(shù)截圖的效果也會(huì)比原圖略顯模糊，因此推薦使用axios傳遞Referer參數(shù)的方法獲取原圖。

PS：方法二的調(diào)試過(guò)程中寫(xiě)了一段逆向遍歷selector的函數(shù)，提供給大家參考，如有不足之處，歡迎指正~

/**
 * 獲取selector
*/
function getSelectorPath(element) {
    if (!!element.id !== false) {
      return '#' + element.id;
    }
    if (element === document.body && !!element) {
      return element.tagName.toLowerCase();
    }
  
    let ix = 0;
    const siblings = element.parentNode?.childNodes;
    for (let i = 0; i < siblings?.length; i++) {
      const sibling = siblings[i];
      if (sibling.innerHTML === element.innerHTML && !!element.parentNode) {
        return `${getSelectorPath(element.parentNode)} > ${element.tagName.toLowerCase()}:nth-child(${ix + 1})`;
      }
      if (sibling.nodeType === 1) {
        ix++;
      }
    }
}

您可能感興趣的文章: