快捷導(dǎo)航

基于OpenCV實(shí)現(xiàn)小型的圖像數(shù)據(jù)庫檢索功能

更新時間：2021年12月14日 15:15:30 作者：Brook_icv

下面就使用VLAD表示圖像，實(shí)現(xiàn)一個小型的圖像數(shù)據(jù)庫的檢索程序。下面實(shí)現(xiàn)需要的功能模塊，分步驟給大家介紹的非常詳細(xì)，對OpenCV圖像數(shù)據(jù)庫檢索功能感興趣的朋友跟隨小編一起看看吧

本文對前面的幾篇文章進(jìn)行個總結(jié)，實(shí)現(xiàn)一個小型的圖像檢索應(yīng)用。

一個小型的圖像檢索應(yīng)用可以分為兩部分：

train，構(gòu)建圖像集的特征數(shù)據(jù)庫。
retrieval，檢索，給定圖像，從圖像庫中返回最類似的圖像

構(gòu)建圖像數(shù)據(jù)庫的過程如下：

生成圖像集的視覺詞匯表(Vocabulary)

提取圖像集所有圖像的sift特征

對得到的sifte特征集合進(jìn)行聚類，聚類中心就是Vocabulary

對圖像集中的圖像重新編碼表示，可使用BoW或者VLAD，這里選擇VLAD.
將圖像集中所有圖像的VLAD表示組合到一起得到一個VLAD表，這就是查詢圖像的數(shù)據(jù)庫。

得到圖像集的查詢數(shù)據(jù)后，對任一圖像查找其在數(shù)據(jù)庫中的最相似圖像的流程如下：

提取圖像的sift特征
加載Vocabulary，使用VLAD表示圖像
在圖像數(shù)據(jù)庫中查找與該VLAD最相似的向量

構(gòu)建圖像集的特征數(shù)據(jù)庫的流程通常是offline的，查詢的過程則需要是實(shí)時的，基本流程參見下圖：

由兩部分構(gòu)成：offline的訓(xùn)練過程以及online的檢索查找

各個功能模塊的實(shí)現(xiàn)

下面就使用VLAD表示圖像，實(shí)現(xiàn)一個小型的圖像數(shù)據(jù)庫的檢索程序。下面實(shí)現(xiàn)需要的功能模塊

特征點(diǎn)提取
構(gòu)建Vocabulary
構(gòu)建數(shù)據(jù)庫

第一步，特征點(diǎn)的提取

不管是BoW還是VLAD，都是基于圖像的局部特征的，本文選擇的局部特征是SIFT，使用其擴(kuò)展RootSift。提取到穩(wěn)定的特征點(diǎn)尤為的重要，本文使用OpenCV體哦那個的SiftDetecotr，實(shí)例化如下：

auto fdetector = xfeatures2d::SIFT::create(0,3,0.2,10);

create的聲明如下：

static Ptr<SIFT> cv::xfeatures2d::SIFT::create     (     int      nfeatures = 0,
        int      nOctaveLayers = 3,
        double      contrastThreshold = 0.04,
        double      edgeThreshold = 10,
        double      sigma = 1.6 
    )

nfeatures 設(shè)置提取到的特征點(diǎn)的個數(shù)，每個sift的特征點(diǎn)都根據(jù)其對比度(local contrast)計(jì)算出來一個分?jǐn)?shù)。設(shè)置了該值后，會根據(jù)分?jǐn)?shù)排序，只保留前nfeatures個返回
nOctaveLayers 每個octave中的層數(shù)，該值可以根據(jù)圖像的分辨率大小計(jì)算出來。D.Lowe論文中該值為３
contrastThreshold　過濾掉低對比度的不穩(wěn)定特征點(diǎn)，該值越大，提取到的特征點(diǎn)越少
edgeThreshold　過濾邊緣處的特征點(diǎn)，該值越大，提取到的特征點(diǎn)就越多
sigma 高斯濾波器的參數(shù)，該濾波器應(yīng)用于第0個Octave

個人的一些見解。

設(shè)置參數(shù)時，主要是設(shè)置contrastThreshold和edgeThreshold。contrastThreshold是過濾掉平滑區(qū)域的一些不穩(wěn)定的特征點(diǎn)，edgeThreshold是過慮類似邊緣的不穩(wěn)定關(guān)鍵點(diǎn)。設(shè)置參數(shù)時，應(yīng)盡量保證提取的特征點(diǎn)個數(shù)適中，不易過多，也不要過少。另外，contrastThreshold和edgeThreshold的平衡，應(yīng)根據(jù)要提取的目標(biāo)是比較平滑的區(qū)域還是紋理較多的區(qū)域，來平衡這兩個參數(shù)的設(shè)置。

對于有些圖像，可能設(shè)置的提取特征點(diǎn)的參數(shù)叫嚴(yán)格，提取特征點(diǎn)的個數(shù)過少，這時候可改變寬松一些的參數(shù)。

auto fdetector = xfeatures2d::SIFT::create(0,3,0.2,10);
fdetector->detectAndCompute(img,noArray(),kpts,feature);

if(kpts.size() < 10){
    fdetector = xfeatures2d::SIFT::create();
    fdetector->detectAndCompute(img,noArray(),kpts,feature);
}

閾值10，可根據(jù)具體的情況進(jìn)行調(diào)節(jié)。

更多關(guān)于sift的內(nèi)容可以參看文章：

圖像檢索(1): 再論SIFT-基于vlfeat實(shí)現(xiàn) 使用輕量級的視覺庫vlfeat提取sift特征，其提取的特征覺得更穩(wěn)定一些，但是使用上就不如OpenCV方便了。
www.dbjr.com.cn/article/181945.htm

關(guān)于RootSift和VLAD可以參考前面的文章www.dbjr.com.cn/article/231900.htm

第二步，構(gòu)建Vocabulary

Vocabulary的構(gòu)建過程，實(shí)際就是對提取到的圖像特征點(diǎn)的聚類。首先提取圖像庫圖像sift特征，并將其擴(kuò)展為RootSift，然后對提取到的RootSift進(jìn)行聚類得到Vocabulary。
這里創(chuàng)建class Vocabulary，主要以下方法：

create 從提取到的特征點(diǎn)構(gòu)建聚類得到視覺詞匯表Vocabulary

void Vocabulary::create(const std::vector<cv::Mat> &features,int k)
{
    Mat f;
    vconcat(features,f);
    vector<int> labels;
    kmeans(f,k,labels,TermCriteria(TermCriteria::COUNT + TermCriteria::EPS,100,0.01),3,cv::KMEANS_PP_CENTERS,m_voc);
    m_k = k;
}

load和save，為了使用方便，需要能夠?qū)⑸傻囊曈X詞匯表Vocabulary保存問文件(.yml)
tranform_vlad，將輸入的圖像進(jìn)行轉(zhuǎn)換為vlad表示

void Vocabulary::transform_vlad(const cv::Mat &f,cv::Mat &vlad)
{
    // Find the nearest center
    Ptr<FlannBasedMatcher> matcher = FlannBasedMatcher::create();
    vector<DMatch> matches;
    matcher->match(f,m_voc,matches);
    // Compute vlad
    Mat responseHist(m_voc.rows,f.cols,CV_32FC1,Scalar::all(0));
    for( size_t i = 0; i < matches.size(); i++ ){
        auto queryIdx = matches[i].queryIdx;
        int trainIdx = matches[i].trainIdx; // cluster index
        Mat residual;
        subtract(f.row(queryIdx),m_voc.row(trainIdx),residual,noArray());
        add(responseHist.row(trainIdx),residual,responseHist.row(trainIdx),noArray(),responseHist.type());
    }

    // l2-norm
    auto l2 = norm(responseHist,NORM_L2);
    responseHist /= l2;
    //normalize(responseHist,responseHist,1,0,NORM_L2);

    //Mat vec(1,m_voc.rows * f.cols,CV_32FC1,Scalar::all(0));
    vlad = responseHist.reshape(0,1); // Reshape the matrix to 1 x (k*d) vector
}

class Vocabulary有以下方法：

從圖像列表中構(gòu)建視覺詞匯表Vocabulary
將生成的Vocabulary保存到本地，并提供了load方法
將圖像表示為VLAD

第三步，創(chuàng)建圖像數(shù)據(jù)庫

圖像數(shù)據(jù)庫也就是將圖像VLAD表示的集合，在該數(shù)據(jù)庫檢索時，返回與query圖像相似的VLAD所對應(yīng)的圖像。
本文使用OpenCV提供的Mat構(gòu)建一個簡單的數(shù)據(jù)庫，Mat保存所有圖像的vlad向量組成的矩陣，在檢索時，實(shí)際就是對該Mat的檢索。
聲明類class Database，其具有以下功能：

add 添加圖像到數(shù)據(jù)庫
save和load 將數(shù)據(jù)庫保存為文件(.yml)
retrieval 檢索，對保存的vald向量的Mat創(chuàng)建索引，返回最相似的結(jié)果。

第四步，Trainer

在上面實(shí)現(xiàn)了特征點(diǎn)的提取，構(gòu)建視覺詞匯表，構(gòu)建圖像表示為VLAD的數(shù)據(jù)庫，這里將其組合到一起，創(chuàng)建Trainer類，方便訓(xùn)練使用。

class Trainer{

public:

    Trainer();
    ~Trainer();

    Trainer(int k,int pcaDim,const std::string &imageFolder,
        const std::string &path,const std::string &identifiery,std::shared_ptr<RootSiftDetector> detector);
    
    void createVocabulary();
    void createDb();

    void save();

private:

    int m_k; // The size of vocabulary
    int m_pcaDimension; // The retrain dimensions after pca

    Vocabulary* m_voc;
    Database* m_db;

private:

    /*
        Image folder
    */
    std::string m_imageFolder;

    /*
        training result identifier,the name suffix of vocabulary and database
        voc-identifier.yml,db-identifier.yml
    */
    std::string m_identifier;

    /*
        The location of training result
    */
    std::string m_resultPath;
};

使用Trainer 需要配置

圖像集所在的目錄視覺
詞匯表的大?。ň垲愔行牡膫€數(shù)）
PCA后VLAD保留的維度，可先不管設(shè)置為0，不進(jìn)行PCA訓(xùn)練后數(shù)據(jù)的保存路徑。
訓(xùn)練后的數(shù)據(jù)保存為yml形式，命名規(guī)則是voc-m_identifier.yml和db-m_identifier.yml。為了方便測試不同參數(shù)的數(shù)據(jù)，這里設(shè)置一個后綴參數(shù)m_identifier,來區(qū)分不同的參數(shù)的訓(xùn)練數(shù)據(jù)。

其使用代碼如下：

int main(int argc, char *argv[])
{
    const string image_200 = "/home/test/images-1";
    const string image_6k = "/home/test/images/sync_down_1";
    
    auto detector = make_shared<RootSiftDetector>(5,5,10);
    Trainer trainer(64,0,image_200,"/home/test/projects/imageRetrievalService/build","test-200-vl-64",detector);

    trainer.createVocabulary();
    trainer.createDb();
    
    trainer.save();

    return 0;
}

偷懶，沒有配置為參數(shù)，使用時需要設(shè)置好圖像的路徑，以及訓(xùn)練后數(shù)據(jù)的保存數(shù)據(jù)。

第五步，Searcher

在Database中，已經(jīng)實(shí)現(xiàn)了retrieval的方法。這里之所以再封裝一層，是為了更好的契合業(yè)務(wù)上的一些需求。比如，圖像的一些預(yù)處理，分塊，多線程處理，查詢結(jié)果的過濾等等。關(guān)于Searcher和具體的應(yīng)用耦合比較深，這里只是簡單的實(shí)現(xiàn)了個retrieval方法和查詢參數(shù)的配置。

class Searcher{

public:
    Searcher();
    ~Searcher();

    void init(int keyPointThreshold);
    void setDatabase(std::shared_ptr<Database> db);

    void retrieval(cv::Mat &query,const std::string &group,std::string &md5,double &score);

    void retrieval(std::vector<char> bins,const std::string &group,std::string &md5,double &score);

private:
    int m_keyPointThreshold;

    std::shared_ptr<Database> m_db;
};

使用也很簡單了，從文件中加載Vaocabulary和Database，設(shè)置Searcher的參數(shù)。

Vocabulary voc;

    stringstream ss;
    ss << path << "/voc-" << identifier << ".yml";

    cout << "Load vocabulary from " << ss.str() << endl;
    voc.load(ss.str());

    cout << "Load vocabulary successful." << endl;

    auto detector = make_shared<RootSiftDetector>(5,0.2,10);

    auto db = make_shared<Database>(detector);

    cout << "Load database from " << path << "/db-" << identifier << ".yml" << endl;
    db->load1(path,identifier);
    db->setVocabulary(voc);
    cout << "Load database successful." << endl;

     Searcher s;
    s.init(10);
    s.setDatabase(db);

Summary

上圖來總結(jié)下整個流程

創(chuàng)建Vocabulary
創(chuàng)建Database
Search Similary list

到此這篇關(guān)于基于OpenCV實(shí)現(xiàn)小型的圖像數(shù)據(jù)庫檢索的文章就介紹到這了,更多相關(guān)OpenCV圖像數(shù)據(jù)庫檢索內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: