快捷導(dǎo)航

Django 解決distinct無法去除重復(fù)數(shù)據(jù)的問題

更新時(shí)間：2021年03月22日 19:07:00 作者：CQ-LQJ

這篇文章主要介紹了Django 解決distinct無法去除重復(fù)數(shù)據(jù)的問題，具有很好的參考價(jià)值，希望對大家有所幫助。一起跟隨小編過來看看吧

使用distinct在mysql中查詢多條不重復(fù)記錄值的解決辦法

如何使用distinct在mysql中查詢多條不重復(fù)記錄值?

有時(shí)候想用distinct去掉queryset中的重復(fù)項(xiàng)，看django文章中是這么說的

>>> Author.objects.distinct()
[...]
 
>>> Entry.objects.order_by('pub_date').distinct('pub_date')
[...]
 
>>> Entry.objects.order_by('blog').distinct('blog')
[...]
 
>>> Entry.objects.order_by('author', 'pub_date').distinct('author', 'pub_date')
[...]
 
>>> Entry.objects.order_by('blog__name', 'mod_date').distinct('blog__name', 'mod_date')
[...]
 
>>> Entry.objects.order_by('author', 'pub_date').distinct('author')
[...]
Note

django文檔中特別介紹了，distinct的列一定要先order_by并且在第一項(xiàng)。
When you specify field names, you must provide an order_by() in the QuerySet, and the fields in order_by() must start with the fields in distinct(), in the same order.

For example, SELECT DISTINCT ON (a) gives you the first row for each value in column a. If you don't specify an order, you'll get some arbitrary row.

完全照做，用的mysql數(shù)據(jù)庫最后出現(xiàn)了這樣的警告：

raise NotImplementedError('DISTINCT ON fields is not supported by this database backend')
NotImplementedError: DISTINCT ON fields is not supported by this database backend
告訴我數(shù)據(jù)庫不支持。

當(dāng)然可以這樣：

items = []
for item in query_set:
  if item not in items:
    items.append(item)

首先,我們必須知道在django中模型執(zhí)行查詢有兩種方法:

第一種,使用django給出的api,例如filter value distinct order_by等模型查詢api;

代碼:LOrder.objects.values('finish_time').distinct()

這里應(yīng)注意,原官方文檔中寫到:

示例（第一個(gè)之后的示例都只能在PostgreSQL 上工作）：

>>> Author.objects.distinct() [...] >>> Entry.objects.order_by('pub_date').distinct('pub_date') [...] >>> Entry.objects.order_by('blog').distinct('blog') [...] >>> Entry.objects.order_by('author', 'pub_date').distinct('author', 'pub_date') [...] >>> Entry.objects.order_by('blog__name', 'mod_date').distinct('blog__name', 'mod_date') [...] >>> Entry.objects.order_by('author', 'pub_date').distinct('author')

因?yàn)槲沂褂玫膍ysql數(shù)據(jù)庫,所以在distinct只能是第一中用法,或者可以這樣用

LOrder.objects.values('finish_time').distinct().order_by('finish_time')

第二種,使用原始SQL查詢

LOrder.objects.raw('SELECT DISTINCT id,finish_time FROM keywork_lorder group by finish_time')

上面直接使用mysql語句進(jìn)行剔重,這里需要特別注意的是:

一是原始SQL查詢只有一種字段不可以被丟掉,官方文檔中這樣說道:

只有一種字段不可以被省略——就是主鍵。 Django 使用主鍵來識(shí)別模型的實(shí)例，所以它在每次原始查詢中都必須包含。如果你忘記包含主鍵的話，會(huì)拋出一個(gè)InvalidQuery異常。

意思是,如果你的sql語句是這樣的'SELECT DISTINCT finish_time FROM keywork_lorder ',那么將會(huì)報(bào)錯(cuò)Raw query must include the primary key,就是id字段不能被丟掉!

二是,這里是原始mysql查詢語句,mysql去掉重復(fù)項(xiàng)要這樣寫:'SELECT DISTINCT id,finish_time FROM keywork_lorder group by finish_time'

補(bǔ)充：使用Distinct去除重復(fù)數(shù)據(jù)

distinct用于在查詢中返回列的唯一不同值（即去重復(fù)），支持單列或多列。
在實(shí)際的應(yīng)用中，表中的某一列含有重復(fù)值是很常見的，如employee員工表的dept部門列。
如果在查詢數(shù)據(jù)時(shí)，希望得到某列的所有不同值，可以使用distinct。

distinct 語法

select 【distinct】 column_name1,column_name2
from table_name;

下面開始操作

創(chuàng)建一個(gè)足跡表
create table footprint(
	id int not null auto_increment primary key,
	username varchar(30) comment '用戶名',
	city varchar(30) comment '城市',
	visit_date varchar(10) comment '到訪日期'
);
插入一些數(shù)據(jù)
insert into footprint(username, city, visit_date) values('mofei', '貴陽', '2019-12-05');
insert into footprint(username, city, visit_date) values('mofei', '貴陽', '2020-01-15');
insert into footprint(username, city, visit_date) values('mofei', '北京', '2018-10-10');
insert into footprint(username, city, visit_date) values('zhangsan', '上海', '2020-01-01');
insert into footprint(username, city, visit_date) values('zhangsan', '上海', '2020-02-02');
insert into footprint(username, city, visit_date) values('lisi', '拉薩', '2016-12-20');

這些用戶到訪過那些城市
mysql> select distinct city from footprint;
和group by 效果相同，只不過distinct專門負(fù)責(zé)去重復(fù)這個(gè)活
mysql> select city from footprint group by city;


查詢有幾個(gè)用戶在使用系統(tǒng)
mysql> select distinct username from footprint;

dictinct作用于兩個(gè)字段時(shí)，多條數(shù)據(jù)都相同時(shí)會(huì)保留一條

以上內(nèi)容來自墨菲墨菲的補(bǔ)充

補(bǔ)充知識(shí)：Distinct和Group by去除重復(fù)字段記錄

重復(fù)記錄有兩個(gè)意義，一是完全重復(fù)的記錄，也即所有字段均重復(fù)的記錄

二是部分關(guān)鍵字段重復(fù)的記錄，比如Name字段重復(fù)，而其他字段不一定重復(fù)或都重復(fù)可以忽略。

1、對于第一種重復(fù)，比較容易解決，使用

select distinct * from tableName

就可以得到無重復(fù)記錄的結(jié)果集。

如果該表需要?jiǎng)h除重復(fù)的記錄(重復(fù)記錄保留1條)，可以按以下方法刪除

select distinct * into #Tmp from tableName
drop table tableName
select * into tableName from #Tmp
drop table #Tmp

發(fā)生這種重復(fù)的原因是表設(shè)計(jì)不周產(chǎn)生的，增加唯一索引列即可解決。

2、這類重復(fù)問題通常要求保留重復(fù)記錄中的第一條記錄，操作方法如下

假設(shè)有重復(fù)的字段為Name,Address，要求得到這兩個(gè)字段唯一的結(jié)果集

select identity(int,1,1) as autoID, * into #Tmp from tableName
select min(autoID) as autoID into #Tmp2 from #Tmp group by Name
select * from #Tmp where autoID in(select autoID from #tmp2)

最后一個(gè)select即得到了Name，Address不重復(fù)的結(jié)果集(但多了一個(gè)autoID字段，實(shí)際寫時(shí)可以寫在select子句中省去此列)

其它的數(shù)據(jù)庫可以使用序列，如：

create sequence seq1;
select seq1.nextval as autoID, * into #Tmp from tableName

zuolo: 我根據(jù)上面實(shí)例得到所需要的語句為 SELECT MAX(id) AS ID,Prodou_id,FinalDye FROM anwell..tblDBDdata GROUP BY Prodou_id,FinalDye ORDER BY id，之前一直想用Distinct來得到指定字段不重復(fù)的記錄是個(gè)誤區(qū)。

以上這篇Django 解決distinct無法去除重復(fù)數(shù)據(jù)的問題就是小編分享給大家的全部內(nèi)容了，希望能給大家一個(gè)參考，也希望大家多多支持腳本之家。

您可能感興趣的文章: