MongoDB中索引的用法
本文是一篇转载文章,作者在对 MongoDB文档进行了细致的阅读后,总结出了MongoDB的各种索引的用法。
原文链接: http://iamcaihuafeng.blog.sohu.com/151638529.html
索引能提高检索数据的速度,你可以想像成在MySQL中创建索引一样,同样索引也是用B-Tree也实现的。
1.单列索引
在字段x上创建索引,1 (ascending) or -1 (descending)
> db.data.ensureIndex({x:1})
显示表data里面的所有索引
> db.data.getIndexes() [ { "name" : "_id_", "ns" : "recommender.data", "key" : { "_id" : 1 } }, { "_id" : ObjectId("4befb146b0e29ba1ce20e0bb"), "ns" : "recommender.data", "key" : { "x" : 1 }, "name" : "x_1" } ]
查找字段x为6的值,此时已经用到索引了
> db.data.find({x:6}) { "_id" : ObjectId("4bee804ba23d558eb6687117"), "x" : 6, "name" : "caihuafeng1" } { "_id" : ObjectId("4bee804ba23d558eb6687118"), "x" : 6, "name" : "caihuafeng2" } { "_id" : ObjectId("4bee804ba23d558eb6687119"), "x" : 6, "name" : "caihuafeng3" } { "_id" : ObjectId("4bee804ba23d558eb668711a"), "x" : 6, "name" : "caihuafeng4" } { "_id" : ObjectId("4bee804ba23d558eb668711b"), "x" : 6, "name" : "caihuafeng5" } { "_id" : ObjectId("4bee804ba23d558eb668711c"), "x" : 6, "name" : "caihuafeng6" } { "_id" : ObjectId("4bee804ba23d558eb668711d"), "x" : 6, "name" : "caihuafeng7" } { "_id" : ObjectId("4bee804ba23d558eb668711e"), "x" : 6, "name" : "caihuafeng8" } { "_id" : ObjectId("4bee804ba23d558eb668711f"), "x" : 6, "name" : "caihuafeng9" } { "_id" : ObjectId("4bee804ba23d558eb6687120"), "x" : 6, "name" : "caihuafeng10" }
2.默认索引
上述1中db.data.getIndexes()显示出来的一共有2个索引,其中_id是创建表的时候自动创建的索引,此索引是不能够删除的。
An index is always created on _id. This index is special and cannot be deleted. The _id index enforces uniqueness for its keys.
3.文档作为索引的键值
a.单列索引
MongoDB的官方文档上面是这样说的:
Documents as Keys
Indexed fields may be of any type, including documents:
往数据库recommender的表data中插入三条记录
> db.data.insert({name:"1616",info:{url:"http://www.1616.net/",city:"beijing"}}); > db.data.insert({name:"hao123",info:{url:"http://www.hao123.com/",city:"beijing"}}); > db.data.insert({name:"ll4la",info:{url:"http://www.114la.com/",city:"dongguan"}});
对字段info创建索引
> db.data.ensureIndex({info: 1});
显示表data上的所有索引
> db.data.getIndexes(); [ { "name" : "_id_", "ns" : "recommender.data", "key" : { "_id" : 1 } }, { "_id" : ObjectId("4befb146b0e29ba1ce20e0bb"), "ns" : "recommender.data", "key" : { "x" : 1 }, "name" : "x_1" }, { "_id" : ObjectId("4befb76bb0e29ba1ce20e0bf"), "ns" : "recommender.data", "key" : { "info" : 1 }, "name" : "info_1" } ]
查找指定的记录,此时会用到索引
> db.data.find({info: {url:"http://www.1616.net/",city:"beijing"}}); { "_id" : ObjectId("4befb711b0e29ba1ce20e0bc"), "name" : "1616", "info" : { "url" : "http://www.1616.net/", "city" : "beijing" } }
b.组合索引
建立组合索引
> db.data.ensureIndex({"info.url":1, "info.city":1}); > db.data.getIndexes(); [ { "name" : "_id_", "ns" : "recommender.data", "key" : { "_id" : 1 } }, { "_id" : ObjectId("4befb146b0e29ba1ce20e0bb"), "ns" : "recommender.data", "key" : { "x" : 1 }, "name" : "x_1" }, { "_id" : ObjectId("4befb76bb0e29ba1ce20e0bf"), "ns" : "recommender.data", "key" : { "info" : 1 }, "name" : "info_1" }, { "_id" : ObjectId("4befb9d1b0e29ba1ce20e0c0"), "ns" : "recommender.data", "key" : { "info.url" : 1, "info.city" : 1 }, "name" : "info.url_1_info.city_1" } ]
下面几个操作均会用到索引
> db.data.find({"info.url": "http://www.1616.net/", "info.city": "beijing"}); { "_id" : ObjectId("4befb711b0e29ba1ce20e0bc"), "name" : "1616", "info" : { "url" : "http://www.1616.net/", "city" : "beijing" } } > db.data.find({"info.url": "http://www.1616.net/"}); { "_id" : ObjectId("4befb711b0e29ba1ce20e0bc"), "name" : "1616", "info" : { "url" : "http://www.1616.net/", "city" : "beijing" } }
1表示升序(asc),-1表示降序(desc)
> db.data.find({"info.url": /http:*/i}).sort({"info.url": 1, "info.city": 1}); { "_id" : ObjectId("4befb740b0e29ba1ce20e0be"), "name" : "ll4la", "info" : { "url" : "http://www.114la.com/", "city" : "dongguan" } } { "_id" : ObjectId("4befb711b0e29ba1ce20e0bc"), "name" : "1616", "info" : { "url" : "http://www.1616.net/", "city" : "beijing" } } { "_id" : ObjectId("4befb723b0e29ba1ce20e0bd"), "name" : "hao123", "info" : { "url" : "http://www.hao123.com/", "city" : "beijing" } } > db.data.find({"info.url": /http:*/i}).sort({"info.url": 1}); { "_id" : ObjectId("4befb740b0e29ba1ce20e0be"), "name" : "ll4la", "info" : { "url" : "http://www.114la.com/", "city" : "dongguan" } } { "_id" : ObjectId("4befb711b0e29ba1ce20e0bc"), "name" : "1616", "info" : { "url" : "http://www.1616.net/", "city" : "beijing" } } { "_id" : ObjectId("4befb723b0e29ba1ce20e0bd"), "name" : "hao123", "info" : { "url" : "http://www.hao123.com/", "city" : "beijing" } } > db.data.find({"info.url": /http:*/i}).sort({"info.url": -1}); { "_id" : ObjectId("4befb723b0e29ba1ce20e0bd"), "name" : "hao123", "info" : { "url" : "http://www.hao123.com/", "city" : "beijing" } } { "_id" : ObjectId("4befb711b0e29ba1ce20e0bc"), "name" : "1616", "info" : { "url" : "http://www.1616.net/", "city" : "beijing" } } { "_id" : ObjectId("4befb740b0e29ba1ce20e0be"), "name" : "ll4la", "info" : { "url" : "http://www.114la.com/", "city" : "dongguan" } }
4.组合索引
注意,这里的组合索引与上述3中的b中的组合索引是有点不同的,4里面是对一级字段建立组合索引,而上述3中是对二级字段建立组合索引。
在字段name及info上面创建组合索引
> db.data.ensureIndex({name: 1, info: -1});
当创建组合索引时,字段后面的1表示升序,-1表示降序,是用1还是用-1主要是跟排序的时候或指定范围内查询的时候有关的,具体看下面的英文原文的说明。
When creating an index, the number associated with a key specifies the direction of the index, so it should always be 1 (ascending) or -1 (descending). Direction doesn’t matter for single key indexes or for random access retrieval but is important if you are doing sorts or range queries on compound indexes.
显示所有的索引
> db.data.getIndexes(); [ { "name" : "_id_", "ns" : "recommender.data", "key" : { "_id" : 1 } }, { "_id" : ObjectId("4befb146b0e29ba1ce20e0bb"), "ns" : "recommender.data", "key" : { "x" : 1 }, "name" : "x_1" }, { "_id" : ObjectId("4befb76bb0e29ba1ce20e0bf"), "ns" : "recommender.data", "key" : { "info" : 1 }, "name" : "info_1" }, { "_id" : ObjectId("4befb9d1b0e29ba1ce20e0c0"), "ns" : "recommender.data", "key" : { "info.url" : 1, "info.city" : 1 }, "name" : "info.url_1_info.city_1" }, { "_id" : ObjectId("4befbfcfb0e29ba1ce20e0c1"), "ns" : "recommender.data", "key" : { "name" : 1, "info" : -1 }, "name" : "name_1_info_-1" } ]
下面的排序将用到上面的索引
最后一行的”name” : “ll4la”实际上是”name” : “114la”(就是将数字一写成了字母l),但是我录入的时候写成了”name” : “ll4la”,是我写错了,但是排序的结果是对的。
> db.data.find({"info.url": /http:*/i}).sort({name:1, info: -1}); { "_id" : ObjectId("4befb711b0e29ba1ce20e0bc"), "name" : "1616", "info" : { "url" : "http://www.1616.net/", "city" : "beijing" } } { "_id" : ObjectId("4befb723b0e29ba1ce20e0bd"), "name" : "hao123", "info" : { "url" : "http://www.hao123.com/", "city" : "beijing" } } { "_id" : ObjectId("4befb740b0e29ba1ce20e0be"), "name" : "ll4la", "info" : { "url" : "http://www.114la.com/", "city" : "dongguan" } }
MongoDB组合索引规则
If you have a compound index on multiple fields, you can use it to query on the beginning subset of fields. So if you have an index on
a,b,c
you can use it query on
a
a,b
a,b,c
如果用过MySQL的话,看起来是不是很熟悉,原理跟MySQL是一样的。
5.唯一索引
往表data中插入一条记录。
> db.data.insert({firstname: "cai", lastname: "huafeng"});
由于表data中只有一记录有字段firstname及lastname,其它的行均没有相应的值,也就是均为null,为null就说明是相同的,而唯一索引是不允许有相同的值的,所以下面创建唯一组合索引时报错了。
所以建立唯一索引时,不管是对单个字段还是多个字段建立索引,则最好每一行均有此字段,否则会报错。
> db.data.find(); { "_id" : ObjectId("4bee745a0863b1c233b8b7ea"), "name" : "caihuafeng" } { "_id" : ObjectId("4bee745f0863b1c233b8b7eb"), "website" : "1616.net" } { "_id" : ObjectId("4bee804ba23d558eb6687117"), "x" : 6, "name" : "caihuafeng1" } { "_id" : ObjectId("4bee804ba23d558eb6687118"), "x" : 6, "name" : "caihuafeng2" } { "_id" : ObjectId("4bee804ba23d558eb6687119"), "x" : 6, "name" : "caihuafeng3" } { "_id" : ObjectId("4bee804ba23d558eb668711a"), "x" : 6, "name" : "caihuafeng4" } { "_id" : ObjectId("4bee804ba23d558eb668711b"), "x" : 6, "name" : "caihuafeng5" } { "_id" : ObjectId("4bee804ba23d558eb668711c"), "x" : 6, "name" : "caihuafeng6" } { "_id" : ObjectId("4bee804ba23d558eb668711d"), "x" : 6, "name" : "caihuafeng7" } { "_id" : ObjectId("4bee804ba23d558eb668711e"), "x" : 6, "name" : "caihuafeng8" } { "_id" : ObjectId("4bee804ba23d558eb668711f"), "x" : 6, "name" : "caihuafeng9" } { "_id" : ObjectId("4bee804ba23d558eb6687120"), "x" : 6, "name" : "caihuafeng10" } { "_id" : ObjectId("4befb711b0e29ba1ce20e0bc"), "name" : "1616", "info" : { "url" : "http://www.1616.net/", "city" : "beijing" } } { "_id" : ObjectId("4befb723b0e29ba1ce20e0bd"), "name" : "hao123", "info" : { "url" : "http://www.hao123.com/", "city" : "beijing" } } { "_id" : ObjectId("4befb740b0e29ba1ce20e0be"), "name" : "ll4la", "info" : { "url" : "http://www.114la.com/", "city" : "dongguan" } } { "_id" : ObjectId("4befc51ab0e29ba1ce20e0c2"), "firstname" : "cai", "lastname" : "huafeng" } > db.data.ensureIndex({firstname: 1, lastname: 1}, {unique: true}); E11000 duplicate key error index: recommender.data.$firstname_1_lastname_1 dup key: { : null, : null }
下面我们用另外一个表person来进行测试
> db.person.ensureIndex({firstname:1, lastname: 1},{unique: true}); > db.person.insert({firstname: 'cai', lastname: 'huafeng'});
第二次插入同样值的时候报错了,说明唯一索引生效了,其实跟MySQL里面是一样的。
> db.person.insert({firstname: 'cai', lastname: 'huafeng'}); E11000 duplicate key error index: recommender.person.$firstname_1_lastname_1 dup key: { : "cai", : "huafeng" }
6.唯一索引中的重复值处理
删除上述5中的索引,插入两行一样的记录
> db.person.dropIndexes(); { "nIndexesWas" : 2, "msg" : "non-_id indexes dropped for collection", "ok" : 1 } > db.person.find(); { "_id" : ObjectId("4befcda6b0e29ba1ce20e0cf"), "firstname" : "cai", "lastname" : "huafeng" } > db.person.insert({firstname: 'cai', lastname: 'huafeng'}); > db.person.find(); { "_id" : ObjectId("4befcda6b0e29ba1ce20e0cf"), "firstname" : "cai", "lastname" : "huafeng" } { "_id" : ObjectId("4befcef0b0e29ba1ce20e0d1"), "firstname" : "cai", "lastname" : "huafeng" }
如果现在直接在字段firstname及lastname上面创建唯一组合索引的时候肯定会报错,我们来试一试:
> db.person.ensureIndex({firstname: 1, lastname: 1}, {unique: true}); E11000 duplicate key error index: recommender.person.$firstname_1_lastname_1 dup key: { : "cai", : "huafeng" }
查看表person的索引,我们可以看到,新创建的索引没有生成。
> db.person.getIndexes(); [ { "name" : "_id_", "ns" : "recommender.person", "key" : { "_id" : 1 } } ]
可以在第二个json对象加入一项dropDups: true,这样在创建唯一组合索引的时候不会报错,保留文档中第一个重复的值,其它重复的值均删除。
再次测试一下,加入dropDups选项,虽然报错了,但是唯一组合索引已经建立了。
> db.person.ensureIndex({firstname: 1, lastname: 1}, {unique: true, dropDups: true}); E11000 duplicate key error index: recommender.person.$firstname_1_lastname_1 dup key: { : "cai", : "huafeng" } > db.person.getIndexes(); [ { "name" : "_id_", "ns" : "recommender.person", "key" : { "_id" : 1 } }, { "_id" : ObjectId("4befcfd9b0e29ba1ce20e0d3"), "ns" : "recommender.person", "key" : { "firstname" : 1, "lastname" : 1 }, "name" : "firstname_1_lastname_1", "unique" : true, "dropDups" : true } ]
再次查询表person中的记录,发现重复的记录已经自动删除了。
> db.person.find(); { "_id" : ObjectId("4befcda6b0e29ba1ce20e0cf"), "firstname" : "cai", "lastname" : "huafeng" }
MongoDB官方文档的说明
A unique index cannot be created on a key that has duplicate values. If you would like to create the index anyway, keeping the first document the database indexes and deleting all subsequent documents that have duplicate values, add the dropDups option.
db.things.ensureIndex({firstname : 1}, {unique : true, dropDups : true})
7.删除索引
a.删除某个表中的所有索引
To delete all indexes on the specified collection:
db.collection.dropIndexes();
b.删除某个表中的单一索引
To delete a single index:
db.collection.dropIndex({x: 1, y: -1}) > db.data.dropIndex({firstname: 1, lastname: 1}); { "nIndexesWas" : 6, "ok" : 1 }
Running directly as a command without helper:
// note: command was "deleteIndexes", not "dropIndexes", before MongoDB v1.3.2 // remove index with key pattern {y:1} from collection foo db.runCommand({dropIndexes:'foo', index : {y:1}}) // remove all indexes: db.runCommand({dropIndexes:'foo', index : '*'}) > db.person.ensureIndex({firstname: 1, lastname: 1}); > db.runCommand({dropIndexes:'person', index:{firstname:1, lastname:1}}); { "nIndexesWas" : 2, "ok" : 1 }
延伸阅读:
http://www.mongodb.org/display/DOCS/Indexes#Indexes-DocumentsasKeys
http://www.mongodb.org/display/DOCS/min+and+max+Query+Specifiers
http://www.mongodb.org/display/DOCS/Advanced+Queries
已有 0 人发表留言,猛击->> 这里<<-参与讨论
ITeye推荐