我们知道初始数据大小(120GB),也知道MongoDB is 64MB中默认的最 Big Data 块大小.如果我们将64MB划分为120GB,我们得到1920个块——这是我们应该首先考虑的最小块数.2048恰好是16除以2的幂,并且考虑到GUID(我们的切分键)是基于十六进制的,这比1920(见下文)更容易处理.
NOTE:必须进行预拆分before向集合中添加任何数据.如果在包含数据的集合上使用enableSharding()命令,MongoDB将拆分数据本身,然后在块已经存在的情况下运行该命令,这可能会导致非常奇怪的块分布,所以要小心.
为了回答这个问题,我们假设数据库将被称为users
,集合将被称为userInfo
.我们还假设GUID将写入_id
字段.使用这些参数,我们将连接到mongos
并运行以下命令:
// first switch to the users DB
use users;
// now enable sharding for the users DB
sh.enableSharding("users");
// enable sharding on the relevant collection
sh.shardCollection("users.userInfo", {"_id" : 1});
// finally, disable the balancer (see below for options on a per-collection basis)
// this prevents migrations from kicking off and interfering with the splits by competing for meta data locks
sh.stopBalancer();
现在,根据上面的计算,我们需要将GUID范围拆分为2048个块.要做到这一点,我们至少需要3个十六进制数字(16^3=4096),我们将把它们放在范围的最高有效位(即最左边的3个).同样,这应该从mongos
个shell 运行
// Simply use a for loop for each digit
for ( var x=0; x < 16; x++ ){
for( var y=0; y<16; y++ ) {
// for the innermost loop we will increment by 2 to get 2048 total iterations
// make this z++ for 4096 - that would give ~30MB chunks based on the original figures
for ( var z=0; z<16; z+=2 ) {
// now construct the GUID with zeroes for padding - handily the toString method takes an argument to specify the base
var prefix = "" + x.toString(16) + y.toString(16) + z.toString(16) + "00000000000000000000000000000";
// finally, use the split command to create the appropriate chunk
db.adminCommand( { split : "users.userInfo" , middle : { _id : prefix } } );
}
}
}
完成后,让我们使用sh.status()
助手判断播放状态:
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 3,
"minCompatibleVersion" : 3,
"currentVersion" : 4,
"clusterId" : ObjectId("527056b8f6985e1bcce4c4cb")
}
shards:
{ "_id" : "shard0000", "host" : "localhost:30000" }
{ "_id" : "shard0001", "host" : "localhost:30001" }
{ "_id" : "shard0002", "host" : "localhost:30002" }
{ "_id" : "shard0003", "host" : "localhost:30003" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "users", "partitioned" : true, "primary" : "shard0001" }
users.userInfo
shard key: { "_id" : 1 }
chunks:
shard0001 2049
too many chunks to print, use verbose if you want to force print
我们有2048个块(加上一个额外的块,这要归功于最小/最大块),但它们都仍然在原始碎片上,因为平衡器关闭了.那么,让我们重新启用平衡器:
sh.startBalancer();
这将立即开始平衡,速度会相对较快,因为所有的数据块都是空的,但仍需要一段时间(如果与其他集合的迁移竞争,速度会慢得多).一旦过了一段时间,再次运行sh.status()
,你(应该)就可以得到它了——2048个块都很好地分割成4个碎片,并准备好进行初始数据加载:
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 3,
"minCompatibleVersion" : 3,
"currentVersion" : 4,
"clusterId" : ObjectId("527056b8f6985e1bcce4c4cb")
}
shards:
{ "_id" : "shard0000", "host" : "localhost:30000" }
{ "_id" : "shard0001", "host" : "localhost:30001" }
{ "_id" : "shard0002", "host" : "localhost:30002" }
{ "_id" : "shard0003", "host" : "localhost:30003" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "users", "partitioned" : true, "primary" : "shard0001" }
users.userInfo
shard key: { "_id" : 1 }
chunks:
shard0000 512
shard0002 512
shard0003 512
shard0001 513
too many chunks to print, use verbose if you want to force print
{ "_id" : "test", "partitioned" : false, "primary" : "shard0002" }
您现在可以开始加载数据了,但为了绝对保证在数据加载完成之前不会发生拆分或迁移,您还需要做一件事——在导入期间关闭平衡器和自动拆分:
导入完成后,根据需要反转步骤(sh.startBalancer()
、sh.enableBalancing("users.userInfo")
,重新启动mongos
而不使用--noAutoSplit
),将所有内容恢复为默认设置.
**
更新:优化速度
**
如果你不着急的话,上面的方法是可以的.就目前的情况来看,如果你测试这个,你会发现,平衡器不是很快——即使是空块.因此,当你增加你创建的块的数量时,平衡所需的时间就越长.我已经看到,平衡2048个块需要30多分钟,尽管这取决于部署情况.
对于测试或相对安静的集群来说,这可能没问题,但是在繁忙的集群上,关闭平衡器并且不需要其他更新干扰,这将很难确保.那么,我们如何加快速度呢?
答案是提前进行一些手动移动,然后在碎片上分割块.请注意,这仅适用于特定的分片密钥(如随机分布的UUID)或特定的数据访问模式,因此要小心,以免最终导致数据分布不良.
使用上面的例子,我们有4个碎片,所以我们不是做所有的分割,然后平衡,而是分成4个.然后,我们通过手动移动每个碎片,在每个碎片上放置一个块,最后我们将这些块分割成所需的数量.
上例中的范围如下所示:
$min --> "40000000000000000000000000000000"
"40000000000000000000000000000000" --> "80000000000000000000000000000000"
"80000000000000000000000000000000" --> "c0000000000000000000000000000000"
"c0000000000000000000000000000000" --> $max
创建这些命令只需要4个命令,但既然我们有了它,为什么不以简化/修改的形式重复使用上面的循环:
for ( var x=4; x < 16; x+=4){
var prefix = "" + x.toString(16) + "0000000000000000000000000000000";
db.adminCommand( { split : "users.userInfo" , middle : { _id : prefix } } );
}
下面是thinks现在的样子-我们有4块,都在Shard001上:
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("53467e59aea36af7b82a75c1")
}
shards:
{ "_id" : "shard0000", "host" : "localhost:30000" }
{ "_id" : "shard0001", "host" : "localhost:30001" }
{ "_id" : "shard0002", "host" : "localhost:30002" }
{ "_id" : "shard0003", "host" : "localhost:30003" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0001" }
{ "_id" : "users", "partitioned" : true, "primary" : "shard0001" }
users.userInfo
shard key: { "_id" : 1 }
chunks:
shard0001 4
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : "40000000000000000000000000000000" } on : shard0001 Timestamp(1, 1)
{ "_id" : "40000000000000000000000000000000" } -->> { "_id" : "80000000000000000000000000000000" } on : shard0001 Timestamp(1, 3)
{ "_id" : "80000000000000000000000000000000" } -->> { "_id" : "c0000000000000000000000000000000" } on : shard0001 Timestamp(1, 5)
{ "_id" : "c0000000000000000000000000000000" } -->> { "_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 6)
我们将把$min
块留在原处,然后移动另外三块.你可以通过编程来实现这一点,但这取决于块最初的位置、你如何命名你的碎片等.所以我现在就把这本手册留给大家,它不太繁重——只有3moveChunk
个命令:
mongos> sh.moveChunk("users.userInfo", {"_id" : "40000000000000000000000000000000"}, "shard0000")
{ "millis" : 1091, "ok" : 1 }
mongos> sh.moveChunk("users.userInfo", {"_id" : "80000000000000000000000000000000"}, "shard0002")
{ "millis" : 1078, "ok" : 1 }
mongos> sh.moveChunk("users.userInfo", {"_id" : "c0000000000000000000000000000000"}, "shard0003")
{ "millis" : 1083, "ok" : 1 }
让我们再次判断,并确保这些块在我们预期的位置:
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("53467e59aea36af7b82a75c1")
}
shards:
{ "_id" : "shard0000", "host" : "localhost:30000" }
{ "_id" : "shard0001", "host" : "localhost:30001" }
{ "_id" : "shard0002", "host" : "localhost:30002" }
{ "_id" : "shard0003", "host" : "localhost:30003" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0001" }
{ "_id" : "users", "partitioned" : true, "primary" : "shard0001" }
users.userInfo
shard key: { "_id" : 1 }
chunks:
shard0001 1
shard0000 1
shard0002 1
shard0003 1
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : "40000000000000000000000000000000" } on : shard0001 Timestamp(4, 1)
{ "_id" : "40000000000000000000000000000000" } -->> { "_id" : "80000000000000000000000000000000" } on : shard0000 Timestamp(2, 0)
{ "_id" : "80000000000000000000000000000000" } -->> { "_id" : "c0000000000000000000000000000000" } on : shard0002 Timestamp(3, 0)
{ "_id" : "c0000000000000000000000000000000" } -->> { "_id" : { "$maxKey" : 1 } } on : shard0003 Timestamp(4, 0)
这与我们上面建议的范围相匹配,所以看起来都不错.现在运行上面的原始循环,在每个碎片上"就地"拆分它们,循环完成后,我们应该有一个平衡的分布.还有sh.status()
件事需要确认:
mongos> for ( var x=0; x < 16; x++ ){
... for( var y=0; y<16; y++ ) {
... // for the innermost loop we will increment by 2 to get 2048 total iterations
... // make this z++ for 4096 - that would give ~30MB chunks based on the original figures
... for ( var z=0; z<16; z+=2 ) {
... // now construct the GUID with zeroes for padding - handily the toString method takes an argument to specify the base
... var prefix = "" + x.toString(16) + y.toString(16) + z.toString(16) + "00000000000000000000000000000";
... // finally, use the split command to create the appropriate chunk
... db.adminCommand( { split : "users.userInfo" , middle : { _id : prefix } } );
... }
... }
... }
{ "ok" : 1 }
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("53467e59aea36af7b82a75c1")
}
shards:
{ "_id" : "shard0000", "host" : "localhost:30000" }
{ "_id" : "shard0001", "host" : "localhost:30001" }
{ "_id" : "shard0002", "host" : "localhost:30002" }
{ "_id" : "shard0003", "host" : "localhost:30003" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0001" }
{ "_id" : "users", "partitioned" : true, "primary" : "shard0001" }
users.userInfo
shard key: { "_id" : 1 }
chunks:
shard0001 513
shard0000 512
shard0002 512
shard0003 512
too many chunks to print, use verbose if you want to force print
现在你有了它——不用等待平衡器,分布已经均匀了.