MongoDB 在盛大大数据量下的应用

MongoDB 在盛大大数据
量下的应用

郭理靖 @sn da ; gu ol ijing@ gm a il .c om ;

11年10月19日星期三

Sections

Basic introduction to MongoDB

Monitor MongoDB

Backup and Rollback of MongoDB

Case Study


What’s MongoDB

MongoDB (from "humongous") is a
scalable, high-performance, open
source, document-oriented
database.
Current version:2.0.0


What’s MongoDB

MongoDB=
JSON + Indexes


Philosophy Of MongoDB


Features

Document-oriented storage （JSON）

Full Index Support （Indexes）

Replication & High availability

Rich Document-based Queries and Updates

Map/Reduce

Auto-Sharding and GridFS


Terminology
RDBMS MongoDB
Table Collection
View/Row(s) JSON Document
Column name Field name
Index Index
Join Embedding & Linking
Partition Shard
Partition Key Shard Key


SELECT
mySQL db.runCommand({
MongoDB
Dim1, Dim2, 1 mapreduce: "DenormAggCollection",
SUM(Measure1) AS MSum, query: {
2
COUNT(*) AS RecordCount, filter1: { '$in': [ 'A', 'B' ] },
AVG(Measure2) AS MAvg, 3 filter2: 'C',
MIN(Measure1) AS MMin filter3: { '$gt': 123 }
MAX(CASE },
WHEN Measure2 < 100 4 map: function() { emit(
THEN Measure2 { d1: this.Dim1, d2: this.Dim2 },
END) AS MMax { msum: this.measure1, recs: 1, mmin: this.measure1,
FROM DenormAggTable mmax: this.measure2 < 100 ? this.measure2 : 0 }
WHERE (Filter1 IN (’A’,’B’)) );},
AND (Filter2 = ‘C’) 5 reduce: function(key, vals) {
AND (Filter3 > 123) var ret = { msum: 0, recs: 0, mmin: 0, mmax: 0 };
GROUP BY Dim1, Dim2 1 for(var i = 0; i < vals.length; i++) {
HAVING (MMin > 0) ret.msum += vals[i].msum;
ORDER BY RecordCount DESC ret.recs += vals[i].recs;
LIMIT 4, 8 if(vals[i].mmin < ret.mmin) ret.mmin = vals[i].mmin;
if((vals[i].mmax < 100) && (vals[i].mmax > ret.mmax))
ret.mmax = vals[i].mmax;
}
1 Grouped dimension columns are pulled return ret;
out as keys in the map function, },
reducing the size of the working set. finalize: function(key, val) {
6
2 Measures must be manually aggregated. 7 val.mavg = val.msum / val.recs;
return val;
3 Aggregates depending on record counts
},

Revision 4, Created 2010-03-06
Rick Osborne, rickosborne.org
must wait until finalization.
4 Measures can use procedural logic.
out: 'result1',
verbose: true
5 Filters have an ORM/ActiveRecord- });
looking style.
db.result1.
6 Aggregate filtering must be applied to
the result set, not in the map/reduce.
find({ mmin: { '$gt': 0 } }).
7 Ascending: 1; Descending: -1
sort({ recs: -1 }).
skip(4).
limit(8);


Indexes

Unique Indexes or Duplicate Values Indexes

Index of Single Key(Embedded key, Document
key) .Default index, ‘_id’: MongoID(GUID)

Index of Compound
Keys(db.user.ensureIndex({credit : -1, name: 1}))

Sparse Index. (A sparse index can only have one
field) //after 1.75


Geo Indexes

Geospatial index supported (good news for LBS)

db.c.find( {a:[50,50]} ) using index {a:’2d’}

db.c.find( {a:{$near:[50,50]}} ) using index
{a:’2d’}

Results are sorted closest - farthest

db.c.find( {a:{$within:{$box:[[40,40],
[60,60]]}}} ) using index {a:’2d’}


Atomic Operations
$set $unset $inc

$push - append a value to an array

$pushAll - append several values to an array

$pull - remove a value(s) from an existing array

$pullAll - remove several value(s) from an
existing array

$bit - bitwise operations


Tips

Update requires a write lock.

Write lock is “greedy” right now

Update in big collection is slow and block other
writes during the update

Query it first then update it will reduce the lock
time


Replication


Replica set

Replica sets are basically master/slave
replication, adding automatic failover and
automatic recovery of member nodes.

A Replica Set consists of two or more nodes that
are copies of each other. (i.e.: replicas)

The Replica Set automatically elects a Primary
(master) if there is no primary currently
available.


Replica set

Automated fail-over

Distribute read load

Simplified maintenance

A Replica Set can have passive members that
cannot become primary (for reading and backup)

A Replica Set can have delayed nodes (in case of
user error)


step down primary

If you step down primary manually when
primary is heavy, the replica set may can not
elect new primary. (it is fixed in 1.8.0)

May lose data when step down primary
manually. (sometimes, not test in 1.8.0 yet)

Clients may not know what happened.


What’s happened when
primary is down?
Replica set will elect new primary, but how?

what’s arbiter?

operator time

votes

priority

may lose data when primary is down


Auto-Sharding


Monitoring and Diagnostics

Query Profiler

Http Console

mongostat

db.stat()

db.serverStatus()


Database Profiler

>db.setProfilingLevel(1,20)

{ "was" : 0, "slowms" : 100, "ok" : 1 }

> db.getProfilingStatus() //after1.7x

{ "was" : 1, "slowms" : 20 }


Database Profiler

db.system.profile.find()

{"ts" : "Thu Jan 29 2009 15:19:32 GMT-0500
(EST)" , "info" : "query test.$cmd ntoreturn:1
reslen:66 nscanned:0 <br>query: { profile: 2 }
nreturned:1 bytes:50" , "millis" : 0}

db.system.profile.find().sort({$natural:-1})//To
see newest information first


mongostat


mongostat

inserts/s - # of inserts per second

query/s - # of queries per second

update/s - # of updates per second

delete/s - # of deletes per second

getmore/s - # of get mores (cursor batch)
per second


mongostat

command/s - # of commands per second

flushes/s - # of fsync flushes per second

mapped - amount of data mmaped (total data
size) megabytes

visze - virtual size of process in megabytes

res - resident size of process in megabytes


mongostat

faults/s - # of pages faults/sec (linux only)

locked - percent of time in global write lock

idx miss - percent of btree page misses
(sampled)

q t|r|w - lock queue lengths (total|read|
write)

conn - number of open connections


Admin UIs


MongoDB with Numa

WARNING: You are running on a NUMA
machine.

We suggest launching mongod like this
to avoid performance problems:
numactl --interleave=all mongod
[other options]


Security

Auth is supported in Replica set/Master-Slaves,
Auto-Sharding

bind_ip

start auth in RS may have bug.(use http to check
the status and try restart mongods)


Memory

Memory-mapping

Keep indexes in memory

db.test.stats()

db.test.totalIndexSize()

RAM > indexes + hot data = better performance


some tips

No transactions

Fast-N-Loose by default (no safe/w/GLE)

Indexing order matters; query optimizer helps

One write thread,many query threads

Memory Mapped Data Files

BSON everywhere


Oplog of MongoDB

Oplog in Replica set:

local.system.replset: store config of replica set,
use rs.conf() to detail

local.oplog.rs: capped collection, use --
oplogSize to set size

local.replset.minvalid: for sync status

NO INDEX


Oplog of MongoDB

> db.test.insert({'name': 'guolijing', ‘fat’:‘false’})

> db.oplog.rs.find().sort({$natural:-1})

{ "ts" : { "t" : 1318772440000, "i" : 1 }, "h" :
NumberLong( "1503388658822904667" ), "op" :
"i", "ns" : "test.test", "o" : { "_id" :
ObjectId("4e9aded8bbf25c4665f212fc"),
"name" : "guolijing" , ‘fat’:‘false’} }


Oplog of MongoDB

{ts:{},h{}, op:{},ns:{},o:{},o2:{} }
Ts: 8 bytes of time stamp by 4 bytes Unix timestamp + 4 bytes since the count said.

This value is very important, in the elections (such as master is down unit), new primary would
choose the biggest change as the ts new primary.
Op: 1 bytes of operation types, such as I said insert, d said the delete.
In the namespace ns: the operation.
O: the operation is corresponding to the document, that is, the content of the current operation
(such as the update operation to update when the fields and value)
O2: perform update operation in the condition is limited to the update, where to have this property
Among them, can be the op several situations:
h:// hash mongodb use to make sure we are reading the right flow of ops and aren't on an out-of-
date "fork"


Oplog of MongoDB

> db.test.update({'name':'guolijing'},{$set:
{'fat':'ture'}})

> db.oplog.rs.find().sort({$natural:-1})

{ "ts" : { "t" : 1318775290000, "i" : 1 }, "h" :
NumberLong( "-5204953982587889486" ),
"op" : "u", "ns" : "test.test", "o2" : { "_id" :
ObjectId("4e9aded8bbf25c4665f212fc") }, "o" :
{ "$set" : { "fat" : "ture" } } }


Feature of Oplog

Replay same oplog is harmless

Replay old oplogs is harmless if the ts in last one
is newer than current database


Back up

mongoexport mongoimport

mongodump mongorestore

data consistency? use --oplog

use incremental back up

read oplog and replay (wordnik tools)


Rollback MongoDB

Use snapshot + oplog
Use delayed secondary +
oplog


Vertical Scala

Old server 8G RAM +200G dis

New server 64G RAM + 2T disk

What can I do if my service should always keep
online ?


Vertical Scala

You’re luck if using Replica set

move everything(include ip config) from
secondary to new service

step down primary, do the same thing above

well, problem solved


Build New Index

Build a index is quite easy.....

Build a index in a huge collections, such as
500GB ,and database is very buy, is it easy?


Build New Index

Take a snapshot of current Mongodb service

Build new index in snapshot

replay oplog of working Mongod

Replace the working Mongod


Best practices

Replica set: one for primary, one for secondary,
one for incremental back up with a arbiter.


Q&A

www.mongoic.com is online !
We’re hiring...


MongoDB 在盛大大数据量下的应用

MongoDB 在盛大大数据量下的应用

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to MongoDB 在盛大大数据量下的应用

Similar to MongoDB 在盛大大数据量下的应用 (20)

More from iammutex

More from iammutex (20)

Recently uploaded

Recently uploaded (20)