This document provides an introduction and overview of MongoDB. It begins with basic introductions to MongoDB and its terminology. It then covers MongoDB features like indexing, replication, auto-sharding, and monitoring. It also provides tips on topics like vertical scaling, building indexes, backups, and rollbacks. The document uses several examples to illustrate MongoDB concepts and operations.
Scanning the Internet for External Cloud Exposures via SSL Certs
MongoDB 在盛大大数据量下的应用
1. MongoDB 在盛大大数据
量下的应用
郭理靖 @sn da ; gu ol ijing@ gm a il .c om ;
11年10月19日星期三
2. Sections
Basic introduction to MongoDB
Monitor MongoDB
Backup and Rollback of MongoDB
Case Study
11年10月19日星期三
3. What’s MongoDB
MongoDB (from "humongous") is a
scalable, high-performance, open
source, document-oriented
database.
Current version:2.0.0
11年10月19日星期三
6. Features
Document-oriented storage (JSON)
Full Index Support (Indexes)
Replication & High availability
Rich Document-based Queries and Updates
Map/Reduce
Auto-Sharding and GridFS
11年10月19日星期三
7. Terminology
RDBMS MongoDB
Table Collection
View/Row(s) JSON Document
Column name Field name
Index Index
Join Embedding & Linking
Partition Shard
Partition Key Shard Key
11年10月19日星期三
8. SELECT
mySQL db.runCommand({
MongoDB
Dim1, Dim2, 1 mapreduce: "DenormAggCollection",
SUM(Measure1) AS MSum, query: {
2
COUNT(*) AS RecordCount, filter1: { '$in': [ 'A', 'B' ] },
AVG(Measure2) AS MAvg, 3 filter2: 'C',
MIN(Measure1) AS MMin filter3: { '$gt': 123 }
MAX(CASE },
WHEN Measure2 < 100 4 map: function() { emit(
THEN Measure2 { d1: this.Dim1, d2: this.Dim2 },
END) AS MMax { msum: this.measure1, recs: 1, mmin: this.measure1,
FROM DenormAggTable mmax: this.measure2 < 100 ? this.measure2 : 0 }
WHERE (Filter1 IN (’A’,’B’)) );},
AND (Filter2 = ‘C’) 5 reduce: function(key, vals) {
AND (Filter3 > 123) var ret = { msum: 0, recs: 0, mmin: 0, mmax: 0 };
GROUP BY Dim1, Dim2 1 for(var i = 0; i < vals.length; i++) {
HAVING (MMin > 0) ret.msum += vals[i].msum;
ORDER BY RecordCount DESC ret.recs += vals[i].recs;
LIMIT 4, 8 if(vals[i].mmin < ret.mmin) ret.mmin = vals[i].mmin;
if((vals[i].mmax < 100) && (vals[i].mmax > ret.mmax))
ret.mmax = vals[i].mmax;
}
1 Grouped dimension columns are pulled return ret;
out as keys in the map function, },
reducing the size of the working set. finalize: function(key, val) {
6
2 Measures must be manually aggregated. 7 val.mavg = val.msum / val.recs;
return val;
3 Aggregates depending on record counts
},
Revision 4, Created 2010-03-06
Rick Osborne, rickosborne.org
must wait until finalization.
4 Measures can use procedural logic.
out: 'result1',
verbose: true
5 Filters have an ORM/ActiveRecord- });
looking style.
db.result1.
6 Aggregate filtering must be applied to
the result set, not in the map/reduce.
find({ mmin: { '$gt': 0 } }).
7 Ascending: 1; Descending: -1
sort({ recs: -1 }).
skip(4).
limit(8);
11年10月19日星期三
9. Indexes
Unique Indexes or Duplicate Values Indexes
Index of Single Key(Embedded key, Document
key) .Default index, ‘_id’: MongoID(GUID)
Index of Compound
Keys(db.user.ensureIndex({credit : -1, name: 1}))
Sparse Index. (A sparse index can only have one
field) //after 1.75
11年10月19日星期三
10. Geo Indexes
Geospatial index supported (good news for LBS)
db.c.find( {a:[50,50]} ) using index {a:’2d’}
db.c.find( {a:{$near:[50,50]}} ) using index
{a:’2d’}
Results are sorted closest - farthest
db.c.find( {a:{$within:{$box:[[40,40],
[60,60]]}}} ) using index {a:’2d’}
11年10月19日星期三
11. Atomic Operations
$set $unset $inc
$push - append a value to an array
$pushAll - append several values to an array
$pull - remove a value(s) from an existing array
$pullAll - remove several value(s) from an
existing array
$bit - bitwise operations
11年10月19日星期三
12. Tips
Update requires a write lock.
Write lock is “greedy” right now
Update in big collection is slow and block other
writes during the update
Query it first then update it will reduce the lock
time
11年10月19日星期三
14. Replica set
Replica sets are basically master/slave
replication, adding automatic failover and
automatic recovery of member nodes.
A Replica Set consists of two or more nodes that
are copies of each other. (i.e.: replicas)
The Replica Set automatically elects a Primary
(master) if there is no primary currently
available.
11年10月19日星期三
15. Replica set
Automated fail-over
Distribute read load
Simplified maintenance
A Replica Set can have passive members that
cannot become primary (for reading and backup)
A Replica Set can have delayed nodes (in case of
user error)
11年10月19日星期三
16. step down primary
If you step down primary manually when
primary is heavy, the replica set may can not
elect new primary. (it is fixed in 1.8.0)
May lose data when step down primary
manually. (sometimes, not test in 1.8.0 yet)
Clients may not know what happened.
11年10月19日星期三
17. What’s happened when
primary is down?
Replica set will elect new primary, but how?
what’s arbiter?
operator time
votes
priority
may lose data when primary is down
11年10月19日星期三
23. mongostat
inserts/s - # of inserts per second
query/s - # of queries per second
update/s - # of updates per second
delete/s - # of deletes per second
getmore/s - # of get mores (cursor batch)
per second
11年10月19日星期三
24. mongostat
command/s - # of commands per second
flushes/s - # of fsync flushes per second
mapped - amount of data mmaped (total data
size) megabytes
visze - virtual size of process in megabytes
res - resident size of process in megabytes
11年10月19日星期三
25. mongostat
faults/s - # of pages faults/sec (linux only)
locked - percent of time in global write lock
idx miss - percent of btree page misses
(sampled)
q t|r|w - lock queue lengths (total|read|
write)
conn - number of open connections
11年10月19日星期三
27. MongoDB with Numa
WARNING: You are running on a NUMA
machine.
We suggest launching mongod like this
to avoid performance problems:
numactl --interleave=all mongod
[other options]
11年10月19日星期三
28. Security
Auth is supported in Replica set/Master-Slaves,
Auto-Sharding
bind_ip
start auth in RS may have bug.(use http to check
the status and try restart mongods)
11年10月19日星期三
29. Memory
Memory-mapping
Keep indexes in memory
db.test.stats()
db.test.totalIndexSize()
RAM > indexes + hot data = better performance
11年10月19日星期三
30. some tips
No transactions
Fast-N-Loose by default (no safe/w/GLE)
Indexing order matters; query optimizer helps
One write thread,many query threads
Memory Mapped Data Files
BSON everywhere
11年10月19日星期三
31. Oplog of MongoDB
Oplog in Replica set:
local.system.replset: store config of replica set,
use rs.conf() to detail
local.oplog.rs: capped collection, use --
oplogSize to set size
local.replset.minvalid: for sync status
NO INDEX
11年10月19日星期三
33. Oplog of MongoDB
{ts:{},h{}, op:{},ns:{},o:{},o2:{} }
Ts: 8 bytes of time stamp by 4 bytes Unix timestamp + 4 bytes since the count said.
This value is very important, in the elections (such as master is down unit), new primary would
choose the biggest change as the ts new primary.
Op: 1 bytes of operation types, such as I said insert, d said the delete.
In the namespace ns: the operation.
O: the operation is corresponding to the document, that is, the content of the current operation
(such as the update operation to update when the fields and value)
O2: perform update operation in the condition is limited to the update, where to have this property
Among them, can be the op several situations:
h:// hash mongodb use to make sure we are reading the right flow of ops and aren't on an out-of-
date "fork"
11年10月19日星期三
35. Feature of Oplog
Replay same oplog is harmless
Replay old oplogs is harmless if the ts in last one
is newer than current database
11年10月19日星期三
36. Back up
mongoexport mongoimport
mongodump mongorestore
data consistency? use --oplog
use incremental back up
read oplog and replay (wordnik tools)
11年10月19日星期三
37. Rollback MongoDB
Use snapshot + oplog
Use delayed secondary +
oplog
11年10月19日星期三
38. Vertical Scala
Old server 8G RAM +200G dis
New server 64G RAM + 2T disk
What can I do if my service should always keep
online ?
11年10月19日星期三
39. Vertical Scala
You’re luck if using Replica set
move everything(include ip config) from
secondary to new service
step down primary, do the same thing above
well, problem solved
11年10月19日星期三
40. Build New Index
Build a index is quite easy.....
Build a index in a huge collections, such as
500GB ,and database is very buy, is it easy?
11年10月19日星期三
41. Build New Index
Take a snapshot of current Mongodb service
Build new index in snapshot
replay oplog of working Mongod
Replace the working Mongod
11年10月19日星期三
42. Best practices
Replica set: one for primary, one for secondary,
one for incremental back up with a arbiter.
11年10月19日星期三
43. Q&A
www.mongoic.com is online !
We’re hiring...
11年10月19日星期三