高性能服务器架构设计和调优


高性能服务器架构设计和调优 千石 Agenda Tuning Troubleshoong Design Device Device CPU Intel Core -2 Intel Sandybridge CPU Cache CPU Cache line Memory Controler UMA/NUMA UMA NUMA Mix architectures 10 buffered io mmap direct io 硬盘到SSD 网卡收包 网卡多队列 网卡多队列绑定CPU Design 软件性能设计原则:不能成为硬件瓶颈 •  均衡使用CPU多核处理能力 •  高效合理地使用和控制内存 •  最大化磁盘IOPS和吞吐,异步化处理 •  小包跑满万兆网卡,中断平衡 CDN Cache系统模型 •  Net模块: 支持大并发,跑满万兆网卡 •  ACL模块: 高效匹配,减少CPU消耗 •  Store模块: 提高命中率,高效利用磁盘IOPS •  回源模块: L7-check,长连接保持 Net 模块 •  I/O模型 – epoll + O_NONBLOCK •  TCP选项 – TCP_DEFER_ACCEPT / TCP_SYNCNT – TCP_CORK / TCP_NODELAY / TCP_QUICKACK •  收发包的方式 – RSS, SMP_AFFNITY – SO_REUSEPORT SO_REUSEPORT •  listen单个fd存在 accept竞争问题 •  SO_REUSEPORT listen多个fd ACL优化 •  squid根据 hp_access配置 的顺序依次比较 •  Swi将ACL按 domain建立一棵 Trie树 完美hash处理hp header •  完美hash( Perfect Hash Funcon ) – Hash table: key = value – PHF将key集合没有冲突地映射到一组整数 – 查找key操作转换为索引整数表 •  PHF场景&作用 – 适合在key集合确定或不经常更新的情况 – 主要作用是提高hash查找的速度 Store模块 •  DIRECT IO写裸盘、绕过FS、不使用page cache •  顺序写/随机读,stripe 8MB/block 512B •  mem_buf 在内存做写合并 •  省去open和close系统调用 多线程任务交互模型 TCmalloc管理内存 < 32KB > 32KB •  thread free list •  central free list •  central heap Hash Trie替换Hash table 无锁LRU链表 Troubleshoong CPU分析工具 用/proc/$pid定位问题 •  # top –cbp $pid •  # strace -cp $pid •  # ps -flp $pid •  # pstack $pid •  # cat /proc/$pid/wchan •  # cat /proc/$pid/status •  # cat /proc/$pid/sched •  # cat /proc/$pid/schedstat •  # cat /proc/$pid/syscall •  # cat /proc/$pid/stack Mem分析工具 /proc/meminfo 项的关系 •  MemTotal = LowTotal + HighTotal •  MemFree = LowFree + HighFree •  Slab = SReclaimable + SUnreclaimable •  Acve = Acve(anon) + Acve(file) •  Inacve = Inacve(anon) + Inacve(file) •  AnonPages + ?X? = Acve(anon) + Inacve(anon) •  Buffers + Cached = Acve(file) + Inacve(file) + ? X? •  AnonPages + Buffers + Cached = Acve + Inacve •  SwapTotal = SwapFree + SwapCached 内存问题定位 •  内存泄露 – # env HEAPPROFILE=/home/qianshi/dev/ swi.hprof ./swi -f swi.conf – # pprof --pdf --base=swi.hprof.0001.heap ./swi swi.hprof.0002.heap > 1-2.pdf •  内存写乱 – # clang -O1 -g -fsanize=address -fno-omit-frame- pointer example_UseAerFree.cc Disk分析工具 Network分析工具 Tuning CPU tuning •  CPU亲和性 – 提高cache命中率 – 降低访问内存延迟 – taskset –c –p $pid •  避免false sharing – 编译器强制对齐 – 填充结构体保证 cache line对齐 – 使用线程局部数据 Memory tuning •  关掉SWAP – /proc/sys/vm/swappiness – swapoff –a •  OOM处理 – /proc/$pid/oom_adj – /proc/sys/vm/overcommit_memory – /proc/sys/vm/overcommit_rao Disk tuning •  Scheduler algorithm – echo deadline > /sys/block//queue/ scheduler •  IO request queue – echo 1024 > /sys/block//queue/nr_requests Network tuning •  Interrupts balance –  /proc/irq/IRQ/smp_affinity •  Backlogs –  net.core.netdev_max_backlog –  net.core.somaxconn –  net.ipv4.tcp_max_syn_backlog •  Reduce TCP overhead –  net.ipv4.tcp_sack –  net.ipv4.tcp_fack •  Reduce connecon overhead –  net.ipv4.tcp_fin_meout –  net.ipv4.tcp_tw_reuse •  Enable auto-tuning –  net.ipv4.tcp_moderate_rcvbuf –  net.ipv4.tcp_window_scaling The tool isn’t important – it’s important to have a way to measure everything -- Brendan Gregg Architects look at thousands of buildings during their training, and study critiques of those buildings written by masters. In contrast, most software developers only ever get to know a handful of large programs well — usually programs they wrote themselves — and never study the great programs of history. As a result, they repeat one another’s mistakes rather than building on one another’s successes. --The Architecture of Open Source Applications 推荐两本书 reference •  hp://tutorials.jenkov.com/soware-architecture/computer-architecture.html •  hp://exadat.co.uk/2015/01/29/cpus-memory-storage-and-database-engines- the-shape-of-things-to-come/ •  hp://mechanical-sympathy.blogspot.com/2013/02/cpu-cache-flushing- fallacy.html •  hp://duartes.org/gustavo/blog/post/what-your-computer-does-while-you-wait/ •  hps://soware.intel.com/en-us/arcles/detecng-memory-bandwidth- saturaon-in-threaded-applicaons •  hps://soware.intel.com/en-us/arcles/opmizing-applicaons-for-numa •  hps://www.thomas-krenn.com/en/wiki/Linux_Storage_Stack_Diagram •  hp://www.mimuw.edu.pl/~lichota/09-10/Optymalizacja-open-source/Materialy/ 10%20-%20Dysk/gelato_ICE06apr_blktrace_brunelle_hp.pdf •  hp://codecapsule.com/2014/02/12/coding-for-ssds-part-2-architecture-of-an- ssd-and-benchmarking/ •  hp://codecapsule.com/2014/02/12/coding-for-ssds-part-3-pages-blocks-and-the- flash-translaon-layer/ •  hp://cd-docdb.fnal.gov/0019/001968/001/Linux-Pkt-Recv-Performance-Analysis- Final.pdf •  hp://www.slideshare.net/hisaki/x86-hardware-for-packet-processing •  hp://balodeamit.blogspot.com/2013/10/receive-side-scaling-and-receive- packet.html •  hps://hpi.de/planer/research/tools-methods-for-enterprise-systems-design- and-engineering.html •  hp://sdepl.ucsd.edu/cgi-bin/yman2html?m=tcp&s=7 •  hp://zh.wikipedia.org/wiki/Trie •  hp://en.wikipedia.org/wiki/Perfect_hash_funcon •  hp://goog-perools.sourceforge.net/doc/tcmalloc.html •  hps://soware.intel.com/en-us/arcles/avoiding-and-idenfying-false-sharing- among-threads •  hps://www.cerficaonkits.com/cisco-cerficaon/cisco-ccnp-tshoot-642-832- exam-study-center/cisco-ccnp-tshoot-troubleshoong-networks/ •  hp://brendangregg.com/books.html •  hp://blog.tanelpoder.com/2013/02/21/peeking-into-linux-kernel-land-using- proc-filesystem-for-quickndirty-troubleshoong/ •  hp://goog-perools.sourceforge.net/doc/heap_profiler.html •  hp://clang.llvm.org/docs/AddressSanizer.html •  hp://brendangregg.com/linuxperf.html
还剩47页未读

继续阅读

下载pdf到电脑,查找使用更方便

pdf的实际排版效果,会与网站的显示效果略有不同!!

需要 10 金币 [ 分享pdf获得金币 ] 0 人已下载

下载pdf

pdf贡献者

lzqie

贡献于2016-04-29

下载需要 10 金币 [金币充值 ]
亲,您也可以通过 分享原创pdf 来获得金币奖励!
下载pdf