• 1. 虚拟化最佳实践及规划
  • 2. 议程应用实施范围考虑 服务器采购考虑 虚拟机部署考虑 管理维护考虑
  • 3. 议程应用实施范围考虑 服务器采购考虑 虚拟机部署考虑 管理维护考虑
  • 4. 应用实施范围总体原则不适合采用虚拟化的应用 具有特殊硬件访问要求的应用 高性能图形显卡 --- 不适用虚拟化 特殊的串/并行加密设备 ---不适用虚拟化 USB设备连接需求 --- 可能不适用,可采用外置USB设备代替,需经过测试 即使在高配置的服务器上仍然具有很高负载的应用 --- 可能不适用,需分析当前服务器配置情况 可以采用虚拟化的应用 除上述不适合采用虚拟化的应用之外的所有应用 可根据应用迁移的复杂程度决定虚拟化先后顺序 较易实现P2V的应用可先做迁移,如可用Converter工具直接迁移的应用 较难或不能做P2V迁移的应用可考虑采用重新安装方式后迁 根据管理的需要决定是否做虚拟化 虚拟化转变过程对现有业务的影响程度 转变为虚拟化后对现有管理的影响程度 部门之间协调的难易程度
  • 5. 虚拟化宿主服务器的部署类型垂直扩展与水平扩展部署模式 不同资源池的“量子化”模型物理主机HypervisorVMOSAppVMOSAppVMOSAppVMOSAppVMOSAppVMOSAppVMOSAppVMOSAppVMOSAppPhysical HostHypervisorVMOSAppVMOSAppVMOSAppPhysical HostHypervisorVMOSAppVMOSAppVMOSApp物理主机HypervisorVMOSAppVMOSAppVMOSApp
  • 6. 不同资源池的类型垂直扩展的主机模式提供更大的连续性资源空间 更容易满足不同负载的吻合性要求 可提供更高的资源利用率 水平扩展的集群主机模式更像是一组小池子的集合 多组小容量资源池 需要更多的监控管理 模块化使用既有优点也有缺点
  • 7. 影响虚拟化部署的参数体系功能的多样化服务器的重要性独立服务器不重要重要(独立的, 本地存储等等)(群集的,多主机的等等)服务器农场(水平扩展服务器群集, 公用服务器等等)(后端办公, 本地应用等等)负载约束技术约束商业约束困难度增加
  • 8. 负载约束不同的资源组都需要分别考虑 CPU利用率 磁盘 I/O 网络 I/O 内存利用率 虚拟化造成的额外负载通常也要做一定考虑,如 磁盘和网络的I/O会增加CPU的负担 iSCSI存储访问也会增加CPU负担 运维的周期性负载变化也必须考虑进来 月末负载变化 年末负载变化
  • 9. 技术约束技术约束通常主要是指: 兼容性 (指系统/应用的兼容性) 关联性 (如系统是摩格逻辑组的一部分) 大部分环境下,这些约束包括了: 网络连接(子网段级别) 应用之间的互连性 相关的存储使用技术 所用的硬件和外设 软件支持度和认证 这些约束条件根据虚拟化在内核上下实现的不同而有所不同 共享与分离OS镜像模式
  • 10. 商业和流程的约束规模较小或集中的情况下容易被忽略的约束 在实验室的测试环境可以不考虑,但生产环境必须要考虑 在虚拟化中常见的商业和流程约束包括: 维护窗口和冻结改变 地理位置和其他物理限制 运维环境,安全区域,应用分层部署考虑 商业组织,部门以及客户 法规政策的考虑与限制 忽视这些约束条件将可能导致不可预知的结果 具体情况具体分析,根据目标制定计划
  • 11. 议程应用实施范围考虑 服务器采购考虑 虚拟机部署考虑 管理维护考虑
  • 12. 虚拟化中使用的硬件应满足兼容性列表要求所有用于实现VMware VI3虚拟架构解决方案的设备,包括:服务器系统、存储系统、IO卡设备等,应满足VMware VI3产品兼容列表的要求,最新的兼容列表可从如下的连接找到: http://www.vmware.com/resources/techresources/cat/119 服务器系统兼容列表 《HCL: Systems Compatibility Guide For ESX Server 3.5 and ESX Server 3i》 存储系统兼容列表 《HCL: Storage / SAN Compatibility Guide For ESX Server 3.5 and ESX Server 3i》 IO卡设备兼容列表,包括网卡、FC HBA卡和iSCSI HBA卡等 《HCL: I/O Compatibility Guide For ESX Server 3.5 and ESX Server 3i》
  • 13. ESX服务器硬件配置考虑要点– CPUsESX 调度CPU周期满足虚拟机和Service Console的处理请求 可用的CPU目标数量越多,ESX管理这个调度机制的效果越好 (单台服务器配置8个以上的CPU核会有最好的效果) 超线程技术并不能提供等同于多核处理器的好处;建议关闭CPU的超线程功能(如果有的话) 使用具有EM64T能力的Intel VT 或AMD V 技术的CPU可以同时支持运行32位和64位的虚拟机 采用同一厂商、同一产品家族和同一代处理器的服务器组成的集群,可以获得最好的VMotion兼容能力 ES的Enhanced VMotion兼容性扩大了原有VMotion的兼容能力-《Alleviating Constraints with Resource Pools Live Migration with Enhanced VMotion》---参见《Best Practices for Successful VI Design》
  • 14. ESX服务器硬件配置考虑要点- 内存内存资源往往比CPU资源更会成为潜在的瓶颈 在某些时候,虚机环境的内存使用量可能会超过物理内存值: Host swap file (尽量少用以获得最佳性能) Transparent Page Sharing(多个虚机共享相同内存页面) 注意服务器特定的内存配置要求 DIMM sizes, bank pairing, parity, upgrade considerations (mix and match or forklift replacement) 尽可能将服务器配置到最大内存,采用最大容量的内存条(特别是当没有配满全部内存条时)---参见《Best Practices for Successful VI Design》
  • 15. 虚拟架构的基本网络连接部件组成(Management virtual machine)(Vmotion, iSCSI, NFS)(VM connectivity)Port GroupPort GroupPort GroupESX服务器硬件配置考虑要点- 网络---参见《Best Practices for Successful VI Design》
  • 16. 最少配置一个虚拟交换机,测试环境可用2个虚拟交换机,生产环境建议最少配置3个虚拟交换机 虚拟交换机可同时支持3种类型的端口组 (Service Console, VMkernel, VM) 建议将Service Console、VMkernel和虚机端口组各自使用自己的虚拟交换机 可用VLAN技术分割不同的端口组 对于使用VMotion和DRS功能的服务器集群,网络配置应该相匹配 (虚拟交换机的数量与网络卷标名应保持一致) ESX服务器Service Console使用固定IP,配置正确的speed和duplex。ESX服务器硬件配置考虑要点- 网络-虚拟交换机和端口组---参见《Best Practices for Successful VI Design》
  • 17. ESX服务器,虚拟交换机,物理网卡 出于冗余的考虑,每个虚拟交换机建议至少分配两个物理网卡 每个ESX服务器的物理网卡/口数量取决于准备配置的虚拟交换机的数量 如果3种类型的端口组(SC, VMkernel, VM)都在不同的虚拟交换机上,生产环境建议至少6个物理网卡/口 如果给包含虚拟机端口组的虚拟交换机分配更多的物理网卡/口,可以获得负载均衡的好处ESX服务器硬件配置考虑要点- 网络基本组件---参见《Best Practices for Successful VI Design》
  • 18. 物理网卡/口与物理交换机 同一个虚拟交换机上的不同物理网卡/口应连接到不同的物理交换机上 将一个集群中所有服务器的VMotion功能端口组所使用的物理网卡/口都连到同一套物理交换机上 (同样遵循上述第一条规则)ESX服务器硬件配置考虑要点- 与物理网络的连接---参见《Best Practices for Successful VI Design》
  • 19. Example 1: Blade Server with 2 NIC PortsvSwitchvmnic0SCvmkernelActiveStandbyvmnic1Candidate Design: Team both NIC ports Create one virtual switch Create three port groups: Use Active/Standby policy for each portgroup Portgroup1: Service Console (SC) Portgroup2: VMotion Portgroup3: VM traffic Use VLAN trunking Trunk VLANs 10, 20, 30 on each uplinkPortgroup1 VLAN 10Portgroup3 VLAN 30Portgroup2 VLAN 20VLAN Trunks (VLANs 10, 20, 30)
  • 20. Example 2: Server with 4 NIC Portsvmnic0SCvmkernelActiveStandbyvmnic1Candidate Design: Create two virtual switches Team two NICs to each vSwitch vSwitch0 (use active/standby for each portgroup): Portgroup1: Service Console (SC) Portgroup2: VMotion vSwitch1 (use Originating Virtual PortID) Portgroup3: VM traffic #1 Portgroup4: VM traffic #2 Use VLAN trunking vmnic1 and vmnic3: Trunk VLANs 10, 20 vmnic0 and vmnic2: Trunk VLANs 30, 40Portgroup4 VLAN 40VLANs 10, 20vSwitch0Portgroup1 VLAN 10Portgroup2 VLAN 20vSwitch1vmnic2vmnic3Portgroup3 VLAN 30VLANs 30, 40
  • 21. Example 3: Server with 4 NIC Ports (Slight Variation)vmnic0SCvmkernelActiveStandbyvmnic1Candidate Design: Create one virtual switch Create two NIC teams vSwitch0 (use active/standby for portgroups 1 & 2): Portgroup1: Service Console (SC) Portgroup2: Vmotion Use Originating Virtual PortID for Portgroups 3 & 4 Portgroup3: VM traffic #1 Portgroup4: VM traffic #2 Use VLAN trunking vmnic1 and vmnic3: Trunk VLANs 10, 20 vmnic0 and vmnic2: Trunk VLANs 30, 40VLANs 10, 20vSwitch0Portgroup1 VLAN 10Portgroup2 VLAN 20vmnic2vmnic3Portgroup3 VLAN 30VLANs 30, 40Portgroup4 VLAN 40
  • 22. Servers with More NIC PortsMore than 4 NIC Ports—Design Considerations With Trunks (VLAN tagging): Use previous approach and scale up to meet additional bandwidth and redundancy requirements Add NICs to NIC team supporting VM traffic VLAN Tagging always recommended, but options if NICs available: Dedicated NIC for VMotion At least one NIC Dedicated NICs for IP Storage (NFS and/or iSCSI) Usually two teamed NICs (consider IP-hash & etherchannel if multiple destinations and Multi-Chassis Etherchannel employed on physical switches) Dedicated NIC(s) for Service Console At least two for availability Note: easy to consume many physical NICs and switch ports if not using VLAN tagging
  • 23. ESX服务器硬件配置考虑要点- 存储应尽可能采用外置共享磁盘阵列存放虚拟机文件 ESX服务器内置硬盘应有充分的冗余,建议采用RAID1 ESX服务器自身对硬盘要求,安装时的Partition划分: 不建议用安装时的自动硬盘划分方法,因为/、/var、/home会放再同一个目录下,当/(root)满了时,ESX服务器会发生严重问题。建议: /boot 50 到100 MB (Primary Partition) / 8.0 到18GB (Primary Partition) (swap) 2倍的Service Console内存, 建议固定使用1.6G /var 4GB 或更大 建议足够的ESX服务器程序空间大小为18GB 本地端的ISO以及其他文本文件的存放空间要考虑
  • 24. 存储对于虚拟机的呈现方式 7VM 层数据存储存储阵列SCSI 控制器 虚拟磁盘呈现为 SCSI 控制器 SCSI 控制器显示为 BUS 或 LSI Logic 磁盘控制器 一个 VM 可具有 1 到 4 个虚拟 LSI Logic 或 BusLogic SCSI 适配器 每个 SCSI 适配器包含1 到 15 个虚拟 SCSI 存储设备 虚拟磁盘驻留在可格式化为 VMFS、 NFS 或裸磁盘的数据存储中 文件系统类型由底层物理磁盘驱动 器确定 VMFSNFSFCiSCSINAS
  • 25. 卷、数据存储和 LUN卷存储阵列数据存储8LUN 10 20 GB LUN 是一个逻辑空间 可由存储阵列的整个空间创建,也可由其中的 部分空间创建 LUN 映射到 ESX 后即成为卷 当卷被格式化为某种文件系统之后即成为数据存储 不能在同一个 LUN 中混用不同类型的文件系统 每个 LUN 对应一个 VMFS 卷
  • 26. 虚拟机内容位于数据存储中 数据存储ESX 数据存储类型: VMware 文件系统(VMFS) 使用 VMFS 的 RDM 网络文件系统(NFS) 卷 VM 内容ESX 主机 数据存储采用某种文件系统格式 可以像操作文件一样操作数据存储 每个系统具有 256 个 VMFS 数据存储 每个系统具有 8 个 NFS 数据存储 ISO 映像、VM 模板和软盘映像9
  • 27. 虚拟机内容 文件名说明.vmxVM 配置.vmdk虚拟磁盘-flat.vmdk预分配的虚拟磁盘文件 (包含数据).vswp交换文件nvram非易失性 RAM 文件.vmemVM 内存.vmssVM 挂起文件.vmsd快照数据 -Snapshot.vmsn快照状态文件VMware-0.log、vmware-1.log 等日志文件10文件夹/子目录
  • 28. 数据存储类型 数据存储VMFS数据存储VMFS数据存储NFSIP 网络VM 内容VM 内容FC 交换机光纤通道 SAN 磁盘阵列iSCSI SAN 磁盘阵列NAS 磁盘阵列VM 1ESX 主机 1VM 2VM 1ESX 主机 2VM 2VM 内容VM 3VM 312本地 SCSIVMDKIP 交换机
  • 29. ESX服务器建议配置-新购为了尽可能的发挥虚拟化的作用,最大限度的利用单台服务器的资源,建议用于虚拟化宿主服务器的配置应达到或超过如下标准:服务器CPU路数双路四路八路CPU(建议主频2GHz以上)双路四核四路双核或四核四路双核或四核+内存16GB+32GB+64GB+ 千兆网口无外接存储4+ / 6+4+ / 6+4+ / 6+使用FC存储4+ / 6+4+ / 6+4+ / 6+使用IP存储6+ / 8+6+ / 8+6+ / 8+FC HBA口(建议4Gb或8Gb产品)222内置硬盘(使用外置磁盘阵列时)222电源双冗余双冗余双冗余从性价比和可用性考虑,不建议在单路服务器上部署虚拟化
  • 30. 虚拟化宿主服务器建议配置-现有对于目前业内用的比较多的四路服务器 ,建议如下:四路单核服务器:运算能力较弱,虚机数量应控制在10个以内,内存配置建议在12GB-16GB; 四路双核服务器:运算能力中等,虚机数量可做到10-15个左右,内存配置建议在16GB-24GB; 四路四核服务器:运算能力强劲,虚机数量可做到15-30个左右,内存配置建议在24GB-32GB。
  • 31. VC服务器最佳配置建议处理器:2.0GHz或更高的Intel或AMD x86处理器,VC支持多处理,可支持至多2个CPU。 内存:最低需求为2GB,假使数据库和VC安装于同一台,建议增加至4GB。 磁盘空间:最小为560MB,建议2GB。 网卡:建议用Gigabit。 最低硬件配置---单个2GHzCPU,2GB内存,千兆网口 可支持20个同时连接,管理50台物理机,1000个虚拟机左右 建议配置---双CPU,4GB内存,千兆网口 可支持50个同时连接,管理200台物理机,2000个虚拟机左右
  • 32. 议程应用实施范围考虑 服务器采购考虑 虚拟机部署考虑 管理维护考虑
  • 33. 虚机个数的规划单台服务器所能支持虚机数量的决定因素: 服务器的硬件配置 CPU性能---多核高主频技术使得CPU成为性能瓶颈的可能性越来越低 内存大小---做为硬指标的内存,配置越高,所能支持的虚机数量越多 网络端口---千兆网环境已很普遍,网络带宽大多有保证,更多从管理角度来考虑 HBA卡---磁盘访问性能对虚机数量有一定影响,建议采用4Gb或8GbHBA卡以减少链路影响 本地磁盘---内置磁盘的可用性及IO吞吐能力均较弱,不建议在其上存放虚拟机,推荐使用外置高性能磁盘阵列 应用负载大小 由于物理服务器资源自身的最大限制,应用负载越大,所能同时运行的虚机数量越少 建议将不同应用访问特性的应用混合部署在同一物理服务器上 灵活运用DRS和VMotion技术可将物理机与虚机的比率关系调到最优 考虑到HA及DRS所要求的资源冗余,所有运行虚机在正常负载下,总体资源使用率不超过三分之二会比较合适 经验值:双路四核10个虚机左右,四路四核15-30个虚机(仅为参考)
  • 34. 虚机资源的分配---CPU、内存CPU分配原则: 尽量使用最少的vCPUs,如果是单线程应用,不支持多线程处理,请不要使用virtual SMP 虚拟CPU数量不要等于或超过物理CPU核数,如双路双核服务器配置的虚机最多使用两个虚拟CPU 当配置虚拟机的时候须了解ESX服务器本身也有一些overhead。需注意不要超过所有虚拟机使用率和所有vCPU汇总数目。 观察”idle loop spin”功能参数,某些操作系统当它们闲置时,并不会真正的释放virtual CPU。 确认配置了单一处理器的虚拟机为”UP HAL/kernel”,多处理器的虚拟机必须设定为”SMP HAL/kernel”。 内存分配原则: 内存总量为在资源评估后,计算虚拟机评估结果所需实际物理内存的总和,其他由于应用程序而产生的更多内存需要可以用ESX的磁盘内存来解决 关键应用可考虑固定内存的方法以保证性能的稳定性
  • 35. DRS Best Practices: Hardware ConfigurationEnsure hosts are CPU compatible Intel vs AMD Similar CPU family/SSE3 status Enhanced VMotion Compatibility (EVC) “VMware VMotion and CPU Compatibility” whitepaper CPU incompatibility => limited DRS VM migration options Larger Host CPU and memory size preferred for VM placement (if all equal) Differences in cache or memory architecture => inconsistency in performance
  • 36. DRS Best Practices: Cluster ConfigurationHigher number of hosts => more DRS balancing options Recommend up to 32 hosts/cluster May vary with VC server configuration and VM/host ratio Network configuration on all hosts VMotion network: Security policies, VMotion nic enabled, GigE network, etc Virtual Machine network present on all hosts VM datastore shared across all hosts VM floppy/CD connected to host device
  • 37. DRS Best Practices: VM Resource SettingsReservations, Limits, and Shares Shares take effect during resource contention Low limits can lead to wasted resources High VM reservations may limit DRS balancing Overhead memory Use resource pools (RP) for better manageability Virtual CPU’s and Memory size High memory size and virtual CPU’s => fewer migration opportunities Configure VMs based on need
  • 38. DRS Best Practices: Algorithm SettingsAggressiveness threshold Moderate threshold (default) works well for most cases Aggressive thresholds recommended if Homogenous clusters and VM demand relatively constant and Few affinity/anti-affinity rules Use affinity/anti-affinity rules only when need Affinity rules: closely interacting VMs Anti-affinity rules: I/O intensive workloads, availability Automatic DRS mode recommended (cluster-wide) Manual/Partially automatic mode for location-critical VMs (per VM) Per VM setting overrides cluster-wide setting
  • 39. HA Best Practices - Setup & Networking Proper DNS & Network settings are needed for initial configuration After configuration DNS resolutions are cached to /etc/FT_HOSTS (minimizing the dependency on DNS server availability during an actual failover) DNS on each host is preferred (manual editing of /etc/hosts is error prone) Redundancy to ESX Service Console networking is essential (several options) Choose the option that minimizes single points of failure Gateways/isolation addresses should respond via ICMP (ping) Enable PortFast (or equivalent) on network switches to avoid spanning tree related isolations Network maintenance activities should take into account dependencies on the ESX Service Console network(s) VMware HA can be temporarily disabled through the Cluster->Edit Settings dialog Valid VM network label names required for proper failover Virtual machines use them to re-establish network connectivity upon restart
  • 40. HA Network Configuration A single service console network with underlying redundancy is usually sufficient: Use a team of 2 NICs connected to different physical switches to avoid a single point of failure Configure vNics in vSwitch for Active/Standby configuration (rolling failover = “yes”, default load balancing = route based on originating port ID) Consider extending timeout values & adding multiple isolation addresses (*see appendix) Timeouts of 30-60 seconds will slightly extend recovery times, but will also allow for intermittent network outagesNetwork redundancy between the ESX service consoles is essential for reliable detection of host failures & isolation conditions
  • 41. HA Network Configuration (Continued) HA will detect and use a secondary service console network Adding a secondary service console portgroup to an existing VMotion vSwitch avoids having to dedicate an additional subnet & NIC for this purpose Also need to specify an additional isolation address for the cluster to account for the added redundancy (*see appendix) Continue using the primary service console network & IP address for management purposes Be careful with network maintenance that affects the primary service console network and the secondary / VMotion networkBeyond NIC teaming, a secondary service console network can be configured to provide redundant heartbeating & isolation detection
  • 42. HA Best Practices – Resource Management Larger groups of homogenous servers will allow higher levels of utilization across an HA/DRS enabled cluster (on average) More nodes per cluster (current maximum is 16) can tolerate multiple host failures while still guaranteeing failover capacities Admission control heuristics are conservatively weighted (so that large servers with many VMs can failover to small servers) To define the sizing estimates used for admission control, set reasonable reservations as the minimum resources needed Admission control will exceed failover capacities when reservations are not set; otherwise HA will use largest reservation specified as the “slot” size. At a minimum, set reservations for a few virtual machines considered “average” Admission control may be too conservative when host and VM sizes vary widely Perform your own capacity planning by choosing “Allow virtual machines to be powered on even if they violate availability constraints”. HA will still try to restart as many virtual machines as it can.
  • 43. 议程应用实施范围考虑 服务器采购考虑 虚拟机部署考虑 管理维护考虑
  • 44. Impact of VirtualCenter Downtime ComponentImpact ExperiencedVirtual MachinesUnaffected, management requires direct connections to ESX ServersESX ServersUnaffected, management requires direct connections to ESX ServersPerformance & Monitoring Statistics Historical records will have gaps during outages, still available via ESX Servers VMotionUnavailableVMware DRSUnavailableVMware HAAgents unaffected & provide failover functionality, admission control unavailable---参见《Bulletproof VirtualCenter - A Guide to Protecting VirtualCenter》
  • 45. VirtualCenter ComponentsVirtualCenter ServerWeb AccessLicense ServerAD Domain ControllerDNS ServerDatabase Server---参见《Bulletproof VirtualCenter - A Guide to Protecting VirtualCenter》
  • 46. VirtualCenter – Recommended CollocationCollocation of VirtualCenter components is desirable for most environments Focus of this session is on providing protection for these components Industry standard solutions assumed for other componentsOne Server, Physical or VirtualVirtualCenter ServerWeb AccessLicense ServerAD Domain ControllerDNS ServerDatabase Server---参见《Bulletproof VirtualCenter - A Guide to Protecting VirtualCenter》
  • 47. VirtualCenter Components (Additional Details)VirtualCenter Service: almost stateless Information about inventory stored in the database Some state files stored locally on VirtualCenter server Web Access No state information License Server License file stored locally 14 day Grace period if unavailable---参见《Bulletproof VirtualCenter - A Guide to Protecting VirtualCenter》
  • 48. VirtualCenter – Local Configuration FilesOne Server, Physical or VirtualVirtualCenter ServerWeb AccessLicense ServerDatabase ServerSSL CertificateLicense File Config. FileUpgrade Files---参见《Bulletproof VirtualCenter - A Guide to Protecting VirtualCenter》
  • 49. Step 1 for High Availability: Protect the DatabaseDatabase outage will terminate VirtualCenter service As of VirtualCenter 2.0.1 Patch 2, Windows Service Manager will automatically attempt to restart it every 5 minutes, indefinitely VirtualCenter Database should be independently installed and managed For local availability use the preferred mechanism for the type of database being used (VMware HA, MSCS, Database specific mechanisms) For disaster recovery, database should be replicated to a remote site as part of an overall DR plan---参见《Bulletproof VirtualCenter - A Guide to Protecting VirtualCenter》
  • 50. VCVCBang!FailoverStep 2 for High Availability: Protect VirtualCenterVMware HA and Microsoft Cluster Services (MSCS) are the two most popular options Other 3rd party solutions possible* *Supported directly by 3rd party Option a): VMware HA Virtual instances only Subject to shared storage / network constraints Only requires single OS & application instance; no explicit replication Option b): MSCS VirtualCenter 2.0.2 patch 2 or beyond Physical or virtual instances Requires 2 identical OS & application installations; explicit replication of files Involves additional configuration efforts & ongoing maintenance---参见《Bulletproof VirtualCenter - A Guide to Protecting VirtualCenter》
  • 51. VirtualCenter: Physical vs. VirtualPhysicalVirtualBackups done using traditional toolsBackups possible through traditional tools, VCB, snapshots, cloning, etc.Dedicated server required Dedicated server not required, resources can be shared with other virtual machines Performance limited only by server hardware Performance from shared resources; tuning may be needed For additional details refer to the following documentation: http://www.vmware.com/pdf/vi3_vc_in_vm.pdf
  • 52. VirtualCenter with VMware HA: Out-of-BandTwo approaches Two VirtualCenter instances manage each other (pictured) Both run in HA cluster Each manages the other’s HA cluster Separate VirtualCenter instance is used to manage 2-node HA cluster (not pictured)vpxdvpxd
  • 53. VirtualCenter Server manages the VMware HA cluster providing its protection When the ESX hosts with VirtualCenter VM fails, VM is restarted automatically by HA Failover functionality provided by HA is independent from VirtualCenter (post-configuration)vpxdVirtualCenter with VMware HA: In-Band
  • 54. VirtualCenter with MSCS – PhysicalBest practice: use Majority Node Set quorum with witness share May be used as geographically dispersed cluster for disaster recovery solution VCDB may be used in another cluster group on the same cluster Requires a third nodeEthernet NetworkvcdbvpxdFor additional details refer to the following documentation: http://www.vmware.com/pdf/VC_MSCS.pdf
  • 55. VirtualCenter with MSCS – VirtualRequires use of quorum disk clustering: Quorum disk on the shared storage System disks for both clustered virtual machines on local storage Incompatible with VMotion or VMware HASANLANvpxdFor additional details refer to the following documentation: http://www.vmware.com/pdf/VC_MSCS.pdf
  • 56. Rise of the Phoenix - Disaster RecoveryVirtualCenter ServerDatabase ServerVI ServicesVI InventoryDatabase ServerStandard DR SolutionStandbyVirtualCenter ServerPrimaryReplication of state files
  • 57. VirtualCenter Disaster Recovery OverviewDisaster recovery solution consists of three pieces: VI Inventory data Use the standard DR solution of the database vendor VI Services Cold Standby – Re-install VirtualCenter and restore local configuration files Warm Standby – Pre-install VirtualCenter and synchronize local configuration files, but keep 2nd instance disconnected All other infrastructure services: AD, DNS, etc. Use existing product-specific solutions
  • 58. Cold Standby Recovery ProcedureAble to assign primary’s IP to standby?Maintain separate up to date copy of local configuration filesInstall fresh VirtualCenter instanceInstall config files; connect to standby DBDisaster!ESX Server hosts reconnect automaticallyRun script to reconnect all ESX Server hostsDoneyesnoFor additional details refer to the new VI perl toolkit: http://www.vmware.com/go/viperl/scripts
  • 59. Warm Standby Recovery ProcedureAble to assign primary’s IP to standby?Keep local configuration files synchronized between primary and standbyDisaster!ESX Server hosts reconnect automaticallyRun script to reconnect all ESX Server hostsDoneyesnoFaster end-to-end recovery (RTO) Also allows for scripting and automation Replication of configuration files through host-based replication or backup tools
  • 60. Monitoring Individual VirtualCenter ComponentsEntitySub entityComponentMetricTool to useVirtualCenter ServerVirtual CenterVirtual Center Service - vpxdvpxd.exe foundWMIvpxd is running as serviceWMIVirtual Center Certificatesfiles are existsWMIsize WMImodification dateWMILicense ServerLicense Server Serviceexe file existsWMIservice is up and runningWMILicense Filesfiles are existsWMIsize WMImodification dateWMIWeb ServiceWeb Serviceexe file existsWMIservice is up and runningWMIWeb Pagehttp://localhost is reachablePerl or vbs script – HTTP GETcritical files (?) are foundWMIHost System (where VC is running on)SystemCPU load < 90%WMISystem disk <80% FULLWMINetwork OKWMIDatabase serverODBC ConnectionConnection worksWMIDatabase integritySome "select" statement on critical tablessql
  • 61. FileContentsLocationFlex License filesLicense keys for all server-based licensed featuresC:\Program Files\VMware\VMware License Server\LicensesVirtualCenter Configuration fileGoverns the behavior of VirtualCenter Server and its interaction with ESX Server hosts as well as client programs, such as the VI Client. C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter\vpxd.cfgSSL certificateThe certificate used to authenticate all communication with ESX Server hosts. %ALLUSERSPROFILE%\Application Data\VMware\VMware VirtualCenter\SSLUpgrade filesIn special circumstances, it might be necessary to make custom changes to the agent which gets pushed to ESX Server hosts as they are added to the environment C:\Program Files\VMware\VMware VirtualCenter 2.0\upgradeVirtualCenter – Important Files (Details)
  • 62. VirtualCenter Communication PortsCommunication ChannelDefault Port Number and ProtocolConfigurable?VirtualCenter to ESX Server host902; TCP and UDPNot easily (requires numerous manual config file changes)VirtualCenter to VI Client902; TCP and 443; TCP for initiationYes, via VI ClientWeb browser to VirtualCenter80 & 443; TCPYes, via VI Client3rd party SDK client to VirtualCenter80 & 443; TCPYes, via VI ClientWeb browser to virtual machine console903; TCPNoVI Client to virtual machine console903; TCPNoVI Client to ESX Server host902; TCP and 443; TCP for initiationNoWeb browser to ESX Server host443; TCPNo3rd party SDK client to ESX Server host443; TCPNo
  • 63. HistoricalPast DayPast WeekPast MonthPast YearRollup every 30 minutes5 minHostdHostdReal Time statistics collectionDatabaseVPXD rolls up real time performance statistics to historical five minutes statistics and sends to databaseVirtualCenterRollup every 2 hoursRollup every 24 hours30 min2 hr24 hrPurge after one yearOverview of Performance Statistics
  • 64. Database Sizing for Optimal PerformanceMemory - databases are most efficient when given enough memory to cache their working data set Disk I/O - sufficient disk device bandwidth for log devices to prevent transactions from bottlenecking on disk I/O CPU - SQL Server is designed to use parallel processing whenever possible SQL Server’s TEMPDB - used extensively in VMware VirtualCenter 2.0 not as much in VMware VirtualCenter 2.5
  • 65. VMware VirtualCenter 2.5 Performance TipsConfigure performance statistics levels for collection levels (see next slide) configure past day and past week statistics at level 3 past month and past year statistics at level 1. fewer performance statistics for historical data Benefits significant reduction in the size of VMware VirtualCenter database. significant reduction in processing of rollups
  • 66. 谢 谢