Eucalyptus私有云 -- 参考架构(小型开发测试云)
小型开发测试云参考架构
The purpose of this document is to provide a reference architecture which describes the physical resource (server, network, storage), Eucalyptus software topology and configuration, and data center management facility requirements for constructing and maintaining a production private cloud deployment.
This reference architecture is specifically for users who require a production private cloud environment composed of a relatively small set of commodity hardware components, and for which the ultimate size of the deployment is limited to the capacity described later in the document. If the target deployment is one that will need to scale to accommodate more capacity in the future, the Dev/Test (Large) reference architecture should be used, instead. 这个参考架构的优点在于:
- 价格低廉,只需要少量的服务器、网络和存储设备
- 提供了一个高速、稳定的自助开发测试环境
- 仅包含少量独立组件,降低系统监控和管理开销
This document is intended for readers who have familiarized themselves with AWS terminology, the Eucalyptus Installation/Admin/User guides, and have experience implementing production data center solutions based upon Linux environments.
系统架构
如下两个图例显示了本参考架构中服务器、网络、Eucalyptus各个组件之间的逻辑关系和物理关系。
应用场景:开发测试
This particular reference architecture is intended to implement a private cloud for a dev/test use case. For dev/test, there are a variety of virtual machines used as development environments (dev) as well as virtual machines that are instantiated in order to perform efficient testing from a known environment (test).
Design Considerations
For the dev/test use case, so called ‘dev’ virtual machine servers (instances) are expected to have short to medium lifetimes, while ‘test’ instances are expected to have relatively short lifetimes. Generally, this architecture is intended to provide fast self-service instantiation of development environments, followed by instatiation of test environments against newly developed software that was produced as a result of development work. Under this type of general workflow, static data (data that needs to persist beyond the lifetimes of instantiated virtual machines) should be identified and de-coupled from the instances themselves as much as possible, resulting in low to medium usage of the static data storage facilities of Eucalyptus. Boot from EBS (bfEBS) instances are expected to be rarely utilized in deployments based on this architecture. However, the deployment does support limited use of bfEBS. If, for example, certain necessary servers that rely on static data are required to be available to service the needs of the other more transient dev/test environments, these would be candidates for use as bfEBS instances.
The following list describes the workload capacity that deployments based upon this reference architecture can support. It should be noted that some of these capacity boundries can be exceeded by deviating from the architecture, with readers being encouraged to Contact Eucalyptus for information on designing production Eucalyptus deployments.
Deployment Capacity
- Max of 128 running virtual machine instances
- One cluster composed of maximum 16 nodes
- Max of 8 virtual machines per node
- Max of 16 simultaneous attached elastic block storage volumes
- Max of 256 independent active users (max of 16 accounts, each with max of 16 users)
物理资源
Following is the minimum resource requirements for the physical servers, networks, and storage that are needed to support the architecture. For each category of physical resource, more resource than the minimum will not have a negative impact on the deployment (more cores, more RAM, more local disks, faster interfaces, higher bandwidth networking, more disk capacity, etc.).
Minimum Server Configuration
- 4 or more modern cores
- 16 or more GB of RAM
- 80 or more GB RAID 1/5/6 local disk for OS/Eucalyptus (see below for special storage requirements, based on component)
- Network: single 1GB or 10GB interface (see below for special network requirements, based on component)
Networking Configuration
- All servers connected to a single 1GB network switch fabric. If limited 10GB host connections/adapters are available, they should be placed such that Storage Controller, Cluster Controller, and Walrus (in that order) are connected at 10GB
Storage Configuration
- Local RAID disk storage for Walrus, Storage Controller, and Node Controllers
- Walrus: 300 or more GB RAID1/5/6
- Walrus capacity impacts the total number of template images and S3 accessible data that is available
- Storage Controller: 200 or more GB RAID1/5/6
- Storage Controller capacity impacts the total size of all EBS and bfEBS volumes that are available
- Node Controller: 200 or more GB RAID1/5/6RAID volumes are separate from OS disk (partition/volume) for all servers with Eucalyptus accessible RAID volumes
- Node Controller capacity impacts the total size of all instance store images that can run concurrently on a single node and the total number of images that can be cached on a single node
部署拓扑
Following is a description of the Eucalyptus platform topology atop physical resources. For this use case, the topology is designed to allow for a minimum of servers used for the Eucalyptus platform, while providing enough capacity to give acceptable performance up to the specified maximums defined at the beginning of this document.
Eucalyptus Component Topology
Each server in the above physical model diagram will be running one or more Eucalyptus software components which together form the Eucalyptus platform. Listed here are the mappings of physical server to Eucalyptus component, where each server is configured to conform to at least the minimum requirements for servers defined previously.
- Front-end server 1: Cloud Controller/Walrus/UserConsole
- Cluster server 1: Cluster Controller
- Cluster server 2: Storage Controller
- Node server 1-16: Node Controller x 16
Eucalyptus Configuration Options
The Eucalyptus platform is highly configurable, covering a wide variety of data center topologies, devices, software management systems and network/security policies. For this reference architecture, we list here certain fundamental configuration options which will provide the necessary service of the reference architecture balanced against minimal performance and management overhead. Please refer to the Eucalyptus Installation and Admin Guides for information on how to implement these configurations.
- Networking mode: MANAGED-NOVLAN
- Public addresses: 160 (maximum number of virtual machines + 32 for allocatable elastic IPs)
- Security group size: minimum 32, maximum 128
- Storage Controller driver: DAS
- High Availability: no
- Linux Distribution: CentOS 6 + KVM
- Java components (Cloud Controller, Walrus, Storage Controller): configured to run with increased heap size (60% of total available memory)
Other Features
Eucalyptus includes a number of features that are in place to support specific aspects of production deployments that may or may not be required based on the user’s preferences and constraints. Listed here are descriptions of some of these features as they apply to this particular reference architecture. Please refer to the Eucalyptus Installation and Admin Guides for information on how to implement these features, if required.
- Reporting feature should be lightly used (configured to be either disabled or to poll at infrequent intervals) for this architecture. If it is a requirement of the deployment to supply fine grained or long-term reporting information, a data warehouse topology (extra machine) should be added to the deployment, with tooling in place to enforce periodic export/flush of reporting data to the warehouse.
- LDAP integration should be implemented only if required.
相关服务
The sections that have been covered up to this point in this reference architecture have been outlining the design of a Eucalyptus software deployment along with a definition of minimum physical resource capacity and configurations. Next, we address additional technologies and techniques that surround the Eucalyptus software/hardware itself which are required to run a complete Eucalyptus private cloud in production.
Services provided by the Eucalyptus private cloud software platform
- EC2 Compatible private cloud virtual machine management platform
- S3 Compatible storage platform
- Eucalyptus end-user web based GUI console
- Eucalyptus end-user and admin CLI tools
- Service of creating, managing, and cleaning up virtual machines and related resource artifacts (EBS volumes, virtual networks, etc.)
- Eucalyptus service troubleshooting and problem resolution
Additional required services
- Data-center server, network, storage, OS installation system
- Physical machine health and status monitoring
- Automatic resource performance monitoring and load-balancing
- Virtual machine, storage, network performance optimization
- Linux Distribution OS software and configuration management
- Dynamic deployment topology/physical infrastructure re-configuration
数据中心管理
The Eucalyptus cloud platform software that provides AWS compatible infrastructure as a service must be integrated with standard data center configuration, management, and monitoring software for production use. Each Eucalyptus component runs as a Linux process that must be configured through both configuration files and run-time configuration parameters, and must additionally be monitored along with physical resource health and status characteristics. There are a variety of User Interfaces that are available for use with your Eucalyptus deployment, including those that are included as part of the Eucalyptus platform as well as third party API, command-line and graphical interface software that is AWS compatible.
While the Eucalyptus software does not currently include the deployment of configuration management or system health/status monitoring solutions itself, there are several third party solutions that existing production deployments rely upon to perform these functions.
Configuration Management
Production deployments based on this reference architecture should include the use of a third-party configuration management system in order to ensure that Eucalyptus configuration is correct both for initial deployment as well as under cases where a particular Eucalyptus server and software must be re-deployed. Several options exist, and here we recommend those which are produced by organizations who have partnered with Eucalyptus to provide high quality integrations.
- Puppet
- Chef
For an example of how to integrate Eucalyptus into your puppet environment, please refer to the following resources:
- TODO: produce/reference Puppet+Eucalyptus example ( http://www.eucalyptus.com/sites/all/files/sg-eucalyptus-puppetlabs.en.pdf)
Monitoring
In addition to automated/controlled configuration management, a production Eucalyptus deployment based on this reference architecture should also be monitored via a third party solution to watch the health and status of the deployment, as well as to notify the cloud administrator when unexpected conditions are occurring. Basic monitoring includes but is not limited to:
- Physical resource availability (network ping and/or ssh access to physical servers running Eucalyptus components)
- Physical resource load
- Physical resource faults (as indicated by Linux fault notification mechanisms)
- Eucalyptus component faults (please refer to the Eucalyptus Admin Guide for information on monitoring for Eucalyptus faults)
There are several solutions for monitoring physical and software components of a data center, and here we recommend those which are developed by Eucalyptus partners:
- Nagios
- Ganglia
- TODO: Others?
For an example on how to integrate Eucalyptus into your Nagios environment, please refer to the following resources:
- TODO: produce and then reference Nagios+Eucalyptus example
User Interface
As an AWS compatible platform, Eucalyptus offers both a variety of user interface tools as well as the option to use third party AWS compatible interfaces that interoperate with AWS and Eucalyptus. For information on installing and using the interfaces that are installed by default with Eucalyptus, please refer to the Eucalyptus Install, Admin and User Guides.
- Eucalyptus admin CLI tools (included)
- Euca2ools use CLI tools (included)
- Eucalyptus graphical user console (included)
- TODO: Others? (hybrid fox, enstratus, s3cmd, rightscale, etc)
负载管理
In addition to monitoring and managing the deployment’s physical resources, application workload images and workflows must also be managed and configured. The Eucalyptus platform offers AWS compatible APIs and services which allow external workload management systems to interoperate with AWS and Eucalyptus, and works to ensure that VM image environments between AWS and Eucalyptus are interoperable.
Image Management
- TODO: produce and reference Eucalyptus+Image Management example(s). reference image management techniques and examples (AWS image -> Euca image, image creators, etc)
Workflow Management
- Jenkins, Electric Cloud
- TODO: produce and reference Eucalyptus + external workflow management examples and techniques
小结
The reference architecture presented here is meant to encapsulate a bounded production Eucalyptus system. As with all use cases, there are variations that cannot reasonable be generalized, but we add here some comments and observations that will help to tune the individual use case variations to achieve efficient, stable performance within Eucalyptus.
- Keep individual system load low. If physical systems are over-provisioned with virtual machines (whether it be too many VMs running on a single system or few but resource intensive VMs that interfere with one another), the underlying operating system and Linux dependencies can become fragile and difficult to debug. Eucalyptus has many features that are designed to function even if the underlying system is underperforming and/or misbehaving, but it is always best to provide Eucalyptus and your workload environments enough resources to function smoothly.
- Consider bottlenecks. When designing a deployment, deciding on capacity to be provided to your applications and making capacity and performance hardware decisions, it is best to consider the data paths that Eucalyptus either provides or works in concert with at run-time. Please refer to the Datapaths series of diagrams to aid in identification of potential shared resource bottlenecks.