Authors: William Dupley & Tom Scott
Introduction to Hybrid DevOps
DevOps is a framework that allows development, quality assurance, and operations to meet the needs of the business to align with customer demand. It contains capabilities related to:
- Integrating development and operations teams to facilitate communication, collaboration, and integration to manage today’s rapidly changing business landscape
- Enabling developers to provision, change and manage their development environments without operations involvement
- Allowing developers to promote to production cloud-native applications without the need for operations involvement
- Facilitating both conventional application development acceleration and cloud-native application development techniques
Benefits of DevOps
- To align with the speed of business
- To increase the speed of release of new applications
- Higher quality applications released to production
- Having a cloud architecture that both Operations and Development collaboratively define. Provides and alignment with business requirements and objectives as a catalyst to realizing greater opportunities to leverage cloud-native capabilities that are inherent in cloud technologies
- As an enabler to operations to review features, provide feedback into the development process, and reduce production issues
- Enables the ability to deliver infrastructure as code
Approaches to achieving a DevOps operation model
DevOps is about streamlining development and optimizing operations to enhance service delivery while decreasing the time it takes to design, build, deploy, and support applications.
The three main approaches to achieving a DevOps operating model outlined in this paper. They are:
- Traditional application acceleration: Applying Six Sigma theory to the IT Operating Model.
- Native cloud application deployment: Deploying SaaS, PaaS, and FaaS application services.
- Cloud native application development: Implementing Cloud Foundry technology.
All three create a tight integration between the Development and Operations teams and increase the release to production. Now comes the Cloud Native approach which provides significant new power to the development group over Operations functions. This shift in power is often met with resistance from the Operations group.
Any approach to DevOps requires a significant culture change in the development, quality assurance, and operations groups. This culture change is often the greatest barrier to successfully implementing a DevOps operating model. To address this reality, the best solution is to change the KPI’s on the development group from release to production time to the number of faults in production measured at the user’s screen. One of the biggest challenges to DevOps is that speed can be the enemy to reliability. Changing the metrics on developers greatly improves this sensitivity and therefore changing the metrics on Operations is also required. Operations can no longer blame developers for production faults if they do not tell them during development. This requires operations to be much more involved in reviewing the features and development activities than in the past.
In the traditional “on-prem” environment, the roles and responsibilities of development teams and operations personnel do not so much change so much as combine to achieve the realization of DevOps. When this is applied to “Cloud,” it becomes more challenging due to the variety of consumption models, e.g., SaaS, PaaS, IaaS, and FaaS being used. Each type will represent a different development model and support challenge. Throw into this the Hybrid IT environment, and roles/responsibilities could become blurred. As businesses struggle to find the right fit for their workloads, IT is also struggling to align with this new paradigm. Best practices are still being defined, and configurations are being optimized, usually manually, in response to overages or worse security breaches.
The Hybrid IT model is unique in that services can be delivered from multiple suppliers and any supplier’s service can compromise the service level. When implementing DevOps in a Hybrid IT climate, actors and roles must be aligned to the same business objectives, timelines, and concepts of Hybrid IT to fully leverage the capabilities of DevOps.
Example: DevOps teams traditionally deployed to on-prem infrastructure where changes were controlled through ridged processes and strict planning. The on-prem infrastructure teams and DevOps build teams were synchronized to deploy on infrastructure that would be “frozen” in order to reduce the impact to the deployed application. The business has decided to adopt an “On-Demand” service provider in which infrastructure would be consumed as a service in order to support the “cloud first” initiative. When the DevOps teams deploy their builds to the “Cloud Environment” they find that the provider has implemented a patch that now introduces a change to a needed API and produces an error which causes the application to crash. The teams are now faced with an environment that they no longer control nor have any ability to implement “freezes” to in order to prevent such an occurrence form happening. Infrastructure now changes in the same way as the application and they now need to be in sync. This requires new roles and responsibilities to be defined on how resources are being consumed and utilized across both the cloud and the traditional workloads. Cross-organizational transformation needs to occur from being a basic “IT operations department” into a “cloud consultation team.”
Hybrid DevOps impact
The roles of developers are significantly impacted. They are required to embrace the following four new responsibilities:
- Rapidly deliver applications to users for traditional, Hybrid IT, and cloud-native workloads
- Facilitate continuous integration and deployment
- Enforce consistent standards of development and deliver high quality, error-free software
- Ensure security compliance and privacy policies are adhered to in the software development process
The historical approach to software development is that Development, Quality Assurance, and Operations operated separately. Development would create a solution and throw it over the wall to Quality Assurance, who would test it and then throw it back to Development. Eventually, Quality Assurance would authorize it for release and throw over the wall to Operations. The result of this model was that Operations needed to learn the application and often found out things in production that impacted the availability, reliability, and performance of the application resulting in extensive change requests and reliability issues with the initial releases of the applications. All of this resulted in long delays to release new applications and reliability issues in production.
SaaS and Cloud services challenged this entire model. Cloud services enabled a business developer to order services directly. The result was much faster release to production of new capabilities. With the introduction of cloud-native application services, release times could also be radically changed. New code could be deployed in a few seconds, and a developer could deploy multiple releases in one day.
Traditional application development versus cloud-native.
The above graphic describes the eight major processes in a software development process. Traditional application development has four groups of individuals who are involved in an application’s release to production. The Development group is responsible for the following activities:
- Plan and release sprints
- Define requirements features user stories and use cases
- Develop code and unit testing
- Build and integrate code
Operations is responsible for moving the code to the Quality Assurance systems. The Quality Assurance group manages the testing process and executes the functional tests. If the tests pass then the Release Management group schedules the release. Operations moves the application to production at the required time and assumes responsibility to monitor and operate the code. In some companies, if databases are necessary, there is a fifth group (database administrators) who are involved in the provisioning of databases to the development group.
All of these different groups and handoffs add more and more time to the provisioning of the service.
In contrast, the cloud-native application development approach makes the developer responsible for conducting all eight processes and promoting the release to production under their own approval. The result is a much faster release to production process.
DevOps is aimed to equip IT to release new capability rapidly. There are three key approaches to accomplishing this goal.
- Traditional application acceleration
- Native cloud applications
- Cloud-native application design
Each of these strategies requires a significant change to the current development and operations relationship and operating model.
In most large companies the majority of applications are still implemented using the waterfall software development methodology as described below.
Using total quality control techniques companies are now applying Six Sigma methods to their software development process. This approach is enabling these companies to eliminate the excess time and handoff issues between the different groups. It requires Application Development, DBAs, Quality Assurance, and Operations groups to work closely together to identify all non-value-added activities, eliminate delays, and compress cycle time of events. The above graphic describes the quality improvement tactics that can be used.
Example Company # 1
Using the quality improvement approach Company #1 identified that the number one delay in provisioning new systems to production was the provisioning of databases. DBAs were taking up to nine weeks to provision databases to the development groups. They developed a private cloud that delivered standardized databases for Microsoft SQL, Oracle, Mongo, and Cassandra and reduced the time to provide a new database from nine weeks to two minutes.
Example Company #2
Company #2 wanted to solve application release to production problems. Of the 150 applications deployed in 2014, zero went right in the first run. There were a lot of manual steps in the deployment process. There was a lack of quality in the deployment process, every deployment did have a different end result, and the duration of the implementation was taking too long. To solve this problem, they focused on the two most critical business-critical applications for the company and implemented Continuous Delivery using a continuous deployment solution that provided automation and release management of complex multi-tier applications across the application lifecycle. The results of this approach were:
- The three months manual preparation process was brought back to twenty minutes
- Deployment time of an application from reduced five hours to two minutes and twenty-seven seconds
- Multiple deployments were done per day. Deployment results are always the same, and there were no incidents in production anymore
- Overall results: Deployment time became 228x faster, improved quality, better integration development, and Quality assurance
“A Native Cloud application is a program that is designed specifically for a cloud computing architecture. Native Cloud applications are developed to take advantage of cloud computing frameworks, which are composed of loosely-coupled cloud services. That means that developers must break down tasks into separate services that can run on several servers in different locations. Because the infrastructure that supports a Native Cloud App does not run locally, NCAs must be planned with redundancy in mind so the application can withstand equipment failure and be able to re-map IP addresses automatically should hardware fail.
The design paradigm is cost-effective but more complex, however, because services and resources for computation and storage can be scaled out horizontally as needed, which negates the need for overprovisioning hardware and having to plan for load balancing. Virtual servers can quickly be added for testing and, in theory, an NCA can be brought to market on the same day it’s created. In general, a native app is an application program that has been developed for use on a particular platform or device.”[i]
Example Company #3
Company #3 had a new product that was to be released, and the marketing team felt the web experience was not compelling enough to reflect the image of the new product. It was decided that an entirely new web presence was needed to reflect the creativity of the new product.
A new web page system was built on their private cloud. Apache and Maria were chosen to deliver the new environment. The Private cloud-delivered the underpinning infrastructure and Platforms (13 servers) for development, production and test, and the time to develop the new application and provision Infrastructure was 48 hours.
Example Company #4
After the Madoff scandal, the Office of Investor Education and Advocacy volume increased to over 90,000 contacts annually, and the decade-old system for managing contacts was incapable of keeping up. They decided to move to a cloud solution that allowed representatives around the country access to all documents in real time. The results of this change were
- Reduced a 30-day response to investors to less than 7 business days.
- Created a completely paperless system for handling investor inquiries.
- Provided staff with contact history to better serve consumers.
- Reduced timeline for system configuration from months to minutes.[ii]
Cloud-native computing takes advantage of many modern techniques, including PaaS, multicloud, microservices, agile methodology, containers, CI/CD, DevOps, Elastic Cloud Infrastructure as well as supports Pipeline Build Methodologies. This approach requires a significant change in application design disciplines. This approach allows a developer to promote directly to production. There are no gateways to ensure that what they release will not cause production problems. As a result, the number one change to using this approach is the accountability for production reliability is no longer operations responsibility but the developers.
Cloud native requires a developer to follow the 12-factor application development disciplines. “The twelve-factor app is a methodology for building software-as-a-service apps that:
- Use declarative formats for setup automation, to minimize time and cost for new developers joining the project;
- Have a clean contract with the underlying operating system, offering maximum portability between execution environments;
- Are suitable for deployment on modern cloud platforms, obviating the need for servers and systems administration;
- Minimize divergence between development and production, enabling continuous deployment for maximum agility;
- Can scale up without significant changes to tooling, architecture, or development practices.”[iii]
The 12 factors are:
- I. Codebase
One codebase tracked in revision control, many deploys
Explicitly declare and isolate dependencies
Store config in the environment
IV. Backing services
Treat backing services as attached resources
V. Build, release, run
Strictly separate build and run stages
Execute the app as one or more stateless processes
VII. Port binding
Export services via port binding
Scale-out via the process model
Maximize robustness with fast startup and graceful shutdown
X. Dev/prod parity
Keep development, staging, and production as similar as possible
Treat logs as event streams
XII. Admin processes
Run admin/management tasks as one-off processes.”
Company #5 example:
Company #5 had a .NET application that was over ten years old. It took 300k hits/day. It was not reliable and did not meet the company’s current needs. They built the replacement application using cloud-native disciplines. Here is the summary of the results:
- Developed a Cloud Native application on two Stackato development platforms using Docker Containers.
- Developed three micro-services,
- Implemented weekly releases,
- Implemented auto scale
- Implemented a Continuous Delivery Pipeline development process that supports the entire system, not just the application source code
- Implemented Infrastructure as code on their private cloud
The results of this initiative were:
- Demo app ready in 11 days! Minimally viable product (MVP) in 7 weeks,
- New UX in 4 days, Alpha in 4 sprints, Beta in 5 sprints
- 454 builds and deploys to Stackato
- 250,000 users worldwide
The transformation to a shared responsibility DevOps model is one of the most challenging activities a technology organization has to achieve due to the magnitude of the changes in process, people, and technology. It requires the adoption of new technologies, new methods, and new people measures. There are however two significant issues in DevOps transformations. The first is people related. There is often resistance to change. The second is Process. The current change and release process can often be an obstacle.
Both of these issues arise from the fear that developers will release poor code into production and impact availability and performance metrics that typically reflect on Operations. Shifting some of this responsibility to developers requires two key changes:
- The measurement system on the development community has to change from release to production time, to quality of software in production.
- The change and release management processes need to be modified to allow a developer to self-authorize release to production.
- Both Operations and Development need to be measured on the Performance, Reliability, and Availability of an application measured at the user’s screen.
DevOps is a critical discipline for all IT environments to adopt. All enterprises require IT to operate much faster. DevOps enables this requirement.
For additional information on the OACA cloud Maturity model go to: https://www.oaca-project.org/cmm40/
About the Authors:
Bill is the Digital Strategist for Liam Associates Inc. Formerly the Cloud Chief Technologist for Hewlett-Packard Enterprise Canada, Bill has provided Hybrid IT and IoT Strategic Planning advisory and planning services to over fifty Private and Public sector clients to help them migrate to a Hybrid IT Cloud Operating model. These transformation plans have helped both government and industry reduce the cost of IT, re-engineer their IT governance models, and reduce the overall complexity of IT.
Tom Scott is currently a Principal Technology Architect at The Walt Disney Company. He is a future-focused technologist with over 30 years of technology innovation and experience. From being awarded a US patent for his work in mobile device news delivery to building innovative workflow solutions for broadcast television to architecting cloud solutions. He is also a member of the OACA Business Transformation Workgroup and has authored and co-authored numerous whitepapers and documents ranging from the published Cloud Maturity Model through Integration and Business Strategy for Cloud Adoption. As a pragmatic technologist, Tom consistently implements technology that drives business