DIGIT - Release Management

Problem Statement:

Inflow of multiple states and DIGIT as an evolving platform and product with customisation demands: there is a need for an efficient release management to increase coordination, prioritising, scoping and planning the parallel releases across states. Process of ensuring that all the checks and balances have been met to ensure the value is delivered to the customers/states and the risk of failure is reduced and thus the adoption is increased. Keep the customers informed about the changes with the effective communication channels and artefacts to enhance the upgrade experience seamless. Handle the multidirectional releases across states that run multiple versions and allowed customisation. It trickles down the complexity into every phase of our existing release approach.

What are we missing today? either partially or completely to drive the effective release rigour:

  1. Most of our release cycles are unidirectional

  2. Not on Agile Release Train. ART aligns teams to a common business and technology mission.

  3. No standard release cycles and release calendar.

  4. Effective Estimation process and predictable Timeline.

  5. Missing XP, TDD, Contract testing rigour.

  6. Strict Branching strategies and PR Process.

  7. Missing test documentation and reports to gate the release quality.

  8. Very fewer test automation and mostly manual that limits our regression cycles and bandwidth..

  9. Traceability or feedback mechanism from PRD to Prod.

  10. Release checklist, artefacts versioning and release sign-off.

  11. Versioning, Tagging and packaging the artefacts and the release notes.

  12. No release management as a Role and Process discipline/diligence.

Proposed Solution:

To mitigate the challenges we need an efficient release management embracing the following:

 

  1. Development phase being really agile

    • PRD to UAT Sign-off, Standard checklists, Single threaded visibility and traceability.

    • Standard PRD templates with Usecases/scenarios to help the TDD and automated UAT Testcases.

    • Effective Jira workflow for designated category of issue types New Feature, Enhancements, Bug-Fixes, Production Issues, Tasks, etc..

    • Effective Sprint Planning, Estimation, Daily Scrum, Sprint Dashboard, Sprint Demo & Sprint Closure.

    • Technical Program management to facilitate and drive dependancies.

    • Daily scrums with the Storyboard, adoption of Jira workflows, dashboard for metrics, Confluence.

    • Test documentation and quality Index of the Test cycles.

  2. Embrace standard release cycles (Both Platform & Features)

    • Fixed release cycles for the platform (Every X Sprints)

    • Delivery Team Release plan (Every Y Sprints)

  3. Process standards through Definition of Done and workflow

  • Quality Gates and DoD at every phase of the Development cycle.

    1. Make DOD Essential: To complete the sprint on time, it is essential to follow DoD in each phase of the workflow. Walking through DoD for each PBI (product backlog item) to make it transparent for the team members and stakeholders; it will establish perfect understanding with mutual trust between the development teams, product owners and other stakeholders.

    2. DoD Checklist: Use DoD as a checklist for each PBI. Only after going through the complete checklist up to the satisfaction, proceed for the next step. 

    3. Make DoD the Tasks Oriented: Create a specific task for each DoD element to ensure that you are more focused on DoD items. Tasks are easier to manage by managing the expanded DoD checklist.

    4. Involve PO to Review DoD at Mid-Sprint: Develop the culture of showing PBI to PO during mid-sprints. It facilitates PO to know about DoD status.

    5. Apply DoD with Retrospective Approach: Always be ready to improve, and explore the possibility to make the processes more robust. Ask Scrum teams for innovative concepts that helped them; and, explore the suitability of shared tips in the line of your project. 

  • Test Automation coverage for CI/CD and release testing.

  • Every commit - CI Tests to validate Code Quality, Security & Vulnerability checks, Unit & Integration Tests, Contract Tests.

  • Functional automation to determine the build promotion across dev envs through UAT and PROD. 

  • Linking Story + Jira + Git + Jenkins for the transition transparency

  • Embrace the right release and change management, process and adoption.

 

So here we can discuss about how the new release management looks like that address these challenges.

 

 

Solution Approach:

The framework of release cycle has to be changed with the adoption of ART, XP, Workflow, CI/CD practices. All of these protocols are related to the development phase of the cycle and are practiced to make the process more rapid, responsive and adaptive. Agile Release Train aligns teams to a common business and technology mission. Each is a virtual organisation that plans, commits, develops and deploys together.

Principles:

  • The release schedule is fixed – The train departs the station on a known, reliable schedule, as determined by the chosen Program Increment (PI) cadence. If a Feature misses a timed departure, and does not get planned into the current PI, it can catch the next one.

    • Release Cycle1: Product & Platform release synchronised to fixed release window.

    • Release Cycle2: State Specific Releases can be different trains with different release window.

  • Versioning and compatibility matrix

  • Sprint Cycles  – Each train delivers a new system increment every X weeks.

  • Sprint Demo - The System Demo provides a mechanism for evaluating the working system, which is an integrated increment from all the teams.

  • Release Synchronisation is applied – All teams on the train are synchronised to the same release length (typically 8 – 12 weeks) and have common Iteration start/end dates and duration.

  • Sprint velocity – Each train can reliably estimate how much cargo (new features) can be delivered in a Sprint.

  • Agile Teams – Agile teams embrace the ‘Agile Manifesto’ and Core Values and Principles. They apply Scrum, Extreme Programming (XP), Kanban, and other Built-In Quality practices.

  • Dedicated people – Most people needed by the ART are dedicated full time to the train, regardless of their functional reporting structure.

  • Sprint planning & estimation– The ART plans its work at periodic, largely face-to-face sprint planning events with the team of (PO, Dev, QA), for the estimation, buffer, task break down and fastlane.

  • Inspect and Adapt – An I&A event is held at the end of every sprint. The current state of the solution is demonstrated and evaluated. Teams and management then identify improvement backlog items via a structured, problem-solving workshop.

  • Develop on Cadence, Release on Demand – ARTs apply cadence and synchronisation to help manage the inherent variability of research and development. However, releasing is typically decoupled from the development cadence. ARTs can release a solution, or elements of a solution, at any time, subject to governance and release criteria.

Approach:

1. Product: Start with an efficient PRD

User stories

The journey towards a good pull request starts with a well-written user story. It should be scoped to a single thing that a user can do in the system being built.

Breaking down user stories

User stories normally start out very generic and get broken down into smaller, more specific stories as a project progresses. Here’s an example of what a generic user story might look like:

“As a customer, I can log in to the administration system.”

It would probably take a developer more than a day to create a fully operational login system. How granularly it gets broken up is up to the team, but a user story should ideally require one day or less of development work. Here’s an example of what one part of this story might look like:

 “As a customer, I can input my username and password into a login form and click submit.”

 Other stories following up from this one could cover what happens when the user provides invalid input, or what happens after clicking submit. Many teams do this during sprint planning, so everyone on the team gets an overview of upcoming work.

Go vertical

There is one more thing worth mentioning about breaking up user stories that many teams get wrong. When slicing stories, they should be sliced vertically, rather than horizontally.

A vertically sliced story normally requires work on the entire stack: from the views, to any services/libraries, and to the database. A horizontally sliced story will only touch one of those. Consider our user story example from before. Here’s a badly horizontally sliced version.

“As a user, I can have my details stored in the database”

That barely passes as a user story in the first place, but teams often end up structuring the work around the stack, rather than the action being performed. As a result, it can take completing multiple stories to see any real customer-facing value being added.

By creating vertically-sliced stories, the team has a clear picture of what specific tasks need to be completed to add value for the end-user, and the tasks are more discrete and easier to keep track of.

User story dependencies

Ideally, no user story should depend on another user story being done. That way, all work can be done in parallel. In reality, stories depend on each other quite often. It’s important to document which stories are dependent on other stories.

When working with a series of stories that depend on each other, it’s common to end up with a “tower” of pull requests, where developers end up with a series of nested feature branches. The easiest solution to this is to merge the first dependent story before the others are finished. Things don’t always run that smoothly though.

Consider these example branches with three levels of nesting.

master  < branch1 < branch2 < branch3

If changes are made in branch1 after branch3 is made, then branch2 has to be rebased on branch1, and then branch3 has to be rebased on branch2. Once branch1 is rebased on master, difficult merge conflicts can can start happening as well.

Ideally this would be resolved from the bottom to the top. Merge branch3 into branch2, then branch2 into branch1 and then finally into master. However, usually this will get merged from old to new, where branch1 will get merged first, then branch2 is rebased on master.

To avoid dealing with situations like this, avoid dependencies whenever possible and when they happen, try to merge the pull requests quickly so that work can continue. In teams where pull requests get dealt with almost immediately, this is not a problem, but if that’s not the case it might be better to notify the team about prioritizing the pull request as it will need building on.

Estimating stories accurately

Many businesses struggle with this, but it can be vital for planning. We’ve already covered the key element in getting better at this, breaking down stories into something that can be done in a day or less. But that begs the question, how do we know if a story takes a day or less?

A good method for doing this is doing “planning poker”. Before or during sprint planning, have each developer independently vote on how long a story will take. The median of the vote should give a reasonably accurate estimate of how long the story will take. If it’s more than a day, break the story down and try again.

It can take some trial and error to get good at this. Having developers use time tracking to detail time spent on each story is a good way to measure the real time spent on each story and validating estimates. Before the next planning poker, review how far off the previous estimates were and use it as reference to adjust future estimates.

 

 

 

2. Engineering: Embrace eXtreme Programming:

Write clear tests

Writing good tests has many benefits. One of them is to make sure the code does what it should. If there are good tests that pass, chances are that the code works. However, this is not a replacement for running code before submitting it to a pull request. Running code in a QA environment can reveal bugs caused by the environment being different, or something that wasn’t tested for.

Tests as documentation

There is another big reason to write tests: tests double as documentation. An experienced developer will look at a user story and start by writing an acceptance test for it. This is another reason good user stories are needed, so developers can write good tests and ultimately good pull requests.

When another developer comes along to review the code someone wrote, it can be helpful if they can look at the tests to see what this pull request covers. Many testing frameworks, such as RSpec, also allow the test output to print out every expectation to state what is happening. Consider the following acceptance test example from RSpec.

RSpec.feature "Widget management", :type => :feature do scenario "User creates a new widget" do visit "/widgets/new" click_button "Create Widget" expect(page).to have_text("Widget was successfully created.") end end

Looking at this, the developer reviewing the code will immediately see that this pull request includes changes allowing the user to create widgets. The output from running this test can be seen below. Failed tests would be displayed in red.

Widget management

   User creates a new widget

When a pull request is being code reviewed, this also allows the developer reviewing the code to double check if all the requirements of the user stories have been met and tested or if the scope of this pull request goes outside of the original user story.

Unit testing in Development Phase

The rule of thumb when testing in the Development Phase is the person writing the code, tests the code. However, commitment is more than writing tests. And, any test written must be able to run repeatedly in an automated testing environment well after the code moves away from the developer. Apply test-first practices including Test-Driven Development (TDD) for unit tests and (BDD) for automated acceptance tests. Code Coverage Reporting is the way we know that all lines of code written have been exercised through testing. 

Code comments

Sometimes our code needs a little help explaining why we are doing something. This can often be resolved by writing descriptive method/function and variable names to make the code read more like spoken language. Another popular solution is to use code comments to explain what the code is doing.

Some teams encourage liberal use of code comments, whereas other teams consider them a code smell. When looking to add a comment to explain code, it’s a good rule to first consider changing the code to be more self documenting. Failing that, adding a comment will probably help the next person who reads the code.

Consider the example in the image above, instead of calling the method rider, we could have called it rider_with_least_ride_time and the comment is no longer required as the intent is clear.

Code comment tags

It might be more descriptive to use code comment tags such as TODO or FIXME to explain the nature of a comment, and why it was left in the first place. If a comment was chosen to explain the code, prefixing the comment with a TODO to improve the code explains to the reviewer and anyone who arrives at it in the future that this needs improving.

Write good commit messages

Writing good commit messages allows others to go through the history of the branch and get an overview of what was done. This requires small and frequent commits. A commit should ideally include a complete, isolated change. This enables developers to safely cherry pick or roll back commits as needed. That’s difficult or impossible to do with big commits, or lots of “work in progress” commits that are incomplete.

When writing commit messages, always stick to the team’s conventions above anything else. To write useful commit messages, it’s good to prefix them with a verb, such as “add”, “remove”, or “fix”, followed by what was done. Keep it short and simple and don’t be scared to use the commit body to provide more information. Here’s some examples.

   add login page for administration panel

   change route to point to new login page

   remove old login page

Some teams prefer to squash a branch into a single commit before merging. The good thing about this is that merge conflicts become much easier to deal with and the commit history is much more succinct. However, this removes a lot of information. If a team is good at doing small, frequent pull requests this can work well.

Spend more time on development and less time on version control. However, to make the most of GitOps, developers are expected to commit directly to the main branch or merge changes from their local branches in at least once a day.

The importance of good pull requests

Having a culture of writing good pull requests within a team can make a big difference in productivity. If pull requests are small, frequent, and easy to review and test, they will result in pull requests being opened and merged quickly.

On the other hand, having infrequent, big pull requests that bounce back and forth between developers, reviewers, testers and product owners can slow progress down significantly, causing developers to waste a lot of time dealing with merge conflicts and putting out tiny fires everywhere. 

 

GitHub Action and Workflows:

We’ll create two workflows:

  1. Build workflow - is for triggering daily builds on push to master branch. This will checkout code, run unit tests, upload code coverage reports, and build a release snapshot. Each release snapshot is an application binary, plus the Docker image containing that binary. The Docker image is tagged with the Git commit hash and then scanned for vulnerabilities. The scan fails the build if there are any critical vulnerabilities found.

  2. Release Workflow - is for release builds. Whenever you create an annotated tag on a release branch or state branches, the Release workflow is triggered to run unit tests and build the release binary and the Docker image. This time, the Docker image is tagged with the release version. If everything is okay, the build artefacts are uploaded to the GitHub repository’s releases page and to GitHub Package Registry. After a successful push to the registry, the image is scanned to make sure that the released containers are not vulnerable.

As you’ll see, whether we build a snapshot or a release it’s very easy to configure and detect vulnerabilities early on in the development stage. This gives you more time to fix security issues by upgrading or replacing vulnerable dependencies.

Build as less as possible:

Eliminate any practices where source code is built multiple times. Even if the software has to be built, packaged, or bundled, you should execute that step only once and promote your binaries.

Pull request creation process

Most teams have some sort of pull request creation process to organize pull requests and make sure everything is documented. An example could be as follows.

  1. Run the code manually to make sure it works

  2. Make sure tests document all changes

  3. Add tags to the pull request

  4. Update status of user story and link to pull request

  5. Add reference to user story in pull request

  6. Update the change log if not done automatically

  7. Request a review from someone in a non-disruptive manner

Security first approach:

Distinguished internal and forked GitRepos, Scan vulnerabilities at every possible step and part of the CI/CD ritual. Manage infra and deployment secrets through effective encryption systems.

Shift Left Testing:

In an Agile world, there is a constant need to move faster. Typically, this means decreasing the relative length of delivery time while continuing to improve quality on each successive release. At the same time, there's always need to minimize testing costs, the adoption of Agile development initiatives requires that different testers with different skillsets become involved in testing. Since Agile work products are built in short sprints (iterations), it is also necessary that developers are involved in the testing as early as possible. To test properly, the testers need to work with product owners and developers so that they can prepare as early as possible to test effectively. This quality assurance movement has become known as “shifting left”.

Survey results showed that the more automated tests a team has, the less satisfied its testers are with their testing process. Some of it could be due to the tedious, time-consuming work of analysing failures of “flaky” regression tests and trying to fix them.

Despite the clear lean towards certain QA activities, when correlated against customer happiness, % increase of bugs found after release, and deployment schedule slip, there was no indication that any type of QA activity had more impact on improving those metrics than another.

 

 

Test automation, which processes and tests to automate first:

While an incremental approach to automation sounds good, transitioning from manual to automated processes often find it difficult to decide which processes to automate first. For example, it is beneficial to automate the process for compiling the code first. As developers need to commit code on a daily basis, it makes sense to do automated smoke tests. Unit tests are usually automated first to reduce the workload on developers.

Consequently, we can automate functional testing, followed by UI testing. Functional tests do not usually require frequent updates in the automation script, unlike UI tests which have more frequent changes. The main idea is to think about all possible dependencies and evaluate their impact to prioritise automation sensibly.

Tracking and version control tools:

Version control system which creates a ‘single source of truth’, tracking of changes in the code-base, and rollback whenever is required.

 

 

Release often - Canary, Alpha, Beta:

Frequent releases are only possible if the product is in a release-ready state and we have tested it in a production-like environment. That’s why the best practice is to add a deployment stage which closely resembles the production environment before the release. Some release best practices include:

  • Canary deployment. Releasing to a subset of users, testing with that base and rolling it out to the wider population if successful (or rolling it back for iteration if it’s not).

  • Blue green deployment. You begin with two identical production environments. One is live in production. The other is idle. When a new release is rolled out the changes are pushed to the idle environment. Then they switch – the environment containing the new release becomes the live environment. If something goes wrong, you can immediately roll back to the other environment (the one that does not contain the new release). If all is well – the environments are brought to parity once more.

  • A/B Testing. A/B testing is similar in flavour to – but not to be confused with- blue green deployments. A/B testing is a way of testing features within the application for things like usability. The better performing variant of the feature wins. This is not a release methodology.

Use on-demand testing environments:

We should consider running tests in containers and headless as this approach allows the quality assurance team to reduce the number of environment variables and changes present between the development and production environments. The primary advantage of using such ephemeral testing environments is that they add agility to our CI/CD cycle. The QA team does not have to pull a build from a CI server and install it in a separate testing environment; instead, it can run tests against a container image. It is much easier to spin up containers (they have no separate installation or configuration requirement) and also destroy them when not needed.

 

Deployment Strategies

 

Release Rituals:

  • Collaborate with the technology-centric teams using similar cadence structures and alignment to shared objectives

  • Collaborate with the Product Owner to create and refine user stories and acceptance criteria

  • Participate in Sprint Planning and create Iteration plans and Team Sprint Objectives

  • Estimate the size and complexity of their work

  • Conduct research, design, prototype, and other exploration activities

  • Determine the technical design in their area of concern, within the architectural guidelines

  • Develop and commit to Team Sprint Objectives and iteration goals

  • Use pairing and other practices for frequent review

  • Apply test-first practices including Test-Driven Development (TDD) for unit tests and (BDD) for automated acceptance tests.

  • Implement and integrate changes in small batches with static code analysis (SCA) and static analysis security testing (SAST) on CI.

  • Create and test the work products defined by their features

  • Test the work products defined by their features

  • Deploy the work products to staging and production

  • Support and/or create the automation necessary to build the continuous delivery pipeline.

  • Embrace efficient PR process, branching strategy, Code Quality checks.

  • Enable agile teams operate on DevOps model where with the automation capabilities Dev, QAs can do their respective operational tasks.

  • Enhance and mandate the UAT Signoff from the POs along with supported automated testing.

  • Continuously improve the team’s process


Release Managers’ Functions:

As with the incorporation of CI/CD in the development and release of any software package, release cycles have become continual. This has brought about a significant shift in release managers’ responsibilities, including those discussed here.

Shift from Linear Phase to Operations

In the past, release managers usually would supervise the linear phases (planning, development and testing) that culminate into a final release. Now, with the advent of CI/CD, release managers are responsible for taking care of the operational side of things. For instance, a modern release manager has to ensure a smooth flow of continuous integration of codes to the production team.

More Fronts to Deal With

Modern release managers are no longer limited to development and quality assurance. They have to deal on several fronts. Because of continuous delivery, speed is now associated with releases. Hence, release managers are required to deal with sales and support, management/marketing, and customer feedback to ensure the required changes in the next release.

Modern Release Manager as a QA Manager

Before the new paradigm of CI/CD, release managers solely relied upon the reports from the quality assurance (QA) manager to determine and access the release quality. With more automation involved in software development, the tasks of QA managers have shifted to release managers. With the help of layered test automation suites, a release manager now has to assure the quality of release from development to production.

Release Management Ensures Swift Decision-Making

In the past, a whole board of decision makers would decide whether to release the software product, or not. But with fast and never-ending cycle of releases, this collective decision-making was lagging in the process of a release. Therefore, release managers are now required to come up with decisive final report, which enables even a single stakeholder to call the shots.

Release managers today must have comprehensive knowledge of CI/CD protocols and DevOps auto deployment tools. They must understand how the CI/CD pipeline, which is central to the quick release cycle, works, and be able to identify defects early. In addition, release managers must understand:

  • Feature Toggles: A unique prevailing tool that enables developers to amend system behavior without altering the code, to help expedite the release cycle.

  • Branch Handling: Used for parallel development of different patches of software.

CI/CD is still in its nascent phase and therefore faces the issues of lack of management and infrastructure. Due to these shortcomings, Some naysayers also question its viability. It becomes, then, a shared responsibility between all departments—including release management—to address the concerns and meet the challenges in a better way.

In this backdrop, below are some important points integral to the release management role.

  • Quality of release must not suffer for the sake of agility.

  • An appropriate change notification system must be implemented so every stakeholder can know which patch is about to release and when.

  • With release becoming more of a continuous progress rather than an event, it is important for release management to make sure that the power of decision-making resides with small, independent release teams to have smooth deployment process.

Apart from the explicit responsibilities of release management discussed above, they must have the requisite interpersonal skill set to implement and steer the changes in the work culture, as well as structure the hierarchy of the team. Furthermore, they should have a thorough grasp of the tools and processes that are necessary in the CI/CD integrated working environment.

There is no doubt that the job of release manager has become more demanding with the introduction of CI/CD. But on the bright side, release management has become the linchpin of any software release that is developed in the environment of CI/CD.

Release KPIs for keeping the team on track:

  • Completed vs. Committed Stories: Always compare how many stories you committed to during sprint planning with the number identified as completed during your sprint review. This will make it easier for your team to better estimate its capabilities going forward.

  • Technical Debt Management: You can measure this a few different ways, but it usually involves the number of bugs found. Other known problems may also be included depending on your project.

  • Team Velocity: From one sprint to the next, you want to know how consistent your team is being. One easy way to do this is by comparing completed story points in the current sprint with those completed in previous sprints.

  • Communication: Open and honest communication is vital between the scrum master, product owner, members of your team, stakeholders, and customers. You need to pay close attention to ensure this is happening and step in immediately if it isn’t.

  • Adherence to Best Practices: This isn’t just about scrum rules, though those are certainly important. You should also be monitoring to ensure that your team is keeping to engineering best practices, as well.

  • Everyone’s Understanding of the Scope and Goal: Although there’s no doubt that it’s a subjective measurement (similar to monitoring for honest communication), it’s a good idea to check in and see how well your product and development teams and the customer understand the sprint stories and goal. The latter is typically aligned with the desired customer value that’s intended for continuous delivery and is objectively defined in the stories’ acceptance criteria. The best way to determine this is with day-to-day contact and interactions with the team. Processing customer feedback will help, too.

  • Portfolio Management: Demands stream managed in portfolio.

  • Release Calendar Management: IT Global Release Calendar defined and applicable to ALL Regions

  • Requirements Schedule: Upcoming and Future Releases scope is visible to all regional and global stakeholders.

  • Release Risk Cockpit: Manage global real-time exceptions within a single collaborative framework

  • Release Quality Progress: Projects release scope test coverage readiness and execution monitored in real-time.

  • Release Approval Readiness: Global release methodology, with unified readiness-requirement milestones.

  • Requirements Structuring: Lean requirements (Epics) decomposition, spanning over sprints, driven by Minimal Viable Product concepts. Close hand-shake with business owners.

  • Demand Efforts Assessment: Agile requirements are constantly improved during a release. Therefore, efforts refinement vs. fixed capacity (SCRUM team is fixed) are crucial for a planned release schedule.

  • Requirements Design and Specs: Requirement Definition, Design, and Specs documents are managed as part of the Entity Lifecycle flow – unified view of content, development processes, and quality.

  • Risk Management: In Agile space, Test Plan Validation and Code Quality fixes are managed before moving to Test Execution.

  • Requirements Traceability Matrix: Testing Execution and Defects are traceable within the Project and Release space. Use RTM report to identify bad quality of a Release scope for further actions like scope-out or postpone to next release.