IEEE_ThemeIssue_ReleaseEng_CD.md 24.5 KB

<!--- Comment: "Having Gov in the title may turn off some readers. Maybe a title something like "Case study adapting CI to a large-scale, complex government organization" would allow it to cross over to non-gov large orgs."</p> <p>Siqueira/Paulo/Fabio: Ainda não chegamos a um título final, mas estamos em vias de convergir</p> <h2 id="gt">-->

title: "Continuous Delivery: building trust in a large-scale, complex government organization" papersize: a4

geometry: "left=1in,right=1.5in"

Authors

Rodrigo Siqueira is a masters student at IME - The Institute of Mathematics and Statistics of the Sao Paulo University. His research interests include software engineering, operating system and computer architecture. He worked for two years with the Brazilian federal government as a coach and developer. Contact him at siqueira@ime.usp.br.

Diego Camarinha is a masters student at IME - The Institute of Mathematics and Statistics of the Sao Paulo University. His research interests include software engineering, computer networks and source code metrics. He worked for two years with the Brazilian federal government as a senior developer. Contact him at diegoamc@ime.usp.br.

Melissa Wen is a software developer. She worked on SPB project as a senior developer and also served as professor of Computer Science at UFBA - The Federal University of Bahia. Her areas of interest include software engineering and open source software development. Contact her at melissa.srw@gmail.com .

Paulo Meirelles received his Ph.D. in Computer Science from the Institute of Mathematics and Statistics at the University of São Paulo. He is full-time Professor at the University of Brasilia, and coordinated the new SPB portal project. His research interest areas are: Free Software, Agile methods, Static analysis, and Source code metrics. Contact him at paulormm@ime.usp.br.

Fabio Kon is Full Professor of the Department of Computer Science of the University of São Paulo. (...) .His research interest areas are: (...). Contact him at fabio.kon@ime.usp.br.

Abstract

For many software development teams, the first aspects that come to mind regarding Continuous Delivery (CD) are the operational challenges and the competitive benefits. In our experience, the CD was much more: it was a survival technique. This article presents how and why we applied CD in a Brazilian government project for the development of a Collaborative Development Environment (CDE), sharing the challenges we faced and the strategies used to overcome them.

Introduction

<!--- Comment: "Generally, the piece comes across as part advertisement for CI and part challenges paper. The authors should figure out their message and make it punch."</p> <p>Siqueira/Paulo: Ao meu ver, a introdução deixa bem claro o nosso "punch" e o resto do texto a gente desenrola bem isso -->

We worked on a three-year-long Brazilian government project to evolve an existing platform that had technical issues and lacked political support. This project, started in 2014, was a partnership between the Ministry of Planning, Budget, and Management (MP) and two public universities: University of Brasília (UnB) and University of São Paulo (USP), to modernize Brazilian Public Software (SPB) portal.

With this partnership, the SPB portal (www.softwarepublico.gov.br) evolved to a Collaborative Development Environment[1], and this evolution brought significant benefits not just to the Brazilian government, but also to society as a whole. The government could minimize bureaucracy and costs of software development, encouraging the use of the same set of applications across different government agencies. The society also gained a mechanism of transparency and collaboration, since anyone can check the government expenses on software and contribute to project communities.

In this article, we discuss the use of Continuous Delivery (CD) during our experience as the academic partner in this project. We focus on how we managed to implement CD in a large institution with traditional values and how CD helped to build trust between the government and the university development team. CD enabled us to show our progress and earned the government’s confidence that we could adequately fulfill their requests, becoming an essential aspect of our interaction with them. According to this experience, the use of CD as a tool to build such trust relationships is yet another of its benefits[2].

Context

<!--- Comment: "Background/intro - This could be more streamlined and focused. Maybe center around a question such as - What is different between gov and non-gov context using your data as to illustrate then tie that to what you had to do in your CI process to address this gap."</p> <p>Paulo: O ponto não foi gov e não gov. Vamos esclarecer que o SPB foi algo particular mesmo, cheio de nuâncias.</p> <p>-->

The SPB is a governmental program of the MP created to foster sharing and collaboration on Open Source Software (OSS) development for the Brazilian public administration. For their projects, the MP managed both software requirements and server infrastructure. However, its hierarchical and traditional processes made them unfamiliar with new software development techniques, such as CD. Any of our requests had to pass through layers of bureaucracy before being answered. Requesting access to their infrastructure to make the deploy was not different.

During its lifetime, the project suffered significant interference from the board of directors because the portal represents an interface between government and society. In light of political interests, directors continually imposed changes to the platform while ignoring our technical advice. In 2015, the board of directors was changed and, with it, the vision of the project. New directors had different political agendas which affected the project's requirements previously approved.

<!--- Comment: "The authors present 3 challenges in the intro and then go on to expand them. However, the expansion (line 21-43 in page 2), is has pints about all three challenges mixed together. It would be better I think to split it into one para for each challenge. "</p> <p>Siqueira/Paulo: Note que esse ponto gerou muita discussão e confusão. Mudei um pouco a estratégia de forma a responder o revisor e ao mesmo tempo tornar o texto mais fluido</p> <p>Melissa: Reformular o enunciado do primeiro problema (FEITO por Siqueira e Diego). -->

In this context, we overcame three distinct challenges: (1) find a system solution wherein government and development team agree, (2) deconstruct the widespread belief among the government agents that any project in partnership with a University is doomed to fail, and (3) deal with bureaucracies involved in the deployment process by the MP.

<!--- Comment: </p> <p>"Some more details about the project that was developed in terms of what the project did could be shared. Right now it just seems like some platform. I am not sure a platform that does what? And why were these 5 tools (Noosfero, Gitlab, Mailman, Mezuro and Colab) were integrated? What purpose did they each serve?"</p> <p>Siqueira/Paulo: Contamos essa história sem ter que entrar em detalhes aqui. Contudo, será preciso fazer a seção de pipeline harmonizar com isso -->

Firstly, to find a system solution wherein MP agents and development team agree, we designed the SPB portal as a CDE with additional social features. Due to the complexity of creating such a system from scratch, we decided to adapt and integrate existing OSS tools to build a system-of-systems [3]. We created a solution that orchestrates software and allowed us to smoothly provide a unified interface for final users, including single sign-on and global searches [4]. On top of that, the new SPB portal was an unprecedented platform to the Brazilian government, with a complicated deployment process.

<!--- Comment: "Can the authors provide some data as well? As I said this is a unique scenario and the community can greatly benefit. For example, how many devs at any given time, how many features/unit time (month/year), how frequently were releases made, how frequently did you meet with the client, how frequently did the requirements change are some example questions to which if we had data, it would place the study in greater context."</p> <p>Siqueira/Paulo: Mostramos parte dos dados aqui e mais na frente falamos da mudança dupla da diretoria tmb -->

<!--- Comment: " In page 3, the authors say 10 SW components were integrated, but only 5 were presented in Page 2. I see in Page 3/4 that the authors mean that they used the the tools in page 2 to manage the building of the SW. But that is not clear. This needs to be clarified. "</p> <p>Siqueira/Paulo: Isso aqui também fica resolvido -->

Secondly, we had to face the widespread belief among MP agents that any project in partnership with a University is doomed to fail. Our team was not from a typical company, consisting mainly of undergraduate students coordinated by two professors. At the first year, we had a group composed of 24 undergraduate students, one designer, and two senior developers. In 2015, our team grew to 36 students, two designers, eight senior developers. In the end, due to budget constraint, we had 20 students, one designer, and two developers. On the government side, the SPB portal evolution was the first software development collaboration between university and government experienced by the MP agents involved in the project.

Lastly, our team thought software deployment differently than the MP. On our side, we believe that frequent deliveries are better for the project’s success. However, the MP works with the idea of a single deployment at the end of the project. In other words, neither the bureaucratic structure of the MP nor its technical abilities were conducive to this style of work. Furthermore, there was little effort to deploy new versions of the system promptly. That ended up hampering the benefits of the tool and preventing us from showing off the fruits of the project to those responsible for evaluating it.

<!--- "the article is missing the ah,ha moment. Was there something interesting that you learned (could be a small thing) that you use to adapt the CI process when applying it in a gov or large org context versus a smaller org? "</p> <p>Siqueira/Paulo: eis o AH-HA moment! -->

These challenges made our relationship with the MP agents tense, in particular at the first year, and alerted us to the fact that they could finish the project at any time. The deployment limitation was the substantial technical issue we could tackle in the short term. As a result, we worked to deploy one version of the project onto our infrastructure and showed it to the government evaluators. This strategy proved them we could efficiently deliver new features, fulfill their expectations regarding the delivery of the requirements, and incited them to demand the last version working in the MP infrastructure. These results, in turn, generated more pressure on the IT department responsible for the deployment routines. With each CD cycle, we gradually built a new relationship among all parties and, by the end of the project, we became active participants in the deploy operations.

Our Continuous Delivery Pipeline

Deployment Pipeline

Figure 1 represents our CD pipeline. A new feature might require changes to more than one SPB integrated software project. Notice that each one of them could be modified independently. The pipeline started when new code arrived. As it went through each step, it was tested and improved until it finally reached the production environment. At this point, we would restart the pipeline to release more new code.

Automated Tests

<!--- Diego: Deixei a mini explicação usando o Colab porque achei que ainda faz sentido e está bem colocada. Se decidirmos tirar, teremos que repensar o parágrafo. -->

The SPB portal is a system-of-systems with several integrated software projects. Each of them, as well as the entire platform, had to be tested. These software components have own test suite. Colab (www.github.com/colab), a systems integration platform for web applications based on a plugins architecture, orchestrate communication between all of them. Therefore, we developed specific plugins for each portal software component, such as Gitlab (www.gitlab.com) and Noosfero (www.noosfero.org). Each plugin also has own test suite, and this set also worked as integration tests.

Both unit and integration tests provided us the performance and security needed to guarantee the stability of components and the platform. If any test suite failed, by either a test error or coverage reduction below a certain threshold, the process stopped. Only when all tests passed, the pipeline proceeded to the step of preparing the release.

Preparing a New Release

An SPB portal release was composed of all its software components releases. Each software component release was a Git tag that referred to a specific feature or bug fix. When all tests passed for a given component, we manually created a new tag for it. Therefore, a new tag on any software component yielded a new SPB portal release. More precisely, SPB had a script that produced a single release for the entire system based on each component tag. At the end of this process, we started packaging.

Packaging

The platform is running on the CentOS 7 GNU/Linux distribution. Packaging a software for that distribution has three steps: write the script for the specific environment (RPM), build the package, and upload it to a package repository.

We decided to create own packages for each software component for the following five reasons:

  1. The community packaged not all software, and those that existed were outdated;
  2. Packaging makes it easy to manage the software on a given distribution;
  3. Packaging simplifies the deployment;
  4. Packaging follows the distribution's best practices and,
  5. Packaging allows configurations and permissions control.

After creating a new tag for one component, the developers informed our DevOps [5] team, and the packaging process began. A set of scripts fully automated the three packaging steps aforementioned. With all them running successfully, the new packages would be ready to be used by our deployment scripts.

Validation Environment Deployment

The Validation Environment (VE) is a replica of the Production Environment (PE), with two exceptions: only the government officers and project leaders had access to it and all the data became anonymous. To configure the environment, we used a configuration management tool named Chef with Chake support (www.github.com/terceiro/chake) -- a serverless configuration created by our team). It maintained environment consistency simplifying the deployment process. Additionally, the packages we built on the last step were readily available to be used by the management tool.

The MP agents used the VE to validate new features and required changes. The VE also was used to verify the integrity of the entire portal as part of the next step in the pipeline.

Acceptance Tests

After we deployed a new SPB portal version in the VE, the MP agents were responsible for checking features and bug fixes required by them. If the MP agents identified a problem, they would notify the developers via comments on the SPB portal's issue tracker. The development team fixed the problem and the pipeline restarted from scratch. If everything was validated, we moved forward.

Production Environment Deployment

When the MP agents finished the VE check, we could finally begin the deployment in the PE. For this, we also used our configuration management tool, the same scripts and package versions as in the VE. After the deploy was completed, both VE and PE were running identical software. Here was the point where new features and bug fixes were finally available to the end users.

Benefits

Research points out many advantages of CD usage in the industry[2, 6], such as accelerated time to market, building the right product, productivity and efficiency improvements, stable releases, and better customer satisfaction. Working with the government, we noticed the following additional benefits.

Strengthening Trust in Work Relationship with the Government

Continuous delivery was also a tool that helped to strengthen trust in the relationship between developers and MP agents. Before using CD, the MP agents had access to the features developed only at the end of the release, usually every four months.

With the implementation of CD, intermediate and candidate versions became available, allowing the MP agents to perform small validations over time. The constant monitoring of the development work brought greater security to the MP leaders and improved the interactions with our development team.

Responsiveness to Change

Responsiveness was one of the direct benefits of adopting the CD pipeline. The ability to react quickly to changes requested by the MP agents was vital for the renewal of the project over the years. Every meeting with the MP leaders resulted in requirements and priorities changes, several of them motivated by political needs. We observed that if we took too long to attend their demands, the MP would use undelivered requirements as a means to justify the lack of financial support and the end of the project.

CD helped us keep the production environment up-to-date, even with partial versions of a feature. That way, we always had something to show on meetings, reducing anxiety to get the platform concluded. For our team, it made the developers more confident that the project would last a little longer and they would not go looking for other jobs.

Shared Responsibility

According to the MP process, the development team could not track what happened to the code after its delivery, since the MP agents were the only ones responsible for deployment. The implementation of CD made our development team feel equally responsible for what was getting into production and take ownership of the project.

Interestingly, the CD pipeline had the same effect on the MP agents. They became more engaged in the whole process, opening and discussing issues during the platform evolution. Additionally, developers worked to improve the CD pipeline to speed up the process of making new features available in the production environment for the MP agents' validation.

Synchronicity Between Government and Development

Despite the positive impacts that the CD pipeline brought to the project, its implementation was not smooth at first. The CD pipeline performance depended on the synchronicity between our development team and the MP agents so that the latter were prepared to start a step as soon as the former concluded the previous step and vice versa. Initially, the agenda of the MP agents did not contemplate this concern, which generated delays in the validation of new features. This situation combined with governmental bureaucracy (up to 3 days) to release access to the production environment resulted in additional delays for the deployment step to begin. This problem was softened when the MP agents realized the impact of these delays on the final product and decided to allocate the revisions in its work schedule and to request the access to production in time.

Challenges

Due to the successful building of the CD pipeline, we improved the MP deployment process and kept the project alive. We map here lessons learned in this successful case.

Build CD From Scratch

Taking on responsibilities for implementing CD impacted on the whole team. Mostly, our team members did not have know-how in this approach, and we had few working hours available to allocate for the building of the pipeline. The construction and maintenance of the CD process were possible by taking some decisions to mature the project:

  1. Select the most experienced senior developers and some advanced software engineering students of the project to work on a specific team for DevOps. These senior developers used their experiences in OSS projects to get an initial proposal for the deployment process. The solution enabled us to automate the deployment, even though the process was still rudimentary.

  2. Interchange team members and encouraging teammates to migrate to DevOps team. The benefits of these movements were twofold: mitigating the difficulty to pass the knowledge between DevOps developers and features developers, and evolving the process on-the-fly.

Building a CD pipeline was hard in the beginning. We believe that more tools to provide out-of-the-box standardized CD pipelines would be of great help for inexperienced teams. Tools that track each step of the pipeline and organize logs in a human-manageable way are necessary too.

Overcoming Mistrust

Taking an unfamiliar approach requires trust. In the MP, software is the product delivered at the end of a development contract. They expected and were prepared to validate and deploy a single delivery. Because the SPB portal is a system-of-systems, the steady growth of its complexity made large deliveries unsustainable. The long time for homologation of developed features also gave the government room to change requirements and priorities. The CD approach was necessary, but how to build trust and gain autonomy to implement a process that was not yet part of the dynamics of the Ministry?

  1. Demonstrate actual results, do not simply tell. Initially, we did not have access to the MP infrastructure, so we created our own validation environment. Thus, we were able to follow the CD pipeline until the stage of production deployment, when we faced two problems. Our pace of intermediate deliveries to the government was faster than the deployment in production by the MP agents. Furthermore, specific issues of the MP infrastructure made some validated features not work as expected in the PE. That situation gave us arguments to negotiate access to PE.

  2. Make our project management transparent and collaborative for the MP agents. Allowing the MP agents to follow our process for version deliveries and bug fixes, we showed them we were meeting our commitments. They started to interact more actively in the generation of versions and became part of the process. After understanding the process, the MP agents helped us in negotiations with the MP leaders. Finally, they created a VE as an isolated replica of PE and gave us access to it.

  3. Gain the confidence of government agents. With the replica of PE, we were able to run the entire pipeline and won the trust of the MP agents involved in the process. They saw the mobilization and responsiveness of our team to generate a new version package. They also recognized the quality of our packages and our deployment process. Finally, the MP agents then realized that it could be beneficial for the project if they granted us access to the project infrastructure, both VE and PE.

<!--- Paulo: Acho que precisamos de algo ligado ao Ha-Ha-moment para fechar o texto aqui. -->

Acknowledgements

We thank our colleagues, Nelson Lago, Lucas Kanashiro and Rafael Manzo, and this article's reviewers.

References

  1. G. Booch and A. Brown, A. W. "Collaborative Development Environments", Advances in Computers, vol. 59, 2003, pp. 1-27.
  2. L. Chen, "Continuous Delivery: Huge Benefits, but Challenges Too", IEEE Software, vol. 32, no. 2, 2015, pp. 50-54.
  3. C. B. Nielsen, P. G. Larsen, J. Fitzgerald, J. Woodcock, and J. Peleska, "Systems of Systems Engineering: Basic Concepts, Model-Based Techniques, and Research Directions", ACM Comput. Surv. 48, 2, Article 18, 2015, 41 pages.
  4. P .Meirelles, M. Wen, A. Terceiro, R. Siqueira, L. Kanashiro, and H. Neri, "Brazilian Public Software Portal: an integrated platform for collaborative development", Proceedings of the 13th International Symposium on Open Collaboration (OpenSym '17). ACM, Article 16, 2017, 10 pages.
  5. J. Davis and K. Daniels, "Effective Devops: Building a Culture of Collaboration, Affinity, and Tooling at Scale", O'Reilly Media, Inc., 2016.
  6. T. Savor, M. Douglas, M. Gentili, L. Williams, K. Beck and M. Stumm, "Continuous Deployment at Facebook and OANDA", 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C), Austin, TX, 2016, pp. 21-30.
  7. J. Humble and D. Farley, "Continuous Delivery: Reliable Software Releases Through Build, Test, and Deployment Automation", Addison-Wesley Professional, 2010. <!--- Está referência "8" tem só uma página mesmo: avaliar se precisamos usar mesmo. -->
  8. L. Chen, "Research Opportunities in Continuous Delivery: Reflections from Two Years' Experiences in a Large Bookmaking Company," 2015 IEEE/ACM 3rd International Workshop on Release Engineering, Florence, 2015, pp. 2-2.
  9. L. Chen, "Towards Architecting for Continuous Delivery," 2015 12th Working IEEE/IFIP Conference on Software Architecture, Montreal, QC, 2015, pp. 131-134.