IEEE_ThemeIssue_ReleaseEng_CD.md 23.5 KB

<!--- Comment: "Having Gov in the title may turn off some readers. Maybe a title something like "Case study adapting CI to a large-scale, complex government organization" would allow it to cross over to non-gov large orgs."</p> <p>Siqueira/Paulo: Ainda não chegamos a um título final, mas estamos em vias de convergir</p> <h2 id="gt">-->

title: "Title: Continuous Delivery: a tool to build trust in a large-scale, complex Government organization" papersize: a4

geometry: "left=1in,right=1.5in"

Authors

Rodrigo Siqueira is a masters student at IME - The Institute of Mathematics and Statistics of the Sao Paulo University. His research interests include software engineering, operating system and computer architecture. He worked for two years with the Brazilian federal government as a coach and developer. Contact him at siqueira@ime.usp.br.

Diego Camarinha is a masters student at IME - The Institute of Mathematics and Statistics of the Sao Paulo University. His research interests include software engineering, computer networks and source code metrics. He worked for two years with the Brazilian federal government as a senior developer. Contact him at diegoamc@ime.usp.br.

Melissa Wen is a software developer. She worked on SPB project as a senior developer and also served as professor of Computer Science at UFBA - The Federal University of Bahia. Her areas of interest include software engineering and open source software development. Contact her at melissa.srw@gmail.com .

Paulo Meirelles received his Ph.D. in Computer Science from the Institute of Mathematics and Statistics at the University of São Paulo. He is a full-time Professor at the University of Brasilia, and coordinated the new SPB Portal project. His research interest areas are: Free Software, Agile methods, Static analysis, and Source code metrics. Contact him at paulormm@ime.usp.br.

Fabio Kon ...

Abstract

For many software development teams, the first aspects that come to mind regarding Continuous Delivery (CD) are the operational challenges and the competitive benefits. In our experience, CD was much more: it was a survival technique. This article presents how and why we applied CD in a Brazilian Government project for the development of a Collaborative Development Environment (CDE), sharing the unconventional challenges we faced and the strategies used to overcome them.

Introduction

<!--- Comment: "Generally, the piece comes across as part advertisement for CI and part challenges paper. The authors should figure out their message and make it punch."</p> <p>Siqueira/Paulo: Ao meu ver, a introdução deixa bem claro o nosso "punch" e o resto do texto a gente desenrola bem isso -->

We worked on a three-year-long Brazilian government project to evolve an existing platform that had technical issues and lacked political support. In 2014, the Ministry of Planning, Budget, and Management (MP) initiated a project to modernize the Brazilian Public Software (SPB) portal in partnership with two public universities: University of Brasilia (UnB) and University of São Paulo (USP).

The SPB Portal (www.softwarepublico.gov.br) evolved to a Collaborative Development Environment [1] and this evolution brought important benefits not just to the Brazilian government, but also to society as a whole. It aims to minimize bureaucracy and costs by encouraging the use of the same set of applications across different government agencies. The society also gained a mechanism of transparency and collaboration, since anyone can check the government expenses on software and contribute to project communities.

In this article, we discuss the use of Continuous Delivery (CD) during our experience as the academic partner in this project. We focus on how we managed to implement CD in a large institution with traditional values and how CD helped to build trust between the government and the development team. CD enabled us to clearly show our progress and earned us the government’s confidence that we could adequately fulfill their requests, becoming an essential aspect of our interaction with them. According to this experience, the use of CD as a tool to build such trust relationships is yet another of its benefits [2].

Context

<!--- Comment: "Background/intro - This could be more streamlined and focused. Maybe center around a question such as - What is different between gov and non-gov context using your data as to illustrate then tie that to what you had to do in your CI process to address this gap."</p> <p>Paulo: Vamos discordar do revisor. O ponto não foi gov e não gov. Vamos esclarecer que o SPB foi algo particular mesmo, cheio de nuâncias. -->

The SPB is a program released in 2005 to foster sharing and collaboration on Open Source Software (OSS) projects for the public administration. A SPB solution is considered a public good and the Federal Government assumes some responsibilities related to its use. The first version of the SPB Portal was available in 2007 but since 2009 it has had several technical issues. This is an example of the consequences generated by the hierarchical and traditional processes and the lack of expertise of public agents, in particular their from the Brazilian Ministry of Planning (MP), in software development.

<!--- Comment: "The authors present 3 challenges in the intro and then go on to expand them. However, the expansion (line 21-43 in page 2), is has pints about all three challenges mixed together. It would be better I think to split it into one para for each challenge. "</p> <p>Siqueira/Paulo: Note que esse ponto gerou muita discussão e confusão. Mudei um pouco a estratégia de forma a responder o revisor e ao mesmo tempo tornar o texto mais fluido -->

The SPB evolution project started in 2014: a presidential election year and everyone involved was under pressure to show results. Even with the re-election of the Brazilian President, leaderships in governmental agencies ended up changing in 2015. Each one of them had different political agendas which affected the project's requirements previously approved. Besides that scenario of instability, we overcame three distinct issues: (1) achieving the goals which have guided the platform development, (2) widespread belief among the government agents that any project in partnership with a University is doomed to fail, and (3) the rudimentary and bureaucratic deployment approach in the MP infrastructure. Handling the interaction of these elements was challenging and the unstable Brazilian political scenario only made things worse.

<!--- Comment: </p> <p>"Some more details about the project that was developed in terms of what the project did could be shared. Right now it just seems like some platform. I am not sure a platform that does what? And why were these 5 tools (Noosfero, Gitlab, Mailman, Mezuro and Colab) were integrated? What purpose did they each serve?"</p> <p>Siqueira/Paulo: Contamos essa história sem ter que entrar em detalhes aqui. Contudo, será preciso fazer a seção de pipeline harmonizar com isso -->

To achieve the SPB project goals, we projected the SPB Portal as a CDE with additional social features. Due to the complexity of creating such a system from scratch, we decided to evolute and integrate existing open source tools to build a system-of-systems [?]. We work on a number of distinct systems such as social and collaboration network, Git repository manager, mailing list, source code metric evaluation, and a systems integration platform [OPENSYM]. We create a solution that orchestrates systems and allowed us to smoothly provide a unified interface for final users, including single sign-on and global searches. For that, we had to learn how each system worked and to come up with ideas of how to integrate them as fast as possible, with a team of mostly inexperienced developers.

<!--- Comment: "Can the authors provide some data as well? As I said this is a unique scenario and the community can greatly benefit. For example, how many devs at any given time, how many features/unit time (month/year), how frequently were releases made, how frequently did you meet with the client, how frequently did the requirements change are some example questions to which if we had data, it would place the study in greater context."</p> <p>Siqueira/Paulo: Mostramos parte dos dados aqui e mais na frente falamos da mudança dupla da diretoria tmb -->

<!--- Comment: " In page 3, the authors say 10 SW components were integrated, but only 5 were presented in Page 2. I see in Page 3/4 that the authors mean that they used the the tools in page 2 to manage the building of the SW. But that is not clear. This needs to be clarified. "</p> <p>Siqueira/Paulo: Isso aqui também fica resolvido -->

Our team was not from a typical company, but consisted mainly of undergraduate students. This led to a vast diversity and quick turnover of members. We had 42 different software engineering undergraduate students during the project: 24 in 2014, 36 in 2015, and 20 in 2016. We also had 2 designers and 6 senior developers from the Brazilian open source communities in charge of handling complex tasks and transferring knowledge to the undergrads. Finally, we had 2 professors responsible for interacting with the Brazilian government and controlling political pressures applied to the project.

Beyond these technical and organizational issues, we had to overcome strong political bias and relatively low budget. The project suffered significant interference from the board of directors because the SPB Portal represents an interface between government and society. In light of political interests, directors continually imposed changes to the platform while ignoring our technical advice. During 2015, Brazil faced a political crisis that impacted the SPB: the board of directors was changed twice and, with it, the vision of the project. New directors primarily focused on keeping their distance from previous administrations.

<!--- Paulo: o segundo problema foi alterado em relação à primeira versão, mas a explicação não ficou muito diferente. -->

The second barrier we had to face was the widespread belief among the government agents that any project in partnership with a University is doomed to fail, which made interacting with us complicated. In particular, the SysAdmin team from the MP had low technical knowledge to deploy new versions of the new SPB Portal in a timely manner, hampering the benefits of the tool and preventing us from showing off the fruits of the project to those responsible for evaluating it. The requirement analysts from the MP were real representatives of the Brazilian government in the project and their role was to test new features, to provide feedback to the development team, and to report for the MP leaders. The SysAdmins in charge of managing the host machines wherein SPB Portal was running. They were, theoretically, responsible for deploying the project but the new SPB Portal was an unprecedented platform to Brazilian government, generating a complex deployment process for their.

<!--- Paulo: o terceiro problema é novo em relação à primeira versão, mas na prática tínhamos tratado a questão -->

To overcame the deployment limitation, we realized we needed to take control over the deployment process. We used CD as a mean to fulfill the government expectations and to provide quick response to their requests, which were influenced most of the time by the uncertainties of the project's continuity. We believed we would keep the project alive, even in a politically unstable and technically complex scenario. For this reason, we focused on automating the deploy process and formed a specific team dedicated to the deployment process. This team was responsible for maturing our CD pipeline, giving us confidence and agility to comply with government requirements.

<!--- "the article is missing the ah,ha moment. Was there something interesting that you learned (could be a small thing) that you use to adapt the CI process when applying it in a gov or large org context versus a smaller org? "</p> <p>Siqueira/Paulo: eis o AH-HA moment! -->

These challenges made our relationship with the Government tense at the first year of the project, as well as, all political and technical issues alerted us for the fact that the project could be finished at any time if we could not overcome during the first year such problems. Therefore, the deployment limitation was the only real problem we could tackle in the short term. As a result, we worked to try and deploy one version of the project onto our own infrastructure and showed it to the government evaluators. This strategy proved them that we could efficiently deliver new features and fulfill their expectations and enticed them to demand these features in production. This, in turn, generated more pressure on the IT department responsible for the deployment routines. This was compounded with each CD cycle, allowing us to gradually built a new relationship among all parties and, by the end of the project, to become active participants in the deploy operations. CD kept the project alive for years during the worst political crisis after the re-democratization in Brazil.

Our Continuous Delivery Pipeline

Deployment Pipeline

Figure 1 represents our CD pipeline. A new feature might require changes on more than one SPB integrated software project. Notice that each one of them could be modified independently. The pipeline started when new code arrived. As it went through each step, it was tested and improved until it finally reached the production environment. At this point we would restart the pipeline to release more new code.

Automated Tests

The SPB portal consists of more than 10 integrated software projects and each of them, as well as the entire platform, had to be tested. These software components have their own test suite. Communication between all components is orchestrated by Colab, a systems integration platform for web applications based on a plugins architecture. Therefore, specific plugins were developed for each portal software component, such as Gitlab and Mailman. Each plugin has its own test suite and this set also worked as integration tests.

Both unit and integration tests provided us the performance and security needed to guarantee the stability for components and the platform. If any test suite failed, by either a test error or coverage reduction below a certain threshold, the pipeline stopped. Only when all tests passed, the pipeline proceeded to the step of preparing the release.

Preparing a New Release

A SPB Portal release was composed of all its software components releases. Each software component release was a git tag that referred to a specific feature or bug fix. When all tests passed for a given component, we manually created a new tag for it. Therefore, a new tag on any software component yielded a new SPB Portal release. More precisely, SPB had a script that produced a single release for the entire system based on each component tag. At the end of this process, we started packaging.

Packaging

The platform is running on the CentOS 7 GNU/Linux distribution. Basically, packaging a software for that distribution has three steps: write the script for the specific environment (RPM), build the package, and upload it to a package repository.

We decided to create our own packages for each software component for the following five reasons:

  • Not all software was packaged by the community and those that existed were outdated;
  • Packaging makes it easy to manage the software on a given distribution;
  • Packaging simplifies the deployment;
  • Packaging follows the distribution's best practices and,
  • Packaging allows configurations and permissions control.

After creating a new tag for one component, the developers informed the DevOps [3] team and the packaging process began. The three packaging steps aforementioned were fully automated by a set of scripts. With all these scripts running successfully, the new packages would be ready to be used by our subsequent deployment scripts.

Validation Environment Deployment

The Validation Environment (VE) is a replica of the Production Environment (PE), with two exceptions: only the government officers and project leaders had access to it and all the data was anonymised. To configure the environment, we used a configuration management tool named Chef with Chake support (serverless configuration for Chef). That tool maintained environment consistency simplifying the deployment process. Additionally, the packages we built on the last step were readily available to be used by the management tool.

Government agents used the VE to validate new features and required changes. Also, the VE was useful to verify the integrity of the entire portal as part of the next step in the pipeline.

Acceptance Tests

After we deployed a new SPB Portal version in the VE, the government agents were responsible for checking features and bug fixes required by them. If the requirement analysts identified a problem, they would notify the developers via comments on the SPB Portal's issue tracker. The problems were fixed and the pipeline restarted from scratch. If everything was validated, we moved forward.

Production Environment Deployment

After the government finished the VE check and it was cleared for deployment, we could finally begin the deployment to the Production Environment (PE). For this we also used our configuration management tool, the same scripts and package versions as in the VE. After the deploy was completed, both VE and PE were running identical software. This was the point where new features and bug fixes were finally available to the end users.

Benefits

Research points out many advantages of CD usage in industry[2], such as: accelerated time to market, building the right product, productivity and efficiency improvements, reliable releases and better customer satisfaction. Working with the government, we noticed the following additional benefits.

Responsiveness to Change

Responsiveness was one of the direct benefits of adopting the CD pipeline. The ability to react quickly to changes requested by the government was vital for the renewal of the project over the years. Every meeting with the government leader resulted in requirements and priorities changes, most of them motivated by political needs. We believed that if we took too long to attend their demands, the government would use undelivered requirements as a means to justify the lack of financial support and the end of the project. % CD helped us keep the production environment up-to-date, even with partial versions of a feature. That way, we always had something to show on meetings, reducing anxiety to get the platform concluded. For our team, it made the developers more confident that the project would last a little longer and they would not go looking for other jobs.

Shared Responsibility

Initially, the development team could not track what happened to the code after its delivery, since government technicians were the only responsible for deploying the project. The implementation of CD made developers feel equally responsible for what was getting into production and take ownership of the project. % Interestingly, the CD pipeline had the same effect on the team of requirement analysts. They became more engaged on the whole process, opening and discussing issues during the platform evolution. Additionally, developers worked to improve the CD pipeline to speed up the process of making new features available in the production environment for analysts’ validation.

Synchronicity Between Government and Development

Despite the positive impacts that the CD pipeline brought to the project, its implementation was not easy at first. The CD pipeline performance depended on the synchronicity between developers and government analysts, so that the latter were prepared to start a step as soon as the former concluded the previous step, and vice versa. Initially, this concern was not contemplated in the agenda of the governmental team, which generated delays in the validation of new features. This situation combined with governmental bureaucracy (up to 3 days) to release access to the production environment resulted in additional delays for the deployment step to begin. This problem was softened when the analysts realized the impact of these delays on the final product and decided to allocate the revisions in its work schedule and to request the access to production in time.

Strengthening Trust in Work Relationship with the Government

Continuous delivery was also a tool that helped to strengthen trust in the relationship between developers and government analysts, as well as between the analysts group and its superiors. Before using CD, analysts had access to the features developed only at the end of the release, usually every four months. However, this periodicity did not meet the requirements of their leaders, who demanded monthly reports on the progress of the project. % With the implementation of CD, intermediate and candidate versions became available, allowing analysts to perform small validations over time. The constant monitoring of the development work brought greater security to the governmental nucleus and improved the interactions with our development team.

Challenges

We successfully built a CD pipeline. In the end, we took over the deployment process from the government. That allowed us to survive into an unstable political scenario. However, we recognize that many challenges still need to be addressed by the industry and academia together.

Build CD From Scratch

Taking on CD responsibilities had a significant impact on the team. We did not have the know-how and had little time to come up with a working pipeline. The senior developers were crucial at this point. They came up with an initial solution to get the team started. That already enabled us to automatize deployment, even though the process was still rudimentary. We had to evolve our solution on-the-fly. We dedicated a few developers to this task.

Building a CD pipeline was hard in the beginning. More tools that provide out-of-the-box standardized CD pipelines would be of great help for inexperienced teams. Tools that track each step of the pipeline and organize logs in a human-manageable way are necessary too.

Handling Inexperienced Teams

After the developers learned how CD worked, it was difficult to pass the knowledge along to other teammates. We tried to mitigate this problem by encouraging members to migrate to the DevOps team. We suggest further research on how to effectively spread knowledge across inexperienced developers in a high turnover scenario.

Overcoming Mistrust

In the project's beginning we struggled with deployment issues in the government structure. We were in a paradoxical situation. The government demanded fast deliveries but would not give access to their production infrastructure. After some interactions with government agents, they created the VE as an isolated replica of the PE in their own infrastructure. The government agents then realised that it could be beneficial for the project if they granted us access to part of the infrastructure. More research is required on development protocols and policies to improve the relation between industry and government, specially regarding CD.

Acknowledgements

We thank our colleagues, Lucas Kanashiro and Rafael Manzo, and this article's reviewers.

References

  1. G. Booch, A. W. Brown, "Collaborative Development Environments", in Advances in Computers, vol. 59, 2003, pp. 1–27.
  2. L. Chen, "Continuous Delivery: Huge Benefits, but Challenges Too", in IEEE Software, vol. 32, no. 2, 2015, pp. 50-54.
  3. Davis, Jennifer and Daniels, Katherine, Effective DevOps: building a culture of collaboration, affinity, and tooling at scale, 2016, " O'Reilly Media, Inc."