Reinventing AI Research & Development: Part V

Through this uncertain time, as people all around the world adjusted to the reality of the COVID-19 pandemic, the AI R&D Operations Team continued to support new ideas and projects to help develop the best methods for internal communication and knowledge sharing — and to establish low-effort and high-value processes.

Our goal was to continue playing to our strengths, building off the work of previous Operations Team members, and setting up the AI R&D program for future success. As a testament to the results of prior efforts, WWT was selected as the Deep Learning AI Partner of the Year for the Americas for the third time by the Nvidia Partner Network (NPN) specifically because of our AI R&D Program.

In the last article of our Reinventing AI R&D series, we described progress in tracking R&D projects, branding outputs from the program and building out machine learning operations (MLOps) as a new capability area.

This article dives deeper into the primary focus points for this rotation, namely:

Maturing internal AI R&D team processes and operations.
Creating an MLOps proof of concept, a change ambassador group and a training platform.
Developing a new project and application of artificial intelligence.
Publishing new articles and insights onto the platform, leveraging our newly designed peer review process.
Next steps.

Maturing internal AI R&D Team processes and operations

Continuous improvement has been a tenet of the Operations Team since its founding in 2018. Over the past quarter, we iterated on three major processes to help improve the AI R&D program:

Collective new project ideation.
Redesigning the weekly meeting format.
Revamping the peer review process.

Collective new project ideation

In the past, projects within the R&D program began as an idea submitted via form by anyone interested in putting forward a proposal. Our submission form is straightforward, requiring merely a description of the initial idea and end result, as well as thoughts towards each of the following areas: innovation/novelty, new learnings, marketability, client benefits, feasibility and platform benefits.

We are interested in all internal ideas, each of which is considered by the Project Selection Panel (PSP) for official inclusion in the R&D Project Certified Backlog.

If the PSP greenlights a project idea, it is formally added to the certified backlog, which we use Trello to maintain. This serves as a tracker and allows projects to move through the stages shown above.

Each in-progress project is discussed every Thursday morning at the weekly AI R&D meeting, which is open to anyone within WWT and serves as an opportunity to keep the R&D community up to date, flag potential problems and discuss future plans and areas of interest. Once out of the Model Training stage, project members document their work on the platform and research potential internal or external applications.

Similar to the Operations Team, the PSP serves in a rotation-scheme and decided to hold a project ideation session as their rotation also came to an end. The purpose of this session was to collectively brainstorm with the broader R&D community about potential project ideas to submit and have a pipeline of innovative and value-based ideas for the new PSP team to evaluate.

Despite being limited to purely a virtual setting, the ideation session was nothing short of a smashing success. Using breakout rooms on virtual meetings and virtual whiteboarding tools, we were able to recreate the setting as close to that of a physical ideation and encouraged the team to think divergently about use cases they were passionate about while staying client-driven and skillset-focused. The result was a shortlist of ideas with the potential to leverage the latest data analytics and science processes and technologies, all while developing our client-facing expertise and skillsets.

The team was encouraged to then think about how these ideas served different criteria as we mentioned above (innovation/novelty, new learnings, marketability, client benefits, feasibility and platform benefits) and were then asked to present those ideas formally to the PSP at a later date.

The result of those presentations is yet unknown since the evaluation process is currently underway. We hope to share which ideas were successfully selected and placed into the project pipeline in our next article.

Redesigning the weekly meeting format

The weekly AI R&D meetings have always been one of the most important drivers of new idea development for the AI R&D team. Given the growing attendance to these meetings from the broader WWT community, we decided it was time to improve the format of these meetings. While the R&D team always has engaging content to share, the previous format of the weekly meetings was a bit disjointed, inhibiting optimal engagement from larger audiences. As such, we decided to enhance these meetings to encourage increased participation.

Perhaps the most observable change we made to the weekly meetings was our implementation of a standardized presentation deck that our viewers could easily follow. Instead of navigating through Trello boards and piecemeal slides to present updates and roadblocks on ongoing projects, R&D team members now place their updates and supporting visuals into a formalized presentation with a clearly defined agenda.

Providing an easy-to-follow structure has made R&D content more consumable and standardized for our audience, thereby encouraging more participation from the group.

A notable trademark of the new weekly meeting presentation format is the inclusion of a project spotlight that takes place in the first half of each meeting. One of our key inspirations for revamping the structure of the weekly meetings was to increase group engagement; the changes we made needed to reflect that goal beyond just the creation of a new PowerPoint template.

Placing a project spotlight at the beginning of each meeting was a foundational component of captivating our audience. In addition, it put a healthy pressure on the AI R&D project teams to be able to synthesize their ideas and present them more regularly.

While previous meetings had started off with logistical updates followed by brief status updates of each ongoing project, the new project spotlight gives R&D members the opportunity to kick off the weekly meetings by presenting a deep dive into the respective projects they are working on. Going beyond surface level updates has, so far, generated unexpected levels of additional group participation, often to the point that project spotlights extend well beyond the thirty-minute period that is allotted to them.

Our hope moving forward is that increased audience participation encourages the type of knowledge sharing and collaboration between the AI R&D team and the greater WWT community that leads to tangible and formative idea creation and iteration for the program.

Revamping the peer review process

The R&D Program has always encouraged data scientists and other participants in R&D projects to document their work afterwards via an article or a whitepaper. And although each piece was carefully edited before publication, we decided to add some additional rigor and structure to the reviewing process. See the new and improved peer review process below:

Now as scientists are completing a draft of the piece for publication, project team members work with an Accountability Group to find peer reviewers for their work and ensure that it meets the highest standards for quality and accuracy. The Accountability Group includes members of the R&D Operations Team who keep the process running smoothly, as well as any additional volunteers interested in the peer review pipeline.

Instead of reaching out to everyone in the WWT AI R&D community for potential editors, authors are now connected with individuals associated with one of the five topic "buckets" below. We collectively decided on these buckets based on the skillsets and backgrounds of the people most involved in the R&D Program.

These subject matter experts have previously expressed interested in reviewing and editing papers coming out of the R&D Program, generally to sustain, enhance or share their knowledge. Due to the pace of publication and the number of interested people, each topic bucket expert is expected to review no more than a few articles or whitepapers per year. These peer reviewers have provided the additional reviews on the last three articles published from the R&D program, and we look forward to their involvement in many projects to come.

Creating an MLOps POC, an ambassador group and a training platform

Spinning up an ambassador group to promote training and learning

The AI R&D team is always looking for opportunities to get ahead of the curve in the AI/ML space and the creation of an MLOps ambassador team was testament to that ideal. Our previous article goes into great depth of what MLOps is, its growing demand in the industry and our efforts to build out an MVP applied to an existing project.

We decided to further mature these initiatives, as we recognize that most industries are trending towards increasingly advanced and complex data science. With this maturation, however, comes the consequential difficulty of scaling out and managing of ML operations.

One of the biggest benefits of focusing on the internal productionalization of MLOps within our R&D team is that it not only serves to build out our internal capability, but also helps us gain hands-on experience, allowing us to leverage our understanding of change management, technical bottlenecks and areas of steep learning curves when going to market.

In order to ensure that MLOps received the attention it deserved, we decided to create a cross-functional ambassador group, consisting of consultants, data scientists and engineers to help build out our internal capabilities through experimenting with different MLOps platforms and tools, while also creating a knowledge-sharing platform to train interested data scientists. The primary role of an ambassador is to represent the voice of MLOps across the BAA team with the intent to ensure that the AI R&D is leveraging MLOps and spreading awareness across the team.

*Figure 1: Roles and responsibilities of the MLOps Ambassador Team*

Building out a proof of concept to gain hands-on experience with Kubeflow and TFX

We previously wrote about our experience applying TensorFlow to an already-published project in Image Classification of Race Cars. Our next step was to build a proof of concept from the ground-up, leveraging the power of MLOps through an end-to-end Kubeflow-based ML pipeline.

The aim was to build a POC that would serve as a more powerful search tool on the WWT homepage, taking in natural language inputs to output the relevant articles and content for customers and employees navigating the website. As an additional challenge, we decided to leverage Google's language model, BERT-Small (Bidirectional Encoder Representations from Transformers), which has over 110,000 training parameters. This complex combination of Kubeflow on AWS with a BERT-Small model stretched our team's thinking allowed us to deeply understand the pros and cons of this novel environment.

The visual below goes into how MLOps played a role in the entire process, offering increased modularity and a decreased in the amount of compute required, especially during the ingest and data processing step of the pipeline.

building an MLOps pipeline — *Figure 2: Kubeflow allowed us to traverse a smooth pipeline from start-to-end*

Building out an accessible and comprehensive MLOps training platform

While our role as an AI R&D team is to expand our current day capabilities by stretching the limits of what new tools and technology we operate with, we were aware that a lot of these learnings can dissipate without regularly maintained and updated training resources for the rest of our team to leverage.

Moving to new methods of operation and learning can be significant sources of friction for any data science team and increasingly more so for those with a higher number of operational models and teams. Therefore, we decided that integrating foundational organizational change management (OCM) components to this broader rollout of MLOps within our team would ensure that our gradual shift to these processes would be as streamlined as possible.

The creation of such a training platform and pathway for our data science team was a key push of the MLOps Ambassador group, especially with the organic interest we received from the broader BAA group as we showcased our POC.

Instead of boiling the ocean to collate a set of randomized training material, our process involved creating four personas that we defined as the primary interactors with MLOps tools in any organization.

*Figure 3: Four primary personas for training*

We then sought to create a set of expected skills and capabilities for each of these personas and mapped each one either to an existing or home-grown training resource, with a lot of learnings extracted from our development of the POC mentioned earlier in this article.

The data science team has been getting to "touch and feel" Kubeflow, KALE and MiniKF to try and get exposure to a variety of tools so that we are prepared as we start to develop a formalized offering.

Our current training platform consists of a Wiki, as well as a resource center available to all members across the BAA team, with relevant links to resources and a recording of all the learning curves, road bumps and learnings our data science team come across as they get hands-on learning with a variety of MLOps platforms. Below is a description of our data-science team's current training efforts, which we hope to continually build and mature.

*Figure 4: The different training groups and their respective efforts so far*

Considering our development of internal MLOps capabilities, we have been engaged in some exciting business development opportunities with interested clients. This is evidence of the maturing market and our readiness to get ahead of the curve to respond aptly.

Our efforts are to continually develop a formalized MLOps offering to supplement our current analytics and AI offerings, offering the degree of scale that more technologically mature organizations seek.

Project focus during our time on the AI R&D Ops rotation

Novel application of reinforcement learning

The R&D Operations team collaborates closely with WWT data scientists who are always welcome to work on R&D projects but who, when on rotation, work in pairs on a project chosen by the PSP.

During the past few months, multiple data scientist rotations tackled the same novel project, an exciting new application of reinforcement learning (RL) in the data center networking space. We were particularly excited about the research because of the way this project brings together different areas of WWT expertise, from core IT infrastructure to advanced analytics. The data scientist rotations made excellent progress, iterating through several different RL algorithms and reward matrices to find the most effective combination.

Now the problems to solve are more network engineering- and development-oriented, so the next data scientists on rotation will move on to experiment with different data sets and objectives. But there is more to come on this application of RL, so keep an eye out for future updates.

New publications

On top of the major project for of the R&D Program over the last three months, the data scientists involved with the program also completed and documented two additional projects from the certified backlog. Their efforts represent the benefit of opening up the slate of projects deemed interesting by the PSP, allowing the data scientists additional opportunity to learn and get a head start on experimenting outside of their project work.

The resulting papers were the first to work through the newly formalized peer review process and are also indicative of the increasing breadth of the R&D community across WWT, as more groups and areas of the company become interested to be involved in the exciting R&D work.

What's next?

Another highlight of the past few months has been the number of new and exciting technologies coming to market, even in the midst of COVID-19. The R&D team would like to highlight the following technology in particular for its applicability to WWT and our partners.

It's time to admit something. This article was not really written by a person...

It was, in fact, written by several people! Though we are very excited though about the possibility of custom-prompted and artificially generated prose that is indistinguishable from real human-created text, this article truly was written by the authors. As is clear from some lighthearted examples of fake content generated by OpenAI's new language model GPT-3 (here and here), we are closer than ever before to being able to speak to computers and have them answer.

This last and even more advanced model, described in a recently released paper, builds off OpenAI's 2019 success using scale to create a production-ready generalized pretrained text (GPT) response generator. OpenAI's engineers again leveraged an astronomically sized training data set and millions more parameters for the latest edition, and the organization has opened an API access point to their GPT-3 model for experimentation by academic, commercial and individual collaborators.

The AI R&D Program has a surfeit of ideas and use cases ready to tackle with GPT-3's help, so stay tuned for future articles once we have hopefully been granted beta API access!

In the future, the R&D Team will continue to encourage new ideas and experimentation as we solidify the MLOps capability, push forward different RL applications and experiment with new technologies. We are excited to continue bringing more people from across WWT into the program and to establish and expand our external partnerships. The R&D Operations Team will continue developing processes and operations to enable data scientists and engineers to realize and publicize their ideas and ongoing efforts.