Chapter 6 Customer churn and deep learning

drake is designed for workflows with long runtimes, and a major use case is deep learning. This chapter demonstrates how to leverage drake to manage a deep learning workflow. The original example comes from a blog post by Matt Dancho, and the chapter’s content itself comes directly from this R notebook, part of an RStudio Solutions Engineering example demonstrating TensorFlow in R. The notebook is modified and redistributed under the terms of the Apache 2.0 license, copyright RStudio (details here).

6.1 Packages

First, we load our packages into a fresh R session.

6.2 Functions

drake is R-focused and function-oriented. We create functions to preprocess the data,

define a keras model, exposing arguments to set the dimensionality and activation functions of the layers,

train a model,

compare predictions against reality,

and compare the performance of multiple models.

6.4 Dependency graph

The graph visualizes the dependency relationships among the steps of the workflow.

6.6 Inspect the results

The two models performed about the same.

6.7 Add models

Let’s try the softmax activation function.

make() skips the relu and sigmoid models because they are already up to date. (Their dependencies did not change.) Only the softmax model needs to run.

6.8 Inspect the results again

6.10 History and provenance

drake version 7.5.0 and above tracks history and provenance. You can see which models you ran, when you ran them, how long they took, and which settings you tried (i.e. named arguments to function calls in your commands).

And as long as you did not run clean(garbage_collection = TRUE), you can get the old data back. Let’s find the oldest run of the relu model.

Copyright Eli Lilly and Company