LLMs have no DELETE button. There is no straightforward mechanism to “unlearn” specific information, no equivalent to deleting a row in your database user table. In a world where “right to be forgotten” is central to many privacy regulations, using LLMs presents some difficult challenges.
Data Privacy Vault is IEEE’s recommended architecture for securely storing, managing and utilizing sensitive customer’s Personally Identifiable Information (PII).
Data of a sensitive nature can seep into LLM during training as well as inference. During training, information of a sensitive na ture may be ingested from documents that are not anonymized or redacted for sensitive information. During inference, a prompt may inadvertently provide sensitive information. For example, a prompt that requests the LLM to summarize a will with sensitive information.
Only way to delete information from an LLM is to train it from scratch! Hence, don’t let sensitive information get in in the first place.
Key consideration for anonymization is Referential Integrity.
Synthetic Data
⇄
Original Sensitive Information
Is private LLM a solution? As opposed to managed service like OpenAI’s ChatGPT. Who will update the base model to keep up with new releases? Expensive!
Tokenization: Swap sensitive data for tokens. A token is a reference for some sensitive data somewhere else. Thus, reference something while providing obfuscation.The data ingested via the application frontend has any sensitive data including Personally Identifiable Information (PII) replaced by tokens generated by the Data Privacy Vault.Fig. shows sensitive data being replaced by tokens by the application frontend through the Data Privacy Vault. The assets downstream of the app – app database, warehouse, reports and/or analytics – then only “know” the tokenized data. These are not tokens in the sense of tokenization in LLMs but tokens that hold a reference to the original data which is stored in the Data Privacy Vault.Vault not only stores and generates de-identified data, it tightly controls access to sensitive data through a zero-trust model, where user accounts are managed through explicit access control policies. 777-123-4567 → ABC4567
Fig. shows sensitive data under explicit access control to address WHO sees WHAT.
WHO sees WHAT? The team that has access to the Data Privacy Vault is verifiably in-scope of Identity & Access Management (IAM). Sensitive information can be redacted according to subscriber roles.
Using privacy enhancing techniques such as polymorphic encryption and tokenization, sensitive data can be de-identified in a way that preserves referential integrity.
Prompt Seepage: Sensitive data may also enter a model during inference. For example, a prompt is created asking for a summary of a will. The vault detects the sensitive information, de-identifies it, and shares a non-sensitive version of the prompt with the LLM. Since the LLM was trained on non-sensitive and de-identified data, inference can be carried out as normal.
Data Privacy Vault Architecture
Fig. shows the flow of information in a Data Privacy Vault architecture.
🚀 Dive into the cutting-edge world of Artificial Intelligence with my hands-on class using FastAI! In this immersive learning experience, you’ll not only grasp the fundamentals of AI but also explore contemporary challenges and solutions, including the privacy and compliance issues associated with powerful tools like large language models (LLMs). Get hands-on experience with state-of-the-art techniques while unraveling the complexities of generative AI. Join me on this exciting journey to master FastAI and gain insights into the latest advancements in AI technology. Don’t just follow the AI wave—ride it with confidence in my dynamic and practical AI class! 🤖💡 #AI #FastAI #HandsOnLearning #TechInnovation
They share a common origin – each one was generated by a Deep Learning model. Intrigued to understand how? Large Language Models (LLMs) are multifaceted, handling complex tasks such as sentence completion, Q&A, text summarization, and sentiment analysis. LLMs, emphasizing their substantial size, are intricate models with tens or hundreds of billions of parameters, honed on vast datasets totaling 10 terabytes. However, it is possible to appreciate the foundation of how machines learn meaning from text starting from a seemingly straightforward concept – the bigram model.
The bigram model operates on the principle of predicting one token from another. For simplicity, let’s consider tokens as characters in the English alphabet. This principle closely aligns with the essence of LLMs like ChatGPT, which predict subsequent tokens based on preceding ones, iteratively generating coherent text and even entire computer programs. In our bigram model, however, we predict one character from the next, utilizing a 26×26 matrix of probabilities. Each entry in the matrix represents the probability of a particular character appearing after another. This matrix, with some modifications, constitutes our model. Our goal? To generate names.
Bigram Matrix
The bigram matrix shows the frequency of occurrence of one token following another (“bigram”) in a given dataset. The tokens here are characters of the English alphabet plus one additional token to mark the start or end of a word. The dataset is a collection of 30,000+ names from a public database. The entry in a cell is the count of occurrences of the character in the column following the character in the row.
We introduce an extra character to mark the start or end of a word, expanding from a 26×26 matrix to a 27×27 matrix. The matrix entries arise from patterns observed in a training dataset comprising over 30,000 names from a public database. Raw occurrence counts shown are transformed into probabilities for sampling. Generating a name involves starting with the character that marks the start of a word, sampling the 1st character from the multinomial probability distribution in the 1st row, recycling that character as input to predict the 2nd character, and so forth until reaching the end character. The resulting names, like junide, janasah, p, cony, and a, showcase the model’s unique outputs.
Considering these names, one might favor Janasah! But there’s room for enhancement. Enter the neural network! How would this transition occur? Instead of relying on a lookup matrix, the neural network would predict one character from another. Here’s how:
Representation: Numerically represent each character for input and output with vectors of length 27, accounting for the extra character.
Data Sets: Divide the data into training, validation, and testing sets to train the model, guard against overfitting, and assess performance.
Loss Function: Utilize negative log-likelihood, common in such scenarios, calculated through a softmax layer to generate a probability distribution.
Training: Adjust model parameters using calculated gradients and backpropagation through the neural network.
Refer to the Colab notebook for the implementation with detailed notes. So we have trained a neural network to do what we could do with a matrix. What’s the big deal?
For one, we can use a longer sequence of characters as input to the neural network, giving the model more material to work with to make better predictions. This block of characters provides not just one sequence, but all sequences including and up to the last character as context to the neural network. This already goes beyond what we can do with matrices with counts of occurrences of bigrams.
But how does a neural network learn meaning in text? Part of the answer lies in embeddings. Every token is converted into a numerical vector of fixed size, thus allowing a spatial representation in which meaningful associations can take shape. We allow the embeddings to emerge as properties of a neural network during the training process. The deeper layers of the neural network use these associations as stepping stones to enrich structure in keeping with the nuances and intricacies of linguistic constructs.
Talk about layered meaning!
Wrapping up our baby steps in language models, we’ve transitioned from basic bigram models to deep neural networks, exploring the evolution from mechanical predictions to embeddings that allow associations that capture primitives of nuanced linguistic structure. We get a glimpse into the potential of these models to grasp the intricacies of language, beyond generating names. As we take these initial steps, the horizon of possibilities widens, promising not only enhanced language generation but also advancements in diverse applications, hinting at a future where machines engage with human communication in increasingly sophisticated ways.
Explore the fascinating world of Artificial Intelligence in my upcoming class, powered by FastAI! We’ll embark on a hands-on journey through the evolving landscape of AI, building models with state-of-the-art architecture and learning to wield the power of Large Language Models (LLMs). Whether you’re a beginner or seasoned enthusiast, this class promises a dynamic and engaging exploration into the realm of AI, equipping you with the skills to navigate and innovate in this rapidly evolving field. Join me for an exciting learning experience that goes beyond theory, fueled by the practical insights and advancements offered by FastAI.
In the world of app development, we often find ourselves humming to the tune of “Little boxes on the hillside, little boxes made of ticky tacky.” While our digital boxes may lack the charming hues of green, pink, blue, and yellow, they still bear a striking resemblance to their real-world counterparts.
You can find out more about this delightful song by folk singer-songwriter Malvina Reynolds on this wiki.
Little boxes on the hillside, Little boxes made of tickytacky
Little boxes on the hillside, little boxes all the same
There’s a green one and a pink one and a blue one and a yellow one
And they’re all made out of ticky tacky and they all look just the same.
These “ticky tacky” boxes are none other than the trusty rectangular frames called arrangements, the unsung heroes of screen design in App Inventor. Much like their physical counterparts, these arrangements serve as the foundation for organizing the various elements that make up your app’s user interface. It’s a simple concept but one that works wonders.
Within these rectangular frames, elements align themselves neatly, jostling for space from left to right in a horizontal arrangement. If you want to pile them up, just opt for the vertical arrangement. Buttons, labels, images, and other visual components can be neatly arranged using these building blocks.
But what if you prefer a bit of artistic flair? Don’t fret! When you choose not to employ an arrangement, elements will naturally stack up. Moreover, these arrangements can play nice with each other, nestling neatly within one another to create a more complex layout.
Image Viewer
The seelcted image is displayed in a view port at the top of the screen.
Image Selector
The buttons are arranged using a HorizontalArrangement from the Layout drawer.
Map With Navigation
The Map component from Maps drawer is used along with Navigation component from the same drawer.
Take, for instance, the “Slideshow” app. It employs the default Screen layout so the image viewer, the row of buttons and the map component stack vertically. The first two have Height set to Automatic... and Width set to Fill parent.... These take up height to fit the contents. The last has the Height and Width both set to Fill Parent. This takes up the remaining available real estate. We only use HoriontalArrangement for the buttons which are stacked horizontally in a row. It’s like a choreographed dance of elements, all thanks to these arrangements.
One of the perks of this layout method is “relative dimensioning.” This means that you can specify the size of elements relative to their containers, all the way up to the outermost container—the screen itself. This ensures a consistent look across devices with different screen sizes, making your app appear polished and professional.
Despite its simplicity, AppInventor empowers creative minds to craft aesthetically pleasing arrangements with complex structures, much like the iconic works of Piet Mondrian. With the versatile use of horizontal and vertical arrangements, you can orchestrate a symphony of colors, shapes, and proportions, evoking the spirit of abstract art within the confines of mobile app design. Your Mondrian-inspired creation in AppInventor showcases the fusion of technology and artistic expression, proving that even the most unassuming tools can spark a touch of genius.
Readers with a background in web development may want to check out Jen Simmons blogpost on Mondrian layouts with CSS Grid.
For animations, AppInventor offers a Canvas component on which to arrange ImageSprite components. Find these in the Drawing and Animation drawer. These allow for shaping, positioning, and sizing animated characters and props on a backdrop in line with the requirements of animation. It’s a world of possibilities, but with a boundary—sprites are confined within the canvas perimeter, and only sprites can “live” there. We will delve more into animation with canvas and sprites in another blogpost.
Happy designing!
Do you have a brilliant app idea but feel overwhelmed, thinking it’s a task best left to computer wizards? It may be easier than you think. Our AppInventor course is here to demystify app development and empower you to turn your concepts into reality. Let our hands-on approach guide you through the process. Each week, you’ll design and build real apps, learning key concepts such as layouts discussed in this blogpost. By the end of the course, you’ll be equipped with the skills and confidence to bring your unique ideas to life. Join us, and discover the art of making apps, one step at a time.
Imagine you are eagerly awaiting a package delivery at your doorstep. In this scenario, you have two distinct approaches to staying informed about when your eagerly anticipated package arrives. First, you could repeatedly venture to your doorstep, hoping to catch a glimpse of your precious cargo. Alternatively, you could simply relax and await the familiar sound of the doorbell, which signals the arrival of the deliveryman and your package. The moment the doorbell rings, you spring into action, promptly picking up your long-awaited package.
These two methods of package monitoring are symbolic of different paradigms in the world of programming. The event-driven paradigm, often exemplified by the doorbell scenario, is akin to following a set of instructions as if they were a “mad-lib,” constructed as follows: “WHEN event occurs, DO action.” In this construct, the first part represents the event, while the second part signifies the callback or action to be executed in response to the event. But how does this analogy relate to the realm of Android app development?
In the context of Android app development, this paradigm is of paramount importance. It hinges on the execution of a callback function when a specific event transpires. These events can be intrinsically tied to the actions of the phone’s user, such as tapping the screen or swiping a finger across it. However, user interactions are just one facet of the vast spectrum of events. Another category of events revolves around time, driven by clocks and timers. The callback function springs into action when a predetermined time interval lapses, and the timer signals its completion.
To illustrate this concept, let’s consider an app known as “Mole Mash”. This app employs a timer to move the mole to a random location on the screen. The objective is to stab the mole and score points. In this instance, event-handler implements the logic as follows: WHEN timer goes off DO move the mole to a random location. In AppInventor, the green block represents the event-handler that sets off the timer at regular intervals. The interval is a configurable property and can be set manually or programmatically. The purple block inside is the callback procedure that moves the mole to a random location. The callback uses a canned procedure for random number generation to set the X and Y coordinates of the mole independently. Thus, the mole comes to life!
In essence, event-driven programming in the context of Android app development hinges on the idea of responding to various events, whether initiated by the user’s actions or the passage of time, by executing predefined actions or callbacks. This flexible and dynamic paradigm enables developers to create interactive and responsive applications that cater to a multitude of scenarios and user experiences.
📱 Explore Android App Development with AppInventor! 🤖
Discover the exciting world of Android app development with AppInventor, where you can build practical skills while crafting innovative apps. Our class offers a hands-on approach to learning key software concepts, such as event-driven programming. Throughout the course, students have the opportunity to create a new app every week, gaining valuable experience and insights into the world of app development. Join us on this coding journey and unlock the potential of interactive, responsive applications. Dive into the world of possibilities with AppInventor! 💡👩💻📦
train_model(epochs=30, lr=0.1): This function acts as the outer wrapper of our training process. It requires access to the training data, trainingIn and trainingOut, which should be defined in the environment. train_model orchestrates the training process by calling the execute_epoch function for a specified number of epochs.
execute_epoch(coeffs, lr): Serving as the inner wrapper, this function carries out one complete training epoch. It takes the current coefficients (weights and biases) and a learning rate as input. Within an epoch, it calculates the loss and updates the coefficients. To estimate the loss, it calls calc_loss, which compares the predicted output generated by calc_preds with the target output. After this, execute_epoch performs a backward pass to compute the gradients of the loss, storing these gradients in the grad attribute of each coefficient tensor.
calc_loss(coeffs, indeps, deps): This function calculates the loss using the given coefficients, input predictors indeps, and target output deps. It relies on calc_preds to obtain the predicted output, which is then compared to the target output to compute the loss. The backward pass is subsequently invoked to compute the gradients, which are stored within the grad attribute of the coefficient tensors for further optimization.
calc_preds(coeffs, indeps): Responsible for computing the predicted output based on the given coefficients and input predictors indeps. This function follows the forward pass logic and applies activation functions where necessary to produce the output.
update_coeffs(coeffs, lr): This function plays a pivotal role in updating the coefficients. It iterates through the coefficient tensors, applying gradient descent with the specified learning rate lr. After each update, it resets the gradients to zero using the zero_ function, ensuring the gradients are fresh for the next iteration.
init_coeffs(n_hidden=20): The initialization function is responsible for setting up the initial coefficients. It shapes each coefficient tensor based on the number of neurons specified for the sole hidden layer.
model_accuracy(coeffs): An optional function that evaluates the prediction accuracy on the validation set, providing insights into how well the trained model generalizes to unseen data.
In this blog post, we’ll take a deep dive into constructing a powerful deep learning neural network from the ground up using PyTorch. Building upon the foundations of the previous simple neural network, we’ll refactor some of these functions for deep learning.
Deep Learning
Refactor code for multiple hidden layers
Initializing Weights and Biases
To prepare our neural network for deep learning, we’ve revamped the weight and bias initialization process. The init_coeffs function now allows for specifying the number of neurons in each hidden layer, making it flexible for different network configurations. We generate weight matrices and bias vectors for each layer while ensuring they are equipped to handle the deep learning challenges.
def init_coeffs(hiddens=[10, 10]):
sizes = [trainingIn.shape[1]] + hiddens + [1]
n = len(sizes)
weights = [(torch.rand(sizes[i], sizes[i+1]) - 0.3) / sizes[i+1] * 4 for i in range(n-1)] # Weight initialization
biases = [(torch.rand(1)[0] - 0.5) * 0.1 for i in range(n-1)] # Bias initialization
for wt in weights: wt.requires_grad_()
for bs in biases: bs.requires_grad_()
return weights, biases
We define the architecture’s structure using sizes, where hiddens specifies the number of neurons in each hidden layer. We ensure that weight and bias initialization is suitable for deep networks.
Forward Propagation With Multiple Hidden Layers
Our revamped calc_preds function accommodates multiple hidden layers in the network. It iterates through the layers, applying weight matrices and biases at each step and introducing non-linearity using the ReLU activation function in the hidden layers and the sigmoid activation in the output layer. This enables our deep learning network to capture complex patterns in the data.
def calc_preds(coeffs, indeps):
weights, biases = coeffs
res = indeps
n = len(weights)
for i, wt in enumerate(weights):
res = res @ wt + biases[i]
if (i != n-1):
res = F.relu(res) # Apply ReLU activation in hidden layers
return torch.sigmoid(res) # Sigmoid activation in the output layer
Note that weights is now a list of tensors containing layer-wise weights and correspondingly, biases is the the list of tensors containing layer-wise biases.
Backward Propagation With Multiple Hidden Layers
Loss calculation and gradient descent remain consistent with the simple neural network implementation. We use the mean absolute error (MAE) for loss as before and tweak the update_coeffs function to apply gradient descent to update the weights and biases in each hidden layer.
def update_coeffs(coeffs, lr):
weights, biases = coeffs
for layer in weights+biases:
layer.sub_(layer.grad * lr)
layer.grad.zero_()
Putting It All Together in Wrapper Functions
Our train_model function can be used ‘as is’ to orchestrate the raining process using the execute_epoch wrapper function to help as before. The model_accuracy function also does not change.
Summary
Conclusion and Takeaways
With these modifications, we’ve refactored our simple neural network into a deep learning model that has greater capacity for learning. The beauty of it is we have retained the same set of functions and interfaces that we implemented in a simple neural network, refactoring the code to scale with multiple hidden layers.
train_model(epochs=30, lr=0.1): No change!
execute_epoch(coeffs, lr): No change!
calc_loss(coeffs, indeps, deps): No change!
calc_preds(coeffs, indeps): Tweak to use the set of weights and corresponding set of biases in each hidden layer, iterating over all layers from input to output.
update_coeffs(coeffs, lr): Tweak to iterate over the set of weights and accompanying set of biases in each layer.
init_coeffs(hiddens=[10, 10]): Tweak for compatibility with an architecture that can potentially have any number of hidden layers of any size.
model_accuracy(coeffs): No change!
Such a deep learning model has greater capacity for learning. However, it is is more hungry for training data! In subsequent posts, we will examine the breakthroughs that have made it possible to make deep learning models practically feasible and reliable. These include advancements such as:
Batch Normalization
Residual Connections
Dropouts
Are you eager to dive deeper into the world of deep learning and further enhance your skills?Consider joining our coaching class in deep learning with FastAI. Our class is designed to provide hands-on experience and in-depth knowledge of cutting-edge deep learning techniques. Whether you’re a beginner or an experienced practitioner, we offer tailored guidance to help you master the intricacies of deep learning and empower you to tackle complex projects with confidence. Join us on this exciting journey to unlock the full potential of artificial intelligence and neural networks.
In this blog post, we will walk you through the process of creating a simple neural network from scratch in PyTorch for binary classification. We will implement a neural network with one hidden layer containing multiple neurons followed by a single output neuron. We will also discuss the design choices made for this network, including the use of ReLU activation in the hidden layer and sigmoid activation in the output layer.
Neural Network Architecture
The architecture of our simple neural network can be summarized as follows:
Input Layer
Hidden Layer with n neurons and ReLU activation.
Output Layer with a single neuron and sigmoid activation.
This structure allows us to demonstrate the gradient descent algorithm in PyTorch with multiple iterations of two steps as follows:
Forward-propagate inputs to generate outputs and compute loss
Backward-propagate loss by computing gradients and applying them to update model parameters.
We show how PyTorch uses tensors to parallelize operations for efficiency.
Training Data
It is customary to split the available data into three distinct sets: training, validation, and testing. These sets serve specific roles in the model development process.
Training Data: The training set is the largest portion of the data and is primarily used for training the model. During training, the gradients are computed on this data to update the weights and biases iteratively, allowing the model to learn from the provided examples.
Validation Data: The validation set is essential for assessing the model’s performance during training. It is not used for gradient computation but serves as a means to measure the loss. This monitoring helps prevent overfitting, a scenario where the model memorizes the training data rather than generalizing from it. Adjustments to the model can be made based on the validation loss.
Test Data: The test set is a reserved subset and should be used sparingly. It comes into play only after the model has completed its training phase. It serves the purpose of evaluating the model’s generalization performance on unseen data and reporting the final results. It ensures that the model can make accurate predictions on new, previously unseen examples, thus providing a reliable measure of its effectiveness.
This partitioning strategy allows for rigorous model assessment and ensures that the model’s performance is accurately estimated on data it has not encountered during training or validation. Before running the code, ensure that trainingIn and trainingOut are defined as global variables. These are represented as tables where rows correspond to individual examples, and each column represents a specific field or feature.
trainingIn contains the independent variables and has the shape (#examples x #variables), where #examples is the number of data points or examples in our training dataset and #variables is the number of independent variables or features.
trainingOut contains the dependent variable and has the shape (#examples x 1), where #examples is the same as in trainingIn
Likewise, we’d want the validationIn and validationOut sets as global variables.
Backpropagation
Apply gradient descent for training
Initializing Weights and Biases
We start by defining the initialization function init_coeffs to set up the initial weights and biases for the neural network. The initialization process includes the following steps:
We divide the weights in the hidden layer by the number of hidden neurons to help with convergence.
We introduce a bias for the output layer.
Note that we set requires_grad on weights and biases during initialization. This is a crucial step, as it informs PyTorch to track and compute gradients for these parameters during the subsequent forward and backward passes. When the loss is calculated as a function of weights and biases, PyTorch automatically computes the gradients of the loss with respect to these parameters and stores them for gradient descent optimization.
Forward Pass
Next, we define the function calc_preds to perform the forward pass of the neural network:
We use the sigmoid activation in the output layer.
The use of non-linearity is key, Without it, the linear layers are equivalent to a single layer. More importantly, the superposition of non-linearities is what gives the neural network the property of being a universal function approximator. We have chosen ReLU for hidden layer and sigmoid of the output layer, enabling the interpretation of the output as a likelihood score.
Loss Calculation
We calculate the loss using the mean absolute error (MAE) in the calc_loss function:
def calc_loss(coeffs, indeps, deps):
predictions = calc_preds(coeffs, indeps)
loss = torch.abs(predictions - deps).mean()
return loss
Notice that the loss is a function of the weights and biases. By setting requires_grad on these parameters, we inform PyTorch that we are interested in computing the gradients of the loss with respect to these parameters for the purpose of optimization.
Training the Model To train the neural network, we define the training process using the train_model function:
def train_model(epochs=30, lr=0.1):
torch.manual_seed(442)
coeffs = init_coeffs()
for i in range(epochs):
execute_epoch(coeffs=coeffs, lr=lr)
return coeffs
The train_model function:
Initializes the coefficients.
Iterates through a specified number of epochs.
Calls execute_epoch for each epoch to update the coefficients.
Executing an Epoch
The execute epoch function calculates the loss using calc_loss and propagates the gradients using update_coeffs as follows:
def execute_epoch(coeffs, lr):
loss = calc_loss(coeffs, trainingIn, trainingOut)
loss.backward()
with torch.no_grad():
update_coeffs(coeffs, lr)
print(f'{loss:.3f}', end='; ')
When we call backward on the loss, PyTorch automatically calculates gradients for all the parameters that contribute to the loss and have requires_grad set. These gradients are stored with the respective parameters and can be accessed using the grad attribute.
Updating Coefficients
The update_coeffs function is used to update the coefficients using gradient descent as follows:
def update_coeffs(coeffs, lr):
for layer in coeffs:
layer.sub_(layer.grad * lr)
layer.grad.zero_()
Note that PyTorch accumulates gradients unless these are reset to zero between successive steps. That is why we have zero_ once the gradients are used to update weights and biases.
Running the Training
Finally, we run the training with different learning rates and for varying numbers of epochs:
coeffs = train_model(lr=1.4) # Example 1
coeffs = train_model(lr=20) # Example 2
coeffs = train_model(epochs=100, lr=10) # Example 3
You can observe how the loss changes during training and evaluate the model’s accuracy based on your dataset.
Model Accuracy
Optionally, we can implement a function model_accuracy(coeffs), to evaluate the accuracy of the trained model on the validation dataset.
That’s it! We now have a simple neural network implemented from scratch in PyTorch for binary classification. We can customize the architecture, hyperparameters, and activation functions to suit our specific problem.
Summary
Conclusion and Takeaways
We split the dataset into subsets for training and validaton. We then wrote a series of functions to parcel out the code for each step in the training process, culminating in the train_model() wrapper that requires data cuts trainingIn and trainingOut in the environment. The steps are as follows:
train_model(epochs=30, lr=0.1): This function acts as the outer wrapper of our training process. It requires access to the training data, trainingIn and trainingOut, which should be defined in the environment. train_model orchestrates the training process by calling the execute_epoch function for a specified number of epochs.
execute_epoch(coeffs, lr): Serving as the inner wrapper, this function carries out one complete training epoch. It takes the current coefficients (weights and biases) and a learning rate as input. Within an epoch, it calculates the loss and updates the coefficients. To estimate the loss, it calls calc_loss, which compares the predicted output generated by calc_preds with the target output. After this, execute_epoch performs a backward pass to compute the gradients of the loss, storing these gradients in the grad attribute of each coefficient tensor.
calc_loss(coeffs, indeps, deps): This function calculates the loss using the given coefficients, input predictors indeps, and target output deps. It relies on calc_preds to obtain the predicted output, which is then compared to the target output to compute the loss. The backward pass is subsequently invoked to compute the gradients, which are stored within the grad attribute of the coefficient tensors for further optimization.
calc_preds(coeffs, indeps): Responsible for computing the predicted output based on the given coefficients and input predictors indeps. This function follows the forward pass logic and applies activation functions where necessary to produce the output.
update_coeffs(coeffs, lr): This function plays a pivotal role in updating the coefficients. It iterates through the coefficient tensors, applying gradient descent with the specified learning rate lr. After each update, it resets the gradients to zero using the zero_ function, ensuring the gradients are fresh for the next iteration.
init_coeffs(n_hidden=20): The initialization function is responsible for setting up the initial coefficients. It shapes each coefficient tensor based on the number of neurons specified for the sole hidden layer.
model_accuracy(coeffs): An optional function that evaluates the prediction accuracy on the validation set, providing insights into how well the trained model generalizes to unseen data.
While we have demonstrated steepest gradient with a simple neural network, we can extend this implementation to a deep learning model by adding more hidden layers. All we need to do is refactor the code keeping the same set of 6 functions and their interfaces. Following the approach presented here, we can create a versatile and scalable neural network architecture tailored to specific requirements.
Happy coding!
Are you eager to dive deeper into the world of deep learning and further enhance your skills?Consider joining our coaching class in deep learning with FastAI. Our class is designed to provide hands-on experience and in-depth knowledge of cutting-edge deep learning techniques. Whether you’re a beginner or an experienced practitioner, we offer tailored guidance to help you master the intricacies of deep learning and empower you to tackle complex projects with confidence. Join us on this exciting journey to unlock the full potential of artificial intelligence and neural networks.
Have you ever dreamed of creating your own amazing app but felt overwhelmed by the complexities of traditional app development? Meet AppInventor, the revolutionary platform developed by MIT and Google that empowers everyone, regardless of their technical background, to build incredible apps with ease. With over 18 million users spanning all ages and backgrounds, AppInventor has been used to craft more than 85 million apps. All you need to get started is a laptop with a web browser like Chrome and an Android phone to turn your ideas into reality!
What Can You Create with AppInventor?
The possibilities with AppInventor are endless. Here’s a glimpse of the innovative apps that creators have brought to life:
Show-and-Tell: Craft interactive apps that seamlessly blend images, audio, or video to share knowledge. Whether it’s introducing people to local art pieces or aiding in meditation practices, you can make learning engaging and fun.
Animated Games: Dive into the world of classic games like Space Invaders, Mole Mash, or Pong. Build games that captivate and challenge users with animated characters, touch gestures, and sensor interactions.
Maps and Navigation: Develop location-based apps that provide directions and guide users to local attractions, ensuring they never miss out on exciting experiences.
Artificial Intelligence: Explore the realm of chatbots and computer vision to create intelligent applications that understand and interact with users in a human-like manner.
Building the App Screen
AppInventor boasts a user-friendly drag-and-drop interface that simplifies app creation. Begin by crafting the app’s screen, effortlessly placing graphical elements like images, buttons, sliders, and spinners. Organize these components using layouts to maximize screen real estate. For drawing and animation, leverage the canvas element, which responds to touch gestures, and populate it with animated characters or sprites for interactive games. Beyond visible elements, AppInventor lets you integrate various functionalities, such as sensors, social media connectors, databases, and even AI components.
Building the App Logic
Modern apps rely on event-driven architecture, where actions are triggered by events like user gestures, sensor data, or timers. With AppInventor, you build these logic routines using blocks that you simply drag and drop onto the coding canvas, where they seamlessly snap together to form programs. These blocks act as the fundamental building blocks of your app, representing either data or operations on data. Whether you’re performing arithmetic operations, text analysis, or time-related tasks, AppInventor offers an array of blocks to fulfill your programming needs, from creating a simple pedometer to crafting an AI-powered chatbot.
Demo: Mole Mash
Take, for instance, the classic game of Mole Mash, where the mole rapidly changes its location on the screen, and players score points by tapping it. AppInventor makes it a breeze to create this game with just a few blocks. By adding a timer element and configuring it to trigger every second, you can implement a callback function called ‘moveMole’ using blocks. In this custom function, you calculate the mole’s coordinates using a random number generator and ensure it stays within the screen boundaries. With minimal code, you breathe life into your game!
A Vibrant User Community
AppInventor encourages collaboration and learning within a community of fellow coders. Its active user community provides invaluable support, fostering an environment where problem-solving becomes a shared journey. This mirrors the collaborative nature of the tech industry, where community assistance plays a crucial role in the coding process.
Skillsets You’ll Develop
Coding with AppInventor hones a wide range of valuable skills, including:
Designing captivating user interfaces with stunning graphics.
Harnessing smartphone sensors to enhance app functionality.
Creating animated games featuring sprites and canvases.
Orchestrating events using timers.
Crafting program flow control with loops and conditional statements.
Modeling data using contemporary structures like lists and dictionaries.
Storing data locally or in cloud databases.
Mastering event-driven programming.
Grasping principles of object-oriented programming.
Exploring the fascinating world of artificial intelligence.
And many more!
So, what would you like to build with the limitless potential of MIT AppInventor? Let your creativity soar, and turn your app ideas into reality today!
Unlock Your App Potential with Our MIT AppInventor Course!
🚀 Ready to transform your app dreams into reality? 🌟 Join us and discover the power of MIT AppInventor. 🧠 Learn to create engaging apps, from games to AI-driven solutions. 👩💻 No coding experience needed – just bring your ideas!
📆 Enroll now and embark on your app-building journey!
We have coded CRUD operations and provided data models in Data layer to support I/O. We have modeled data to align with domain business logic and housed these models in the Domain layer for other layers to use. We have provided mapping functionality between sets of data models. So are we ready to access data?
There is some more work (!) to do, in order that we build a professional app that is responsive and doesn’t throw up surprises when contingencies occur.
We want to be mindful of the following:
Remote I/O operations take an undefined amount of time and are unpredictable due to contingencies such as poor internet connectivity, insufficient access privilege or remote service outages. We need to account for these contingencies in our app.
We want to offer a responsive User Experience (UX) that doesn’t keep the user waiting or require unnecessary clicks for user action to be completed.
The repository pattern is where this management overhead resides. At the very least, we implement the following features as we open the door to data in our app:
Exception Handling: We wrap I/O operations in a try-catch block and print diagnostic error messages. We can further enhance this to look for specific errors and fine-tune how we handle contingencies.
Event Emissions: We emit results, whether success or failure, as events to take advantage of Kotlin flows in our app. Streaming events in this ways offers a superbly elegant way to serve content in a dynamically updating UI that is listening for events.
The effect is magical!
Data Contract
Define the contract in Domain layer
Let’s look at sample code. Following the pattern of programming to an interface, we define the repository’s interface in the Domain layer in the package repository. Here is the interface, which resides in package “repository” in the Domain layer.
interface QuizRepository {
suspend fun getModuleNames(
fetchFromRemote: Boolean
): Resource<List<ChapterInfo>>
suspend fun getModuleContents(
name: String
): Flow<Resource<List<QuizQuestion>>>
}
The code defines an interface named QuizRepository that outlines the functions required for interacting with data related to the quiz modules. The functions are designed for use for asynchronous I/O in Kotlin Coroutines. They can take advantage of Kotlin flows to sequentially emit events in place of one return value. Let’s unpack the functions defined in this interface:
suspend fun getModuleNames(fetchFromRemote: Boolean): Resource<List<ChapterInfo>>: This function fetches a list of module names (representing different sections or chapters) that are part of the quiz app. It is marked with suspend to indicate that it should be called from a coroutine or another suspending function, as it is designed to perform network I/O operations asynchronously. It takes a Boolean parameter fetchFromRemote, which suggests whether the data should be fetched from a remote data source (e.g., a server) or if it can be retrieved from a local cache or other sources. The exact behavior can be fine-tuned in the implementation. It returns a Resource object wrapping a List of ChapterInfo. The Resource class is often used to represent the result of asynchronous operations and can hold data, loading states, or error messages.
suspend fun getModuleContents(name: String): Flow<Resource<List<Quiz>>>: This function is responsible for fetching the contents of a specific quiz module based on its name. It is also marked with suspend, indicating it’s a suspending function suitable for asynchronous operations. It takes a name parameter representing the identifier of the quiz module to retrieve. It returns a Flow of Resource wrapping around a List of Quiz objects. A Flow is a cold asynchronous data stream that emits values over time. The Resource class is designed to wrap around a generic type so as to encapsulate any result type from a successful I/O operation.
In summary, the QuizRepository interface defines functions to fetch module names and module contents. These functions return data wrapped in a Resource object, indicating the status of the operation (e.g., success, loading, or error). The implementation of such an interface provides the actual logic for fetching data, including network requests, database queries, or other data retrieval methods, using modulating flags like fetchFromRemote to tune the behavior. This structure allows for flexibility in handling data sources and provides a clean separation of concerns in the app’s architecture.
Data Access
Flesh out the contract in Data layer
Being in the Domain layer, the interface only knows the data models in the home layer. The implementation however requires use of assets we have built out in the Data layer. Hence, we put the implementation in the package “repository” in the Data layer.
Do you see how the separation of concerns works? The repository opens a door to the Data layer and defines the contract in the interface. The contract, in the Domain layer, is written in the language of the domain. The contract’s implementation resides in the Data layer. We can choose to change the implementation should we make changes in the Data layer, for example, cache remote data locally. These changes need not affect the contract, wherein this detail is abstracted away.
Now let’s look at a concrete function where the I/O operation is implemented.
override suspend fun getModuleContents(name: String): Flow<Resource<List<QuizQuestion>>> {
return flow {
emit(Resource.Loading(true))
try {
val knowledgeBank = fireStoreCRUD.fetchModuleContents(name)
emit(Resource.Success(data = knowledgeBank))
} catch (e: Exception) {
emit(Resource.Failure(message = e.message ?: "Got not contents for ${name} in Firestore!"))
} finally {
emit(Resource.Loading(false))
}
}
}
Here, we have an implementation of the getModuleContents function within the repository, which fetches the contents of a specific quiz module. This function returns a Flow of Resource<List<Quiz>>. Let’s break down how it works:
override suspend fun getModuleContents(name: String): Flow<Resource<List<Quiz>>> { ... }: This function is marked with override as it provides the concrete implementation of the abstract suspend function of the same name in the interface.
return flow { ... }: The function returns a Kotlin Flow, which is a cold asynchronous data stream that emits values over time. The emissions use the Resource class to represent the state of the I/O operation – whether success, failure or work-in-progress.
emit(Resource.Loading(true)): The initial emission within the flow indicates that the operation is in a loading state, with Resource.Loading(true) signaling that loading has started.
try { ... } catch (e: Exception) { ... } finally { ... }: The core logic of the function is enclosed within a try-catch-finally block to handle different scenarios during the data retrieval process.
val knowledgeBank = fireStoreCRUD.fetchModuleContents(name): Inside the try block, it attempts to fetch the contents of the specified quiz module by calling function fetchModuleContents from a fireStoreCRUD instance passed to the repository via dependency injection. The function is responsible for making a request to the Firestore database.
emit(Resource.Success(data = knowledgeBank)): If the data retrieval is successful (i.e., no exceptions are thrown), it emits an event of type Resource.Successwith the fetched List<Quiz> in variable knowledgeBank as the data payload. This indicates that the operation was successful, and makes the retrieved data available to the consumers of this Flow.
catch (e: Exception) { ... }: If an exception is thrown during the data retrieval process, it catches the exception. It then emits an event of type Resource.Failure with an error message derived from the exception, or a default message if no message is available. This indicates that there was a failure in retrieving the data.
finally { ... }: The finally block is used to ensure that the loading state is correctly updated, regardless of whether the data retrieval was successful or encountered an error. It emits an event of typeResource.Loading(false) to signal that the loading has completed.
In summary, this function returns a Flow that emits a sequence of Resource objects to represent the status of the data retrieval operation. It starts by emitting a loading state, then attempts to fetch the data. If successful, it emits a success state with the retrieved data; if an error occurs, it emits a failure state with an error message. Finally, it emits a loading state indicating that the operation has completed, whether successfully or with an error. This design allows the UI to respond to different states (loading, success, failure) and appropriately display the data or error messages to the user.
Resource
Create class Resource for state emissions
A traditional function returns a single result at the end of the operation. By contrast, a Kotlin Flow allows us to emit events asynchronously as many times as needed in an operation. A sealed class is commonly used for handling different states or outcomes of such an asynchronous operation, when making network requests or performing database queries.
sealed class Resource<T>(val data: T? = null, val message: String? = null) {
class Success<T>(data: T?): Resource<T>(data)
class Failure<T>(message: String, data: T? = null): Resource<T>(data, message)
class Loading<T>(val isLoading: Boolean = true): Resource<T>(null)
}
Here, we have defined a sealed class named Resource<T> that has three subclasses:
Success<T>: This subclass represents a successful outcome of an operation. It contains a nullable data attribute to hold the result of the operation, which is of a generic type T. It inherits from the Resource<T> class, passing the data to its parent class constructor
Failure<T>: This subclass represents a failed outcome of an operation. It contains a message attribute, which can hold an error message or description, and an optional data attribute that can contain additional data related to the failure. It also inherits from the Resource<T> class, passing both data and message to its parent class constructor.
Loading<T>: This subclass represents a loading or in-progress state of an operation. It includes a isLoading attribute, which is a boolean indicating whether the operation is still loading. It does not carry any data or message, so it passes null to the parent class constructor.
By using this sealed class, we can create instances of Resource to represent different states of our app’s operations, making it easier to handle and communicate the results, errors, and loading states in a consistent and safe manner throughout your app. We will use this class in ViewModel to provide real-time updates to our UI based on the state of asynchronous operations.
Conclusion
Takeaway and Next Steps
Recap:
Create a repository interface in the package “repository” in Domain layer, which will be used by the Presentation layer.
Implement the repository interface in the package “repository” in the Data layer.
Implement sealed class Resource in package “util”.
Now we are ready to use the repository access data in the Presentation layer. However, before we can use the repository in View Model, there is one more set up we need to take care of. This is, setting up the repository for injection into the View Model using Dagger-Hilt!
Business Logic
Model data for ingestion into the app
We define data models in the Data layer for storage and separate data models in the Domain layer for implementing the core business logic of our app. Storage can take various forms, such as when we introduce caching, requiring its own set of data models in the Data layer for local database operations.
But why the need for distinct data models? The choice of storage technology significantly influences our data design. In the case of relational databases, like Room DB for caching, data must conform to an atomic entity structure. Consequently, we place the corresponding data models in the package “local”within the Data layer and employ them for CRUD operations.
In contrast, NoSQL databases, like Firebase Firestore, offer more flexibility. Firestore’s data organization revolves around collections of documents, where each document contains named slots for numeric or textual data. This organizing principle is reflected in the design of our data models, which we place in the package “remote”within the Data layer.
The Domain layer harmonizes over differences by presenting data models in a manner closer to how we typically conceptualize domain entities, ensuring they fit seamlessly into our app’s business logic. We introduce a package “mapper” in the Data layer to house the code for mapping between these different sets of models. We house this code in the Data layer, adhering to the principle that the Domain layer may not know about other layers but the other layers know about Domain layer.
Mapper
Write extension functions for mapping
We implement a mapper as an extension function of the source class that returns an object of the target class. This one-one mapping is fairly typical. The simplicity of business logic in your use-case means we do not need to deal with nested data structures, complex joins or aggregates.
Here is a sample mapper function.
fun ChapterInfoFire.toChapterInfo(): ChapterInfo {
return ChapterInfo(
name = name,
title = title,
description = description
)
}
The ChapterInfoFire.toChapterInfo() function is an extension function that maps an instance of the ChapterInfoFire data class into an instance of the ChapterInfo data class. In Kotlin, an extension function allows adding new functionality to an existing class without modifying its source code.
Here’s a breakdown of how the function works:
ChapterInfoFire.toChapterInfo(): This is the function signature, indicating that it’s an extension function for the ChapterInfoFire class, and it will return a ChapterInfo object.
return ChapterInfo(...): The function returns a new ChapterInfo object, and within the parentheses, it initializes the properties of the ChapterInfo object.
name = name, title = title, description = description: Here, it’s copying the values of the properties from the ChapterInfoFire object to the corresponding properties in the ChapterInfo object. The name, title, and description properties in the ChapterInfo object are set equal to the values of the same-named properties in the ChapterInfoFire object.
In summary, this extension function simplifies the process of converting a ChapterInfoFire object into a ChapterInfo object by copying the values from one to the other.
This example shows how we map data from the Data layer to the Domain layer. It allows seamless data flows while ensuring separation of concerns in MVVM architecture.
Conclusion
Takeaways and Next Steps
Recap:
Create data models in the “remote” package in Data layer for Firestore CRUD operations.
Create data models in the “domain” package in Domain layer to use in the app’s business logic.
Implement mapper functions in the “mapper” package to map data models between Data and Domain layers.
We have now set the stage for ingestion of data into the app. But we have yet to address all aspects of data management for a professional app. For example, our query may fail due to a poor network connection. We need to address this possibility and ensure the app handles off-the-happy-path scenarios gracefully.
Another important consideration is, we want to use Kotlin’s data structures to manage data flows to deliver a responsive user experience that doesn’t keep the user waiting and minimizes or eliminates unnecessary clicks. In other words, we need a repository pattern. That’s coming up next.
A key feature of the Domain layer is that it must not know about the Data and Presentation layers, but the other layers may know about it. The Domain layer serves as a crucial bridge between the Data and Presentation layers. Why do we need this bridge?
Because the design considerations that drive “How?” we shape data are different for “When?“we want store data and “When?“we want to work with data in our app. That means we need to maintain two sets of data models and the code that maps between them. So far, we have seen the storage side of the story. Now let us look at the domain side.
Data Models
Define data models for other layers
In our screen, we want to present a ToC, which is a list of chapters. Here is what the model for a single chapter looks like:
data class ChapterInfo(
val name: String,
val title: String,
val description: String
)
Compare with the model ChapterInfoFire in the Data layer’s package “remote”. These models are very alike.
The domain model for a quiz is as follows:
data class Quiz(
val questionNumber: Int,
val question: String,
val answer: String,
val image: String?,
val youTubeVideoId: String,
val timestamp: Int,
val chapterKey: String,
val isActive: Boolean
)
Here, we employ a questionNumber attribute to facilitate both the ranking and unique identification of questions within a chapter or module. The quiz question may have an explanatory image to accompany the answer. The quiz includes a YouTube video identified by youTubeVideoId with explanatory content. A timestamp marks the start time of relevant content on the video track. The chapterKey attribute is employed as a reference to the parent chapter or module, enhancing the ease of navigating associated information.
In the realm of relational databases, the navigation of data relationships across tables is guided by foreign key constraints, ensuring a structured approach. However, Firestore’s NoSQL design opts for a more flexible approach by discarding these constraints to reduce upfront storage design complexity. Nevertheless, this flexibility comes at the cost of requiring additional logic in our application code to navigate these relationships. For intricate relationship scenarios, it may be beneficial to explore alternatives, such as leveraging a “built-to-suit” backend solution with technologies like PostgreSQL and a Python FastAPI wrapper to streamline complex data interactions.
Both the data models we have shown (and any others we need in the future) are housed in package “model” in Domain layer.
Conclusion
Takeaways and Next Steps
We have defined the data models in the Domain layer. Our “Android Ally” quiz app does not have complex processing logic at this stage and the models sufficiently structure data for presentation. Where should the logic of mapping between sets of models reside? For this, we have package “mapper” in the Data layer. Let’s look at that next.