| Oct 27, 2023 | Biraj sarmah |
The Inner Workings of Neural Network |2|
The brightest neuron of that output layer is the network’s choice, so to speak, for what digit this image represents. And before jumping into the math for how one layer influences the next, or how training works, Let’s just talk about why it’s even reasonable to expect a layered structure like this to behave intelligently.
What are we expecting here? What is the best hope for what those middle layers might be doing? Well, when you or I recognize digits, we piece together various components. A 9 has a loop up top and a line on the right. An 8 also has a loop up top, but it’s paired with another loop down low. A four basically breaks down into three specific lines and things like that.
Now in a perfect world, we might hope that each neuron in the second to last layer corresponds with one of these subcomponents. That anytime you feed in an image with, say, a loop up top, like a nine or an eight, there’s some specific neuron whose activation is going to be close to one. And I don’t mean this specific loop of pixels.
The hope would be that any generally loopy pattern towards the top sets off this neuron. That way, going from the third layer to the last one just requires learning which combination of subcomponents corresponds to which digits. Of course, that just kicks the problem down the road, because how would you recognize these subcomponents, or even learn what the right subcomponents should be?
And I still haven’t even talked about how one layer influences the next, but run with me on this one for a moment. Recognizing a loop can also break down into sub problems. One reasonable way to do this would be to first recognize the various little edges that make it up. Similarly, a long line, like the kind you might see in the digits 1, or 4, or 7, well that’s really just a long edge, or maybe you think of it as a certain pattern of several smaller edges.
So, maybe. Our hope is that each neuron in the second layer of the network corresponds with the various relevant little edges. Maybe when an image like this one comes in, it lights up all of the neurons associated with around eight to 10 specific little edges, which in turn lights up the neurons associated with the upper loop and a long vertical line, and those light up the neuron associated with a nine.
Whether or not this is what our final network actually does is another question. One that I’ll come back to once we see how to train the network. But this is a hope that we might have, a sort of goal with the layered structure like this. Moreover, you can imagine how being able to detect edges and patterns like this would be really useful for other image recognition tasks.
And even beyond image recognition, there are all sorts of intelligent things you might want to do that break down into layers of abstraction. Parsing speech, for example, involves taking raw
audio and picking out distinct sounds, which combine to make certain syllables, which combine to form words, which combine to make up phrases, and more abstract thoughts, etc.
But getting back to how any of this actually works, picture yourself right now designing how exactly the activations in one layer might determine the activations in the next. The goal is to have some mechanism that could conceivably combine pixels into edges, or edges into patterns, or patterns into digits.
And to zoom in on one very specific example, Let’s say the hope is for one particular neuron in the second layer to pick up on whether or not the image has an edge in this region here. The question at hand is what parameters should the network have? What dials and knobs should you be able to tweak so that it’s expressive enough to potentially capture this pattern or any other pixel pattern or the pattern that several edges can make a loop and other such things?
Well, what we’ll do is assign a weight to each one of the connections between our neuron and the neurons from the first layer. These weights are just numbers. Then take all of those activations from the first layer and compute their weighted sum according to these weights. I find it helpful to think of these weights as being organized into a little grid of their own.
And I’m gonna use green pixels to indicate positive weights and red pixels to indicate negative weights, where the brightness of that pixel is some loose depiction of the weights value. Now, if we made the weights associated with almost all of the pixels, zero, except for some positive weights in this region that we care about, then taking the weighted sum of all the pixel values really just amounts to adding up the values of the pixel just in the region that we care about.
And if you really wanted to pick up on whether there’s an edge here, what you might do is have some negative weights associated with the surrounding pixels. Then, the sum is largest when those middle pixels are bright, but the surrounding pixels are darker. When you compute a weighted sum like this, you might come out with any number.
But for this network, what we want is for activations to be some value between 0 and 1. So a common thing to do is to pump this weighted sum into some function that squishes the real number line into the range between 0 and 1. And a common function that does this is called the sigmoid function, also known as a logistic curve.
Basically, very negative inputs end up close to 0. Very positive inputs end up close to 1, and it just steadily increases around the input 0. So the activation of the neuron here is basically a measure of how positive the relevant weighted sum is. But maybe it’s not that you want the neuron to light up when the weighted sum is bigger than 0.
Maybe you only want it to be active when the sum is bigger than, say, 10. That is, you want some bias for it to be inactive. What we’ll do then is just add in some other number, like negative 10, to this weighted sum. Before plugging it through the sigmoid squ ation function, that additional number is called the bias. Continue reading …
Readers Also Read This
Artificial Intelligence Has No Reason to Harm Us: Deep Dive Analysis
Unleash AI Text-to-Speech Excellence – Elevate Your Voice, Join Discord and Speak Your Mind with Cutting-Edge Technology!