We train NN to understand what is correct and what is not in general )
The point of passing small amount is for the network to learn slowly, do not overfit and perform better in overall.
I will illustrate this with an example of finding max value of a function:
If the learning rate is too high, the NN may end jumping between two first local maximums, never hitting the third — true — maximum. The slower the learning rate is, the bigger is the chance not to miss something.