I see a lot of answers describing *what* tensors are, but none really describe *...

I see a lot of answers describing what tensors are, but none really describe why they're important. To understand this, let's go back to some first-year Calculus. If we have a function f, we can approximate f as

f(x+dx) = f(x) + f'(x)dx + O(dx^2)

This should look familiar: taking a = f(x) and b = f'(x), this is just the line a + b.dx! In other words, Calculus is just a way transforming questions about (differentiable) functions into questions about lines.

So this is all well and good, but what if x is a vector? Or f(x) is a vector? Or both? Well, now we can approximate

f(x+dx) = f(x) + Df(x).dx + O(|dx|^2)

Here, Df(x) is nothing other than the matrix [(df_i/dx_j)(x)]. In other words, we still get a linear approximation. But now suppose we get greedy and want to take higher order derivatives: what is the derivative of Df(x)? It's a tensor!

There are a few ways to think about this (which gives rise to the different interpretations of a tensor):

1. It's just df_i/(dx_jdx_k). This is a multi-dimensional array.

2. In 1D, we have

f(x+dx) = f(x) + f'(x)dx + (1/2)f''(x)dx^2 + O(dx^3)

In higher dimensions, we get

f(x+dx) = f(x) + Df(x)*dx + dx'.D^2f(x).dx + O(|dx|^3)

Here, D^2f(x) is a multi-linear map: it takes in two copies of the vector dx, and returns a new vector.

More generalizations exist for various reasons, but I think this intuition captures why tensors show up so often.