Hacker News

by Ryan Harman

The quadratic sandwich (fedemagnani.github.io)

113 points by cpp_frog 3 days ago

explainforwhat 29 minutes ago

It frustrates me when math explainers, and textbooks, seem to start from the "here's why our methods are insufficient to solve our problem" and fail to provide an example of the problem they are trying to solve.

What's the question this method is attempting to answer? What does an answer look like? How does this method lead to it?

> If you have ever tried to minimize a function with gradient descent

"and if otherwise, go kick sand," I guess.

laGrenouille 9 hours ago

Great visualizations. Really enjoyed having a well-written example where mathematical proofs directly help with understanding a practical application.

I wonder what would happen with this analysis if a momentum term was added to the gradient descent. It seems that it would fix the specific failure modes in the examples, but I wonder if there's a corresponding mathematical way of categorizing what kinds of functions can(not) be quickly optimized with GD + momentum.

20k an hour ago

This is a great article and its super helpful, thanks to whoever wrote it!

Scene_Cast2 4 hours ago

There is one very clear example that I ran across due to the reasons outlined in the article. If you have a wavelet and you're trying to slide it around to make it fit, that will fail spectacularly. There are lots of problems that boil down to basically the above.

The neural net answer is being able to spawn a wavelet at any position, as opposed to tweaking the position of an existing one.

xuzhenpeng 11 hours ago

The animation is very good, making the article easy to understand

Guestmodinfo 10 hours ago

We studied it in our peparation for college entrance exams in India. Though the detail the article goes in is exhaustive. But I thought that this maybe common or almost common knowledge. We used to call it sandwich theorem

thaumasiotes 9 hours ago

The sandwich theorem would normally refer to this one: https://en.wikipedia.org/wiki/Squeeze_theorem

quietbritishjim 5 hours ago

vzaliva 3 hours ago

Kudos for beatiful formulae rendering.

CarVac 6 hours ago

Simplex methods can handle those tough situations, though.

FabHK an hour ago

Simplex is not applicable. Simplex only minimises a linear function (f(x)=c'x) under linear inequality constraints (Ax≤b). The minimisation problem here is unconstrained, but (very) non-linear.

Hacker News

by Ryan Harman

The quadratic sandwich (fedemagnani.github.io)

explainforwhat 29 minutes ago [-]

laGrenouille 9 hours ago [-]

20k an hour ago [-]

Scene_Cast2 4 hours ago [-]

xuzhenpeng 11 hours ago [-]

Guestmodinfo 10 hours ago [-]

thaumasiotes 9 hours ago [-]

quietbritishjim 5 hours ago [-]

vzaliva 3 hours ago [-]

CarVac 6 hours ago [-]

FabHK an hour ago [-]