**Physicist**: If you’ve taken calculus, then at some point you learned that to find the area under a function (generally written ) you need to find the anti-derivative of that function. The most natural response to these types of theorems is “wait… what?… why?”.

This theorem is so important and widely used that it’s called the “fundamental theorem of calculus”, and it ties together the integral (area under a function) with the antiderivative (opposite of the derivative) so tightly that the two words are essentially interchangeable. However, there are some mathematicians who may take issue with mixing up the two terms.

It comes back (in a roundabout way) to the fact that the derivative of a function is the slope of that function or the “rate of change”. In what follows “f” is a function, and “F” is its anti-derivative (that is: F’ = f).

*Intuitively*: Say you’ve got a function f(x), and the area under f(x) (up to some value x) is given by A(x).

Then the statement “the area, A, is given by the anti-derivative of f” is equivalent to “the derivative of A is given by f”.

In other words, the rate at which the area increases (as you slide x to the right) is given by the height, f(x).

For example, if the height of the function were 3, then, for a moment, the area under the function is increasing by 3 for every 1 unit of distance you slide to the right. Keep in mind that the function can move up and down as much as it wants. As far as the function “knows”, at any particular moment it may as well be constant (dotted line in picture above).

So if the height of the function (which is just the function) is the rate at which the area changes, then f is the derivative of the area: A’=f. But that’s exactly the same as saying that the area is the anti-derivative of the function.

*Mathematically*: There’s a theorem called the mean value theorem that states that if you have a “smooth” function with no sudden bends or kinks, then over any interval the derivative will be equal to the average slope at least once. This needs a picture:

More precisely, if you have a function on the interval [A,B], then there’s a point c between A and B such that . You can just as easily write this as or (since F’ =f).

So if you drive 60 miles in one hour, then at some instant you must have been driving at exactly 60 mph, even though for almost the entire trip you may have been traveling much faster or much slower than 60 mph.

Keep that stuff in the back of your mind for a moment, and ponder instead how to go about approximating the area under a function.

You can divide up the area between x=A and x=B under a function by putting a mess of rectangles under it. Divide up the interval [A,B] by picking a string of points x_{0}, x_{1}, x_{2}, …, x_{N}, and use these as the left and right sides of your rectangles (and set x_{0}=A and x_{N}=B).

The point, c_{i}, that you pick in between each x_{i-1} and x_{i} is unimportant. To get the exact area you let N, the total number of rectangles, go flying off to infinity, and you’ll find that the highest value of f and the lowest value of f in each tiny interval gets squeezed together.

So, why not choose a value of c_{i} so that in each rectangle you can say ?

Holy crap! The area under the function (the integral) is given by the antiderivative! Again, this approximation becomes an equality as the number of rectangles becomes infinite.

As an aside (for those of you who really wanted to read an entire post about integrals), integrals are surprisingly robust. That is to say, if your function has a kink in it (the way |x| has a kink at zero, for example) then you can’t find a derivative at that kink, but integrals don’t have that problem. If there’s a kink or even a discontinuity; no problem!

You can just put the edge of a rectangle at the problem point, and then ignore it. In fact, think of (almost) any function in your head… You can take the integral of that. It may have an infinite value, or something awful like that, but you can still take the integral.

To make a function that can’t be integrated you have to make it infinitely messed up. Mathematicians live for this sort of thing. There is almost nothing in the world they enjoy more than coming up with ways to break each other’s theories. One of the classic examples is the function

Over any interval you pick, f still jumps around infinitely often, so the whole “things will get better as the number of rectangles increases” thing can never get off the ground. There are fixes to this, but they come boiling and howling up out of the ever-darker, stygian abyss that is measure theory.

neat!

What did you use to generate the plots?

Power Point

I am two thirds mathematician and one third physicist, and am very upset by your explanation.

I have never heard of

antiderivative, but perhaps that is because I am from England. Even so, it seems as useful as usingantiquickerinstead ofdistance.You seem to be

antiabstractingnotions to a physical graph, from where you can explain the idea of ‘area under’ a function in geometric terms. You have devised a ladder back out of the hole you imagined yourself into.When I accelerate to work there is no ‘area under’ my journey: I stop and start as I please, and when I arrive I have covered enough miles for me to walk the remaining distance to my desk.

IMO you do mathematics a disservice by pretending that there is an

xand af(x)that can be drawn on a piece of paper and crayoned in. Even the relationship between trigonometry and imaginary numbers breaks down straight away.British people really don’t use the term “anti-derivative” or crayons?

Borodin: anyone who’s taken an undergraduate degree in mathematics and has done a first year analysis course should have heard of an “antiderivative”! The more-meaty proof (which is what was explained heuristically, and usefully, above) is proving that this antiderivative is in fact the original function.

I don’t understand why it does maths a disservice by pretending there’s an x and an f(x)? No one’s pretending – we’re just saying that you could write it down in those terms if you wished and you could also find areas etc. also if you wished.

In England we don’t use crayons, no. We still use quill and ink of course – the Queen doesn’t allow otherwise.

I have never heard “anti-derivative” before, although of course it is possible I blinked at the wrong times. And, erm, yes we have crayons – don’t you? I can’t think what American for crayon might be. I survived my time in Kansas City by being able to say “vanilla ice cream”.

Why F’ = f?

It’s just a definition. I’m

definingit that way.I have a question regarding the paragraph, “More precisely, if you have…”

We end by saying F(B)-F(A) gives the area under the curve and it all seemed to make logical sense. However, if I then return to the aforementioned paragraph and use what I’ve just learned, I come to the conclusions that

“f(c)(B-A) = F(B) – F(A)”

f(c)(B-A) is the area under the middle chart. However, it looks very much larger. It is also not intuitive to me that the x-coordinate, c, whose derivative matches the average gradient has a height that matches the corresponding average height. What am I doing wrong?

Many thanks.

Different c’s.

What you’ve done is turned the mean value theorem on it’s head. You’re statement boils down to: “for a smooth function, there exists a “c” between “A” and “B”, such that f(c) is equal to the average height of the function”. It’s this statement that actually gives the “mean value theorem” its name.

Imagine that the area under the curve is made of wax. When you melt it you end up with a rectangle with the same area and the same base length (B-A), but its height will be the

averageheight of the original function.So, if F(B) – F(A) is the area, then [F(B)-F(A)]/(B-A) is the average (mean) height. And the mean value theorem states that, for a smooth function, the function will assume its “mean value” at some point (c).

Thank you for responding so quickly but I’m not sure that entirely answered my question. Do you mean to say you are referring to different c’s between the two equations:

f'(c)(B-A)=f(B)-f(A) and f(c)(B-A)=F(B)-F(A)

in which case how can I justify this step? I’m comfortable with the idea that on the interval [A,B] there exists a c whose gradient matches the average gradient and I’m comfortable with the idea that there exists a c whose f(c) corresponds to the average height and that these are not necessarily the same c. But I don’t understand how to get between these two equations.

Many thanks.

There’s no generalizable relationship between the c’s for F, f, f’, f”, …

The Mean Value Theorem is just a property of functions. It’s nice to have a picture (the post had one, and you found another) to understand what’s going on, but it’s not the whole story.

The MVT just says: “for a smooth function, f, on (A,B) there’s at least one value, c, between A and B such that (B-A) f'(c) = f(B)-f(A)”. It establishes a relation between functions and their derivatives (f and f’) or (exactly equivalently) between anti-derivatives of functions and their original functions (F and f).

Did that just run us in circles?

@the physicist

thank you very much for the post..exactly what i was looking for..a mathematical proof that the antiderivative is indeed equal to the area under the curve…and i didnt find any flaws in it..pretty straightforward. i didnt think the MVT was of much use, but guess it is.

I get it!

This whole time I’ve been struggling to see how you got from f’(c)(B-A)=f(B)-f(A) to f(c)(B-A)=F(B)-F(A) as though you were somehow integrating it up but you’re not doing that. You’re simply applying MVT to F and f as opposed to f and f’, and saying there must be some other (unrelated) c on the interval [A,B] that satisfies:

f(c)(B-A)=F(B)-F(A)

Physicist. Sorry for being so slow. Thanks for all your help.

Thank you for helping the other people with exactly the same question!

this proof was in my back of my head, but, because of the confusion between F(x), f(x) and f ‘(x) graphs, i would always lose track writing it down in paper. thanks to this now i have, hopefully, a permanent concept of integration.

So, would it be safe to say that when the amount of rectangles approaches infinity the mean value theorem for integrals becomes an identity rather than a relationship? In other words, do the two endpoints of each rectangle become synonymous with the point c as the distance between the endpoints approaches zero? If this is the case than I get it! Rather simple compared to lebesgue measure I bet!

Mechanical Engineering Major

Thanks Alot

Basically, yes!

What you can do is show that it doesn’t matter what point you pick. As the rectangles get thinner and thinner they’ll all give you very nearly the same answer. One of those points is “c”, so any point (for a very thin rectangle) will give you about the same answer as c. As the width of every rectangle is taken to zero, the difference between “nearly the same” and “exactly the same” disappears.

@physicist: About the method you use in this post to show the relationship between antiderivative and area under a curve. Is it your own method, or is it someone else’s? Does it have a popular name?

The “intuitive” part of the post is my own approach (not a proof), but the “mathematical” part is just the standard proof.

Sooo I tried to explain to myself how u go from f’(c)(B-A)=f(B)-f(A) to f(c)(B-A)=F(B)-F(A) and even after reading al’s comments and your concomitant answers I still end up nowhere. I do fully (at least that’s what my mind is saying) comprehend f’(c)(B-A)=f(B)-f(A), but I cannot make the logical transition to f(c)(B-A)=F(B)-F(A). Please help me put an end to this struggle

So is it because the relationship between f’ and f is exactly the same as that of f and F and vice versa ?

Yup! Exactly.

Hey guys,Im a lebanese Gs(physics-math) student i`d like to travel abroad but I dont know if my level is sufficient to do that,so if someone would help me..Were taking Logic.metric relations.irrational functions.parametric curves.conics(hyper,ellipse,parabola,curves of 2nd degree).level curves.mean value theoram.applications on complex numbers.transformTion plAnes.complements for integrals.numericL sequence.sphere.functions(limits,continuity,derivative).inverse functions.trigonometric functions.system of linear equations.vect Hey guys,Im a lebanese Gs(physics-math) student i`d like to travel abroad but I dont know if my level is sufficient to do that,so if someone would help me..Were taking Logic.metric relations.irrational functions.parametric curves.conics(hyper,ellipse,parabola,curves of 2nd degree).level curves.mean value theoram.applications on complex numbers.transformTion plAnes.complements for integrals.numericL sequence.sphere.functions(limits,continuity,derivative).inverse functions.trigonometric functions.system of linear equations.vector product-mixed product.lines and planes in space.complex numbers.integration.logarithims.exponentials.differential equations.binary operaation.statistics.counting.probability product-mixed product.lines and planes in space.complex numbers.integration.logarithims.exponentials.differential equations.binary operaation.statistics.counting.probability

Weve taken this proof,and moreover we had A test(4 hour tests usually),in iy she asked prove that the area of a circle is pie.r*2 ,and we didnt do such a thing in class,and we had only solved 1/4 of the integral course in that time.

Pingback: Q: Is it a coincidence that a circles circumference is the derivative of its area, as well as the volume of a sphere being the antiderivative of its surface area? What is the explanation for this? | Ask a Mathematician / Ask a Physicist

Hi,

Thank you for that explanation. I’m a physicist (not really, just have a bachelor’s degree) and I never really questioned why the slope of the tangent line was the opposite of the area under the curve. This makes perfect sense.

“f ‘(c) (B-A) = f(B)-f(A) or f(c) (B-A) = F(B)-F(A) (since F’ =f).”

This doesn’t make any sense to me. And, if this is wrong, your entire derivation is wrong!

You’re right, it would be.

The mean value theorem is a statement about any function and the derivative of that function.

If the function is , the derivative is .

If the function is , the derivative is .

if we integrate the f(x) with dx it gives area bw curve fx and x axia with appropriate limit

as same if we integrate f(x) with dg(x) then it can show the area btween both curve fx and gx or not or what limit we use in ……. Int.f(x)dg(x).d(x)/d(x)

can u give the proof that integration is inverse of differentiation??

It is a fallacy that integration is the reverse of differentiation.

An ante-derivative or primitive function can be used in the “summation” process, but it need not be. Finding an ante-derivative is NOT integration. Integration involves determining the product of two averages (more about this in my New Calculus). Integration has nothing to do with summation which stems from the idiotic ideas of Leibniz and Riemann.

@The Physicist:

You wrote:

The mean value theorem is a statement about any function and the derivative of that function.

If the function is f, the derivative is f^\prime.

If the function is F, the derivative is f.

Correction: The mean value theorem states that the average value of the ordinates of any function f ‘ (x) on the interval (a,b) is given by the ratio: {f(b)-f(a)}/b-a

One of the non-mathematicians who is a moderator on this site, deleted one of my comments. That comment contained a link:

http://www.spacetimeandtheuniverse.com/math/6661-0-999-really-equal-1-a-5.html#post23696

This link is the actual explanation of the connection between derivative and integral. All else is rubbish.

@Sung wrote: “f ‘(c) (B-A) = f(B)-f(A) or f(c) (B-A) = F(B)-F(A) (since F’ =f).”

This doesn’t make any sense to me. And, if this is wrong, your entire derivation is wrong!

Gabriel: “f ‘(c) (B-A) = f(B)-f(A) or f(c) (B-A) = F(B)-F(A) (since F’ =f).” is correct.

I’m having trouble believing the area under the curve should approach the area of rectangles under the curve as the width of those rectangles decreases. It makes sense that the error for a particular rectangle decreases, but there are also more rectangles, so it’s not clear to me that the sum of those errors decreases or ever reaches zero.

So,

1. How do you know the decrease in error per rectangle is dominant over the quantity of rectangles as the width of the rectangles approaches zero?

2. How do you know the error reaches (not just approaches) zero at the limit?

@Borodin: I am a mathematician who did his mathematical training in England (at what is generally considered an important and famous UK university) and who now works in North America. I can tell you that the expression “antiderivative” is used widely on both sides of the pond. It is standard terminology in calculus, and it is not made up. In any case, it doesn’t matter because the author of the post explains clearly what he / she means by “antiderivative”. A definition is a definition. Mathematicians must introduce new definitions all of the time, in order to communicate their research effectively. As long as a definition is written clearly, it is “fair game”. If we did not introduce new words as we come across new ideas, mathematics would grind to a halt.

I don’t think the author is doing any kind of disservice to mathematics. When you write “When I accelerate to work there is no ‘area under’ my journey…”, I see that you are failing to understand the point of the author’s article, and to understand the necessity for abstraction in the development of mathematical tools.

The crux of this article which I liked the most was this: “the rate at which the area increases is given by f ” – a very interesting view to represent the integral/area relation in a reverse notion! This is what I was exactly looking for in understanding the “why”!

Thanks!

Steve

A good name is the “Area So Far” function A(x) of a function f(x). The rate of change of this area function thing is clearly f (x) if one considers a thin slice of x width dx say.

Alternately or equivalently using time t as variable, for an object in motion then the distance moved is clearly the sum of velocity multiplied by time intervals. So the distance is just the area under velocity graph. While velocity is the rate of change of distance. And that’s it explained! ( if you don’t see it yet drink more beer!)

These simple intuitive ideas may need more care with wierd functions, but the for the present question wanting simple explanation its OK.

GBC has a good concern about the little triangular errors when adding up the myriad of thin rectangles. As our increment gets smaller the error gets proportionately smaller and even though we have more rectangles the total error is the sum of rectangle base areas which is the finite distance along x , say X, multiplied by the error … which is like a thin strip of wonky triangular error bits. However we see the strip gets narrower as dx gets smaller but the strip length stays the same, roughly X, so the error just gets smaller proportionately as dx which we can take as small as we like and finally let be vanishingly tiny. The bits do effectively vanish in limit.

What’s the formula to calculate area covered by any sphere on athe plane surface?

This comment is in response to GBC and Pablo Jeynes, regarding the question of whether the remainder area of triangles does or does not approach zero. I think that the following intuitive view indicates that the remainder area does in fact tend to zero:

Draw a right-angled triangle with a horizontal base and vertical side. For the purpose of this discussion, let the triangle be representative of the top end of any sliver. Next subdivide that triangle into two with a vertical line, and where the vertical line meets the hypotenuse, go across horizontally to the side of the triangle. This process represents what happens when the number of the slivers dividing the function is doubled. Just by looking at the original triangle, there is now a portion of that triangle contained within the top of a rectangular segment. So, doubling the number of slivers has reduced the remainder area by some proportion.

Repeating this process will further reduce the remainder areas, hence the remainder area tends to zero as the divisions increase. Pretend that each increase of divisions is a doubling of the number of divisions if that helps create a picture.