# 246. Optimal culling

While trying to figure out how bot tanks saw each other, I revisited culling (deciding which objects to draw), and it proved an exciting adventure that resulted in some blazingly fast code, which I’m happy to share with you now.

We are in a 3D scene, but everything is on a flat surface, so we move around in two dimensions (this is sometimes referred to as 2.5D).

Our camera is looking along the surface, as shown by the blue arrow. The dotted blue lines show the “field of view”, ie what the camera can see.

At the left, there is a tree, which is a 2D image that rotates to face the camera.

We need to know if the camera can see it. If you rotate the tree through 360 degrees (on the vertical y axis), it forms a cylinder with a circle at its base, with a radius of half the width of the image. If that circle is inside the “field of view”, then our tree can be seen. So we are going to work with this circle and the field of view, to find an efficient test for whether they overlap or not.

### Tangent

At the point where the circle first touches the left hand side of field of view, the dotted line is a tangent, ie the line is at right angles to the centre of the circle, as shown on the left.

If, given camera position and a circle, we can calculate the tangent point where the line from the camera just touches the circle, at right angles to the centre, we can then test if this point is inside the field of view or not.

However, there are two tangent points, one on each side of the circle. We want to use the one that is closest to the field of view.

Figuring all this out could cost us quite a few lines of code.

But would you believe everything can be done with just three (3) lines of code?

And here is the first line.

p = circlePos+cameraDirection*(radius/math.sin(math.rad(FOV/2)))

It looks like Greek, I know, so I’ll use a picture to explain. Stay with me, though, it’s not hard when you “get” it, and the result is magical.

We’ll start with the circle just touching the field of view, at a right angle to the circle centre.

We draw a line (marked d alongside) from the circle centre, in the same direction as the camera (straight up in this case), up to where it crosses the dotted line marking the edge of the field of view.

We’ll call the angle between the blue lines a (which is half the field of view). Using geometry, the angle between the green and (dotted) blue line at top left, must also be a, (because the dotted line touches the circle at right angles).

How long is the line d? Well, if the circle radius is r, then sin(a) = r / d, so

d = r / sin(a)

And that’s where the formula above comes from. It gives the position of the point p at the end of line d. And the line running through the tangent from the camera to the edge of the circle will always pass through p as well.

### Is the point inside the field of view?

But what happens when the circle is not touching the edge of the field of view? We still calculate p exactly the same way, but now it won’t be touching the field of view – it will be inside or outside. So now we need to test this, which can be done quite easily using a dot product.

v=(p-camPos):normalize() if v:dot(camDir)>cosFOV then visible=true end

We normalize the direction from the camera to p, then calculate the dot product, which gives us the cos of the angle between them. We can compare this with cos(a), and if it is greater, it means p is inside the field of view, and visible.

## But which tangent do we use?

There are two tangents where the camera touches the circle. Which one do we use?

Amazingly, we don’t need any code for this.

When we calculated the point p, we used the camera direction, and *this will always point us toward the tangent that is closest to the field of view*, so we don’t need any extra calculations.And that is a major reason for calculating p in this way.

My thanks to forum user yojimbo2000 for this awesome shortcut, which I would never have thought of in a million years.

### Optimising for speed

We’re nowhere near done. This function needs to be very fast, and even if it’s only three lines, there’s a lot we can do to speed it up. And I guarantee you will be surprised by what works best.

So here is our visibility function.

```
--cameraDirection must be normalised
--FOV is field of view (default is 45)
function IsVisible2(circlePos,cameraPos,cameraDirection,radius,FOV)
local p=circlePos+cameraDirection*(radius/math.sin(math.rad(FOV/2)))
local v=(p-cameraPos):normalize()
return v:dot(cameraDirection)>math.cos(math.rad(FOV/2))
end
```

If you have a lot of objects to test for visibility, like all the trees in my tank scene, then you’ll quickly realise that there is no point calculating the sin and cos of FOV/2 for each object, because it’s going to be the same for all the objects. So we should “pre-calculate” all the items that won’t be different between objects, giving us something like this.

```
--precalculated global variables are
--radiusAdjust = 1/math.sin(math.rad(FOV/2)))
--cosFOV = math.cos(math.rad(FOV/2))
function IsVisible2(circlePos,cameraPos,cameraDirection,radius,FOV)
local p = circlePos + cameraDirection*radiusAdjust
local v=(p-cameraPos):normalize()
return v:dot(cameraDirection)>cosFOV
end
```

That makes a big difference to speed, but would you guess I can still speed it up by a factor of three (3) ?

One reason is that the normalize function involves a square root. We could make the function faster by not taking the square root, and comparing the squares of the results.

So our code will change

--instead of local v=(p-cameraPos):normalize() --which is the same as local v=(p-cameraPos) v = v / v:len() --which is the same as v = v / (v.x*v.x + v.y*v.y) ^ 0.5 --we can use v = v * v / (v.x*v.x + v.y*v.y) --ie everything is squared --and compare with the square of cosFOV return v:dot(cameraDirection)>cosFOV*cosFOV

(and we can pre-calculate cosFOV*cosFOV to make it a little faster still).

Now the surprising thing is that we aren’t done, even though our code looks like bare bones, using optimised built in functions.

It seems vector math is slow.

Really quite slow.

If we decompose the vectors into ordinary variables as shown below, the speed improves dramatically. Note that

- I’m breaking the vectors into x,y values. Normally, a 3D surface will have y as height, so the surface will use the x and z axes. But I was using 2D code for testing, and it is easy to change.
- When you only compare squared values, you get a second (and incorrect) “visible” result when the circle is directly behind the camera. So I’ve included an initial test for this (it makes things a little slower, but for any trees behind the camera, it will save us calculating tangents etc, so it may even be faster on average)

--these global variables are pre-calculated --tangentAdjust = 1 / math.sin(math.rad(FOV/2)) --cosFOV2 = square of math.cos(math.rad(FOV/2)) --camposX, camposY = x,y values of camera position --camdirX,camdirY = x,y values of camera direction function IsVisible(pos,radius) local px,py=pos.x,pos.y --if circle is behind the camera, it's invisible local dx,dy=px-camposX,py-camposY if dx*camdirX+dy*camdirY<0 then return end --calculate the tangent point & subtract camera position local u=radius*tangentAdjust local ptx,pty=px+camdirX*u-camposX,py+camdirY*u-camposY --(squared) length of this direction local sq=ptx*ptx+pty*pty --numerator of dot function local a=ptx*camdirX+pty*camdirY --compare squares of the results --we multiply by sq on the right instead of dividing by it on --the left, as multiplying is faster return a*a>cosFOV2*sq end

The result is blazingly fast. On my iPad Air 2, I can do 18,500 visibility tests per frame, for objects in front of the camera, and if they are behind the camera, I can do 30,000 tests. By comparison, the original 3 line function I started with above, could only manage 4,500 tests per frame! That’s a 4x speed improvement.

And the video below shows that it works. The yellow circle brightens when IsVisible is true. The green line shows the tangent.

### Why vectors are slower in this case

Normally, vectors perform pretty well. Why is it possible to improve on them in this case?

Simeon, developer of Codea, explains it like this.

The performance difference appears to be due to allocations, every vector mult / sub / add has to allocate a new vector object as Lua user data to return its results. The overhead in the allocations accounts for all the difference in performance.

This problem exhibits itself because the vectors are short-lived and created / deleted constantly. Using vectors in a more long-term scenario should be totally fine (the overhead will not really be noticeable without lots of operations).