Skip to content

191. Locating the 3D position of a 2D point

January 4, 2015

Didn’t we just have this one? Actually, it was the other way round. But this was a followup question.

This post shows one way to do it. It’s not beautiful, but it works, and I’m posting it because it has some interesting features worth talking about.

So suppose we have a rectangular surface drawn on the screen. We want to touch it and know which point on the surface we touched. Easy!?

There are just two problems. It is drawn in 3D, so the size depends on distance from the camera, and it is partly rotated, so it looks like this.

This is quite tricky. If I touch the screen, how do I figure out which point in the rectangle was touched?

As I said above, this is the reverse of finding the 2D screen position of a point drawn in 3D. That involved calculating a matrix

m = modelMatrix() * viewMatrix() * projectionMatrix()

so surely all we need to do is invert (ie reverse) this matrix?

Perhaps, yes, but in a previous thread about this topic, our resident mathematician said inverting a matrix was a Very Bad Idea *, because they may be unstable and can’t always be inverted. However, there seemed to be no other (mathematical) solution.

* which sounds very Winnie The Pooh, except that Pooh was no mathematician.

That doesn’t bother me, because near enough is usually good enough in computer graphics.

So I looked for a suitable kludge.

Binary search

My first thought was to use binary search, starting with the mid point of the rectangle, calculating its 3D position, and if the touch was to the left of that point, then take the mid point of the left half, calculate its 3D point, and narrow it down like that,…. It would work pretty fast, except that if I had just looked at the diagram above, I would have realised that it is easy to find two points A and B where A is to the left of B in the original rectangle, but is on its right in the 3D picture above. So I wasted a lot of time there – and I won’t waste your time with any more of it.

I only told you about this in case you think I find this stuff easy. I can be very dumb.

Hidden image

I went back to something that had worked for a similar problem – if you have a number of objects drawn on a 3D screen, and you touch one of them, how can Codea figure out which one you touched? If the objects are regular shapes like cubes, then you can use math, but if they are irregular or have transparent areas (so you should be able to touch an object behind another object with transparent pixels), it doesn’t really work.

So that needed a kludge too, which was to redraw the entire screen to an image in memory, using a shader which drew each object in a unique colour, then Codea could look up the pixel that was touched, get the colour, and figure out which object it was.

Our current problem is similar, except we want to know more than which object was touched – we want to know where it was touched. Luckily, though, we only have one object to worry about.

So we can use a similar approach. Let’s think it through.

  1. We put our rectangle in a mesh, so a shader can work on it.
  2. We provide texture mappings which run from (0,0) at the bottom left, to (1,1) at top right
  3. However, we don’t need a texture image
  4. The fragment shader knows the texture mapping of each pixel, as a fraction 0-1. So the texture mapping of the point which is halfway along x and 3/4 of the way up y, is (0.5, 0.75). And that is exactly the information we want to store.
  5. We can put these values into the red and green colours of the pixel (this is easy because colour values in the shader also run from 0 to 1 (not 0 to 255)
  6. Codea can look up the pixel that was touched, and get the x,y position on the original rectangle from the red and green colour values


So far so good, but there is one little problem. Codea (or maybe the shader) converts all the colour values to a number 0-255. This means we only have 256 possible values for x and y. However, if we are working with the full screen, it is 1024 pixels wide – ie 4 times bigger – and this means our solution is only accurate to within 4 pixels! (or maybe 8 pixels, for retina screens, which have double the number of pixels in each direction).

The way out is to use the blue colour of each pixel to store additional information. It can store 256 values, which is 16 x 16. So we can store 16 x as much information for each of x and y, by doing this in the shader

  • look up the texture mapping position for the current pixel, eg (0.325,0.7512)
  • multiply by 4096 (which is 256 * 16)
  • divide by 16, and keep the integer and remainder results
  • the integer results for x and y go into the red and green pixel values (they are between 0 and 255)
  • take the remainders (which are 0 to 15), multiply the x value by 16 and add y, and put the result (0 to 255) in the green pixel

When Codea looks up the pixel, it reverses the process. This approach enables us to accurately map a screen of up to 4096 x 4096 pixels.

More accuracy?

You might expect that if we need even more accuracy, we could store numbers in the alpha value of the pixel colour. To quote my mathematician friend, that is a Very Bad Idea.

This is because OpenGL is set up so it adjusts the alpha values to make lines look straighter, and to make contrasting colours blend smoothly together. This means you can’t rely on alpha values to stay the way you set them.

 Programming for speed

If your rectangle is always in the same position and orientation (rotation), then you can create this “image map” just once, during setup, then when there is a touch, you just look up the pixel and decode the colour values. OpenGL draws this image extremely quickly.

However, if your rectangle is changing position or orientation (rotation), then this won’t work, and you’ll have to draw the shader image each time there is a touch. In this case, you can tell OpenGL only to draw a couple of pixels around the touch point. It will not bother drawing anything else, which makes it much faster.

The code for this, if your touch point is (x,y), is

clip(x-1,y-1,3,3) --start x,y, width,height

I usually include a pixel on each side of the touch point, because clipping the pixel on its own doesn’t always seem to work.

I won’t go into all the formulae and shader details, but here is a code example, which you can explore if you wish.



One Comment

Trackbacks & Pingbacks

  1. Index of posts | coolcodea

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: