No, you don't have to alter audio in any way, you just leave it untouched.
Let's say you have 100 seconds of film material with corresponding audio duration. That would be exactly 2400 frames, because film is 24 fps. When you do 2:3 pulldown (that's one of the methods used to do telecine, i.e. film conversion to television format) you'll get back exactly 3000 video frames. But in NTSC format, (if you'll leave audio untouched, i.e. it's duration didn't change, it's still is 100 seconds) for 100 seconds of video duration you must display 2997 video frames, not 3000. So you just throw away 1 frame from every 1000 frames to get the exact 29.97 fps.
Doing this, at worst case you'll get audio/video desynchronization of 33 ms, which I don't think is noticeable (I tend to visually notice desynchronization when it's at about 100 ms or more). But if you'll drop that 1 excessive frame in the middle (not at the end) of the sequence of 1000 frames, you'll get audio/video desynchronization which only varies from -16 ms to +16 ms. Which is not worth the trouble of fixing this by going the other way around it: leaving video untouched and slowing down the audio by 0.1%.
Let's say you have 100 seconds of film material with corresponding audio duration. That would be exactly 2400 frames, because film is 24 fps. When you do 2:3 pulldown (that's one of the methods used to do telecine, i.e. film conversion to television format) you'll get back exactly 3000 video frames. But in NTSC format, (if you'll leave audio untouched, i.e. it's duration didn't change, it's still is 100 seconds) for 100 seconds of video duration you must display 2997 video frames, not 3000. So you just throw away 1 frame from every 1000 frames to get the exact 29.97 fps.
Doing this, at worst case you'll get audio/video desynchronization of 33 ms, which I don't think is noticeable (I tend to visually notice desynchronization when it's at about 100 ms or more). But if you'll drop that 1 excessive frame in the middle (not at the end) of the sequence of 1000 frames, you'll get audio/video desynchronization which only varies from -16 ms to +16 ms. Which is not worth the trouble of fixing this by going the other way around it: leaving video untouched and slowing down the audio by 0.1%.