View Single Post
Old 2009-01-07, 19:50   Link #850
pichu
Senior Member
*Fansubber
 
Join Date: Jul 2004
Quote:
Originally Posted by TheFluff View Post
wayyyyyy to miss the point there brosef
the selected area doesn't mark a "syllable" or anything else for that matter (what kind of meaningful audio clip lasts 10 milliseconds?), it is only there to demonstrate the scale of the image
the point is that it is possible to exactly determine the start of the song to within +/- 10ms (ASS doesn't allow higher precision unless you mod vsfilter), so that if you have a shifting line at the start of the song you can shift it to the exact start and be assured that the karaoke/lyrics sync up perfectly

of course all of this exxxxtreme precision is kind of pointless since your karaoke timing is likely to be a lot more off than +/- 10ms per syllable but the point was that it isn't inaccurate like getfresh claims it is, in fact it is more accurate than shifting to a video cue


What I don't understand is that why do you keep on circumventing the whole discussion around frames per seconds, frame rates, and videos--as they are irrelevant. Here are known facts to subbers:
  1. In Karaoke timing, video scenes are almost never used. (i.e., no one even bothers scene-timing) Therefore, everything is timed through hearing and audio spectrum.
  2. In Karaoke shifting, I normally approach with hearing and/or audio spectrum too, which leads to a precision of +/- 20ms difference from the original version. (1/50 is acceptable in my opinion; it's possible to achieve less than 10ms accuracies through extensive trials and errors, but read on to the third point) Of course, a quick and dirty way is to specify a known line in the original version and shift the karaoke to the correct scene, but this can be off by about +/- 1.5 frame--far less accurate than shifting by mere audio.
  3. In making karaoke effects, lead-in and lead-out effects are often used. They can be as long as 5 frames in a 24fps footage, and so having an extremely accurate time-shift (i.e. under 40ms difference) is irrelevant in most cases.
  4. You just made the whole discussion to sound as if using ears is incompetent comparing to visual only approach. It makes me wonder if you're legally deaf, as you want to avoid using ears as much as possible because of the lack of trust in your own sense. It's either that, or you don't have much timing experiences to know which is better and which isn't.
pichu is offline   Reply With Quote