Lip Reading Demonstraton LiP25w
Dec. 7, 2018: Released
▼ Upload Now
- This demonstration is aimed to introduce our research results of word-level lip reading technology.
- Wrong recognition result may be obtained. We kindly ask for your understanding.
- Many training data are young people. Children and elderly people are easily to obtain wrong recognition result.
- Depending on the network speed, it may take more than 10 seconds for the results to be displayed after uploading the video file.
- Uploaded video files, experimental date and recognition results (defined as user data) are recorded on the server managed by us.
- User data will be used for the purpose of performance improvement of lip reading technology.
- The recognition method is based on the method presented at ViEW2018, but may be changed without notice. Major changes will be stated on this site.
25 Japanese words
|20||/do-u-i-ta-shi-ma-shi-te/||you are welcome|
|22||/ha-ji-me-ma-shi-te/||nice to meet you|
How to shooting
- Voice is not used for recognition process. It is acceptable if you just move the mouth without making a voice.
- Please refer to the following shooting example.
- To obtain correct recognition result,
- Please start moving your mouth after recording starts.
- Your face should be located at the center of the screen.
- Look at the center of the screen and speak.centered on the screen.
- Close your mouth before and after your speech.
- Please speak with natural speech speed and mouth movement.
- Too eary / too slow speech, too wide open / small open may cause false recognition.
- Please set the image size to VGA (640 x 480 [pixels]) or more.
How to use
- Please register your user ID in advance.
- After entering the registered user ID and selecting video file, push the [Upload] button.
- Please upload a video file uttering one of the 25 words in the table above.
- In the case of smart devices (tablet and smartphone), it is also possible to launch the video shooting application and upload the speech scenes.
- The maximum video file size is 10MB.
- Even if the face is appeared in the video, the face may not be detected. In this case, an error will occur.
- If the video play time is less than 1 second or more than 10 seconds, an error will occurs.
- After uploading, it takes time for the results to be displayed. Please wait a moment.
- Please refer to the following video (written in Japanese).
齊藤 剛史, 窪川 美智子（九工大）, "再帰型ニューラルネットワークを用いた動き特徴量による単語読唇システムの開発", ビジョン技術の実利用ワークショップ（ViEW2018）, pp.430-435, 2018.12．
- 2019/1/24 Published in the Asahi Shimbun.
- 2019/2/14 Published in the Nishinippon Shimbun.
- 2019/3/11 Broadcasted on RKB Radio "Umeko Shokudo".
- 2019/3/11 Released the instruction video.
- 2019/3/15 Published in the Mainichi Shimbun.