0% Complete
فارسی
Home
/
شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش
From Faces to Words: An Efficient Persian Visual Lip Reading
Authors :
Mana Amini
1
Sajjad Aemmi
2
Azadeh Ashouri
3
Reza Akhoundzadeh
4
Kourosh Hassanzadeh
5
Mohammad Reza Mohammadi
6
1- PART AI Research Center
2- PART AI Research Center
3- PART AI Research Center
4- PART AI Research Center
5- PART AI Research Center
6- Iran University of Science and Technology
Keywords :
Lip Reading،Visual Speech Recognition،CTC Loss،LSTM،Video-Based Authentication
Abstract :
Visual speech recognition, or lip reading, is the task of transcribing spoken content directly from video frames of a speaker’s mouth without relying on audio. We develop an end-to-end visual lip reading system that processes cropped mouth regions from video sequences and decodes them into text using recurrent neural networks trained with CTC loss. To extend beyond existing English datasets, we collected and manually annotated a new Persian Lip Reading Dataset (PLRD), providing valuable resources for studying morphologically rich languages. Our experiments show that the proposed system achieves competitive word error rates on our custom Persian dataset. Beyond transcription, the model can also be employed in authentication scenarios, where it verifies whether a spoken phrase in a video matches a given reference text. This demonstrates the potential of lip reading systems not only for accessibility and robust speech recognition in noisy environments, but also for secure user verification.
Papers List
List of archived papers
Aligning the Brick and Mortar cosmetic with digital transformation as the right way to overhaul the In-store Experience
Mehrgan Malekpour - Dr Federica Caboni
توسعه ی کارآفرینی دیجیتال در بخش کشاورزی
شایان مظاهری - فاطمه قربانی پیرعلیدهی - فاطمه رزاقی بورخانی
Combinatorial Auction Based on Social Choice in the Internet of Things
Maede Esmaeili - Faria Nassiri-Mofakham - Fatemeh Hassanvand
آرتمیا: پروتکل مسیریابی مبتنی بر انجمن و آگاه به نظم تماس در شبکة اجتماعی متحرک تأخیرپذیر
سعید مرادی - جمشید باقرزاده محاسفی
Experimental analysis of automated negotiation agents in modeling Gaussian bidders
Fatemeh Hassanvand - Dr Faria Nassiri-Mofakham
Data Analysis to Reduce Electrical Power Plants
Amirali Sahraei - Jamshid Shanbehzadeh
Context Awareness Gate for Retrieval Augmented Generation
Mohammad Hassan Heydari - Arshia Hemmat - Erfan Naman - Afsaneh Fatemi
Statistical distance-base acceptance strategy for desirable offers in bilateral automated negotiation
Arash Ebrahimnezhad - Dr Hamid Jazayeriy - Dr Faria Nassiri-mofakham
A No-Code Platform for Developing Customizable Recommender Systems for Restaurants
Moein-Aldin AliHosseini - MohammadReza Sharbaf
BMPA- DSL: Binary Marine Predators Algorithm to Identify Driver's Different Levels of Stress
Mahtab Vaezi - Mehdi Nasri - Farhad Azimifar - Mahdi Mosleh
more
Samin Hamayesh - Version 43.8.0