کنفرانس بین المللی فناوری اطلاعات و دانش

English

صفحه اصلی / شانزدهمین کنفرانس بین المللی فناوری اطلاعات و دانش

Integrating Wasserstein GANs for High-Speed Transformer-Based Neural Machine Translation

نویسندگان :

Parisa Nekoogol¹ Mostafa Salehi²

1- دانشگاه تهران 2- دانشگاه تهران

کلمات کلیدی :

Neural Machine Translation،Generative Adversarial Networks،Reinforcement Learning،Transformer

چکیده :

Neural machine translation (NMT), a key achievement in natural language processing (NLP), continues to face challenges such as producing low-quality output for complex sentences and lacking natural fluency. This study aimed to improve machine translation quality by integrating Generative Adversarial Networks (GANs) with an NMT model. Initially, the baseline NMT model, derived from previous research and based on recurrent neural networks (RNNs), was reconstructed and implemented. Subsequently, this architecture was replaced with the advanced Transformer architecture, and the system was developed using a Wasserstein Generative Adversarial Network (WGAN). To overcome the crucial problem of textual data discontinuity (non-differentiability), the Self-Critical Sequence Training (SCST) method, a reinforcement learning (RL) algorithm, was employed. A core objective was to analyze the performance benefits of adversarial training when applied to a robust Transformer-based generator. The research concluded that while adversarial training enhances the model's performance in generating more fluent translations, this particular improvement is more substantial and notable for models based on recurrent neural networks compared to the Transformer architecture.

لیست مقالات

لیست مقالات بایگانی شده

Electrophysiological Modeling and Interactive Approaches of Electrical Circuits and Hypergraphs for Understanding Neural Circuit Dynamics

Arian Baymani - Maryam Naderi Soorki

An integrated approach for estimating software cost estimation using Adaptive Neuro-Fuzzy Inference System and the Grey Wolf Optimization algorithm

Maryam Karimi - Taghi Javdani Gandomani - Mahdi Mosleh

چارچوب پیش‌بینی خرابی تطبیقی مبتنی بر شبکه عصبی گراف پویا و GRU در سامانه‌های صنعتی IIoT

رسول اسماعیلی فرد - لیلا رنجبر

تشخیص بیماری مزمن کلیوی با استفاده از یادگیرنده‌های گروهی و انتخاب ویژگی‌های مؤثر مبتنی‌ بر الگوریتم بهینه‌سازی تبادل حرارتی

صبا عارف‌نیا - مهدی هاشم‌زاده - امین گلزاری اسکوئی

مکان‌یابی بهینه آلودگی در شبکه‌های توزیع آب با استفاده از تکنولوژی اینترنت اشیاء بر مبنای پیشبینی سری زمانی چند متغیره

زینب محزون - امید بوشهریان

پیاده سازی موازی یک طرح (t,n)-تسهیم چند تصویر با استفاده از GPU

سعیده کبیری راد

شناسایی جایگاه مالونیلاسیون در پروتئین‌ها با بهره‌گیری از استخراج ویژگی و تکنیک‌های پردازش زبان طبیعی

حنانه رجبیون - محمد قاسم زاده - وحید رنجبر بافقی

A Nano-based High-Speed QCA circuit for Information Security with Image Masking

Saeid Seyedi - Hatam Abdoli

یک روش کارآمد جهت تشخیص آنلاین حملات DRDoS به سرویس های مبتنی بر UDP درمعماری SDN با استفاده از الگوریتم های یادگیری ماشین

میترا اکبری کهنه شهری - دکتر رضا محمدی - دکتر محمد نصیری میترا اکبری کهنه شهری - رضا محمدی - محمد نصیری -

Adaptive Stopping Criteria-based A-RANSAC algorithm in Copy Move Image Forgery detection

ZAHRA HOSEINNEJAD - Dr MEHDI NASRI

بیشتر

ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 44.2.0