A SURVEY OF METHODS OF TEXT-TO-IMAGE TRANSLATION

І. Конарєва; Д. Підоренко; О. Турута

doi:10.30837/bi.2019.2(93).11

Автор(и)

І. Конарєва Університет Комплутенсе, Мадрид (UCM), Іспанія
Д. Підоренко Харківський національний університет радіоелектроніки, Україна https://orcid.org/0000-0003-0232-4634
О. Турута Харківський національний університет радіоелектроніки, Україна

DOI:

https://doi.org/10.30837/bi.2019.2(93).11

Ключові слова:

Image generation, Text keywords, Image-to-image translation, Text-to-image translation, Text compres

Анотація

The given work considers the existing methods of text compression (finding keywords or creating summary) using RAKE, Lex Rank, Luhn, LSA, Text Rank algorithms; image generation; text-to-image and image-to-image translation including GANs (generative adversarial networks). Different types of GANs were described such as StyleGAN, GauGAN,
Pix2Pix, CycleGAN, BigGAN, AttnGAN. This work aims to show ways to create illustrations for the text. First, key information should be obtained from the text. Second, this key information should be transformed into images. There were proposed several ways to transform keywords to images: generating images or selecting them from a dataset with
further transforming like generating new images based on selected ow combining selected images e.g. with applying style from one image to another. Based on results, possibilities for further improving the quality of image generation were also planned: combining image generation with selecting images from a dataset, limiting topics of image generation.

Біографії авторів

І. Конарєва, Університет Комплутенсе, Мадрид (UCM)

Студент магістратури Університету Комплутенсе

Д. Підоренко, Харківський національний університет радіоелектроніки

Студент магістратури кафедри програмної інженерії

О. Турута, Харківський національний університет радіоелектроніки

К.т.н., доцент кафедри програмної інженерії

Посилання

Karras T., Laine S., Aila T. A Style-Based generator Architecture for generative Adversarial Networks // Ieee Conference on Computer Vision and Pattern Recognition (CVPR) – 2019. – P. 4401-4410.

Park T., Liu M.-Y., Wang T.-C., Zhu J.-Y. Semantic Image Synthesis with Spatially-Adaptive Normalization // Ieee Conference on Computer Vision and Pattern Recognition (CVPR) – 2019. – P. 2337-2346.

Isola P., Zhu J.-Y., Zhou T., Efros A. A. Image-to-Image Translation with Conditional Adversarial Networks // Ieee Conference on Computer Vision and Pattern Recognition (CVPR) – 2017. – P. 1125-1134.

Zhu J.-Y., Park T., Isola P., Efros A. A. unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks // Ieee International Conference on Computer Vision (ICCV) – 2017. – P. 2223-2232.

gatys L. A., Ecker A. S., Bethge M. A Neural Algorithm of Artistic Style // arXiv e-prints, arXiv:1508.06576v2 – 2015. – P. 1.

Brock A., Donahue J., Simonyan K.. large Scale gAN Training for High Fidelity Natural Image Synthesis // arXiv e-prints, arXiv:1809.11096v2 – 2019. – P. 8.

Xu T., Zhang P., Huang Q., Zhang H., Gan Z., Huang X., He X. AttngAN: Fine-grained Text to Image generation with Attentional generative Adversarial Networks // Ieee Conference on Computer Vision and Pattern Recognition (CVPR) – 2018. – P. 1316-1324.

Shafkat I. Intuitively understanding Variational Autoencoders // Medium – 2018. – uRl: https://towardsdatascience.com/intuitivelyunderstanding-variational-autoencoders-1bfe67eb5daf

Shaham T. R., Dekel T., Michaeli T. SingAN: learning a generative Model from a Single Natural Image // Ieee International Conference on Computer Vision (ICCV) – 2019. – P. 4570-4580.

Rose, S., Engel, D., Cramer, N., & Cowley, W. Automatic Keyword extraction from Individual Documents // M. W. Berry & J. Kogan (eds.), Text Mining: Theory and Applications – John Wiley & Sons, Hoboken, 2010. – P. 1-20.

Pranay M., Aman G., Aayush Y. Text Summarization in Python: extractive vs. Abstractive techniques revisited // Rare Technologies. – 2017. – uRl: https://rare-technologies.com/text-summarizationin-python-extractive-vs-abstractive-techniques-revisited/

Naskar A. extract Custom Keywords using NlTK POS tagger in python // ThinkInfi. – 2018. – uRl: https://www.thinkinfi.com/2018/10/ extract-custom-entity-using-nltk-pos.html

Ma L., Jia X., Sun Q., Schiele B., Tuytelaars T., Gool L. V. Pose guided Person Image generation // Ieee Conference on Computer Vision and Pattern Recognition (CVPR) – 2019. – P. 2337-2346.

Bondarenko M., Konoplyanko Z., Chetverikov G. Analiz problemy sozdaniya novich technicheskich sredstv dlya realizataii lingvisticheskogo interfeisan // Proc.of the 10th International Conference KDS–2003, Varna, Bulgaria, – 2003. – P. 3–15.

Bondarenko M.F., Konoplyanko Z..D., Chetverikov G.G. Theory fundamentals of multipli-valued structures and coding in artificial intelligence systems.iKharkiv: Factor-druk. 2003.– 336 p.

A SURVEY OF METHODS OF TEXT-TO-IMAGE TRANSLATION

Автор(и)

DOI:

Ключові слова:

Анотація

Біографії авторів

І. Конарєва, Університет Комплутенсе, Мадрид (UCM)

Д. Підоренко, Харківський національний університет радіоелектроніки

О. Турута, Харківський національний університет радіоелектроніки

Посилання

##submission.downloads##

Опубліковано

Номер

Розділ

Інформація

##plugins.block.developedBy.blockTitle##

Подати статтю