Viviify (2021)


How can we make reading more lively, accessible, and easier to comprehend? Viviify is a web application tool that converts text to video for a more engaging and comprehensive experience.


During my graduate program at NYU, I was given many reading assignments and I felt frustration in reading long texts. For me, it was very energy and time-consuming to learn something by reading. So I imagined, "What if there was a tool that turns text into something more engaging such as a video, wouldn't it improve my learning experience?". And I did some research and found out that our brain can process visuals much faster than text, and also video viewers tend to retain more information compared to text readers.

So I designed and built a web application tool that automatically converts any English text to video with subtitles and voiceover for a more engaging and comprehensive learning experience. The tool takes the user input text with options to select voice types, speed, and subtitle size and uses Natural Language Processing (NLP) to find stock video clips that are most relevant to the meaning of each sentence in the text, and outputs a video with subtitles and text-to-speech voiceover that the user can watch directly in the browser. The app provides a dedicated video player making it easier for the user to navigate between each sentence in the video to support comprehension of the user. The generated videos can be stored in the user's account, and shared via a link or embedded into a web page or blog post. The app supports cross-browser and cross-device compatibility for better user experience and accessibility.



I first began my research by looking if there is an existing tool that converts text to a video. I found a few similar projects, but none of them were available to be used by the public or had the same approach and goal as mine. I also found very few research papers about generating video from text using a Generative Adversarial Network (GAN) but this approach requires large training datasets and an intensive training time, and presumably too slow to generate a video for consumption purpose. 

I also found and tried quite a few online video making tools out there that allow users to easily create a video by arranging video clips onto a timeline, but since these tools are for production purpose, they require the user's time and energy to create a video, making them not suitable for consuming video to make reading a more engaging experience which needs to be available almost immediately and effortlessly.

I have also done research about whether learning through videos can be more engaging than text alone. I found a number of articles and studies supporting that videos can be more effective in learning than text because our brain can process visuals 60,000 times faster than text, and viewers retain 95% of a video’s message compared to 10% when reading text. Also, studies convey that approximately 65% of the population are visual learners.

Technical Details

Viviify was built using HTML, CSS, JavaScript, Node.js, Express, MongoDB, Amazon Polly, Python, Flask, gevent, and spaCy.

Software Engineer | UX Designer | Creative Technologist