This article is automatically translated.
I’m Sasaki from the research and development department of Dwango Media Village. Dwango Media Village had a corporate exhibition for two days at the 21st Symposium on Image Recognition and Understanding MIRU2018 held in August. MIRU2018 is a symposium for academic exchange among domestic Computer Vision researchers. Alongside the main event, there was a Young Researchers Program, which served as a venue for young researchers, including students, postdocs, and professionals, to interact. This article introduces the symposium and the young researchers’ program.
At the MIRU2018 corporate booth, Dwango Media Village presented posters and demos. The posters introduced projects such as automatic coloring of manga pages, normal map estimation from line drawings, and high-speed clustering of large-scale, multi-dimensional image feature vectors. Additionally, voice conversion and Dwango’s horse racing prediction project were showcased.
The demo featured 2D to 3D pose estimation, a recent internship outcome, and the Artificial Life Project, which was unveiled on the day of the event. The 3D pose estimation demo showcased a deep learning model that learns 3D pose estimation from 2D data without the need for 2D-3D pose data pairs, which were previously required. Visitors could see 3D poses estimated online from images captured by a webcam. The artificial life demo illustrated virtual creatures in a 3D virtual space learning to survive and move their bodies through reinforcement learning. Some visitors even returned on the second day to see the progress of the creatures born the previous day.
Deep learning, a machine learning technique, is gaining attention in Computer Vision for its high performance and versatility, with new methods and algorithms being proposed daily. The tutorials covered representative deep learning models, Generative Adversarial Networks (GANs) excelling in image generation tasks, and the latest research trends in reinforcement learning. There was also a session on automatic hyperparameter tuning, which is crucial in machine learning. The tutorials were practical, covering both theory and useful know-how.
Many studies using Convolutional Neural Networks (CNNs), deep learning models for image recognition and generation, were presented. The oral presentations included papers accepted at the recent top conference in the CV field, CVPR.
Here are some presentations that particularly interested me:
For image classification with CNNs, a dataset with pairs of input images and correct labels is essential. Incorrect labels can degrade performance. Normally, data cleaning is performed, but this is challenging with large datasets. This presentation proposed a method to train CNNs while correcting incorrect labels.
This study involves image generation from descriptions. While CNNs can create high-quality images, generating images exactly as users envision is difficult. This research introduced an interactive method where users can add instructions to refine the avatar image generation model’s results.
MIRU2018 actively included presentations from different fields. It was impressive to hear that deep learning could revolutionize robotics, similar to other fields like speech and natural language processing. One example was research on reinforcement learning for manipulating soft materials (e.g., fabric), which is challenging with conventional robotics technology.
Each team conducted a research survey in fields other than Computer Vision. About two months before MIRU2018, teams were formed, each assigned a different field (e.g., natural language, data mining), with about five members per team. The survey and preparation for poster and oral presentations required meticulous literature review and preparation. Team members, scattered nationwide, communicated frequently via Slack to read papers and prepare.
During the symposium, various exchange activities were organized. The first day featured an icebreaker with 90-second self-introduction lightning talks, where participants introduced their research and hobbies. Many were university lab students focusing on Computer Vision-related fields, with a strong interest in deep learning models. After the self-introductions, we enjoyed a social gathering at a nearby izakaya, savoring Hokkaido’s food and drinks. Throughout the symposium, there were sponsor-provided lunches, spot presentations of survey results, and a 20-minute oral presentation for each team, concluding with discussions on the future of young researchers.
The Young Researchers Program was a tough but rewarding experience. It required reading numerous papers and compiling them into a single presentation, demanding significant effort. However, it was a great opportunity to connect with new research peers across universities and companies. The committee members were very friendly, fostering a warm and amicable environment for interaction.
MIRU2018 not only focused on Computer Vision research trends but also encouraged exploring different fields. It was impressive to see the emphasis on finding new challenges in emerging areas, not just learning about the latest research.
August in Sapporo was far more comfortable than Tokyo, making it hard to leave. The city hosted beer gardens, allowing leisurely outdoor enjoyment of beer and Genghis Khan in the cool evenings. The conference banquet featured Hokkaido delicacies like soup curry, miso ramen, and sake. MIRU2019 will be held in Osaka next year, and I definitely plan to attend!