News On Japan

Why Tokyo University’s Data Science Course Has Teens Hooked

TOKYO - A rapidly growing data science program at the University of Tokyo is attracting an unusually wide range of participants, with junior high and high school students studying alongside university students and working adults.

The course, known as GCI, is offered nationwide online and free of charge for students, eliminating barriers even for complete beginners and fueling a surge of interest from across Japan and abroad.

At a recent completion ceremony, organizers reported 10,579 total enrollees and 1,490 graduates, highlighting the program’s rigorous nature with a 14% completion rate. “I thought I might fail the final assignment, but I managed to finish,” said a second-year junior high school student with little programming experience. GCI is held twice a year, with the next session starting in mid-October, and its popularity has gone global, drawing 7,700 applicants from 32 countries and 430 universities for the English-language version.

To explore why the course is so compelling, GCI instructor and AI startup researcher Masayuki Sera walked through its approach, from fundamentals to practical applications. Sera works at Twins, a company spun out of the university’s AI lab, and applies data science to real business problems. “The work is wide-ranging,” he said. “For a telecom company, for example, we might predict whether customers are likely to cancel their contracts and then suggest changes to their plans. We also assess whether current strategies are effective and adjust them if necessary.”

The program’s curriculum follows a structured process: explore and clean data, build models, evaluate results, and iterate. A signature assignment involves the “Home Credit Default Risk” challenge, where students predict whether customers will default on loans based on tabular data such as income, family size, and loan type. The training dataset includes about 170,000 rows and 51 columns, while the test set has around 60,000 rows and 50 columns, with the default labels hidden.

Exploratory data analysis (EDA) is emphasized early on, teaching students to identify missing values, outliers, and skewed distributions. In one example, missing entries in household size and product price had to be filled before modeling. Students also learn how class imbalance—92% repay their loans while 8% default—can distort results and why metrics like AUC are better than raw accuracy. Visualization reveals useful patterns: income distributions become more interpretable after log transformations, and certain features, like education level and loan type, strongly correlate with default rates.

Before modeling, text categories must be encoded as numbers and missing values filled. Although one-hot encoding is generally safer, GCI demonstrates label encoding for simplicity with tree-based models. A basic random forest model trained on a 70/30 split achieves an AUC of around 0.65—“not exceptional but proof the features contain predictive power,” Sera noted.

Students then learn how to improve performance through feature engineering, such as creating new variables like the ratio of loan amount to income (repayment burden) or product price to loan amount (self-financing ratio). These changes can nudge AUC scores upward—sometimes by just 0.5 percentage points, a difference that can significantly impact leaderboard rankings. Other techniques include comparing individual loan amounts to group averages, trying different encoding or imputation strategies, tuning hyperparameters, or even switching algorithms. This iterative cycle—hypothesizing, testing, and refining—is where many learners find themselves “hooked.”

What keeps even teenagers engaged, instructors say, is the immediate feedback and sense of discovery. With only a few lines of Python, beginners can build a competitive model, and a single visualization can reshape their understanding of the data. “You don’t need to master every algorithm to start,” said Sera. “What matters is rigorous analysis, thoughtful feature design, and relentless iteration.”

GCI’s success reflects a broader trend: data science has become the gateway to artificial intelligence. By grounding learners in core skills—predictive modeling, fair evaluation, and careful data preparation—the course demystifies AI and builds practical foundations. For companies, the message is similar: rather than chasing buzzwords, start by examining existing data, asking the right questions, and letting evidence guide strategy.

Source: テレ東BIZ

News On Japan
POPULAR NEWS

A fire broke out at Arima Inari Shrine near the Arima Onsen hot spring resort area in Kobe on the night of June 9th, destroying multiple buildings and leaving an elderly Shinto priest and his wife with minor injuries.

Japan, which records the shortest average sleep duration among OECD countries, is launching new efforts to tackle widespread sleep deprivation, including the opening of specialized sleep disorder departments and programs aimed at improving children's sleep habits through sports and physical activity.

Japan's national soccer team arrived in Nashville, Tennessee, on June 8th from Monterrey, Mexico, where it had been conducting a pre-World Cup training camp, and held its first practice session at its base camp for the FIFA World Cup in North America.

A prolonged eruption at Sakurajima on June 7th blanketed parts of Kagoshima City in volcanic ash, turning roads gray and prompting long lines of vehicles seeking car washes after a plume of smoke rose 1,300 meters above the crater.

A powerful earthquake struck off Mindanao Island in the southern Philippines at 8:38 a.m. (Japan time) on June 8th, generating tsunami waves across parts of the Pacific, causing building collapses and casualties near the epicenter, and prompting the Japan Meteorological Agency to issue tsunami advisories along a wide stretch of Japan's Pacific coastline before lifting all of them at 4:50 p.m.

MEDIA CHANNELS
         

MORE Web3 NEWS

Ranmaru Kishitani, a 24-year-old education entrepreneur and member of Generation Z who has built a public profile by speaking widely on politics, economics and current affairs, says young people in Japan are becoming more conscious of politics as social media brings elections into everyday life and creates a sense that individual votes can still change outcomes.

NTT plans to establish a new investment vehicle, the IOWN AI Fund, to accelerate the global expansion of its next-generation communications infrastructure known as IOWN.

Mercari subsidiary Melcoin, which operates cryptocurrency trading services, announced that it has expanded the range of cryptocurrencies available through the Mercari marketplace app.

Fukuoka City began training teachers in the use of generative artificial intelligence on June 5th, as part of an effort to improve classroom instruction and streamline administrative work across its public schools.

Hitachi has signed an agreement granting it access to "Claude Mythos," the latest artificial intelligence model developed by U.S.-based AI company Anthropic, sources revealed on June 5th.

Gamification is shaking up the way people spend their spare time online, turning passive visits into active adventures.

The latest film by Hirokazu Kore-eda, Sheep in the Box, opened in Japan on May 29th after being screened in the Competition section at the Cannes Film Festival, bringing to the screen a near-future story about a grieving couple who welcome into their home a humanoid modeled on their deceased seven-year-old son.

Former Digital Minister Masaaki Taira, who oversees cybersecurity and artificial intelligence policy within the ruling Liberal Democratic Party, said Japan still has opportunities to compete in the rapidly evolving AI sector, despite the dominance of major U.S. and Chinese developers.