Xiaojie Yang

Ph.D. Candidate, The University of Tokyo
Koshizuka Laboratory
The Daiwa Ubiquitous Computing Research Building
7 Chome-3-1 Hongo, Bunkyo City, Tokyo, Japan
xiaojieyang [at] g.ecc.u-tokyo.ac.jp
Google scholar || Github || ORCID

I specialize in Spatial Information Science with Deep Learning, leveraging big data for innovative applications. My academic background in Geographic Information Science has fueled my ambition to integrate advanced computer science technologies to enhance spatial research capabilities. My Ph.D. thesis explores causality analysis using spatio-temporal data. Currently, I am interested in uncovering the potential of large language models to empower urban computing in various scenarios, such as trajectory generation, geo-related fake information detection, and more.

Research interest: urban computing, causal inference, LLMs, human mobility prediction

News

Aug 10, 2025	Our paper is accepted by IEEE Transactions on Intelligent Transportation Systems!
Nov 17, 2024	Our paper is accepted by KDD 2025 Research Track (August Cycle)!

Highlights

I worked as a research intern in the INTPART DTRF project at Western Norway Research Institute in Jul–Aug 2024.

Selected publications

CausalMob: Causal Human Mobility Prediction with LLMs-derived Human Intentions toward Public Events

Xiaojie Yang , Hangli Ge, Jiawei Wang, Zipei Fan, Renhe Jiang, Ryosuke Shibasaki, and Noboru Koshizuka

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1 2025 | [ arXiv ]

Large-scale human mobility exhibits spatial and temporal patterns that can assist policymakers in decision making. Although traditional prediction models attempt to capture these patterns, they are often affected by nonperiodic public events, such as disasters and occasional celebrations. Since regular human mobility patterns are affected by these events, estimating their causal effects is critical to accurate mobility predictions. News articles provide unique perspectives on these events, though processing them is a challenge. In this study, we propose a causality based prediction model, CausalMob, to analyze the causal effects of public events. We first utilize large language models (LLMs) to extract human intentions from news and transform them into features that act as causal treatments. Next, the model learns representations of spatio-temporal regional covariates from multiple data sources to serve as confounders for causal inference. Finally, we present a causal effect estimation framework to ensure that event features remain independent of confounders during prediction. Based on large-scale real-world data, the experimental results show that the proposed model excels in human mobility prediction, outperforming state-of-the-art models.
FRTP: Federating Route Search Records to Enhance Long-term Traffic Prediction

Hangli Ge, Xiaojie Yang , Itsuki Matsunaga, Dizhi Huang, and Noboru Koshizuka

2024 IEEE International Conference on Big Data (BigData) 2024 | [ arXiv ]
Causality-Aware Next Location Prediction Framework based on Human Mobility Stratification

Xiaojie Yang , Zipei Fan, Hangli Ge, Takashi Michikata, Ryosuke Shibasaki, and Noboru Koshizuka

2024 IEEE Smart World Congress (SWC) 2024 | [ arXiv ]
Online trajectory prediction for metropolitan scale mobility digital twin

Zipei Fan, Xiaojie Yang , Wei Yuan, Renhe Jiang, Quanjun Chen, Xuan Song, and Ryosuke Shibasaki

Proceedings of the 30th International Conference on Advances in Geographic Information Systems 2022 | [ arXiv ]

Knowing "what is happening" and "what will happen" of the mobility in a city is the building block of a data-driven smart city system. In recent years, mobility digital twin that makes a virtual replication of human mobility and predicting or simulating the fine-grained movements of the subjects in a virtual space at a metropolitan scale in near real-time has shown its great potential in modern urban intelligent systems. However, few studies have provided practical solutions. The main difficulties are four-folds: 1) the daily variation of human mobility is hard to model and predict; 2) the transportation network enforces a complex constraints on human mobility; 3) generating a rational fine-grained human trajectory is challenging for existing machine learning models; and 4) making a fine-grained prediction incurs high computational costs, which is challenging for an online system. Bearing these difficulties in mind, in this paper we propose a two-stage human mobility predictor that stratifies the coarse and fine-grained level predictions. In the first stage, to encode the daily variation of human mobility at a metropolitan level, we automatically extract citywide mobility trends as crowd contexts and predict long-term and long-distance movements at a coarse level. In the second stage, the coarse predictions are resolved to a fine-grained level via a probabilistic trajectory retrieval method, which offloads most of the heavy computations to the offline phase. We tested our method using a real-world mobile phone GPS dataset in the Kanto area in Japan, and achieved good prediction accuracy and a time efficiency of about 2 min in predicting future 1h movements of about 220K mobile phone users on a single machine to support more higher-level analysis of mobility prediction.

Education

Ph.D. in Applied Computer Science
The University of Tokyo, 2022 – 2025
Funded by Shibasaki Scholarship (3 years)
M.Sc. in Socio-Culture Environmental Studies
The University of Tokyo, 2020 – 2022
B.Eng. in Geographical Information Science
Wuhan University, 2016 – 2020