Sen Yang
I am a Senior Machine Learning Engineer at Google, working on the Search Intelligence team.
My work focuses on large-scale personalized retrieval systems, representation learning,
and end-to-end ML system design under production constraints.
Previously, I worked at Waymo, where I led evaluations of Waymo Driver models using large-scale representation and foundation models for driving log understanding.
Before that, I was a
Senior Machine Learning Engineer at LinkedIn, building large-scale personalization and
feed ranking systems.
I received my Ph.D. in Electrical and Computer Engineering from Rutgers University, advised by Dr. Ivan Marsic and Dr. Hui Xiong. My
research interests span machine learning, representation learning, retrieval, and
learning-based evaluation and decision systems, with publications in venues such as
KDD, SIGIR, CVPR, and ACM TKDD.
Contact:
Email |
Website |
CV |
Google Scholar |
LinkedIn
News
| Jun 2025 |
Joined Google as a Senior Machine Learning Engineer on the Search Intelligence team.
|
| 2025 |
Journal paper accepted by Journal of Biomedical Informatics (JBI).
|
| 2024 |
Journal paper accepted by ACM Transactions on Knowledge Discovery from Data (TKDD).
|
| 2023 |
Paper accepted to SemEval 2023.
|
| 2022 |
Joined Waymo as a Senior Machine Learning Engineer.
|
Research
Coding Agent Development
This work builds on my Ph.D. research in learning-based evaluation and decision systems,
extending these ideas to agent architectures for code generation, tool use, and verification.
Large-scale Personalized Ranking and Retrieval Models
I work on large-scale personalized ranking and retrieval models across both feed and search
settings. At LinkedIn, I built and scaled home feed ranking models, learning user and content
representations and validating improvements through online A/B experimentation. At Google
Search, I focus on building personalized retrieval channels, designing representation learning
pipelines that optimize end-to-end relevance under strict latency and production constraints.
Waymo Driver Evaluation with Foundation Models
I led learning-based evaluation efforts for Waymo Driver models, leveraging large-scale
representation learning and foundation models to replace brittle rule-based metrics and
improve robustness and generalization in driving log understanding.
Process Mining (a.k.a. Workflow Mining)
Process mining aims to discover, monitor and improve real world processes by extracting knowledge from activity logs. There are three main problems in process mining, process discovery, conformance checking and model enhancement. Process discovery takes activity logs as input and produce data-driven graphical models. Conformance checking tries to align the activity logs with process models to highlight the differences and commonalities. Model enhancement repairs the process model with activity logs. My research focused on developing new techniques and algorithms to address the above problems. In addition, I explored a new research direction, the process recommender system (PRS). PRS aims to provide people with real-time step-by-step recommendations on real world processes.
Workflow Data Visual Analytics
Workflow event logs (temporal activity sequences) are challenging to analyze and interpret at scale. I focused on building user-friendly, machine learning-driven visual analytics tools to support interactive workflow exploration and diagnosis. Working with students I supervised, we developed VIT-PLA 1.0 (Java/Java Swing) and VIT-PLA 2.0 (web-based with JSP and D3.js) for workflow data analysis.
Deep Representation Learning for Time Series (Bell Labs)
During my Bell Labs research internship, I studied deep representation learning for time-series
data using RNN-based autoencoders. I identified key failure modes where low reconstruction error
does not translate to strong downstream classification performance, especially under oscillatory
patterns and limited training data.
I explored architectural improvements inspired by ensemble and inception-style designs, and
demonstrated that combining deep representations with statistical features (e.g., TSFRESH)
yields more robust and discriminative representations. This work emphasized expert-free,
scalable feature learning for time-series analytics.
Smart Trauma Resuscitation Decision Support System (funded by NIH)
I worked on an NIH-funded project to build a Decision Support System for Smart Trauma Resuscitation Room using data mining, machine learning and process mining techniques. During trauma resuscitation, multidisciplinary teams rapidly identify and treat potentially life threatening injuries, then develop and execute a short-term management plan for the identified injuries. To improve medical team performance and reduce the adverse outcomes on the patients, we are developing a computerized decision support system for trauma resuscitations and other fast-paced, high-risk critical care settings. The system monitors workflow and alerts users of errors, allowing remedial actions to be taken to prevent adverse outcomes.
Publications
Selected publications are listed below. A complete list is available on
Google Scholar.
- Keyi Li, Mary S. Kim, Wenjin Zhang, Sen Yang, Genevieve J. Sippel, Aleksandra Sarcevic, Randall S. Burd, Ivan Marsic.
Human Intention Recognition for Trauma Resuscitation: An Interpretable Deep Learning Approach for Medical Process Data. Journal of Biomedical Informatics, 161:104767, 2025.
- Keyi Li, Sen Yang, Travis M. Sullivan, Randall S. Burd, Ivan Marsic.
ProcessGAN: Generating Privacy-Preserving Time-Aware Process Data with Conditional Generative Adversarial Nets. ACM Transactions on Knowledge Discovery from Data, 18(9):1-31, 2024.
- Wenjin Zhang, Keyi Li, Sen Yang, Sifan Yuan, Ivan Marsic, Genevieve J. Sippel, Mary S. Kim, Randall S. Burd.
Focusing on What Matters: Fine-grained Medical Activity Recognition for Trauma Resuscitation via Actor Tracking. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4950-4958, 2024.
- Wenjin Zhang, Keyi Li, Sen Yang, Chenyang Gao, Wanzhao Yang, Sifan Yuan, Ivan Marsic.
MaskMatch: Boosting Semi-supervised Learning through Mask Autoencoder-driven Feature Learning. arXiv preprint arXiv:2405.06227, 2024.
- Keyi Li, Sen Yang, Chenyang Gao, Ivan Marsic.
Rutgers Multimedia Image Processing Lab at SemEval-2023 Task 1: Text-Augmentation-based Approach for Visual Word Sense Disambiguation. Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval), pp. 1483-1490, 2023.
- Xin Dong, Yaxin Zhu, Yupeng Zhang, Zuohui Fu, Dongkuan Xu, Sen Yang, Gerard de Melo.
Leveraging Adversarial Training in Self-Learning for Cross-Lingual Text Classification. Proceedings of the 43rd ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 1541-1544, 2020.
- Sen Yang, Aleksandra Sarcevic, Richard A. Farneth, Shuhong Chen, Omar Z. Ahmed, Ivan Marsic, Randall S. Burd.
An Approach to Automatic Process Deviation Detection in a Time-Critical Clinical Process. Journal of Biomedical Informatics, 85:155-167, 2018.
- Wangsu Hu, Zijun Yao, Sen Yang, Shuhong Chen, Peter J. Jin.
Discovering Urban Travel Demands through Dynamic Zone Correlation in Location-based Social Networks. European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), pp. 88-104, 2018.
- Sen Yang, Weiqing Ni, Xin Dong, Shuhong Chen, Richard A. Farneth, Aleksandra Sarcevic, Ivan Marsic, Randall S. Burd.
Intention Mining in Medical Process: A Case Study in Trauma Resuscitation. IEEE International Conference on Healthcare Informatics (ICHI), pp. 36-43, 2018.
- Ran He, Sen Yang, Jingyuan Yang, Jin Cao.
Automated Mining of Approximate Periodicity on Numeric Data: A Statistical Approach. ACM International Conference on Compute and Data Analysis, 2018.
- Sen Yang, Moliang Zhou, Shuhong Chen, Xin Dong, Omar Ahmed, Ivan Marsic, Randall S. Burd.
Medical Workflow Modeling Using Alignment-Guided State-Splitting HMM. IEEE International Conference on Healthcare Informatics (ICHI), 2017.
- Sen Yang, Xin Dong, Leilei Sun, Yichen Zhou, Richard A. Farneth, Hui Xiong, Randall S. Burd, Ivan Marsic.
A Data-driven Process Recommender Framework. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 2111-2120, 2017.
- Sen Yang, Xin Dong, Moliang Zhou, Xinyu Li, Shuhong Chen, Rachel Webman, Aleksandra Sarcevic, Ivan Marsic, Randall S. Burd.
VIT-PLA: Visual Interactive Tool for Process Log Analysis. KDD Workshop on Interactive Data Exploration and Analytics (IDEA), 2016.
- Sen Yang, Moliang Zhou, Rachel Webman, JaeWon Yang, Aleksandra Sarcevic, Ivan Marsic, Randall S. Burd.
Duration-Aware Alignment of Process Traces. Industrial Conference on Data Mining (ICDM), pp. 379-393, 2016.
- Sen Yang.
Applied Process Mining, Recommendation, and Visual Analytics. Ph.D. Thesis, Rutgers University, 2019.
Academic Services
Journal Reviewing
|
ACM Transactions on Knowledge Discovery from Data (TKDD)
|
|
Journal of Biomedical Informatics (JBI)
|
|
ACM Transactions on Management Information Systems (TMIS)
|
|
IEEE Transactions on Big Data
|
Conference Program Committee / Reviewing
|
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)
|
|
ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR)
|
|
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops
|
|
IEEE International Conference on Healthcare Informatics (ICHI)
|
|
Conference on Empirical Methods in Natural Language Processing (EMNLP)
|
|
Annual Meeting of the Association for Computational Linguistics (ACL)
|