Welcome to my personal website!

I currently serve as a Software Engineer at Databricks Inc. working on the Apache Spark team to advance one of the fastest and most scalable data engines in the world. I also serve as an Adjunct Assistant Professor at UMass Amherst.

My research focuses on big data management, data-processing systems, and machine learning systems. I take a systems-driven approach by co-designing the key components of modern data-intensive pipelines, including workflow engines, UDF debugging frameworks, pipelining optimizers, and machine learning acceleration systems for streaming data. To optimize performance, usability, and scalability, I integrate techniques across data management, distributed systems, program analysis, and machine learning.

I have contributed extensively to the Apache Texera (Incubating) project, a collaborative and interactive system for data science and AI/ML using workflows. My research has been published in database venues such as SIGMOD, VLDB and ICDE, and my interdisciplinary work spans venues including TOCHI, PNAS Nexus, JAMIA, AMIA, and PLOS ONE.

Education

Work

  • Software Engineer
    Databricks Inc. · Full-time
    Aug 2025 - Present · 7 mos
    Working on PySpark.
  • Adjunct Assistant Professor
    University of Massachusetts Amherst · Part-time
    May 2025 - Present · 10 mos
    Serving in adjunct capacity.
  • Software Engineer Intern
    Observe Inc. · Internship
    Jun 2024 - Sep 2024 · 4 mos
    Dataset transformer, Log analytics, Streaming windows, Snowflake Time Travel
  • Research Intern
    VISA Inc. · Internship
    Jun 2022 - Sep 2022 · 3 mos
    Real-time window aggregation, Out-of-order events, In-memory data structures, Flink optimization
  • Research Intern
    ByteDance Inc. · Internship
    Jun 2020 - Sep 2020 · 3 mos
    HTAP database, Real-time query processing, MySQL-to-Kudu schema conversion, Lock-free heap structure

Awards

2026

2025

2024

2023

2020

  • Best Lecturer Award
    CUCS
    Recognized for excellence in teaching performance.

Selected Publications (All)

2025

  1. ML-Asset Management: Curation, Discovery, and Utilization
    Mengying Wang, Moming Duan, Yicong Huang, Chen Li, Bingsheng He, and 1 more author
    Proc. VLDB Endow., 2025
  2. Dissertation
    UDF-Centric Dataflow Systems for Supporting User-Defined Functions in Collaborative Data Science, AI, and ML
    Yicong Huang
    University of California, Irvine, 2025
  3. DSE-K12
    DS4ALL: Teaching High-School Students Data Science and AI/ML Using the Texera Workflow Platform as a Service
    Jiadong Bai, Xiaozhen Liu, Anthony Cuturrufo, Alexander Kundu Taylor, Jeehyun Hwang, and 7 more authors
    In Data Science Education K-12: Research to Practice Annual Conference, Feb 2025

2024

  1. IcedTea: Efficient and Responsive Time-Travel Debugging in Dataflow Systems
    Shengquan Ni, Yicong Huang, Zuozhi Wang, and Chen Li
    Proc. VLDB Endow., Feb 2024
  2. Pasta: A Cost-Based Optimizer for Generating Pipelining Schedules for Dataflow DAGs
    Xiaozhen Liu, Yicong Huang, Xinyuan Lin, Avinash Kumar, Sadeem Alsudais, and 1 more author
    Proc. ACM Manag. Data, Feb 2024
  3. Texera: A System for Collaborative and Interactive Data Analytics Using Workflows
    Zuozhi Wang, Yicong Huang, Shengquan Ni, Avinash Kumar, Sadeem Alsudais, and 4 more authors
    Proc. VLDB Endow., Feb 2024
  4. Demonstration of Udon: Line-by-line Debugging of User-Defined Functions in Data Workflows
    Yicong Huang, Zuozhi Wang, and Chen Li
    In Companion of the 2024 International Conference on Management of Data, SIGMOD/PODS 2024, Santiago AA, Chile, June 9-15, 2024, Feb 2024
  5. Data Science Tasks Implemented with Scripts versus GUI-Based Workflows: The Good, the Bad, and the Ugly
    Alexander K. Taylor, Yicong Huang, Junheng Hao, Xinyuan Lin, Xiusi Chen, and 2 more authors
    In 40th International Conference on Data Engineering, ICDE 2024 - Workshops, Utrecht, Netherlands, May 13-16, 2024, Feb 2024
  6. fncir
    Brain image data processing using collaborative data workflows on Texera
    Yunyan Ding, Yicong Huang, Pan Gao, Andy Thai, Atchuth Naveen Chilaparasetti, and 3 more authors
    Frontiers in Neural Circuits, Feb 2024