Bo Yang

I am an incoming Assistant Professor (2020.11-) in the Department of Computing at The Hong Kong Polytechnic University. I completed my D.Phil degree (2016.10-2020.09) in the Department of Computer Science at University of Oxford, supervised by Profs. Niki Trigoni and Andrew Markham. Prior to Oxford, I obtained an M.Phil degree from The University of Hong Kong where I was supervised by Prof. S.H. Choi, and a B.Eng degree from Beijing University of Posts and Telecommunications.

In my D.Phil study, I interned at the Augumented Reality team of Amazon (Palo Alto, CA). In my M.Phil study, I interned at Hong Kong Applied Science and Technology Research Institute. In my undergraduate study, I was an exchange student at Universitat Politècnica de València (Valencia, Spain).

Email / Google Scholar  /  Github


I'm interested in machine learning, computer vision, and robotics. My research goal is to build intelligent systems which endow machines to recover, understand, and eventually interact with the real 3D world. This includes accurate and efficient recognition, segmentation and reconstruction of all individual objects within large-scale 3D scenes.

Research Positions: If you're interested in working with me, don't hesitate to drop an email.

[2020.09.10] Successfully defend my D.Phil thesis, examined by Profs. M. Pawan Kumar and Andrew Davison.

[2020.03.08] Invited to present our RandLA-Net and 3D-BoNet at Shenlan. Here are the Video and Slides.

[2020.02.27] One co-authored paper for 3D semantic segmentation is accepted by CVPR 2020.

[2019.10.24] Successfully defend D.Phil confirmation, examined by Profs. Andrew Zisserman and Alessandro Abate.

[2019.09.03] One first-authored paper for 3D instance segmentation is accepted as a spotlight at NeurIPS 2019.

[2019.08.16] One first-authored paper for multi-view 3D reconstruction is accepted in IJCV.

Publications / Preprints

Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges
Q. Hu, B. Yang*, S. Khalid, W. Xiao, N. Trigoni, A. Markham
arXiv, 2020
arXiv / Demo / Project page

We introduce an urban-scale photogrammetric point cloud dataset and extensively evaluate and analyze the state-of-the-art algorithms on the dataset.
(* indicates corresponding author)


PointLoc: Deep Pose Regressor for LiDAR Point Cloud Localization
Wei Wang, Bing Wang, Peijun Zhao, Changhao Chen, Ronald Clark, B. Yang, Andrew Markham, Niki Trigoni
arXiv, 2020

We present a learning-based LiDAR relocalization framework to efficiently estimate 6-DoF poses from LiDAR point clouds.


RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
Q. Hu, B. Yang*, L. Xie, S. Rosa, Y. Guo, Z. Wang, N. Trigoni, A. Markham
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
arXiv / Semantic3D Benchmark / News: (新智元, AI科技评论, CVer)/ Video/ Code
(* indicates corresponding author)

We introduce an efficient and lightweight neural architecture to directly infer per-point semantics for large-scale point clouds.


Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds
B. Yang, J. Wang, R. Clark, Q. Hu, S. Wang, A. Markham, N. Trigoni
Advances in Neural Information Processing Systems (NeurIPS), 2019 (Spotlight, 200/6743)
arXiv / ScanNet Benchmark / Reddit Discussion / News: (新智元, 图像算法, AI科技评论, 将门创投, CVer, 泡泡机器人)/ Video/ Code

We propose a simple and efficient neural architecture for accurate 3D instance segmentation on point clouds. It achieves the SOTA performance on ScanNet and S3DIS (June 2019).


DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network
W. Wang, M.R.U. Saputra, P. Zhao, P. Gusmao, B. Yang, C. Chen, A. Markham, N. Trigoni
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019

We propose a novel end-to-end deep parallel neural network to estimate the 6-DOF poses using consecutive 3D point clouds.


Robust Attentional Aggregation of Deep Feature Sets for Multi-view 3D Reconstruction
B. Yang, S. Wang, A. Markham, N. Trigoni
International Journal of Computer Vision (IJCV), 2019 (IF=6.07)
arXiv/ Springer Open Access/ Code

We propose an attentive aggregation module together with a training algorithm for multi-view 3D object reconstruction. It outperforms all existing poolings and recurrent neural networks.


Learning Semantically Meaningful Embeddings Using Linear Constraints
S. Lin, B. Yang, R. Birke, R. Clark
IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR-W), 2019
CVF Open Access

We propose a simple embedding learning method that jointly optimises for an auto-encoding reconstruction task and for estimating the corresponding attribute labels.


Dense 3D Object Reconstruction from a Single Depth View
B. Yang, S. Rosa, A. Markham, N. Trigoni, H. Wen
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018 (IF=17.73)
arXiv/ IEEE Xplore/ Code

We propose a novel neural architecture to reconstruct the complete 3D structure of a given object from a single arbitrary depth view using generative adversarial networks.

this slowpoke moves

3D-PhysNet: Learning the Intuitive Physics of Non-Rigid Object Deformations
Z. Wang, S. Rosa, B. Yang, S. Wang, N. Trigoni, A. Markham
International Joint Conference on Artificial Intelligence (IJCAI), 2018
arXiv/ Code

We present a neural framework to predict how a 3D object will deform under an applied force using intuitive physics modelling.


Learning 3D Scene Semantics and Structure from a Single Depth Image
B. Yang*, Z. Lai*, X. Lu, S. Lin, H. Wen, A. Markham, N. Trigoni
IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR-W), 2018
CVF Open Access / IEEE Xplore
(* indicates equal contribution)

We propose an efficient and holistic pipeline to simultaneously learn the semantics and structure of a scene from a single depth image.


Defo-Net: Learning Body Deformation Using Generative Adversarial Networks
Z. Wang, S. Rosa, L. Xie, B. Yang, S. Wang, N. Trigoni, A. Markham
IEEE International Conference on Robotics and Automation (ICRA) , 2018
arXiv / Video/ IEEE Xplore/ Code

We present a novel generative adversarial network to predict body deformations under external forces from a single RGB-D image.


3D Object Reconstruction from a Single Depth View with Adversarial Learning
B. Yang, H. Wen, S. Wang, R. Clark, A. Markham, N. Trigoni
IEEE International Conference on Computer Vision Workshops (ICCV-W) , 2017
arXiv / IEEE Xplore/ News: 机器之心/ Code

We propose a novel approach to reconstruct the complete 3D structure of a given object from a single arbitrary depth view using generative adversarial networks.


Updating Wireless Signal Map with Bayesian Compressive Sensing
B. Yang, S. He, S-H G. Chan
ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems (MSWiM) , 2016

We propose Compressive Signal Reconstruction (CSR), a novel learning system employing Bayesian compressive sensing (BCS) for online signal map update.


A mechanised 3D scanning method for item-level radio frequency identification of palletised products
S.H. Choi, B. Yang, H.H. Cheung
Computers in Industry , 2015 (IF=4.77)
Elsevier ScienceDirect

We propose a mechanised 3D scanning method for identification of tagged products in large numbers to facilitate supply chain management.


Item-level RFID for Enhancement of Customer Shopping Experience in Apparel Retail
S.H. Choi, Y.X. Yang, B. Yang, H.H. Cheung
Computers in Industry , 2015 (IF=4.77)
Elsevier ScienceDirect

We propose an item-level RFID-enabled retail store management system for relatively high-end apparel products to provide customers with more leisure, interaction for product information.


RFID Tag Data Processing in Manufacturing for Track-and-Trace Anti-counterfeiting
S.H. Choi, B. Yang, H.H. Cheung, Y.X. Yang
Computers in Industry , 2015 (IF=4.77)
Elsevier ScienceDirect

We present a track-and-trace anti-counterfeiting system, and propose a tag data processing and synchronization algorithm to generate initial e-pedigrees for products.


Hilary Term, 2019:    Knowledge Representation & Reasoning (University of Oxford).

Michaelmas Term, 2018:    Machine Learning (University of Oxford).

Michaelmas Term, 2017:    Machine Learning (University of Oxford).

Second Semester, 2014:    C++ Programming (The University of Hong Kong).


Alexander Trevithick (Oct 2019 - ):    Exeter College at University of Oxford.

Qingyong Hu (Oct 2018 - ):    Department of Computer Science at University of Oxford.

Jianan Wang (May - Dec 2018):    Now with Google DeepMind.

Zihang Lai (Oct 2017 - Mar 2018):    Now with VGG.

About Me

In my free time, I like playing tennis on lawns, clays, and hard surfaces. I also like to fly drones for landscape photography. Here's a video over the historic Oxford [Youtube, 腾讯视频], and another video for the scenic Lake District [Youtube]. Remember to turn up the volume for the background music.

Last update: 2020.09.12. Thanks.