Summary: | 碩士 === 國立清華大學 === 資訊工程學系 === 104 === We introduce a method for understanding road scenes and simultaneously predicting the hazard levels of three categories of objects in road scene images by using a fully convolutional network (FCN) architecture. In our approach, with a single input image, the multi-task model produces a _ne segmentation result and a prediction of hazard levels in a form of heatmap. The model can be divided into three parts: shared net, segmentation net, and hazard level net. The shared net and segmentation net use the encoder-decoder architecture provided by Badrinarayanan et al . [2]. The hazard level net is a fully convolution network estimating hazard level of a segment with a coarse segmentation result.
We also provide a dataset with the object segmentation ground truth and the hazard levels for training and evaluating the proposed deep networks. To prove that our network can learn highly semantic attributes of objects, we use two measurements to evaluate the performance of our method, and compare our method with a saliency-based method to show the difference between predicting hazard levels and estimating human eyes fixations.
|