Summary: | Transient simulations of high-speed channels can be very time intensive. Recurrent neural network (RNN) based methods can be used to speed up the process by training a RNN model on a relatively short bit sequence, and then using a multi-steps rolling forecast method to predict subsequent bits. However, the performance of the RNN model is highly affected by its hyperparameters. We propose an algorithm named adaptive successive halving automated hyperparameter optimization (ASH-HPO) which combines successive halving, Bayesian optimization (BO), and progressive sampling to tune the hyperparameters of the RNN models. Modifications are proposed to the successive halving and progressive sampling algorithms for better efficiency on time series data. The ASH-HPO algorithm trains on smaller dataset subsets initially, then expands the training dataset progressively and adaptively adds or removes models along the process. In this paper, we use the ASH-HPO algorithm to optimize the hyperparameters of convolutional neural networks (CNNs), long short-term memory (LSTM) networks, and CNN-LSTM networks. We demonstrate the effectiveness of the ASH-HPO algorithm using a PCIe Gen 2 channel, a PCIe Gen 5 channel, and a PAM4 differential channel. We also investigate the effects of several settings and tunable variables of the ASH-HPO algorithm on its convergence speed. As a benchmark, we compared the ASH-HPO algorithm to three state-of-the-art HPO methods: BO, successive halving, and hyperband. The results show that the ASH-HPO algorithm converges faster than the other HPO methods on transient simulation problems.
|