|   | Computational & Technology Resources an online resource for computational, engineering & technology publications | 
| Civil-Comp Proceedings ISSN 1759-3433 CCP: 80 PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON ENGINEERING COMPUTATIONAL TECHNOLOGY Edited by: B.H.V. Topping and C.A. Mota Soares Paper 131 On Training Sample Selection for Artificial Neural Networks using Number-Theoretic Methods F. Tong+ and X.L. Liu* +Department of Civil Engineering, Tsinghua University, Beijing, China
 Full Bibliographic Reference for this paper F. Tong, X.L. Liu, "On Training Sample Selection for Artificial Neural Networks using Number-Theoretic Methods", in B.H.V. Topping, C.A. Mota Soares, (Editors), "Proceedings of the Fourth International Conference on Engineering Computational Technology", Civil-Comp Press, Stirlingshire, UK, Paper 131, 2004. doi:10.4203/ccp.80.131 Keywords: artificial neural networks, number-theoretic methods (NTMs), NT-net, discrepancy, good lattice points (GLP-net), hammersley-net. Summary Flexibility in generalization is always what is to be pursued when an artificial
neural network (ANN) model is set up. For this purpose, this paper makes efforts to
improve the quality of the "teacher", i.e., to ensure that the uniformity of training samples
distribution by use of the series of Number-Theoretic Methods (NTMs). NTMs are a
series of deterministic number-theoretic algorithms used to generate points that
uniformly scatter in s-dimensional unit cube   . As the ANN prediction shows that the
nature of nonlinear interpolation, uniformity of samples is helpful to produce small
errors on new samples unseen during training. 
Under NTMs theory frame, discrepancy is defined as a quantitative measurement
for the uniformity of a set of points. The smaller the discrepancy is, the more
uniformly samples distribute. Actually, discrepancy describes how well a set of
points represents the uniform distribution on  In this paper, GLP-net, Halton-net, and Hammersley-net, are introduced as typical NT-nets. Training samples are prepared, respectively, by GLP-net, Hammersley-net, and compared with equal-spaced samples in uniformity in terms of discrepancy value. Trained, respectively, by these three types of samples, ANN models show quite different performance in computational precision and stability. ANNs trained by NTM-based samples outperform in terms of generalization flexibility. This is demonstrated through an engineering case study in this paper. Conclusively, good uniformity of training samples, instead of unselectively piling more and more data, is really helpful to enhance the ANNs' generalization performance. It is mathematically proven to obtain uniformly scattering samples through NTMs other than equal spaced sampling. References 
 purchase the full-text of this paper (price £20) 
go to the previous paper | |