Despite decades of research, the features of an input audio stimulus that are encoded in an electroencephalogram (EEG) are still not clearly identified. We wish to investigate whether a frequency-band coupling model that estimates the cortical neural activity from EEGs can capture the important features of an input audio stimulus. To do so, EEG recordings were acquired from 8 subjects during a listening task where the vowels a, i and u were randomly presented. The neural activity was estimated from the EEG using a frequency-band coupling model that combined the EEG's phase in the delta band (2 Hz-4 Hz) and its amplitude in the gamma band (30 Hz-100 Hz). To investigate if the estimated neural activity could capture relevant features of an input audio stimulus, we fitted a generalized linear model (GLM) to the estimated neural activity and applied a statistical relative deviance metric to evaluate how important is the input audio stimulus in the estimated neural activity. We demonstrate that the input audio stimulus is the main component explaining the estimated neural activity and that other aspects such as the contribution of the surrounding network dynamics do not contribute significantly to the estimated neural activity. These results confirm that the features of the EEG used in the coupling model, namely the phase of the delta band and the power of the gamma band, do encode relevant aspects of an input audio signal. This non-invasive approach could be used, for example, to study how the presence of spectro-temporal features in the estimated neural activity is modified depending on different listening conditions or types of input sounds.