Identifying perceived loudness in audio signals

In processing audio, it can be helpful to have algorithms that can extract volume and loudness information. One application of this would be finding speech in long intervals of silence. Digital audio files, which contain raw information, may not represent how a human can perceive the sound. For example, high and low frequencies of sound with equal intensities will be perceived as having different levels of loudness. This paper presents a series of processing operations to be performed on sound signals, which attempt to find the loudness of sounds as perceived by a human. Previously used methods of calculating perceived loudness includes looking at the frequency spectrum of sounds. The first operation accounts for the human perception of loudness at different frequencies. The second creates an envelope of the sound signal, while preventing impulses from getting filtered out. These impulses are short bursts of sound such as a gunshot or a handclap. The third operation accounts for the human perception of impulse loudness. The output of this process is an envelope of the original sound signal, which represents how a human would perceive the audio volume. This removes the sinusoidal components in the sound signal and maintains the sound amplitude information, with some other minor adjustments.