In this paper, we propose a method of embedding images into a speech signal. An image is embedded in the high frequency band above 16 kHz of the speech signal to be difficult to perceive for humans. In the proposed method, the image is embedded after being binarized by the error diffusion method. When the image is embedded, we apply an iterative phase restoration method to avoid noise generation. We also investigate a compensation method for an extracted image whose quality degrades due to transmission channel characteristics. Simulation and real-environmental experiments showed that the proposed method significantly improved the quality of the extracted image on the receiver side.

