Recognizing Human Actions by Fusing Spatio-temporal Appearance and Motion Descriptors