Speech Emotion Recognition by Combining a Unified First-Order Attention Network With Data Balance